Short-Time Fourier Transform (STFT)
The STFT breaks audio into overlapping windows and computes the frequency spectrum for each. This creates a time-frequency representation that drives all visualizations.
Frequency resolution vs time resolution tradeoff
Overlap between consecutive windows (75%)
Reduces spectral leakage at window boundaries
Extracted Audio Features
Root Mean Square - measures loudness/intensity of the signal
"Center of mass" of the spectrum - indicates brightness/timbre
Energy in low frequencies (20-250 Hz)
Energy in mid frequencies (250-2000 Hz) - vocals, instruments
Energy in high frequencies (2000-8000 Hz) - cymbals, brilliance
Onset detection using energy derivative + thresholding
Analysis Pipeline
GPU Shader Uniforms
Audio features are passed to GPU shaders as uniform variables every frame.
Configuration
Audio analysis parameters can be configured via YAML or CLI.