Transform audio into stunning visualizations
Music Video Creator is a Python framework that transforms WAV audio files into professional music visualization videos. Using STFT (Short-Time Fourier Transform) analysis and GPU-accelerated rendering, it extracts audio features and maps them to dynamic visual effects.
Key Capabilities:
| Style | Description |
|---|---|
sphere_3d_gpu |
3D audio-reactive sphere with noise displacement |
terrain_3d_gpu |
Synthwave terrain flyover |
tunnel_3d_gpu |
Infinite tunnel with bass-reactive walls |
reactive_disk_gpu |
Bass-reactive spinning disk |
particles_gpu |
GPU particle system |
frequency_bars_gpu |
GPU spectrum analyzer bars |
circular_gpu |
GPU radial spectrum |
waveform_gpu |
GPU waveform display |
spectrum_texture_gpu |
128-bin spectrum texture mapping |
sphere_3d_calm_gpu |
Calm sphere (reduced reactivity) |
| Style | Description |
|---|---|
waveform_spectrogram |
Combined waveform + spectrogram |
spectrogram_only |
Spectrogram with playhead |
waveform_only |
Waveform with playhead |
frequency_bars |
Classic spectrum analyzer |
particles |
Audio-reactive particle system |
geometric |
Pulsing polygons and circles |
circular_spectrum |
Radial spectrum analyzer |
beat_pulse |
Rhythm-synced pulsing effects |
beat_rings |
Expanding rings on beats |




WAV Input
│
▼
┌─────────────────┐
│ Audio Loader │ Load, normalize, mono conversion
└────────┬────────┘
│
▼
┌─────────────────┐
│ STFT Analysis │ Short-Time Fourier Transform
└────────┬────────┘
│
▼
┌─────────────────┐
│ Features │ RMS, bass/mid/high energy,
│ Extraction │ spectral centroid, onset strength
└────────┬────────┘
│
▼
┌─────────────────┐
│ Beat Detection │ Onset detection + peak picking
│ Scene Detection │ Auto-segment: intro→verse→drop→outro
└────────┬────────┘
│
▼
┌─────────────────┐
│ Visualization │ 19 styles (Matplotlib or GPU/ModernGL)
│ + Effects │ Zoom, flash, shake on beats
└────────┬────────┘
│
▼
┌─────────────────┐
│ Rendering │ FFmpeg pipe (memory-efficient)
│ │ Optional: 1440p upscale for YouTube VP9
└────────┬────────┘
│
▼
MP4/GIF Output
# Basic rendering with GPU style
python main.py song.wav -o output.mp4 --style sphere_3d_gpu --pipe
# YouTube-optimized (1440p upscale forces VP9 codec)
python main.py song.wav -o output.mp4 --youtube --style reactive_disk_gpu
# Scene-based rendering with genre preset
python main.py song.wav -o output.mp4 --scenes --genre electronic --pipe
# With beat effects
python main.py song.wav -o output.mp4 --beat-zoom --beat-flash --style terrain_3d_gpu
# GIF for Discord
python main.py song.wav -o preview.gif --gif --gif-preset discord --max-duration 10
# With lyrics (Whisper transcription)
python main.py song.wav -o output.mp4 --lyrics --style sphere_3d_gpu
| Renderer | Style Type | Speed | Memory |
|---|---|---|---|
| FFmpeg Pipe | GPU | ~30 fps | Low (streaming) |
| FFmpeg Pipe | Matplotlib | ~0.3 fps | Low (streaming) |
| Standard | GPU | ~25 fps | Medium |
| Standard | Matplotlib | ~0.2 fps | High (RAM) |
GPU styles are ~100x faster than Matplotlib styles.
Explore the capabilities interactively:
All Rights Reserved. This showcase is for demonstration purposes only. The source code is not publicly available.
Music Video Creator was developed to provide musicians with professional-quality visualizations for their audio content. It’s optimized for YouTube publishing with automatic VP9 codec selection and supports various social media platforms through GIF presets.
Tech Stack: Python, NumPy, SciPy, Matplotlib, ModernGL, FFmpeg, faster-whisper