music-video-creator-showcase

Music Video Creator

Transform audio into stunning visualizations

Python 3.8+ License


Overview

Music Video Creator is a Python framework that transforms WAV audio files into professional music visualization videos. Using STFT (Short-Time Fourier Transform) analysis and GPU-accelerated rendering, it extracts audio features and maps them to dynamic visual effects.

Key Capabilities:


Visual Styles

GPU-Accelerated (Fast)

Style Description
sphere_3d_gpu 3D audio-reactive sphere with noise displacement
terrain_3d_gpu Synthwave terrain flyover
tunnel_3d_gpu Infinite tunnel with bass-reactive walls
reactive_disk_gpu Bass-reactive spinning disk
particles_gpu GPU particle system
frequency_bars_gpu GPU spectrum analyzer bars
circular_gpu GPU radial spectrum
waveform_gpu GPU waveform display
spectrum_texture_gpu 128-bin spectrum texture mapping
sphere_3d_calm_gpu Calm sphere (reduced reactivity)

Matplotlib (Detailed)

Style Description
waveform_spectrogram Combined waveform + spectrogram
spectrogram_only Spectrogram with playhead
waveform_only Waveform with playhead
frequency_bars Classic spectrum analyzer
particles Audio-reactive particle system
geometric Pulsing polygons and circles
circular_spectrum Radial spectrum analyzer
beat_pulse Rhythm-synced pulsing effects
beat_rings Expanding rings on beats

Sample Output

3D Sphere with Lyrics

Sphere with Lyrics

Terrain Flyover

Terrain

Reactive Disk

Reactive Disk

Style Mashup

Mashup


Architecture

WAV Input
    │
    ▼
┌─────────────────┐
│  Audio Loader   │  Load, normalize, mono conversion
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  STFT Analysis  │  Short-Time Fourier Transform
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│    Features     │  RMS, bass/mid/high energy,
│   Extraction    │  spectral centroid, onset strength
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Beat Detection  │  Onset detection + peak picking
│ Scene Detection │  Auto-segment: intro→verse→drop→outro
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Visualization  │  19 styles (Matplotlib or GPU/ModernGL)
│    + Effects    │  Zoom, flash, shake on beats
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Rendering     │  FFmpeg pipe (memory-efficient)
│                 │  Optional: 1440p upscale for YouTube VP9
└────────┬────────┘
         │
         ▼
    MP4/GIF Output

CLI Examples

# Basic rendering with GPU style
python main.py song.wav -o output.mp4 --style sphere_3d_gpu --pipe

# YouTube-optimized (1440p upscale forces VP9 codec)
python main.py song.wav -o output.mp4 --youtube --style reactive_disk_gpu

# Scene-based rendering with genre preset
python main.py song.wav -o output.mp4 --scenes --genre electronic --pipe

# With beat effects
python main.py song.wav -o output.mp4 --beat-zoom --beat-flash --style terrain_3d_gpu

# GIF for Discord
python main.py song.wav -o preview.gif --gif --gif-preset discord --max-duration 10

# With lyrics (Whisper transcription)
python main.py song.wav -o output.mp4 --lyrics --style sphere_3d_gpu

Performance

Renderer Style Type Speed Memory
FFmpeg Pipe GPU ~30 fps Low (streaming)
FFmpeg Pipe Matplotlib ~0.3 fps Low (streaming)
Standard GPU ~25 fps Medium
Standard Matplotlib ~0.2 fps High (RAM)

GPU styles are ~100x faster than Matplotlib styles.


Interactive Demos

Explore the capabilities interactively:


Documentation


License

All Rights Reserved. This showcase is for demonstration purposes only. The source code is not publicly available.


About

Music Video Creator was developed to provide musicians with professional-quality visualizations for their audio content. It’s optimized for YouTube publishing with automatic VP9 codec selection and supports various social media platforms through GIF presets.

Tech Stack: Python, NumPy, SciPy, Matplotlib, ModernGL, FFmpeg, faster-whisper