About
Spleeter is an open-source audio source separation tool developed by Deezer, one of the world's leading music streaming services. Built on Python and TensorFlow, it leverages state-of-the-art deep learning models to decompose mixed audio tracks into isolated stems — including vocals, drums, bass, piano, and other instruments — with impressive speed and accuracy. The library ships with multiple pretrained models supporting 2-stem (vocals/accompaniment), 4-stem (vocals, drums, bass, other), and 5-stem (vocals, drums, bass, piano, other) separation. These models were trained on professional music datasets, enabling high-quality results on real-world audio out of the box. Spleeter can process audio much faster than real-time on a GPU and works reasonably well on CPU as well. Primary use cases include music production, remixing, karaoke track generation, music information retrieval research, podcast editing, and vocal extraction. It supports a simple command-line interface as well as a Python API, making it accessible to both developers and researchers. Docker support is also provided for easy deployment. With over 28,000 GitHub stars and an MIT license, Spleeter is one of the most popular and trusted tools in the audio AI ecosystem.
Key Features
- Pretrained Separation Models: Ships with ready-to-use models for 2-stem, 4-stem, and 5-stem audio separation without any training required.
- Multiple Stem Outputs: Isolate vocals, drums, bass, piano, and other instruments individually from any mixed audio file.
- Fast Processing: Processes audio faster than real-time on GPU, and remains practical on CPU for most use cases.
- CLI and Python API: Offers both a command-line interface for quick use and a Python API for integration into custom pipelines.
- Docker Support: Provides Docker images for easy, reproducible deployment across different environments.
Use Cases
- Creating karaoke versions of songs by extracting the instrumental accompaniment from a mixed track.
- Music producers remixing tracks by isolating individual stems such as drums or bass lines.
- Music information retrieval researchers studying individual audio components for academic analysis.
- Podcast and audio editors removing background music or isolating voice recordings.
- Building automated audio processing pipelines that require stem separation as a preprocessing step.
Pros
- Completely Free and Open Source: Licensed under MIT, Spleeter is free to use, modify, and distribute for personal and commercial projects.
- High-Quality Results: Trained on professional-grade music data, the pretrained models deliver competitive separation quality out of the box.
- Easy to Use: A simple CLI and Python API make it accessible for researchers and developers without deep ML expertise.
- Large Community: With 28k+ GitHub stars and an active fork community, there is extensive documentation and third-party tooling.
Cons
- Requires Technical Setup: Installation involves Python, TensorFlow, and optional GPU configuration, which may be complex for non-developers.
- TensorFlow Dependency: Relies on TensorFlow, which can be heavyweight and may introduce version compatibility issues with newer Python environments.
- Limited Real-Time Use: Designed for offline batch processing; not suited for live or real-time audio separation scenarios.
Frequently Asked Questions
Spleeter supports common audio formats including MP3, WAV, OGG, and FLAC via the FFmpeg backend.
No, Spleeter works on CPU as well, though a GPU significantly speeds up processing, especially for longer audio files.
Spleeter offers pretrained models for 2-stem (vocals + accompaniment), 4-stem (vocals, drums, bass, other), and 5-stem (vocals, drums, bass, piano, other) separation.
Yes, Spleeter supports custom model training if you have a dataset of isolated audio sources, allowing you to build domain-specific separation models.
Yes, Spleeter is released under the MIT license, which permits free use in commercial applications.
