Originally from: iky1e/demucs-mlx-fp16
Float32 variant: mlx-community/demucs-mlx
Demucs β MLX (float16)
Float16 MLX-compatible weights for all 8 pretrained Demucs models, converted to safetensors format for inference on Apple Silicon.
This is the float16 variant of iky1e/demucs-mlx β same models, half the file size, identical output quality. Recommended for Apple Silicon where memory is constrained (iOS, smaller Macs).
Demucs is a music source separation model that splits audio into stems: drums, bass, other, vocals (and guitar, piano for 6-source models).
Models
| Model | What it is | Architecture | Sub-models | Sources | Weights (fp16) | Weights (fp32) |
|---|---|---|---|---|---|---|
htdemucs |
Default v4 model, best speed/quality balance | HTDemucs (v4) | 1 | 4 | 80 MB | 160 MB |
htdemucs_ft |
Fine-tuned v4, best overall quality | HTDemucs (v4) | 4 (fine-tuned) | 4 | 321 MB | 641 MB |
htdemucs_6s |
6-source v4 (adds guitar + piano stems) | HTDemucs (v4) | 1 | 6 | 52 MB | 105 MB |
hdemucs_mmi |
v3 hybrid, trained on more data | HDemucs (v3) | 1 | 4 | 160 MB | 319 MB |
mdx |
v3 bag-of-models ensemble | Demucs + HDemucs | 4 (bag) | 4 | 659 MB | 1.3 GB |
mdx_extra |
v3 ensemble trained on extra data | HDemucs | 4 (bag) | 4 | 638 MB | 1.2 GB |
mdx_q |
Quantized v3 ensemble (same quality, smaller) | Demucs + HDemucs | 4 (bag) | 4 | 659 MB | 1.3 GB |
mdx_extra_q |
Quantized v3 extra ensemble | HDemucs | 4 (bag) | 4 | 638 MB | 1.2 GB |
All models output stereo audio at 44.1 kHz.
Float16 vs Float32
Output quality is identical β max sample difference is 3.1e-5 (one int16 LSB), correlation > 0.999999999. MLX on Apple Silicon upcasts float16 weights to float32 for computation, so the math is the same.
| Metric | float32 (iky1e/demucs-mlx) | float16 (this repo) |
|---|---|---|
| htdemucs file size | 160 MB | 80 MB |
| htdemucs RSS (peak memory) | 1311 MB | 1210 MB |
| htdemucs speed (M1 Pro) | 7.1s | 7.9s |
| Output quality | reference | identical |
Origin
- Original model/repo: adefossez/demucs
- Float32 weights: iky1e/demucs-mlx
- License: MIT (same as original Demucs)
- Conversion path: PyTorch checkpoints β safetensors float32 β float16
- Swift MLX port: kylehowells/demucs-mlx-swift
Files
Each model consists of two files at the repo root:
{model_name}.safetensorsβ model weights (float16){model_name}_config.jsonβ model class, architecture config, and bag-of-models metadata
Usage
Swift (demucs-mlx-swift)
Point the model directory or repo to this float16 variant:
# Use float16 models from local directory
demucs-mlx-swift -n htdemucs --model-dir /path/to/demucs-mlx-fp16 song.wav
# Or set the HF repo environment variable
export DEMUCS_MLX_SWIFT_MODEL_REPO=iky1e/demucs-mlx-fp16
demucs-mlx-swift -n htdemucs song.wav
Or use the Swift API directly:
import DemucsMLX
let separator = try DemucsSeparator(modelName: "htdemucs")
let result = try separator.separate(fileAt: URL(fileURLWithPath: "song.wav"))
Converting from PyTorch
To reproduce the export directly from PyTorch Demucs checkpoints:
pip install demucs safetensors numpy
# Export all 8 models as float16 (default)
python export_from_pytorch.py --out-dir ./output
# Export as float32
python export_from_pytorch.py --out-dir ./output --dtype float32
The conversion script (export_from_pytorch.py) is available in the demucs-mlx-swift repo under scripts/.
Citation
@inproceedings{rouard2022hybrid,
title={Hybrid Transformers for Music Source Separation},
author={Rouard, Simon and Massa, Francisco and Defossez, Alexandre},
booktitle={ICASSP 23},
year={2023}
}
@inproceedings{defossez2021hybrid,
title={Hybrid Spectrogram and Waveform Source Separation},
author={Defossez, Alexandre},
booktitle={Proceedings of the ISMIR 2021 Workshop on Music Source Separation},
year={2021}
}
Quantized