Originally from: iky1e/demucs-mlx-fp16

Float32 variant: mlx-community/demucs-mlx

Demucs β€” MLX (float16)

Float16 MLX-compatible weights for all 8 pretrained Demucs models, converted to safetensors format for inference on Apple Silicon.

This is the float16 variant of iky1e/demucs-mlx β€” same models, half the file size, identical output quality. Recommended for Apple Silicon where memory is constrained (iOS, smaller Macs).

Demucs is a music source separation model that splits audio into stems: drums, bass, other, vocals (and guitar, piano for 6-source models).

Models

Model What it is Architecture Sub-models Sources Weights (fp16) Weights (fp32)
htdemucs Default v4 model, best speed/quality balance HTDemucs (v4) 1 4 80 MB 160 MB
htdemucs_ft Fine-tuned v4, best overall quality HTDemucs (v4) 4 (fine-tuned) 4 321 MB 641 MB
htdemucs_6s 6-source v4 (adds guitar + piano stems) HTDemucs (v4) 1 6 52 MB 105 MB
hdemucs_mmi v3 hybrid, trained on more data HDemucs (v3) 1 4 160 MB 319 MB
mdx v3 bag-of-models ensemble Demucs + HDemucs 4 (bag) 4 659 MB 1.3 GB
mdx_extra v3 ensemble trained on extra data HDemucs 4 (bag) 4 638 MB 1.2 GB
mdx_q Quantized v3 ensemble (same quality, smaller) Demucs + HDemucs 4 (bag) 4 659 MB 1.3 GB
mdx_extra_q Quantized v3 extra ensemble HDemucs 4 (bag) 4 638 MB 1.2 GB

All models output stereo audio at 44.1 kHz.

Float16 vs Float32

Output quality is identical β€” max sample difference is 3.1e-5 (one int16 LSB), correlation > 0.999999999. MLX on Apple Silicon upcasts float16 weights to float32 for computation, so the math is the same.

Metric float32 (iky1e/demucs-mlx) float16 (this repo)
htdemucs file size 160 MB 80 MB
htdemucs RSS (peak memory) 1311 MB 1210 MB
htdemucs speed (M1 Pro) 7.1s 7.9s
Output quality reference identical

Origin

Files

Each model consists of two files at the repo root:

  • {model_name}.safetensors β€” model weights (float16)
  • {model_name}_config.json β€” model class, architecture config, and bag-of-models metadata

Usage

Swift (demucs-mlx-swift)

Point the model directory or repo to this float16 variant:

# Use float16 models from local directory
demucs-mlx-swift -n htdemucs --model-dir /path/to/demucs-mlx-fp16 song.wav

# Or set the HF repo environment variable
export DEMUCS_MLX_SWIFT_MODEL_REPO=iky1e/demucs-mlx-fp16
demucs-mlx-swift -n htdemucs song.wav

Or use the Swift API directly:

import DemucsMLX

let separator = try DemucsSeparator(modelName: "htdemucs")
let result = try separator.separate(fileAt: URL(fileURLWithPath: "song.wav"))

Converting from PyTorch

To reproduce the export directly from PyTorch Demucs checkpoints:

pip install demucs safetensors numpy

# Export all 8 models as float16 (default)
python export_from_pytorch.py --out-dir ./output

# Export as float32
python export_from_pytorch.py --out-dir ./output --dtype float32

The conversion script (export_from_pytorch.py) is available in the demucs-mlx-swift repo under scripts/.

Citation

@inproceedings{rouard2022hybrid,
  title={Hybrid Transformers for Music Source Separation},
  author={Rouard, Simon and Massa, Francisco and Defossez, Alexandre},
  booktitle={ICASSP 23},
  year={2023}
}

@inproceedings{defossez2021hybrid,
  title={Hybrid Spectrogram and Waveform Source Separation},
  author={Defossez, Alexandre},
  booktitle={Proceedings of the ISMIR 2021 Workshop on Music Source Separation},
  year={2021}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including mlx-community/demucs-mlx-fp16