Add Metal (Apple Silicon) build variants

#1

Add pre-built Metal kernel variants for Apple Silicon Macs.

Build variants:

  • torch210-metal-aarch64-darwin
  • torch29-metal-aarch64-darwin

These enable GPU-accelerated fused RMS normalization on MPS (Metal Performance Shaders) backend, tested on M1/M2/M3/M4 with macOS 14+.

74/74 tests passing across all dtypes and configurations.

Source: https://github.com/robtaylor/fused-rms-norm

Hey @robtaylor-chipflow , could you please open a pr here : https://github.com/huggingface/kernels-community instead ?

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment