Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper β’ 2502.11089 β’ Published Feb 16, 2025 β’ 166