BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper
•
2402.10631
•
Published
•
2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 11.99 | 41.96 ± 1.01 | 24.32 ± 1.25 | 66.21 ± 1.10 | 50.28 ± 1.41 | 39.00 ± 0.49 | - | 44.35 |
Training method based on BitDistiller Paper
Base model
TinyLlama/TinyLlama_v1.1