RandomXiong commited on
Commit
5cceab6
·
verified ·
1 Parent(s): d41150b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -18,6 +18,10 @@ license: apache-2.0
18
  Quantized version of [deepseek-ai/DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1)
19
 
20
 
 
 
 
 
21
  ### Model Optimizations
22
  These models were obtained by quantizing the weights and activations of DeepSeek models to mixed-precision data types (W4(int)A(FP)8 for MoE layers and FP8 for dense layers).
23
  This optimization reduces the number of bits per parameter 4/8, significantly reducing GPU memory requirements.
 
18
  Quantized version of [deepseek-ai/DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1)
19
 
20
 
21
+ | Model| MMLU |
22
+ |-------|-------|
23
+ | novita/Deepseek-V3.1-W4AFP8 | 0.8680 |
24
+
25
  ### Model Optimizations
26
  These models were obtained by quantizing the weights and activations of DeepSeek models to mixed-precision data types (W4(int)A(FP)8 for MoE layers and FP8 for dense layers).
27
  This optimization reduces the number of bits per parameter 4/8, significantly reducing GPU memory requirements.