add some perplexity data
Browse files- README.md +91 -27
- images/perplexity.png +3 -0
- logs/imatrix-Step-3.5-Flash-BF16.log +780 -0
- logs/perplexity-Step-3.5-Flash-BF16.log +202 -0
README.md
CHANGED
|
@@ -11,11 +11,6 @@ tags:
|
|
| 11 |
- step3p5
|
| 12 |
---
|
| 13 |
|
| 14 |
-
## WIP
|
| 15 |
-
Only one test quant for now, a custom `IQ4_XS` which runs on both mainline llama.cpp and [ik_llama.cpp now that this was just merged to main](https://github.com/ikawrakow/ik_llama.cpp/pull/1240).
|
| 16 |
-
|
| 17 |
-
I'm cooking imatrix now and planning to release some more ik_llama.cpp quants on Saturday!
|
| 18 |
-
|
| 19 |
## `ik_llama.cpp` imatrix Quantizations of stepfun-ai/Step-3.5-Flash
|
| 20 |
*NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
|
| 21 |
|
|
@@ -35,31 +30,72 @@ Perplexity computed against *wiki.test.raw*. (lower is "better")
|
|
| 35 |
|
| 36 |

|
| 37 |
|
| 38 |
-
These two are just a test quants for baseline perplexity comparison:
|
| 39 |
* `BF16` 366.952 GiB (16.004 BPW)
|
| 40 |
-
-
|
| 41 |
* `Q8_0` 195.031 GiB (8.506 BPW)
|
| 42 |
-
-
|
| 43 |
|
| 44 |
*NOTE*: The first split file is much smaller on purpose to only contain metadata, its fine!
|
| 45 |
|
| 46 |
-
## IQ5_K
|
| 47 |
-
|
| 48 |
|
| 49 |
<details>
|
| 50 |
|
| 51 |
<summary>👈 Secret Recipe</summary>
|
| 52 |
|
| 53 |
```bash
|
| 54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
```
|
| 56 |
|
| 57 |
</details>
|
| 58 |
|
| 59 |
## IQ4_XS 100.53 GiB (4.38 BPW)
|
| 60 |
-
|
| 61 |
|
| 62 |
-
*NOTE*: This
|
| 63 |
|
| 64 |
<details>
|
| 65 |
|
|
@@ -111,33 +147,61 @@ numactl -N ${SOCKET} -m ${SOCKET} \
|
|
| 111 |
|
| 112 |
</details>
|
| 113 |
|
| 114 |
-
##
|
| 115 |
-
|
| 116 |
|
| 117 |
<details>
|
| 118 |
|
| 119 |
<summary>👈 Secret Recipe</summary>
|
| 120 |
|
| 121 |
```bash
|
| 122 |
-
|
| 123 |
-
```
|
| 124 |
|
| 125 |
-
|
|
|
|
| 126 |
|
| 127 |
-
#
|
| 128 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
|
| 130 |
-
|
|
|
|
|
|
|
| 131 |
|
| 132 |
-
|
|
|
|
|
|
|
| 133 |
|
| 134 |
-
|
| 135 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
```
|
| 137 |
|
| 138 |
</details>
|
| 139 |
|
| 140 |
-
## IQ2_KS TODO
|
| 141 |
TODO
|
| 142 |
|
| 143 |
<details>
|
|
@@ -185,9 +249,9 @@ numactl -N "$SOCKET" -m "$SOCKET" \
|
|
| 185 |
--jinja
|
| 186 |
```
|
| 187 |
|
| 188 |
-
For tool use you can always bring your own template with `--
|
| 189 |
|
| 190 |
-
Another option is to check out [pwilkin's autoparser branch](https://github.com/ggml-org/llama.cpp/pull/18675)
|
| 191 |
|
| 192 |
## References
|
| 193 |
* [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)
|
|
|
|
| 11 |
- step3p5
|
| 12 |
---
|
| 13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
## `ik_llama.cpp` imatrix Quantizations of stepfun-ai/Step-3.5-Flash
|
| 15 |
*NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
|
| 16 |
|
|
|
|
| 30 |
|
| 31 |

|
| 32 |
|
| 33 |
+
These two are just a test quants for baseline perplexity comparison and not available for download here:
|
| 34 |
* `BF16` 366.952 GiB (16.004 BPW)
|
| 35 |
+
- PPL over 561 chunks for n_ctx=512 = 2.4169 +/- 0.01107
|
| 36 |
* `Q8_0` 195.031 GiB (8.506 BPW)
|
| 37 |
+
- PPL over 561 chunks for n_ctx=512 = 2.4188 +/- 0.01109
|
| 38 |
|
| 39 |
*NOTE*: The first split file is much smaller on purpose to only contain metadata, its fine!
|
| 40 |
|
| 41 |
+
## IQ5_K 136.891 GiB (5.970 BPW)
|
| 42 |
+
PPL over 561 chunks for n_ctx=512 = 2.4304 +/- 0.01117
|
| 43 |
|
| 44 |
<details>
|
| 45 |
|
| 46 |
<summary>👈 Secret Recipe</summary>
|
| 47 |
|
| 48 |
```bash
|
| 49 |
+
#!/usr/bin/env bash
|
| 50 |
+
|
| 51 |
+
custom="
|
| 52 |
+
# 45 Repeating Layers [0-44]
|
| 53 |
+
|
| 54 |
+
# Attention [0-44] GPU
|
| 55 |
+
blk\..*\.attn_gate.*=q8_0
|
| 56 |
+
blk\..*\.attn_q.*=q8_0
|
| 57 |
+
blk\..*\.attn_k.*=q8_0
|
| 58 |
+
blk\..*\.attn_v.*=q8_0
|
| 59 |
+
blk\..*\.attn_output.*=q8_0
|
| 60 |
+
|
| 61 |
+
# First 3 Dense Layers [0-2] GPU
|
| 62 |
+
blk\..*\.ffn_down\.weight=q8_0
|
| 63 |
+
blk\..*\.ffn_(gate|up)\.weight=q8_0
|
| 64 |
+
|
| 65 |
+
# Shared Expert Layers [3-44] GPU
|
| 66 |
+
blk\..*\.ffn_down_shexp\.weight=q8_0
|
| 67 |
+
blk\..*\.ffn_(gate|up)_shexp\.weight=q8_0
|
| 68 |
+
|
| 69 |
+
# Routed Experts Layers [3-44] CPU
|
| 70 |
+
blk\..*\.ffn_down_exps\.weight=iq6_k
|
| 71 |
+
blk\..*\.ffn_(gate|up)_exps\.weight=iq5_k
|
| 72 |
+
|
| 73 |
+
# Non-Repeating Layers
|
| 74 |
+
token_embd\.weight=q8_0
|
| 75 |
+
output\.weight=q8_0
|
| 76 |
+
"
|
| 77 |
+
|
| 78 |
+
custom=$(
|
| 79 |
+
echo "$custom" | grep -v '^#' | \
|
| 80 |
+
sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
|
| 81 |
+
)
|
| 82 |
+
|
| 83 |
+
numactl -N ${SOCKET} -m ${SOCKET} \
|
| 84 |
+
./build/bin/llama-quantize \
|
| 85 |
+
--custom-q "$custom" \
|
| 86 |
+
--imatrix /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat \
|
| 87 |
+
/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf \
|
| 88 |
+
/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-IQ5_K.gguf \
|
| 89 |
+
IQ5_K \
|
| 90 |
+
128
|
| 91 |
```
|
| 92 |
|
| 93 |
</details>
|
| 94 |
|
| 95 |
## IQ4_XS 100.53 GiB (4.38 BPW)
|
| 96 |
+
PPL over 561 chunks for n_ctx=512 = 2.5181 +/- 0.01178
|
| 97 |
|
| 98 |
+
*NOTE*: This mainline compatible quant does not use imatrix.
|
| 99 |
|
| 100 |
<details>
|
| 101 |
|
|
|
|
| 147 |
|
| 148 |
</details>
|
| 149 |
|
| 150 |
+
## smol-IQ3_KS 75.934 GiB (3.312 BPW)
|
| 151 |
+
PPL over 561 chunks for n_ctx=512 = 2.7856 +/- 0.01365
|
| 152 |
|
| 153 |
<details>
|
| 154 |
|
| 155 |
<summary>👈 Secret Recipe</summary>
|
| 156 |
|
| 157 |
```bash
|
| 158 |
+
#!/usr/bin/env bash
|
|
|
|
| 159 |
|
| 160 |
+
custom="
|
| 161 |
+
# 45 Repeating Layers [0-44]
|
| 162 |
|
| 163 |
+
# Attention [0-44] GPU
|
| 164 |
+
blk\..*\.attn_gate.*=iq6_k
|
| 165 |
+
blk\..*\.attn_q.*=iq6_k
|
| 166 |
+
blk\..*\.attn_k.*=iq6_k
|
| 167 |
+
blk\..*\.attn_v.*=iq6_k
|
| 168 |
+
blk\..*\.attn_output.*=iq6_k
|
| 169 |
|
| 170 |
+
# First 3 Dense Layers [0-2] GPU
|
| 171 |
+
blk\..*\.ffn_down\.weight=iq6_k
|
| 172 |
+
blk\..*\.ffn_(gate|up)\.weight=iq6_k
|
| 173 |
|
| 174 |
+
# Shared Expert Layers [3-44] GPU
|
| 175 |
+
blk\..*\.ffn_down_shexp\.weight=iq6_k
|
| 176 |
+
blk\..*\.ffn_(gate|up)_shexp\.weight=iq6_k
|
| 177 |
|
| 178 |
+
# Routed Experts Layers [3-44] CPU
|
| 179 |
+
blk\..*\.ffn_down_exps\.weight=iq3_ks
|
| 180 |
+
blk\..*\.ffn_(gate|up)_exps\.weight=iq3_ks
|
| 181 |
+
|
| 182 |
+
# Non-Repeating Layers
|
| 183 |
+
token_embd\.weight=iq4_k
|
| 184 |
+
output\.weight=iq6_k
|
| 185 |
+
"
|
| 186 |
+
|
| 187 |
+
custom=$(
|
| 188 |
+
echo "$custom" | grep -v '^#' | \
|
| 189 |
+
sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
|
| 190 |
+
)
|
| 191 |
+
|
| 192 |
+
numactl -N ${SOCKET} -m ${SOCKET} \
|
| 193 |
+
./build/bin/llama-quantize \
|
| 194 |
+
--custom-q "$custom" \
|
| 195 |
+
--imatrix /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat \
|
| 196 |
+
/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf \
|
| 197 |
+
/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-smol-IQ3_KS.gguf \
|
| 198 |
+
IQ3_KS \
|
| 199 |
+
128
|
| 200 |
```
|
| 201 |
|
| 202 |
</details>
|
| 203 |
|
| 204 |
+
## smol-IQ2_KS TODO
|
| 205 |
TODO
|
| 206 |
|
| 207 |
<details>
|
|
|
|
| 249 |
--jinja
|
| 250 |
```
|
| 251 |
|
| 252 |
+
For tool use you can always bring your own template with `--chat-template-file myTemplate.jinja` and might need `--special` etc. The chat template baked into these GGUFs from the [original one](https://huggingface.co/stepfun-ai/Step-3.5-Flash/blob/main/chat_template.jinja).
|
| 253 |
|
| 254 |
+
Another option for mainline tool calling users is to check out [pwilkin's autoparser branch](https://github.com/ggml-org/llama.cpp/pull/18675).
|
| 255 |
|
| 256 |
## References
|
| 257 |
* [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)
|
images/perplexity.png
ADDED
|
Git LFS Details
|
logs/imatrix-Step-3.5-Flash-BF16.log
ADDED
|
@@ -0,0 +1,780 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
numactl -N 0 -m 0 ./build/bin/llama-imatrix --model /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf -f ubergarm-imatrix-calibration-corpus-v02.txt -o /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat --no-fused-moe --no-fused-up-gate --no-fused-mul-multiadd --ctx-size 512 -ub 4096 -b 4096 --threads 96 --threads-batch 128 --no-mmap --numa numactl --verbosity 1 --layer-similarity
|
| 2 |
+
|
| 3 |
+
CPU: using device CPU - 0 MiB free
|
| 4 |
+
llama_model_loader: additional 8 GGUFs metadata loaded.
|
| 5 |
+
llama_model_loader: loaded meta data with 50 key-value pairs and 754 tensors from /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf (version GGUF V3 (latest))
|
| 6 |
+
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
|
| 7 |
+
llama_model_loader: - kv 0: general.architecture str = step35
|
| 8 |
+
llama_model_loader: - kv 1: general.type str = model
|
| 9 |
+
llama_model_loader: - kv 2: general.name str = Step 3.5 Flash
|
| 10 |
+
llama_model_loader: - kv 3: general.size_label str = 288x7.4B
|
| 11 |
+
llama_model_loader: - kv 4: general.license str = apache-2.0
|
| 12 |
+
llama_model_loader: - kv 5: general.base_model.count u32 = 1
|
| 13 |
+
llama_model_loader: - kv 6: general.base_model.0.name str = Step 3.5 Flash
|
| 14 |
+
llama_model_loader: - kv 7: general.base_model.0.organization str = Stepfun Ai
|
| 15 |
+
llama_model_loader: - kv 8: general.base_model.0.repo_url str = https://huggingface.co/stepfun-ai/ste...
|
| 16 |
+
llama_model_loader: - kv 9: step35.block_count u32 = 45
|
| 17 |
+
llama_model_loader: - kv 10: step35.context_length u32 = 262144
|
| 18 |
+
llama_model_loader: - kv 11: step35.embedding_length u32 = 4096
|
| 19 |
+
llama_model_loader: - kv 12: step35.feed_forward_length u32 = 11264
|
| 20 |
+
llama_model_loader: - kv 13: step35.attention.head_count arr[i32,45] = [64, 96, 96, 96, 64, 96, 96, 96, 64, ...
|
| 21 |
+
llama_model_loader: - kv 14: step35.rope.freq_base f32 = 5000000.000000
|
| 22 |
+
llama_model_loader: - kv 15: step35.rope.freq_base_swa f32 = 10000.000000
|
| 23 |
+
llama_model_loader: - kv 16: step35.expert_gating_func u32 = 2
|
| 24 |
+
llama_model_loader: - kv 17: step35.attention.key_length u32 = 128
|
| 25 |
+
llama_model_loader: - kv 18: step35.attention.value_length u32 = 128
|
| 26 |
+
llama_model_loader: - kv 19: general.file_type u32 = 32
|
| 27 |
+
llama_model_loader: - kv 20: step35.attention.head_count_kv arr[i32,45] = [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
|
| 28 |
+
llama_model_loader: - kv 21: step35.attention.sliding_window u32 = 512
|
| 29 |
+
llama_model_loader: - kv 22: step35.attention.sliding_window_pattern arr[i32,45] = [0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, ...
|
| 30 |
+
llama_model_loader: - kv 23: step35.expert_count u32 = 288
|
| 31 |
+
llama_model_loader: - kv 24: step35.expert_used_count u32 = 8
|
| 32 |
+
llama_model_loader: - kv 25: step35.expert_feed_forward_length u32 = 1280
|
| 33 |
+
llama_model_loader: - kv 26: step35.expert_shared_feed_forward_length u32 = 1280
|
| 34 |
+
llama_model_loader: - kv 27: step35.expert_weights_scale f32 = 3.000000
|
| 35 |
+
llama_model_loader: - kv 28: step35.expert_weights_norm bool = true
|
| 36 |
+
llama_model_loader: - kv 29: step35.leading_dense_block_count u32 = 3
|
| 37 |
+
llama_model_loader: - kv 30: step35.moe_every_n_layers u32 = 1
|
| 38 |
+
llama_model_loader: - kv 31: step35.attention.layer_norm_rms_epsilon f32 = 0.000010
|
| 39 |
+
llama_model_loader: - kv 32: step35.swiglu_clamp_exp arr[f32,45] = [0.000000, 0.000000, 0.000000, 0.0000...
|
| 40 |
+
llama_model_loader: - kv 33: step35.swiglu_clamp_shexp arr[f32,45] = [0.000000, 0.000000, 0.000000, 0.0000...
|
| 41 |
+
llama_model_loader: - kv 34: general.quantization_version u32 = 2
|
| 42 |
+
llama_model_loader: - kv 35: tokenizer.ggml.model str = gpt2
|
| 43 |
+
llama_model_loader: - kv 36: tokenizer.ggml.pre str = deepseek-v3
|
| 44 |
+
llama_model_loader: - kv 37: tokenizer.ggml.tokens arr[str,128896] = ["<|begin▁of▁sentence|>", "<�...
|
| 45 |
+
llama_model_loader: - kv 38: tokenizer.ggml.token_type arr[i32,128896] = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
|
| 46 |
+
llama_model_loader: - kv 39: tokenizer.ggml.merges arr[str,127741] = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
|
| 47 |
+
llama_model_loader: - kv 40: tokenizer.ggml.bos_token_id u32 = 0
|
| 48 |
+
llama_model_loader: - kv 41: tokenizer.ggml.eos_token_id u32 = 128007
|
| 49 |
+
llama_model_loader: - kv 42: tokenizer.ggml.padding_token_id u32 = 1
|
| 50 |
+
llama_model_loader: - kv 43: tokenizer.ggml.add_bos_token bool = true
|
| 51 |
+
llama_model_loader: - kv 44: tokenizer.ggml.add_sep_token bool = false
|
| 52 |
+
llama_model_loader: - kv 45: tokenizer.ggml.add_eos_token bool = false
|
| 53 |
+
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_content(content) %}{%...
|
| 54 |
+
llama_model_loader: - kv 47: split.no u16 = 0
|
| 55 |
+
llama_model_loader: - kv 48: split.count u16 = 9
|
| 56 |
+
llama_model_loader: - kv 49: split.tensors.count i32 = 754
|
| 57 |
+
llama_model_loader: - type f32: 266 tensors
|
| 58 |
+
llama_model_loader: - type bf16: 488 tensors
|
| 59 |
+
load: printing all EOG tokens:
|
| 60 |
+
load: - 128007 ('<|im_end|>')
|
| 61 |
+
load: special tokens cache size = 818
|
| 62 |
+
load: token to piece cache size = 0.8220 MB
|
| 63 |
+
llm_load_print_meta: format = GGUF V3 (latest)
|
| 64 |
+
llm_load_print_meta: arch = step35
|
| 65 |
+
llm_load_print_meta: n_ctx_train = 262144
|
| 66 |
+
llm_load_print_meta: n_embd = 4096
|
| 67 |
+
llm_load_print_meta: n_layer = 45
|
| 68 |
+
llm_load_print_meta: n_head = [64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64]
|
| 69 |
+
llm_load_print_meta: n_head_kv = 8
|
| 70 |
+
llm_load_print_meta: n_rot = 128
|
| 71 |
+
llm_load_print_meta: n_swa = 512
|
| 72 |
+
llm_load_print_meta: n_swa_pattern = 1
|
| 73 |
+
llm_load_print_meta: n_embd_head_k = 128
|
| 74 |
+
llm_load_print_meta: n_embd_head_v = 128
|
| 75 |
+
llm_load_print_meta: n_gqa = [8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8]
|
| 76 |
+
llm_load_print_meta: n_embd_k_gqa = 1024
|
| 77 |
+
llm_load_print_meta: n_embd_v_gqa = 1024
|
| 78 |
+
llm_load_print_meta: f_norm_eps = 0.0e+00
|
| 79 |
+
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
|
| 80 |
+
llm_load_print_meta: f_clamp_kqv = 0.0e+00
|
| 81 |
+
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
|
| 82 |
+
llm_load_print_meta: f_logit_scale = 0.0e+00
|
| 83 |
+
llm_load_print_meta: n_ff = 11264
|
| 84 |
+
llm_load_print_meta: n_expert = 288
|
| 85 |
+
llm_load_print_meta: n_expert_used = 8
|
| 86 |
+
llm_load_print_meta: causal attn = 1
|
| 87 |
+
llm_load_print_meta: pooling type = 0
|
| 88 |
+
llm_load_print_meta: rope type = 2
|
| 89 |
+
llm_load_print_meta: rope scaling = linear
|
| 90 |
+
llm_load_print_meta: freq_base_train = 5000000.0
|
| 91 |
+
llm_load_print_meta: freq_scale_train = 1
|
| 92 |
+
llm_load_print_meta: n_ctx_orig_yarn = 262144
|
| 93 |
+
llm_load_print_meta: rope_finetuned = unknown
|
| 94 |
+
llm_load_print_meta: ssm_d_conv = 0
|
| 95 |
+
llm_load_print_meta: ssm_d_inner = 0
|
| 96 |
+
llm_load_print_meta: ssm_d_state = 0
|
| 97 |
+
llm_load_print_meta: ssm_dt_rank = 0
|
| 98 |
+
llm_load_print_meta: model type = ?B
|
| 99 |
+
llm_load_print_meta: model ftype = BF16
|
| 100 |
+
llm_load_print_meta: model params = 196.956 B
|
| 101 |
+
llm_load_print_meta: model size = 366.952 GiB (16.004 BPW)
|
| 102 |
+
llm_load_print_meta: repeating layers = 364.986 GiB (16.004 BPW, 195.900 B parameters)
|
| 103 |
+
llm_load_print_meta: general.name = Step 3.5 Flash
|
| 104 |
+
print_info: vocab type = BPE
|
| 105 |
+
print_info: n_vocab = 128896
|
| 106 |
+
print_info: n_merges = 127741
|
| 107 |
+
print_info: BOS token = 0 '<|begin▁of▁sentence|>'
|
| 108 |
+
print_info: EOS token = 128007 '<|im_end|>'
|
| 109 |
+
print_info: EOT token = 128007 '<|im_end|>'
|
| 110 |
+
print_info: PAD token = 1 '<|end▁of▁sentence|>'
|
| 111 |
+
print_info: LF token = 201 'Ċ'
|
| 112 |
+
print_info: FIM PRE token = 128801 '<|fim▁begin|>'
|
| 113 |
+
print_info: FIM SUF token = 128800 '<|fim▁hole|>'
|
| 114 |
+
print_info: FIM MID token = 128802 '<|fim▁end|>'
|
| 115 |
+
print_info: EOG token = 128007 '<|im_end|>'
|
| 116 |
+
print_info: max token length = 256
|
| 117 |
+
llm_load_tensors: ggml ctx size = 0.31 MiB
|
| 118 |
+
llm_load_tensors: offloading 0 repeating layers to GPU
|
| 119 |
+
llm_load_tensors: offloaded 0/46 layers to GPU
|
| 120 |
+
llm_load_tensors: CPU buffer size = 375759.27 MiB
|
| 121 |
+
....................................................................................................
|
| 122 |
+
llama_new_context_with_model: n_ctx = 512
|
| 123 |
+
llama_new_context_with_model: n_batch = 512
|
| 124 |
+
llama_new_context_with_model: n_ubatch = 512
|
| 125 |
+
llama_new_context_with_model: flash_attn = 1
|
| 126 |
+
llama_new_context_with_model: attn_max_b = 0
|
| 127 |
+
llama_new_context_with_model: fused_moe = 0
|
| 128 |
+
llama_new_context_with_model: grouped er = 0
|
| 129 |
+
llama_new_context_with_model: fused_up_gate = 0
|
| 130 |
+
llama_new_context_with_model: fused_mmad = 0
|
| 131 |
+
llama_new_context_with_model: rope_cache = 0
|
| 132 |
+
llama_new_context_with_model: graph_reuse = 1
|
| 133 |
+
llama_new_context_with_model: k_cache_hadam = 0
|
| 134 |
+
llama_new_context_with_model: split_mode_graph_scheduling = 0
|
| 135 |
+
llama_new_context_with_model: reduce_type = f16
|
| 136 |
+
llama_new_context_with_model: sched_async = 0
|
| 137 |
+
llama_new_context_with_model: ser = -1, 0
|
| 138 |
+
llama_new_context_with_model: freq_base = 5000000.0
|
| 139 |
+
llama_new_context_with_model: freq_scale = 1
|
| 140 |
+
llama_kv_cache_init: CPU KV buffer size = 90.00 MiB
|
| 141 |
+
llama_new_context_with_model: KV self size = 90.00 MiB, K (f16): 45.00 MiB, V (f16): 45.00 MiB
|
| 142 |
+
llama_new_context_with_model: CPU output buffer size = 0.49 MiB
|
| 143 |
+
llama_new_context_with_model: CPU compute buffer size = 259.75 MiB
|
| 144 |
+
llama_new_context_with_model: graph nodes = 2369
|
| 145 |
+
llama_new_context_with_model: graph splits = 1
|
| 146 |
+
XXXXXXXXXXXXXXXXXXXXX Setting only active experts offload
|
| 147 |
+
|
| 148 |
+
system_info: n_threads = 96 (n_threads_batch = 128) / 512 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
|
| 149 |
+
compute_imatrix: tokenizing the input ..
|
| 150 |
+
compute_imatrix: tokenization took 599.134 ms
|
| 151 |
+
compute_imatrix: computing over 812 chunks with batch_size 512
|
| 152 |
+
compute_imatrix: 4.10 seconds per pass - ETA 55.55 minutes
|
| 153 |
+
===================================== llama_new_context_with_model: f16
|
| 154 |
+
======================================= HAVE_FANCY_SIMD is defined
|
| 155 |
+
[1]92.2870,[2]15.6185,[3]9.0021,[4]5.2226,[5]3.8316,[6]3.1180,[7]2.6999,[8]2.4021,[9]2.2278,
|
| 156 |
+
save_imatrix: entry ' blk.43.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 157 |
+
save_imatrix: entry ' blk.42.ffn_down_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
|
| 158 |
+
save_imatrix: entry ' blk.39.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 159 |
+
save_imatrix: entry ' blk.38.ffn_gate_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 160 |
+
save_imatrix: entry ' blk.39.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 161 |
+
save_imatrix: entry ' blk.37.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 162 |
+
save_imatrix: entry ' blk.36.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 163 |
+
save_imatrix: entry ' blk.40.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 164 |
+
save_imatrix: entry ' blk.35.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 165 |
+
save_imatrix: entry ' blk.35.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 166 |
+
save_imatrix: entry ' blk.34.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 167 |
+
save_imatrix: entry ' blk.34.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 168 |
+
save_imatrix: entry ' blk.33.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 169 |
+
save_imatrix: entry ' blk.33.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 170 |
+
save_imatrix: entry ' blk.39.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 171 |
+
save_imatrix: entry ' blk.32.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 172 |
+
save_imatrix: entry ' blk.32.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 173 |
+
save_imatrix: entry ' blk.34.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 174 |
+
save_imatrix: entry ' blk.31.ffn_down_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
|
| 175 |
+
save_imatrix: entry ' blk.31.ffn_gate_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
|
| 176 |
+
save_imatrix: entry ' blk.40.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 177 |
+
save_imatrix: entry ' blk.43.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 178 |
+
save_imatrix: entry ' blk.30.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 179 |
+
save_imatrix: entry ' blk.30.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 180 |
+
save_imatrix: entry ' blk.29.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 181 |
+
save_imatrix: entry ' blk.29.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 182 |
+
save_imatrix: entry ' blk.42.ffn_up_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
|
| 183 |
+
save_imatrix: entry ' blk.28.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 184 |
+
save_imatrix: entry ' blk.28.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 185 |
+
save_imatrix: entry ' blk.43.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 186 |
+
save_imatrix: entry ' blk.31.ffn_up_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
|
| 187 |
+
save_imatrix: entry ' blk.27.ffn_gate_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
|
| 188 |
+
save_imatrix: entry ' blk.26.ffn_gate_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
|
| 189 |
+
save_imatrix: entry ' blk.26.ffn_up_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
|
| 190 |
+
save_imatrix: entry ' blk.36.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 191 |
+
save_imatrix: entry ' blk.24.ffn_down_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
|
| 192 |
+
save_imatrix: entry ' blk.24.ffn_gate_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
|
| 193 |
+
save_imatrix: entry ' blk.28.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 194 |
+
save_imatrix: entry ' blk.23.ffn_down_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
|
| 195 |
+
save_imatrix: entry ' blk.23.ffn_gate_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
|
| 196 |
+
save_imatrix: entry ' blk.23.ffn_up_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
|
| 197 |
+
save_imatrix: entry ' blk.38.ffn_up_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 198 |
+
save_imatrix: entry ' blk.22.ffn_down_exps.weight' has partial data (89.58%) 30 out of 288 experts are missing data - skipping
|
| 199 |
+
save_imatrix: entry ' blk.22.ffn_gate_exps.weight' has partial data (89.58%) 30 out of 288 experts are missing data - skipping
|
| 200 |
+
save_imatrix: entry ' blk.25.ffn_gate_exps.weight' has partial data (90.97%) 26 out of 288 experts are missing data - skipping
|
| 201 |
+
save_imatrix: entry ' blk.15.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 202 |
+
save_imatrix: entry ' blk.7.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 203 |
+
save_imatrix: entry ' blk.11.ffn_up_exps.weight' has partial data (89.93%) 29 out of 288 experts are missing data - skipping
|
| 204 |
+
save_imatrix: entry ' blk.6.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 205 |
+
save_imatrix: entry ' blk.20.ffn_down_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 206 |
+
save_imatrix: entry ' blk.11.ffn_down_exps.weight' has partial data (89.93%) 29 out of 288 experts are missing data - skipping
|
| 207 |
+
save_imatrix: entry ' blk.16.ffn_up_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 208 |
+
save_imatrix: entry ' blk.41.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 209 |
+
save_imatrix: entry ' blk.33.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 210 |
+
save_imatrix: entry ' blk.4.ffn_up_exps.weight' has partial data (82.99%) 49 out of 288 experts are missing data - skipping
|
| 211 |
+
save_imatrix: entry ' blk.29.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 212 |
+
save_imatrix: entry ' blk.8.ffn_up_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
|
| 213 |
+
save_imatrix: entry ' blk.10.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 214 |
+
save_imatrix: entry ' blk.3.ffn_up_exps.weight' has partial data (99.31%) 2 out of 288 experts are missing data Storing **but be aware**
|
| 215 |
+
save_imatrix: entry ' blk.6.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 216 |
+
save_imatrix: entry ' blk.37.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 217 |
+
save_imatrix: entry ' blk.9.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 218 |
+
save_imatrix: entry ' blk.36.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 219 |
+
save_imatrix: entry ' blk.3.ffn_down_exps.weight' has partial data (99.31%) 2 out of 288 experts are missing data Storing **but be aware**
|
| 220 |
+
save_imatrix: entry ' blk.12.ffn_down_exps.weight' has partial data (88.54%) 33 out of 288 experts are missing data - skipping
|
| 221 |
+
save_imatrix: entry ' blk.21.ffn_down_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 222 |
+
save_imatrix: entry ' blk.27.ffn_up_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
|
| 223 |
+
save_imatrix: entry ' blk.41.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 224 |
+
save_imatrix: entry ' blk.12.ffn_gate_exps.weight' has partial data (88.54%) 33 out of 288 experts are missing data - skipping
|
| 225 |
+
save_imatrix: entry ' blk.38.ffn_down_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 226 |
+
save_imatrix: entry ' blk.44.ffn_up_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
|
| 227 |
+
save_imatrix: entry ' blk.4.ffn_gate_exps.weight' has partial data (82.99%) 49 out of 288 experts are missing data - skipping
|
| 228 |
+
save_imatrix: entry ' blk.19.ffn_up_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 229 |
+
save_imatrix: entry ' blk.13.ffn_up_exps.weight' has partial data (83.33%) 48 out of 288 experts are missing data - skipping
|
| 230 |
+
save_imatrix: entry ' blk.44.ffn_down_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
|
| 231 |
+
save_imatrix: entry ' blk.7.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 232 |
+
save_imatrix: entry ' blk.30.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 233 |
+
save_imatrix: entry ' blk.5.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 234 |
+
save_imatrix: entry ' blk.18.ffn_up_exps.weight' has partial data (90.62%) 27 out of 288 experts are missing data - skipping
|
| 235 |
+
save_imatrix: entry ' blk.4.ffn_down_exps.weight' has partial data (82.99%) 49 out of 288 experts are missing data - skipping
|
| 236 |
+
save_imatrix: entry ' blk.17.ffn_up_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 237 |
+
save_imatrix: entry ' blk.41.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 238 |
+
save_imatrix: entry ' blk.9.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 239 |
+
save_imatrix: entry ' blk.25.ffn_up_exps.weight' has partial data (90.97%) 26 out of 288 experts are missing data - skipping
|
| 240 |
+
save_imatrix: entry ' blk.3.ffn_gate_exps.weight' has partial data (99.31%) 2 out of 288 experts are missing data Storing **but be aware**
|
| 241 |
+
save_imatrix: entry ' blk.8.ffn_gate_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
|
| 242 |
+
save_imatrix: entry ' blk.9.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 243 |
+
save_imatrix: entry ' blk.5.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 244 |
+
save_imatrix: entry ' blk.13.ffn_down_exps.weight' has partial data (83.33%) 48 out of 288 experts are missing data - skipping
|
| 245 |
+
save_imatrix: entry ' blk.16.ffn_gate_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 246 |
+
save_imatrix: entry ' blk.27.ffn_down_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
|
| 247 |
+
save_imatrix: entry ' blk.26.ffn_down_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
|
| 248 |
+
save_imatrix: entry ' blk.5.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 249 |
+
save_imatrix: entry ' blk.11.ffn_gate_exps.weight' has partial data (89.93%) 29 out of 288 experts are missing data - skipping
|
| 250 |
+
save_imatrix: entry ' blk.37.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 251 |
+
save_imatrix: entry ' blk.18.ffn_gate_exps.weight' has partial data (90.62%) 27 out of 288 experts are missing data - skipping
|
| 252 |
+
save_imatrix: entry ' blk.20.ffn_gate_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 253 |
+
save_imatrix: entry ' blk.13.ffn_gate_exps.weight' has partial data (83.33%) 48 out of 288 experts are missing data - skipping
|
| 254 |
+
save_imatrix: entry ' blk.40.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 255 |
+
save_imatrix: entry ' blk.14.ffn_up_exps.weight' has partial data (87.85%) 35 out of 288 experts are missing data - skipping
|
| 256 |
+
save_imatrix: entry ' blk.10.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 257 |
+
save_imatrix: entry ' blk.14.ffn_gate_exps.weight' has partial data (87.85%) 35 out of 288 experts are missing data - skipping
|
| 258 |
+
save_imatrix: entry ' blk.14.ffn_down_exps.weight' has partial data (87.85%) 35 out of 288 experts are missing data - skipping
|
| 259 |
+
save_imatrix: entry ' blk.8.ffn_down_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
|
| 260 |
+
save_imatrix: entry ' blk.24.ffn_up_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
|
| 261 |
+
save_imatrix: entry ' blk.12.ffn_up_exps.weight' has partial data (88.54%) 33 out of 288 experts are missing data - skipping
|
| 262 |
+
save_imatrix: entry ' blk.42.ffn_gate_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
|
| 263 |
+
save_imatrix: entry ' blk.10.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 264 |
+
save_imatrix: entry ' blk.15.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 265 |
+
save_imatrix: entry ' blk.15.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
|
| 266 |
+
save_imatrix: entry ' blk.16.ffn_down_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 267 |
+
save_imatrix: entry ' blk.17.ffn_gate_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 268 |
+
save_imatrix: entry ' blk.35.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 269 |
+
save_imatrix: entry ' blk.17.ffn_down_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 270 |
+
save_imatrix: entry ' blk.18.ffn_down_exps.weight' has partial data (90.62%) 27 out of 288 experts are missing data - skipping
|
| 271 |
+
save_imatrix: entry ' blk.21.ffn_up_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 272 |
+
save_imatrix: entry ' blk.25.ffn_down_exps.weight' has partial data (90.97%) 26 out of 288 experts are missing data - skipping
|
| 273 |
+
save_imatrix: entry ' blk.6.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 274 |
+
save_imatrix: entry ' blk.19.ffn_gate_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 275 |
+
save_imatrix: entry ' blk.19.ffn_down_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 276 |
+
save_imatrix: entry ' blk.20.ffn_up_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 277 |
+
save_imatrix: entry ' blk.22.ffn_up_exps.weight' has partial data (89.58%) 30 out of 288 experts are missing data - skipping
|
| 278 |
+
save_imatrix: entry ' blk.21.ffn_gate_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
|
| 279 |
+
save_imatrix: entry ' blk.44.ffn_gate_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
|
| 280 |
+
save_imatrix: entry ' blk.7.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 281 |
+
save_imatrix: entry ' blk.32.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 282 |
+
save_imatrix: warning: storing only 418 out of 529 entries
|
| 283 |
+
|
| 284 |
+
save_imatrix: stored collected data after 10 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 285 |
+
[10]2.1021,[11]2.3311,[12]2.4035,[13]2.3973,[14]2.4537,[15]2.3408,[16]2.2269,[17]2.1399,[18]2.0725,[19]2.0198,
|
| 286 |
+
save_imatrix: entry ' blk.43.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 287 |
+
save_imatrix: entry ' blk.42.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 288 |
+
save_imatrix: entry ' blk.37.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 289 |
+
save_imatrix: entry ' blk.36.ffn_down_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 290 |
+
save_imatrix: entry ' blk.35.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 291 |
+
save_imatrix: entry ' blk.35.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 292 |
+
save_imatrix: entry ' blk.34.ffn_gate_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 293 |
+
save_imatrix: entry ' blk.34.ffn_up_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 294 |
+
save_imatrix: entry ' blk.33.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 295 |
+
save_imatrix: entry ' blk.33.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 296 |
+
save_imatrix: entry ' blk.32.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 297 |
+
save_imatrix: entry ' blk.32.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 298 |
+
save_imatrix: entry ' blk.34.ffn_down_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 299 |
+
save_imatrix: entry ' blk.31.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 300 |
+
save_imatrix: entry ' blk.31.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 301 |
+
save_imatrix: entry ' blk.43.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 302 |
+
save_imatrix: entry ' blk.30.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 303 |
+
save_imatrix: entry ' blk.30.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 304 |
+
save_imatrix: entry ' blk.29.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 305 |
+
save_imatrix: entry ' blk.29.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 306 |
+
save_imatrix: entry ' blk.42.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 307 |
+
save_imatrix: entry ' blk.28.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 308 |
+
save_imatrix: entry ' blk.28.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 309 |
+
save_imatrix: entry ' blk.43.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 310 |
+
save_imatrix: entry ' blk.31.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 311 |
+
save_imatrix: entry ' blk.27.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 312 |
+
save_imatrix: entry ' blk.26.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 313 |
+
save_imatrix: entry ' blk.26.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 314 |
+
save_imatrix: entry ' blk.36.ffn_up_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 315 |
+
save_imatrix: entry ' blk.24.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 316 |
+
save_imatrix: entry ' blk.24.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 317 |
+
save_imatrix: entry ' blk.28.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 318 |
+
save_imatrix: entry ' blk.23.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 319 |
+
save_imatrix: entry ' blk.23.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 320 |
+
save_imatrix: entry ' blk.23.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 321 |
+
save_imatrix: entry ' blk.22.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 322 |
+
save_imatrix: entry ' blk.22.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 323 |
+
save_imatrix: entry ' blk.25.ffn_gate_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 324 |
+
save_imatrix: entry ' blk.15.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 325 |
+
save_imatrix: entry ' blk.11.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 326 |
+
save_imatrix: entry ' blk.6.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 327 |
+
save_imatrix: entry ' blk.20.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 328 |
+
save_imatrix: entry ' blk.11.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 329 |
+
save_imatrix: entry ' blk.16.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 330 |
+
save_imatrix: entry ' blk.41.ffn_down_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
|
| 331 |
+
save_imatrix: entry ' blk.33.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 332 |
+
save_imatrix: entry ' blk.4.ffn_up_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
|
| 333 |
+
save_imatrix: entry ' blk.29.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 334 |
+
save_imatrix: entry ' blk.8.ffn_up_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
|
| 335 |
+
save_imatrix: entry ' blk.10.ffn_gate_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 336 |
+
save_imatrix: entry ' blk.6.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 337 |
+
save_imatrix: entry ' blk.37.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 338 |
+
save_imatrix: entry ' blk.9.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 339 |
+
save_imatrix: entry ' blk.36.ffn_gate_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 340 |
+
save_imatrix: entry ' blk.12.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 341 |
+
save_imatrix: entry ' blk.21.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 342 |
+
save_imatrix: entry ' blk.27.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 343 |
+
save_imatrix: entry ' blk.41.ffn_gate_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
|
| 344 |
+
save_imatrix: entry ' blk.12.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 345 |
+
save_imatrix: entry ' blk.44.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 346 |
+
save_imatrix: entry ' blk.4.ffn_gate_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
|
| 347 |
+
save_imatrix: entry ' blk.19.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 348 |
+
save_imatrix: entry ' blk.13.ffn_up_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 349 |
+
save_imatrix: entry ' blk.44.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 350 |
+
save_imatrix: entry ' blk.30.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 351 |
+
save_imatrix: entry ' blk.5.ffn_down_exps.weight' has partial data (98.61%) 4 out of 288 experts are missing data Storing **but be aware**
|
| 352 |
+
save_imatrix: entry ' blk.18.ffn_up_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 353 |
+
save_imatrix: entry ' blk.4.ffn_down_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
|
| 354 |
+
save_imatrix: entry ' blk.17.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 355 |
+
save_imatrix: entry ' blk.41.ffn_up_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
|
| 356 |
+
save_imatrix: entry ' blk.9.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 357 |
+
save_imatrix: entry ' blk.25.ffn_up_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 358 |
+
save_imatrix: entry ' blk.8.ffn_gate_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
|
| 359 |
+
save_imatrix: entry ' blk.9.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 360 |
+
save_imatrix: entry ' blk.5.ffn_up_exps.weight' has partial data (98.61%) 4 out of 288 experts are missing data Storing **but be aware**
|
| 361 |
+
save_imatrix: entry ' blk.13.ffn_down_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 362 |
+
save_imatrix: entry ' blk.16.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 363 |
+
save_imatrix: entry ' blk.27.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 364 |
+
save_imatrix: entry ' blk.26.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 365 |
+
save_imatrix: entry ' blk.5.ffn_gate_exps.weight' has partial data (98.61%) 4 out of 288 experts are missing data Storing **but be aware**
|
| 366 |
+
save_imatrix: entry ' blk.11.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 367 |
+
save_imatrix: entry ' blk.37.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 368 |
+
save_imatrix: entry ' blk.18.ffn_gate_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 369 |
+
save_imatrix: entry ' blk.20.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 370 |
+
save_imatrix: entry ' blk.13.ffn_gate_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
|
| 371 |
+
save_imatrix: entry ' blk.14.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 372 |
+
save_imatrix: entry ' blk.10.ffn_down_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 373 |
+
save_imatrix: entry ' blk.14.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 374 |
+
save_imatrix: entry ' blk.14.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 375 |
+
save_imatrix: entry ' blk.8.ffn_down_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
|
| 376 |
+
save_imatrix: entry ' blk.24.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 377 |
+
save_imatrix: entry ' blk.12.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
|
| 378 |
+
save_imatrix: entry ' blk.42.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 379 |
+
save_imatrix: entry ' blk.10.ffn_up_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
|
| 380 |
+
save_imatrix: entry ' blk.15.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 381 |
+
save_imatrix: entry ' blk.15.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 382 |
+
save_imatrix: entry ' blk.16.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 383 |
+
save_imatrix: entry ' blk.17.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 384 |
+
save_imatrix: entry ' blk.35.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 385 |
+
save_imatrix: entry ' blk.17.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
|
| 386 |
+
save_imatrix: entry ' blk.18.ffn_down_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 387 |
+
save_imatrix: entry ' blk.21.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 388 |
+
save_imatrix: entry ' blk.25.ffn_down_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 389 |
+
save_imatrix: entry ' blk.6.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 390 |
+
save_imatrix: entry ' blk.19.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 391 |
+
save_imatrix: entry ' blk.19.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 392 |
+
save_imatrix: entry ' blk.20.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
|
| 393 |
+
save_imatrix: entry ' blk.22.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 394 |
+
save_imatrix: entry ' blk.21.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 395 |
+
save_imatrix: entry ' blk.44.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 396 |
+
save_imatrix: entry ' blk.32.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 397 |
+
save_imatrix: warning: storing only 478 out of 529 entries
|
| 398 |
+
|
| 399 |
+
save_imatrix: stored collected data after 20 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 400 |
+
[20]1.9686,[21]1.9365,[22]1.8869,[23]1.8609,[24]1.8847,[25]1.8792,[26]1.8445,[27]1.9620,[28]2.0677,[29]2.1517,
|
| 401 |
+
save_imatrix: entry ' blk.43.ffn_up_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 402 |
+
save_imatrix: entry ' blk.42.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 403 |
+
save_imatrix: entry ' blk.43.ffn_gate_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 404 |
+
save_imatrix: entry ' blk.30.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 405 |
+
save_imatrix: entry ' blk.30.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 406 |
+
save_imatrix: entry ' blk.29.ffn_gate_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
|
| 407 |
+
save_imatrix: entry ' blk.29.ffn_up_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
|
| 408 |
+
save_imatrix: entry ' blk.42.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 409 |
+
save_imatrix: entry ' blk.28.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 410 |
+
save_imatrix: entry ' blk.28.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 411 |
+
save_imatrix: entry ' blk.43.ffn_down_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
|
| 412 |
+
save_imatrix: entry ' blk.24.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 413 |
+
save_imatrix: entry ' blk.24.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 414 |
+
save_imatrix: entry ' blk.28.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 415 |
+
save_imatrix: entry ' blk.23.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 416 |
+
save_imatrix: entry ' blk.23.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 417 |
+
save_imatrix: entry ' blk.23.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 418 |
+
save_imatrix: entry ' blk.25.ffn_gate_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
|
| 419 |
+
save_imatrix: entry ' blk.15.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 420 |
+
save_imatrix: entry ' blk.20.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 421 |
+
save_imatrix: entry ' blk.4.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 422 |
+
save_imatrix: entry ' blk.29.ffn_down_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
|
| 423 |
+
save_imatrix: entry ' blk.12.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 424 |
+
save_imatrix: entry ' blk.21.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 425 |
+
save_imatrix: entry ' blk.12.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 426 |
+
save_imatrix: entry ' blk.4.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 427 |
+
save_imatrix: entry ' blk.13.ffn_up_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
|
| 428 |
+
save_imatrix: entry ' blk.30.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 429 |
+
save_imatrix: entry ' blk.18.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 430 |
+
save_imatrix: entry ' blk.4.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 431 |
+
save_imatrix: entry ' blk.17.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 432 |
+
save_imatrix: entry ' blk.25.ffn_up_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
|
| 433 |
+
save_imatrix: entry ' blk.13.ffn_down_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
|
| 434 |
+
save_imatrix: entry ' blk.18.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 435 |
+
save_imatrix: entry ' blk.20.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 436 |
+
save_imatrix: entry ' blk.13.ffn_gate_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
|
| 437 |
+
save_imatrix: entry ' blk.14.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 438 |
+
save_imatrix: entry ' blk.14.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 439 |
+
save_imatrix: entry ' blk.14.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
|
| 440 |
+
save_imatrix: entry ' blk.24.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 441 |
+
save_imatrix: entry ' blk.12.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 442 |
+
save_imatrix: entry ' blk.42.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
|
| 443 |
+
save_imatrix: entry ' blk.15.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 444 |
+
save_imatrix: entry ' blk.15.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
|
| 445 |
+
save_imatrix: entry ' blk.17.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 446 |
+
save_imatrix: entry ' blk.17.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 447 |
+
save_imatrix: entry ' blk.18.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 448 |
+
save_imatrix: entry ' blk.21.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 449 |
+
save_imatrix: entry ' blk.25.ffn_down_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
|
| 450 |
+
save_imatrix: entry ' blk.20.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
|
| 451 |
+
save_imatrix: entry ' blk.21.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
|
| 452 |
+
save_imatrix: warning: storing only 511 out of 529 entries
|
| 453 |
+
|
| 454 |
+
save_imatrix: stored collected data after 30 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 455 |
+
[30]2.1645,[31]2.1882,[32]2.1900,[33]2.1667,[34]2.2092,[35]2.2241,[36]2.2535,[37]2.2534,[38]2.3061,[39]2.2955,
|
| 456 |
+
save_imatrix: entry ' blk.43.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 457 |
+
save_imatrix: entry ' blk.42.ffn_down_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 458 |
+
save_imatrix: entry ' blk.43.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 459 |
+
save_imatrix: entry ' blk.42.ffn_up_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 460 |
+
save_imatrix: entry ' blk.43.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 461 |
+
save_imatrix: entry ' blk.23.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 462 |
+
save_imatrix: entry ' blk.23.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 463 |
+
save_imatrix: entry ' blk.23.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 464 |
+
save_imatrix: entry ' blk.4.ffn_up_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 465 |
+
save_imatrix: entry ' blk.4.ffn_gate_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 466 |
+
save_imatrix: entry ' blk.13.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 467 |
+
save_imatrix: entry ' blk.4.ffn_down_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 468 |
+
save_imatrix: entry ' blk.13.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 469 |
+
save_imatrix: entry ' blk.13.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
|
| 470 |
+
save_imatrix: entry ' blk.14.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 471 |
+
save_imatrix: entry ' blk.14.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 472 |
+
save_imatrix: entry ' blk.14.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
|
| 473 |
+
save_imatrix: entry ' blk.42.ffn_gate_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
|
| 474 |
+
|
| 475 |
+
save_imatrix: stored collected data after 40 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 476 |
+
[40]2.3248,[41]2.3142,[42]2.2977,[43]2.3104,[44]2.3077,[45]2.3022,[46]2.3080,[47]2.2986,[48]2.2730,[49]2.2517,
|
| 477 |
+
save_imatrix: stored collected data after 50 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 478 |
+
[50]2.2370,[51]2.2344,[52]2.2286,[53]2.2301,[54]2.2405,[55]2.2243,[56]2.2017,[57]2.2026,[58]2.1997,[59]2.2053,
|
| 479 |
+
save_imatrix: stored collected data after 60 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 480 |
+
[60]2.1886,[61]2.2348,[62]2.2826,[63]2.3263,[64]2.3770,[65]2.4355,[66]2.4710,[67]2.5238,[68]2.5784,[69]2.6394,
|
| 481 |
+
save_imatrix: stored collected data after 70 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 482 |
+
[70]2.7132,[71]2.7529,[72]2.7870,[73]2.8046,[74]2.8227,[75]2.8730,[76]2.9207,[77]2.9354,[78]2.9547,[79]2.9834,
|
| 483 |
+
save_imatrix: stored collected data after 80 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 484 |
+
[80]3.0243,[81]3.0668,[82]3.1158,[83]3.1211,[84]3.1977,[85]3.2135,[86]3.2150,[87]3.2843,[88]3.3456,[89]3.4204,
|
| 485 |
+
save_imatrix: stored collected data after 90 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 486 |
+
[90]3.4383,[91]3.4328,[92]3.4361,[93]3.4487,[94]3.4527,[95]3.4903,[96]3.4952,[97]3.5381,[98]3.5657,[99]3.5372,
|
| 487 |
+
save_imatrix: stored collected data after 100 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 488 |
+
[100]3.5712,[101]3.6255,[102]3.6591,[103]3.7021,[104]3.7340,[105]3.7678,[106]3.8049,[107]3.7880,[108]3.7927,[109]3.7995,
|
| 489 |
+
save_imatrix: stored collected data after 110 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 490 |
+
[110]3.8065,[111]3.7946,[112]3.8325,[113]3.8564,[114]3.8682,[115]3.8427,[116]3.8048,[117]3.7913,[118]3.7995,[119]3.7751,
|
| 491 |
+
save_imatrix: stored collected data after 120 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 492 |
+
[120]3.7521,[121]3.7361,[122]3.7214,[123]3.7211,[124]3.7221,[125]3.7333,[126]3.7428,[127]3.7639,[128]3.7981,[129]3.8097,
|
| 493 |
+
save_imatrix: stored collected data after 130 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 494 |
+
[130]3.7767,[131]3.7424,[132]3.7104,[133]3.6785,[134]3.6792,[135]3.6723,[136]3.7012,[137]3.7375,[138]3.7522,[139]3.7524,
|
| 495 |
+
save_imatrix: stored collected data after 140 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 496 |
+
[140]3.7753,[141]3.8038,[142]3.8356,[143]3.8474,[144]3.8692,[145]3.8896,[146]3.9076,[147]3.9221,[148]3.9314,[149]3.9288,
|
| 497 |
+
save_imatrix: stored collected data after 150 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 498 |
+
[150]3.9329,[151]3.9521,[152]3.9701,[153]3.9697,[154]3.9747,[155]3.9853,[156]3.9910,[157]3.9969,[158]4.0021,[159]4.0095,
|
| 499 |
+
save_imatrix: stored collected data after 160 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 500 |
+
[160]4.0235,[161]4.0243,[162]4.0248,[163]4.0299,[164]4.0368,[165]4.0363,[166]4.0330,[167]4.0539,[168]4.0627,[169]4.0696,
|
| 501 |
+
save_imatrix: stored collected data after 170 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 502 |
+
[170]4.0906,[171]4.1067,[172]4.1002,[173]4.1049,[174]4.1072,[175]4.1209,[176]4.1278,[177]4.1407,[178]4.1391,[179]4.1392,
|
| 503 |
+
save_imatrix: stored collected data after 180 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 504 |
+
[180]4.1375,[181]4.1371,[182]4.1347,[183]4.1326,[184]4.1200,[185]4.1316,[186]4.1609,[187]4.1892,[188]4.2153,[189]4.2400,
|
| 505 |
+
save_imatrix: stored collected data after 190 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 506 |
+
[190]4.2773,[191]4.2866,[192]4.2995,[193]4.2804,[194]4.2936,[195]4.2837,[196]4.2593,[197]4.2321,[198]4.2527,[199]4.2750,
|
| 507 |
+
save_imatrix: stored collected data after 200 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 508 |
+
[200]4.2825,[201]4.2905,[202]4.3070,[203]4.3250,[204]4.3391,[205]4.3518,[206]4.3650,[207]4.3586,[208]4.3318,[209]4.3069,
|
| 509 |
+
save_imatrix: stored collected data after 210 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 510 |
+
[210]4.2806,[211]4.2549,[212]4.2300,[213]4.2044,[214]4.2076,[215]4.2338,[216]4.2205,[217]4.2112,[218]4.2377,[219]4.2507,
|
| 511 |
+
save_imatrix: stored collected data after 220 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 512 |
+
[220]4.2713,[221]4.2947,[222]4.3132,[223]4.3261,[224]4.3557,[225]4.3644,[226]4.3954,[227]4.4296,[228]4.4535,[229]4.4635,
|
| 513 |
+
save_imatrix: stored collected data after 230 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 514 |
+
[230]4.4711,[231]4.4786,[232]4.5008,[233]4.5066,[234]4.5144,[235]4.5423,[236]4.5473,[237]4.5815,[238]4.6125,[239]4.6244,
|
| 515 |
+
save_imatrix: stored collected data after 240 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 516 |
+
[240]4.6367,[241]4.6566,[242]4.6653,[243]4.6757,[244]4.6927,[245]4.7105,[246]4.7363,[247]4.7391,[248]4.7495,[249]4.7627,
|
| 517 |
+
save_imatrix: stored collected data after 250 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 518 |
+
[250]4.7765,[251]4.7811,[252]4.7926,[253]4.8027,[254]4.8115,[255]4.8230,[256]4.8389,[257]4.8518,[258]4.8654,[259]4.8754,
|
| 519 |
+
save_imatrix: stored collected data after 260 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 520 |
+
[260]4.8781,[261]4.8896,[262]4.8903,[263]4.9069,[264]4.9292,[265]4.9484,[266]4.9674,[267]4.9801,[268]4.9860,[269]4.9948,
|
| 521 |
+
save_imatrix: stored collected data after 270 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 522 |
+
[270]5.0074,[271]5.0283,[272]5.0497,[273]5.0671,[274]5.0720,[275]5.0726,[276]5.0887,[277]5.0971,[278]5.1116,[279]5.1267,
|
| 523 |
+
save_imatrix: stored collected data after 280 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 524 |
+
[280]5.1272,[281]5.1285,[282]5.1367,[283]5.1374,[284]5.1515,[285]5.1579,[286]5.1643,[287]5.1887,[288]5.2028,[289]5.2193,
|
| 525 |
+
save_imatrix: stored collected data after 290 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 526 |
+
[290]5.2383,[291]5.2501,[292]5.2751,[293]5.2875,[294]5.3043,[295]5.3194,[296]5.3327,[297]5.3388,[298]5.3604,[299]5.3687,
|
| 527 |
+
save_imatrix: stored collected data after 300 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 528 |
+
[300]5.3728,[301]5.3898,[302]5.4105,[303]5.4165,[304]5.4243,[305]5.4300,[306]5.4394,[307]5.4488,[308]5.4525,[309]5.4703,
|
| 529 |
+
save_imatrix: stored collected data after 310 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 530 |
+
[310]5.4786,[311]5.4932,[312]5.5118,[313]5.5282,[314]5.5483,[315]5.5213,[316]5.5219,[317]5.4985,[318]5.5149,[319]5.5228,
|
| 531 |
+
save_imatrix: stored collected data after 320 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 532 |
+
[320]5.5227,[321]5.5188,[322]5.5330,[323]5.5465,[324]5.5547,[325]5.5643,[326]5.5650,[327]5.5811,[328]5.5871,[329]5.6020,
|
| 533 |
+
save_imatrix: stored collected data after 330 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 534 |
+
[330]5.6110,[331]5.6185,[332]5.6276,[333]5.5984,[334]5.6102,[335]5.6322,[336]5.6522,[337]5.6740,[338]5.6883,[339]5.7097,
|
| 535 |
+
save_imatrix: stored collected data after 340 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 536 |
+
[340]5.7121,[341]5.7121,[342]5.7181,[343]5.7259,[344]5.7456,[345]5.7737,[346]5.7669,[347]5.7665,[348]5.7747,[349]5.7696,
|
| 537 |
+
save_imatrix: stored collected data after 350 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 538 |
+
[350]5.7718,[351]5.7753,[352]5.7696,[353]5.7763,[354]5.7891,[355]5.7864,[356]5.7856,[357]5.7662,[358]5.7433,[359]5.7300,
|
| 539 |
+
save_imatrix: stored collected data after 360 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 540 |
+
[360]5.7155,[361]5.6990,[362]5.6893,[363]5.6717,[364]5.6643,[365]5.6480,[366]5.6478,[367]5.6308,[368]5.6273,[369]5.6031,
|
| 541 |
+
save_imatrix: stored collected data after 370 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 542 |
+
[370]5.5829,[371]5.5735,[372]5.5600,[373]5.5399,[374]5.5243,[375]5.5158,[376]5.4972,[377]5.4881,[378]5.4873,[379]5.4862,
|
| 543 |
+
save_imatrix: stored collected data after 380 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 544 |
+
[380]5.4784,[381]5.4703,[382]5.4476,[383]5.4259,[384]5.4148,[385]5.4007,[386]5.3796,[387]5.3563,[388]5.3332,[389]5.3184,
|
| 545 |
+
save_imatrix: stored collected data after 390 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 546 |
+
[390]5.3114,[391]5.3143,[392]5.3078,[393]5.3050,[394]5.2956,[395]5.2810,[396]5.2605,[397]5.2439,[398]5.2358,[399]5.2181,
|
| 547 |
+
save_imatrix: stored collected data after 400 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 548 |
+
[400]5.2028,[401]5.1889,[402]5.1764,[403]5.1653,[404]5.1494,[405]5.1340,[406]5.1233,[407]5.1054,[408]5.0884,[409]5.0733,
|
| 549 |
+
save_imatrix: stored collected data after 410 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 550 |
+
[410]5.0606,[411]5.0529,[412]5.0454,[413]5.0384,[414]5.0269,[415]5.0170,[416]4.9984,[417]4.9798,[418]4.9609,[419]4.9444,
|
| 551 |
+
save_imatrix: stored collected data after 420 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 552 |
+
[420]4.9271,[421]4.9130,[422]4.8962,[423]4.8795,[424]4.8668,[425]4.8511,[426]4.8386,[427]4.8284,[428]4.8148,[429]4.7992,
|
| 553 |
+
save_imatrix: stored collected data after 430 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 554 |
+
[430]4.7833,[431]4.7701,[432]4.7664,[433]4.7576,[434]4.7630,[435]4.7521,[436]4.7382,[437]4.7269,[438]4.7143,[439]4.7061,
|
| 555 |
+
save_imatrix: stored collected data after 440 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 556 |
+
[440]4.6956,[441]4.6814,[442]4.6741,[443]4.6630,[444]4.6612,[445]4.6517,[446]4.6430,[447]4.6422,[448]4.6334,[449]4.6247,
|
| 557 |
+
save_imatrix: stored collected data after 450 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 558 |
+
[450]4.6132,[451]4.6061,[452]4.5944,[453]4.5835,[454]4.5715,[455]4.5610,[456]4.5472,[457]4.5362,[458]4.5256,[459]4.5127,
|
| 559 |
+
save_imatrix: stored collected data after 460 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 560 |
+
[460]4.5012,[461]4.4927,[462]4.4892,[463]4.4768,[464]4.4725,[465]4.4667,[466]4.4614,[467]4.4546,[468]4.4480,[469]4.4419,
|
| 561 |
+
save_imatrix: stored collected data after 470 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 562 |
+
[470]4.4352,[471]4.4286,[472]4.4222,[473]4.4156,[474]4.4099,[475]4.4034,[476]4.3970,[477]4.3924,[478]4.3810,[479]4.3720,
|
| 563 |
+
save_imatrix: stored collected data after 480 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 564 |
+
[480]4.3595,[481]4.3528,[482]4.3495,[483]4.3489,[484]4.3361,[485]4.3263,[486]4.3164,[487]4.3049,[488]4.2961,[489]4.2901,
|
| 565 |
+
save_imatrix: stored collected data after 490 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 566 |
+
[490]4.2823,[491]4.2764,[492]4.2675,[493]4.2613,[494]4.2508,[495]4.2465,[496]4.2389,[497]4.2302,[498]4.2205,[499]4.2204,
|
| 567 |
+
save_imatrix: stored collected data after 500 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 568 |
+
[500]4.2207,[501]4.2235,[502]4.2204,[503]4.2211,[504]4.2210,[505]4.2177,[506]4.2105,[507]4.2207,[508]4.2305,[509]4.2408,
|
| 569 |
+
save_imatrix: stored collected data after 510 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 570 |
+
[510]4.2493,[511]4.2572,[512]4.2653,[513]4.2721,[514]4.2796,[515]4.2847,[516]4.2921,[517]4.2970,[518]4.2971,[519]4.3143,
|
| 571 |
+
save_imatrix: stored collected data after 520 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 572 |
+
[520]4.3267,[521]4.3409,[522]4.3505,[523]4.3562,[524]4.3613,[525]4.3665,[526]4.3715,[527]4.3778,[528]4.3836,[529]4.3874,
|
| 573 |
+
save_imatrix: stored collected data after 530 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 574 |
+
[530]4.3930,[531]4.3979,[532]4.4008,[533]4.4036,[534]4.4078,[535]4.4051,[536]4.4066,[537]4.4142,[538]4.4193,[539]4.4241,
|
| 575 |
+
save_imatrix: stored collected data after 540 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 576 |
+
[540]4.4356,[541]4.4396,[542]4.4415,[543]4.4458,[544]4.4472,[545]4.4490,[546]4.4543,[547]4.4598,[548]4.4671,[549]4.4731,
|
| 577 |
+
save_imatrix: stored collected data after 550 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 578 |
+
[550]4.4794,[551]4.4874,[552]4.4927,[553]4.5002,[554]4.5035,[555]4.5077,[556]4.5118,[557]4.5194,[558]4.5195,[559]4.5247,
|
| 579 |
+
save_imatrix: stored collected data after 560 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 580 |
+
[560]4.5291,[561]4.5356,[562]4.5417,[563]4.5440,[564]4.5502,[565]4.5578,[566]4.5635,[567]4.5724,[568]4.5739,[569]4.5765,
|
| 581 |
+
save_imatrix: stored collected data after 570 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 582 |
+
[570]4.5767,[571]4.5807,[572]4.5769,[573]4.5728,[574]4.5711,[575]4.5738,[576]4.5735,[577]4.5762,[578]4.5760,[579]4.5796,
|
| 583 |
+
save_imatrix: stored collected data after 580 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 584 |
+
[580]4.5788,[581]4.5769,[582]4.5764,[583]4.5737,[584]4.5691,[585]4.5699,[586]4.5660,[587]4.5588,[588]4.5570,[589]4.5552,
|
| 585 |
+
save_imatrix: stored collected data after 590 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 586 |
+
[590]4.5495,[591]4.5451,[592]4.5398,[593]4.5345,[594]4.5309,[595]4.5299,[596]4.5265,[597]4.5261,[598]4.5232,[599]4.5186,
|
| 587 |
+
save_imatrix: stored collected data after 600 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 588 |
+
[600]4.5134,[601]4.5140,[602]4.5149,[603]4.5143,[604]4.5097,[605]4.5077,[606]4.5034,[607]4.5078,[608]4.5055,[609]4.5028,
|
| 589 |
+
save_imatrix: stored collected data after 610 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 590 |
+
[610]4.5024,[611]4.5070,[612]4.5081,[613]4.4981,[614]4.4911,[615]4.4818,[616]4.4730,[617]4.4655,[618]4.4565,[619]4.4458,
|
| 591 |
+
save_imatrix: stored collected data after 620 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 592 |
+
[620]4.4353,[621]4.4247,[622]4.4166,[623]4.4103,[624]4.4047,[625]4.4032,[626]4.3953,[627]4.3885,[628]4.3801,[629]4.3742,
|
| 593 |
+
save_imatrix: stored collected data after 630 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 594 |
+
[630]4.3734,[631]4.3750,[632]4.3698,[633]4.3645,[634]4.3603,[635]4.3512,[636]4.3435,[637]4.3355,[638]4.3274,[639]4.3192,
|
| 595 |
+
save_imatrix: stored collected data after 640 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 596 |
+
[640]4.3116,[641]4.3051,[642]4.2995,[643]4.2915,[644]4.2844,[645]4.2776,[646]4.2779,[647]4.2723,[648]4.2643,[649]4.2584,
|
| 597 |
+
save_imatrix: stored collected data after 650 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 598 |
+
[650]4.2522,[651]4.2453,[652]4.2373,[653]4.2301,[654]4.2238,[655]4.2182,[656]4.2116,[657]4.2119,[658]4.2109,[659]4.2123,
|
| 599 |
+
save_imatrix: stored collected data after 660 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 600 |
+
[660]4.2098,[661]4.2022,[662]4.1968,[663]4.1905,[664]4.1822,[665]4.1751,[666]4.1678,[667]4.1612,[668]4.1540,[669]4.1467,
|
| 601 |
+
save_imatrix: stored collected data after 670 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 602 |
+
[670]4.1401,[671]4.1332,[672]4.1270,[673]4.1204,[674]4.1135,[675]4.1059,[676]4.0989,[677]4.0931,[678]4.0862,[679]4.0799,
|
| 603 |
+
save_imatrix: stored collected data after 680 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 604 |
+
[680]4.0738,[681]4.0672,[682]4.0606,[683]4.0531,[684]4.0467,[685]4.0404,[686]4.0371,[687]4.0292,[688]4.0222,[689]4.0156,
|
| 605 |
+
save_imatrix: stored collected data after 690 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 606 |
+
[690]4.0082,[691]4.0020,[692]3.9976,[693]3.9952,[694]3.9912,[695]3.9880,[696]3.9845,[697]3.9813,[698]3.9780,[699]3.9749,
|
| 607 |
+
save_imatrix: stored collected data after 700 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 608 |
+
[700]3.9718,[701]3.9691,[702]3.9669,[703]3.9641,[704]3.9606,[705]3.9584,[706]3.9549,[707]3.9518,[708]3.9490,[709]3.9462,
|
| 609 |
+
save_imatrix: stored collected data after 710 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 610 |
+
[710]3.9466,[711]3.9471,[712]3.9477,[713]3.9479,[714]3.9485,[715]3.9479,[716]3.9493,[717]3.9502,[718]3.9503,[719]3.9497,
|
| 611 |
+
save_imatrix: stored collected data after 720 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 612 |
+
[720]3.9500,[721]3.9501,[722]3.9499,[723]3.9515,[724]3.9533,[725]3.9540,[726]3.9537,[727]3.9534,[728]3.9537,[729]3.9552,
|
| 613 |
+
save_imatrix: stored collected data after 730 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 614 |
+
[730]3.9562,[731]3.9560,[732]3.9550,[733]3.9541,[734]3.9559,[735]3.9573,[736]3.9575,[737]3.9584,[738]3.9593,[739]3.9593,
|
| 615 |
+
save_imatrix: stored collected data after 740 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 616 |
+
[740]3.9591,[741]3.9590,[742]3.9602,[743]3.9603,[744]3.9601,[745]3.9609,[746]3.9612,[747]3.9615,[748]3.9606,[749]3.9609,
|
| 617 |
+
save_imatrix: stored collected data after 750 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 618 |
+
[750]3.9599,[751]3.9609,[752]3.9604,[753]3.9600,[754]3.9608,[755]3.9604,[756]3.9608,[757]3.9618,[758]3.9614,[759]3.9625,
|
| 619 |
+
save_imatrix: stored collected data after 760 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 620 |
+
[760]3.9629,[761]3.9641,[762]3.9633,[763]3.9637,[764]3.9646,[765]3.9640,[766]3.9639,[767]3.9642,[768]3.9633,[769]3.9631,
|
| 621 |
+
save_imatrix: stored collected data after 770 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 622 |
+
[770]3.9635,[771]3.9626,[772]3.9624,[773]3.9617,[774]3.9616,[775]3.9632,[776]3.9632,[777]3.9638,[778]3.9640,[779]3.9624,
|
| 623 |
+
save_imatrix: stored collected data after 780 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 624 |
+
[780]3.9619,[781]3.9623,[782]3.9627,[783]3.9611,[784]3.9616,[785]3.9611,[786]3.9622,[787]3.9626,[788]3.9620,[789]3.9625,
|
| 625 |
+
save_imatrix: stored collected data after 790 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 626 |
+
[790]3.9628,[791]3.9645,[792]3.9663,[793]3.9661,[794]3.9649,[795]3.9648,[796]3.9660,[797]3.9665,[798]3.9659,[799]3.9668,
|
| 627 |
+
save_imatrix: stored collected data after 800 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 628 |
+
[800]3.9682,[801]3.9689,[802]3.9695,[803]3.9710,[804]3.9716,[805]3.9721,[806]3.9725,[807]3.9742,[808]3.9749,[809]3.9743,
|
| 629 |
+
save_imatrix: stored collected data after 810 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 630 |
+
[810]3.9744,[811]3.9747,[812]3.9755,
|
| 631 |
+
save_imatrix: stored collected data after 812 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
|
| 632 |
+
|
| 633 |
+
Final estimate: PPL = 3.9755 +/- 0.01997
|
| 634 |
+
|
| 635 |
+
======================== sorted layer importances
|
| 636 |
+
0: Layer 0, <cos_sim> = 0.191944
|
| 637 |
+
1: Layer 44, <cos_sim> = 0.794719
|
| 638 |
+
2: Layer 11, <cos_sim> = 0.880959
|
| 639 |
+
3: Layer 15, <cos_sim> = 0.889736
|
| 640 |
+
4: Layer 12, <cos_sim> = 0.892113
|
| 641 |
+
5: Layer 19, <cos_sim> = 0.896638
|
| 642 |
+
6: Layer 16, <cos_sim> = 0.902101
|
| 643 |
+
7: Layer 14, <cos_sim> = 0.904488
|
| 644 |
+
8: Layer 13, <cos_sim> = 0.904849
|
| 645 |
+
9: Layer 18, <cos_sim> = 0.90949
|
| 646 |
+
10: Layer 20, <cos_sim> = 0.912555
|
| 647 |
+
11: Layer 17, <cos_sim> = 0.913825
|
| 648 |
+
12: Layer 21, <cos_sim> = 0.916861
|
| 649 |
+
13: Layer 43, <cos_sim> = 0.920321
|
| 650 |
+
14: Layer 22, <cos_sim> = 0.920897
|
| 651 |
+
15: Layer 7, <cos_sim> = 0.925641
|
| 652 |
+
16: Layer 10, <cos_sim> = 0.928077
|
| 653 |
+
17: Layer 9, <cos_sim> = 0.930262
|
| 654 |
+
18: Layer 23, <cos_sim> = 0.930822
|
| 655 |
+
19: Layer 8, <cos_sim> = 0.932862
|
| 656 |
+
20: Layer 24, <cos_sim> = 0.936006
|
| 657 |
+
21: Layer 3, <cos_sim> = 0.940002
|
| 658 |
+
22: Layer 41, <cos_sim> = 0.945994
|
| 659 |
+
23: Layer 25, <cos_sim> = 0.946426
|
| 660 |
+
24: Layer 27, <cos_sim> = 0.946791
|
| 661 |
+
25: Layer 42, <cos_sim> = 0.94737
|
| 662 |
+
26: Layer 26, <cos_sim> = 0.948684
|
| 663 |
+
27: Layer 36, <cos_sim> = 0.949698
|
| 664 |
+
28: Layer 39, <cos_sim> = 0.949899
|
| 665 |
+
29: Layer 37, <cos_sim> = 0.9515
|
| 666 |
+
30: Layer 28, <cos_sim> = 0.951921
|
| 667 |
+
31: Layer 38, <cos_sim> = 0.953373
|
| 668 |
+
32: Layer 35, <cos_sim> = 0.955007
|
| 669 |
+
33: Layer 29, <cos_sim> = 0.955639
|
| 670 |
+
34: Layer 34, <cos_sim> = 0.955797
|
| 671 |
+
35: Layer 31, <cos_sim> = 0.956181
|
| 672 |
+
36: Layer 6, <cos_sim> = 0.956762
|
| 673 |
+
37: Layer 33, <cos_sim> = 0.958702
|
| 674 |
+
38: Layer 5, <cos_sim> = 0.959416
|
| 675 |
+
39: Layer 40, <cos_sim> = 0.96006
|
| 676 |
+
40: Layer 30, <cos_sim> = 0.960335
|
| 677 |
+
41: Layer 32, <cos_sim> = 0.961425
|
| 678 |
+
42: Layer 4, <cos_sim> = 0.963155
|
| 679 |
+
43: Layer 1, <cos_sim> = 0.977383
|
| 680 |
+
44: Layer 2, <cos_sim> = 0.981096
|
| 681 |
+
|
| 682 |
+
======================== sorted attention importances
|
| 683 |
+
0: Layer 3, <cos_sim> = 0.268473
|
| 684 |
+
1: Layer 5, <cos_sim> = 0.445389
|
| 685 |
+
2: Layer 1, <cos_sim> = 0.491229
|
| 686 |
+
3: Layer 4, <cos_sim> = 0.507703
|
| 687 |
+
4: Layer 2, <cos_sim> = 0.523524
|
| 688 |
+
5: Layer 7, <cos_sim> = 0.546491
|
| 689 |
+
6: Layer 6, <cos_sim> = 0.551201
|
| 690 |
+
7: Layer 0, <cos_sim> = 0.657228
|
| 691 |
+
8: Layer 9, <cos_sim> = 0.693649
|
| 692 |
+
9: Layer 8, <cos_sim> = 0.693792
|
| 693 |
+
10: Layer 10, <cos_sim> = 0.715702
|
| 694 |
+
11: Layer 11, <cos_sim> = 0.738956
|
| 695 |
+
12: Layer 13, <cos_sim> = 0.812073
|
| 696 |
+
13: Layer 14, <cos_sim> = 0.819818
|
| 697 |
+
14: Layer 12, <cos_sim> = 0.85671
|
| 698 |
+
15: Layer 15, <cos_sim> = 0.860875
|
| 699 |
+
16: Layer 17, <cos_sim> = 0.888072
|
| 700 |
+
17: Layer 18, <cos_sim> = 0.89278
|
| 701 |
+
18: Layer 16, <cos_sim> = 0.914259
|
| 702 |
+
19: Layer 19, <cos_sim> = 0.931089
|
| 703 |
+
20: Layer 21, <cos_sim> = 0.949091
|
| 704 |
+
21: Layer 22, <cos_sim> = 0.955978
|
| 705 |
+
22: Layer 20, <cos_sim> = 0.958918
|
| 706 |
+
23: Layer 23, <cos_sim> = 0.963765
|
| 707 |
+
24: Layer 24, <cos_sim> = 0.963995
|
| 708 |
+
25: Layer 28, <cos_sim> = 0.965883
|
| 709 |
+
26: Layer 43, <cos_sim> = 0.967174
|
| 710 |
+
27: Layer 42, <cos_sim> = 0.969761
|
| 711 |
+
28: Layer 26, <cos_sim> = 0.970181
|
| 712 |
+
29: Layer 25, <cos_sim> = 0.971553
|
| 713 |
+
30: Layer 39, <cos_sim> = 0.972275
|
| 714 |
+
31: Layer 41, <cos_sim> = 0.975387
|
| 715 |
+
32: Layer 29, <cos_sim> = 0.975487
|
| 716 |
+
33: Layer 36, <cos_sim> = 0.977112
|
| 717 |
+
34: Layer 32, <cos_sim> = 0.978462
|
| 718 |
+
35: Layer 38, <cos_sim> = 0.979173
|
| 719 |
+
36: Layer 27, <cos_sim> = 0.979313
|
| 720 |
+
37: Layer 35, <cos_sim> = 0.980944
|
| 721 |
+
38: Layer 34, <cos_sim> = 0.98212
|
| 722 |
+
39: Layer 30, <cos_sim> = 0.982521
|
| 723 |
+
40: Layer 33, <cos_sim> = 0.982989
|
| 724 |
+
41: Layer 37, <cos_sim> = 0.983563
|
| 725 |
+
42: Layer 40, <cos_sim> = 0.985181
|
| 726 |
+
43: Layer 31, <cos_sim> = 0.985454
|
| 727 |
+
44: Layer 44, <cos_sim> = 0.987712
|
| 728 |
+
|
| 729 |
+
======================== sorted ffn importances
|
| 730 |
+
0: Layer 0, <cos_sim> = 0.431108
|
| 731 |
+
1: Layer 2, <cos_sim> = 0.44518
|
| 732 |
+
2: Layer 3, <cos_sim> = 0.450093
|
| 733 |
+
3: Layer 4, <cos_sim> = 0.471592
|
| 734 |
+
4: Layer 5, <cos_sim> = 0.482406
|
| 735 |
+
5: Layer 6, <cos_sim> = 0.559887
|
| 736 |
+
6: Layer 1, <cos_sim> = 0.602544
|
| 737 |
+
7: Layer 8, <cos_sim> = 0.643123
|
| 738 |
+
8: Layer 7, <cos_sim> = 0.684008
|
| 739 |
+
9: Layer 9, <cos_sim> = 0.708513
|
| 740 |
+
10: Layer 10, <cos_sim> = 0.718472
|
| 741 |
+
11: Layer 13, <cos_sim> = 0.770861
|
| 742 |
+
12: Layer 12, <cos_sim> = 0.786273
|
| 743 |
+
13: Layer 44, <cos_sim> = 0.811898
|
| 744 |
+
14: Layer 14, <cos_sim> = 0.832882
|
| 745 |
+
15: Layer 11, <cos_sim> = 0.841347
|
| 746 |
+
16: Layer 16, <cos_sim> = 0.847809
|
| 747 |
+
17: Layer 17, <cos_sim> = 0.867317
|
| 748 |
+
18: Layer 18, <cos_sim> = 0.875668
|
| 749 |
+
19: Layer 15, <cos_sim> = 0.886359
|
| 750 |
+
20: Layer 19, <cos_sim> = 0.932629
|
| 751 |
+
21: Layer 21, <cos_sim> = 0.935681
|
| 752 |
+
22: Layer 20, <cos_sim> = 0.936905
|
| 753 |
+
23: Layer 22, <cos_sim> = 0.94295
|
| 754 |
+
24: Layer 23, <cos_sim> = 0.944582
|
| 755 |
+
25: Layer 27, <cos_sim> = 0.947721
|
| 756 |
+
26: Layer 24, <cos_sim> = 0.95027
|
| 757 |
+
27: Layer 25, <cos_sim> = 0.952
|
| 758 |
+
28: Layer 43, <cos_sim> = 0.953131
|
| 759 |
+
29: Layer 35, <cos_sim> = 0.954686
|
| 760 |
+
30: Layer 31, <cos_sim> = 0.954798
|
| 761 |
+
31: Layer 38, <cos_sim> = 0.958932
|
| 762 |
+
32: Layer 26, <cos_sim> = 0.960332
|
| 763 |
+
33: Layer 37, <cos_sim> = 0.960368
|
| 764 |
+
34: Layer 28, <cos_sim> = 0.96127
|
| 765 |
+
35: Layer 29, <cos_sim> = 0.961706
|
| 766 |
+
36: Layer 34, <cos_sim> = 0.962314
|
| 767 |
+
37: Layer 36, <cos_sim> = 0.964392
|
| 768 |
+
38: Layer 32, <cos_sim> = 0.965215
|
| 769 |
+
39: Layer 33, <cos_sim> = 0.9656
|
| 770 |
+
40: Layer 39, <cos_sim> = 0.965828
|
| 771 |
+
41: Layer 41, <cos_sim> = 0.966507
|
| 772 |
+
42: Layer 30, <cos_sim> = 0.966721
|
| 773 |
+
43: Layer 42, <cos_sim> = 0.967369
|
| 774 |
+
44: Layer 40, <cos_sim> = 0.970084
|
| 775 |
+
|
| 776 |
+
llama_print_timings: load time = 89422.20 ms
|
| 777 |
+
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 778 |
+
llama_print_timings: prompt eval time = 3082057.10 ms / 415744 tokens ( 7.41 ms per token, 134.89 tokens per second)
|
| 779 |
+
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 780 |
+
llama_print_timings: total time = 3182699.59 ms / 415745 tokens
|
logs/perplexity-Step-3.5-Flash-BF16.log
ADDED
|
@@ -0,0 +1,202 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
|
| 3 |
+
# echo 0 | sudo tee /proc/sys/kernel/numa_balancing
|
| 4 |
+
# sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
|
| 5 |
+
|
| 6 |
+
model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf
|
| 7 |
+
#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-Q8_0.gguf
|
| 8 |
+
#model=/mnt/data/models/stepfun-ai/Step-3.5-Flash-Int4/step3p5_flash_Q4_K_S-00001-of-00012.gguf
|
| 9 |
+
#model=/mnt/raid/hf/Step-3.5-Flash-GGUF/IQ4_XS/Step-3.5-Flash-IQ4_XS-00001-of-00004.gguf
|
| 10 |
+
#model=/mnt/raid/hf/Step-3.5-Flash-GGUF/IQ5_K/Step-3.5-Flash-IQ5_K-00001-of-00004.gguf
|
| 11 |
+
#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-IQ3_KS.gguf
|
| 12 |
+
#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-smol-IQ3_KS.gguf
|
| 13 |
+
#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-IQ2_KL.gguf
|
| 14 |
+
|
| 15 |
+
# Check if the SOCKET variable is unset or empty.
|
| 16 |
+
if [[ -z "${SOCKET}" ]]; then
|
| 17 |
+
# If it is, print an error to standard error and exit with a non-zero status.
|
| 18 |
+
echo "Error: The SOCKET environment variable is not set." >&2
|
| 19 |
+
exit 1
|
| 20 |
+
else
|
| 21 |
+
# If it is set, print its value and exit successfully.
|
| 22 |
+
echo "SOCKET is set to: ${SOCKET}"
|
| 23 |
+
fi
|
| 24 |
+
SOCKET="${SOCKET}"
|
| 25 |
+
|
| 26 |
+
numactl -N "$SOCKET" -m "$SOCKET" \
|
| 27 |
+
./build/bin/llama-perplexity \
|
| 28 |
+
-m "$model" \
|
| 29 |
+
-f wiki.test.raw \
|
| 30 |
+
--seed 1337 \
|
| 31 |
+
--ctx-size 512 \
|
| 32 |
+
-ub 4096 -b 4096 \
|
| 33 |
+
--numa numactl \
|
| 34 |
+
--threads 96 \
|
| 35 |
+
--threads-batch 128 \
|
| 36 |
+
--validate-quants \
|
| 37 |
+
--no-mmap
|
| 38 |
+
|
| 39 |
+
SOCKET is set to: 1
|
| 40 |
+
main: build = 4186 (82c4f273)
|
| 41 |
+
main: built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
|
| 42 |
+
main: seed = 1337
|
| 43 |
+
CPU: using device CPU - 0 MiB free
|
| 44 |
+
llama_model_loader: additional 8 GGUFs metadata loaded.
|
| 45 |
+
llama_model_loader: loaded meta data with 50 key-value pairs and 754 tensors from /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf (version GGUF V3 (latest))
|
| 46 |
+
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
|
| 47 |
+
llama_model_loader: - kv 0: general.architecture str = step35
|
| 48 |
+
llama_model_loader: - kv 1: general.type str = model
|
| 49 |
+
llama_model_loader: - kv 2: general.name str = Step 3.5 Flash
|
| 50 |
+
llama_model_loader: - kv 3: general.size_label str = 288x7.4B
|
| 51 |
+
llama_model_loader: - kv 4: general.license str = apache-2.0
|
| 52 |
+
llama_model_loader: - kv 5: general.base_model.count u32 = 1
|
| 53 |
+
llama_model_loader: - kv 6: general.base_model.0.name str = Step 3.5 Flash
|
| 54 |
+
llama_model_loader: - kv 7: general.base_model.0.organization str = Stepfun Ai
|
| 55 |
+
llama_model_loader: - kv 8: general.base_model.0.repo_url str = https://huggingface.co/stepfun-ai/ste...
|
| 56 |
+
llama_model_loader: - kv 9: step35.block_count u32 = 45
|
| 57 |
+
llama_model_loader: - kv 10: step35.context_length u32 = 262144
|
| 58 |
+
llama_model_loader: - kv 11: step35.embedding_length u32 = 4096
|
| 59 |
+
llama_model_loader: - kv 12: step35.feed_forward_length u32 = 11264
|
| 60 |
+
llama_model_loader: - kv 13: step35.attention.head_count arr[i32,45] = [64, 96, 96, 96, 64, 96, 96, 96, 64, ...
|
| 61 |
+
llama_model_loader: - kv 14: step35.rope.freq_base f32 = 5000000.000000
|
| 62 |
+
llama_model_loader: - kv 15: step35.rope.freq_base_swa f32 = 10000.000000
|
| 63 |
+
llama_model_loader: - kv 16: step35.expert_gating_func u32 = 2
|
| 64 |
+
llama_model_loader: - kv 17: step35.attention.key_length u32 = 128
|
| 65 |
+
llama_model_loader: - kv 18: step35.attention.value_length u32 = 128
|
| 66 |
+
llama_model_loader: - kv 19: general.file_type u32 = 32
|
| 67 |
+
llama_model_loader: - kv 20: step35.attention.head_count_kv arr[i32,45] = [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
|
| 68 |
+
llama_model_loader: - kv 21: step35.attention.sliding_window u32 = 512
|
| 69 |
+
llama_model_loader: - kv 22: step35.attention.sliding_window_pattern arr[i32,45] = [0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, ...
|
| 70 |
+
llama_model_loader: - kv 23: step35.expert_count u32 = 288
|
| 71 |
+
llama_model_loader: - kv 24: step35.expert_used_count u32 = 8
|
| 72 |
+
llama_model_loader: - kv 25: step35.expert_feed_forward_length u32 = 1280
|
| 73 |
+
llama_model_loader: - kv 26: step35.expert_shared_feed_forward_length u32 = 1280
|
| 74 |
+
llama_model_loader: - kv 27: step35.expert_weights_scale f32 = 3.000000
|
| 75 |
+
llama_model_loader: - kv 28: step35.expert_weights_norm bool = true
|
| 76 |
+
llama_model_loader: - kv 29: step35.leading_dense_block_count u32 = 3
|
| 77 |
+
llama_model_loader: - kv 30: step35.moe_every_n_layers u32 = 1
|
| 78 |
+
llama_model_loader: - kv 31: step35.attention.layer_norm_rms_epsilon f32 = 0.000010
|
| 79 |
+
llama_model_loader: - kv 32: step35.swiglu_clamp_exp arr[f32,45] = [0.000000, 0.000000, 0.000000, 0.0000...
|
| 80 |
+
llama_model_loader: - kv 33: step35.swiglu_clamp_shexp arr[f32,45] = [0.000000, 0.000000, 0.000000, 0.0000...
|
| 81 |
+
llama_model_loader: - kv 34: general.quantization_version u32 = 2
|
| 82 |
+
llama_model_loader: - kv 35: tokenizer.ggml.model str = gpt2
|
| 83 |
+
llama_model_loader: - kv 36: tokenizer.ggml.pre str = deepseek-v3
|
| 84 |
+
llama_model_loader: - kv 37: tokenizer.ggml.tokens arr[str,128896] = ["<|begin▁of▁sentence|>", "<�...
|
| 85 |
+
llama_model_loader: - kv 38: tokenizer.ggml.token_type arr[i32,128896] = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
|
| 86 |
+
llama_model_loader: - kv 39: tokenizer.ggml.merges arr[str,127741] = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
|
| 87 |
+
llama_model_loader: - kv 40: tokenizer.ggml.bos_token_id u32 = 0
|
| 88 |
+
llama_model_loader: - kv 41: tokenizer.ggml.eos_token_id u32 = 128007
|
| 89 |
+
llama_model_loader: - kv 42: tokenizer.ggml.padding_token_id u32 = 1
|
| 90 |
+
llama_model_loader: - kv 43: tokenizer.ggml.add_bos_token bool = true
|
| 91 |
+
llama_model_loader: - kv 44: tokenizer.ggml.add_sep_token bool = false
|
| 92 |
+
llama_model_loader: - kv 45: tokenizer.ggml.add_eos_token bool = false
|
| 93 |
+
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_content(content) %}{%...
|
| 94 |
+
llama_model_loader: - kv 47: split.no u16 = 0
|
| 95 |
+
llama_model_loader: - kv 48: split.count u16 = 9
|
| 96 |
+
llama_model_loader: - kv 49: split.tensors.count i32 = 754
|
| 97 |
+
llama_model_loader: - type f32: 266 tensors
|
| 98 |
+
llama_model_loader: - type bf16: 488 tensors
|
| 99 |
+
load: printing all EOG tokens:
|
| 100 |
+
load: - 128007 ('<|im_end|>')
|
| 101 |
+
load: special tokens cache size = 818
|
| 102 |
+
load: token to piece cache size = 0.8220 MB
|
| 103 |
+
llm_load_print_meta: format = GGUF V3 (latest)
|
| 104 |
+
llm_load_print_meta: arch = step35
|
| 105 |
+
llm_load_print_meta: n_ctx_train = 262144
|
| 106 |
+
llm_load_print_meta: n_embd = 4096
|
| 107 |
+
llm_load_print_meta: n_layer = 45
|
| 108 |
+
llm_load_print_meta: n_head = [64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64]
|
| 109 |
+
llm_load_print_meta: n_head_kv = 8
|
| 110 |
+
llm_load_print_meta: n_rot = 128
|
| 111 |
+
llm_load_print_meta: n_swa = 512
|
| 112 |
+
llm_load_print_meta: n_swa_pattern = 1
|
| 113 |
+
llm_load_print_meta: n_embd_head_k = 128
|
| 114 |
+
llm_load_print_meta: n_embd_head_v = 128
|
| 115 |
+
llm_load_print_meta: n_gqa = [8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8]
|
| 116 |
+
llm_load_print_meta: n_embd_k_gqa = 1024
|
| 117 |
+
llm_load_print_meta: n_embd_v_gqa = 1024
|
| 118 |
+
llm_load_print_meta: f_norm_eps = 0.0e+00
|
| 119 |
+
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
|
| 120 |
+
llm_load_print_meta: f_clamp_kqv = 0.0e+00
|
| 121 |
+
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
|
| 122 |
+
llm_load_print_meta: f_logit_scale = 0.0e+00
|
| 123 |
+
llm_load_print_meta: n_ff = 11264
|
| 124 |
+
llm_load_print_meta: n_expert = 288
|
| 125 |
+
llm_load_print_meta: n_expert_used = 8
|
| 126 |
+
llm_load_print_meta: causal attn = 1
|
| 127 |
+
llm_load_print_meta: pooling type = 0
|
| 128 |
+
llm_load_print_meta: rope type = 2
|
| 129 |
+
llm_load_print_meta: rope scaling = linear
|
| 130 |
+
llm_load_print_meta: freq_base_train = 5000000.0
|
| 131 |
+
llm_load_print_meta: freq_scale_train = 1
|
| 132 |
+
llm_load_print_meta: n_ctx_orig_yarn = 262144
|
| 133 |
+
llm_load_print_meta: rope_finetuned = unknown
|
| 134 |
+
llm_load_print_meta: ssm_d_conv = 0
|
| 135 |
+
llm_load_print_meta: ssm_d_inner = 0
|
| 136 |
+
llm_load_print_meta: ssm_d_state = 0
|
| 137 |
+
llm_load_print_meta: ssm_dt_rank = 0
|
| 138 |
+
llm_load_print_meta: model type = ?B
|
| 139 |
+
llm_load_print_meta: model ftype = BF16
|
| 140 |
+
llm_load_print_meta: model params = 196.956 B
|
| 141 |
+
llm_load_print_meta: model size = 366.952 GiB (16.004 BPW)
|
| 142 |
+
llm_load_print_meta: repeating layers = 364.986 GiB (16.004 BPW, 195.900 B parameters)
|
| 143 |
+
llm_load_print_meta: general.name = Step 3.5 Flash
|
| 144 |
+
print_info: vocab type = BPE
|
| 145 |
+
print_info: n_vocab = 128896
|
| 146 |
+
print_info: n_merges = 127741
|
| 147 |
+
print_info: BOS token = 0 '<|begin▁of▁sentence|>'
|
| 148 |
+
print_info: EOS token = 128007 '<|im_end|>'
|
| 149 |
+
print_info: EOT token = 128007 '<|im_end|>'
|
| 150 |
+
print_info: PAD token = 1 '<|end▁of▁sentence|>'
|
| 151 |
+
print_info: LF token = 201 'Ċ'
|
| 152 |
+
print_info: FIM PRE token = 128801 '<|fim▁begin|>'
|
| 153 |
+
print_info: FIM SUF token = 128800 '<|fim▁hole|>'
|
| 154 |
+
print_info: FIM MID token = 128802 '<|fim▁end|>'
|
| 155 |
+
print_info: EOG token = 128007 '<|im_end|>'
|
| 156 |
+
print_info: max token length = 256
|
| 157 |
+
llm_load_tensors: ggml ctx size = 0.31 MiB
|
| 158 |
+
llm_load_tensors: offloading 0 repeating layers to GPU
|
| 159 |
+
llm_load_tensors: offloaded 0/46 layers to GPU
|
| 160 |
+
llm_load_tensors: CPU buffer size = 375759.27 MiB
|
| 161 |
+
....................................................................................................
|
| 162 |
+
llama_new_context_with_model: n_ctx = 4096
|
| 163 |
+
llama_new_context_with_model: n_batch = 4096
|
| 164 |
+
llama_new_context_with_model: n_ubatch = 4096
|
| 165 |
+
llama_new_context_with_model: flash_attn = 1
|
| 166 |
+
llama_new_context_with_model: attn_max_b = 0
|
| 167 |
+
llama_new_context_with_model: fused_moe = 1
|
| 168 |
+
llama_new_context_with_model: grouped er = 0
|
| 169 |
+
llama_new_context_with_model: fused_up_gate = 1
|
| 170 |
+
llama_new_context_with_model: fused_mmad = 1
|
| 171 |
+
llama_new_context_with_model: rope_cache = 0
|
| 172 |
+
llama_new_context_with_model: graph_reuse = 1
|
| 173 |
+
llama_new_context_with_model: k_cache_hadam = 0
|
| 174 |
+
llama_new_context_with_model: split_mode_graph_scheduling = 0
|
| 175 |
+
llama_new_context_with_model: reduce_type = f16
|
| 176 |
+
llama_new_context_with_model: sched_async = 0
|
| 177 |
+
llama_new_context_with_model: ser = -1, 0
|
| 178 |
+
llama_new_context_with_model: freq_base = 5000000.0
|
| 179 |
+
llama_new_context_with_model: freq_scale = 1
|
| 180 |
+
llama_kv_cache_init: CPU KV buffer size = 720.00 MiB
|
| 181 |
+
llama_new_context_with_model: KV self size = 720.00 MiB, K (f16): 360.00 MiB, V (f16): 360.00 MiB
|
| 182 |
+
llama_new_context_with_model: CPU output buffer size = 3.93 MiB
|
| 183 |
+
llama_new_context_with_model: CPU compute buffer size = 2078.00 MiB
|
| 184 |
+
llama_new_context_with_model: graph nodes = 2201
|
| 185 |
+
llama_new_context_with_model: graph splits = 1
|
| 186 |
+
XXXXXXXXXXXXXXXXXXXXX Setting only active experts offload
|
| 187 |
+
|
| 188 |
+
system_info: n_threads = 96 (n_threads_batch = 128) / 512 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
|
| 189 |
+
perplexity: tokenizing the input ..
|
| 190 |
+
perplexity: tokenization took 723.567 ms
|
| 191 |
+
perplexity: calculating perplexity over 561 chunks, n_ctx=512, batch_size=4096, n_seq=8
|
| 192 |
+
perplexity: 15.47 seconds per pass - ETA 18.07 minutes
|
| 193 |
+
===================================== llama_new_context_with_model: f16
|
| 194 |
+
======================================= HAVE_FANCY_SIMD is defined
|
| 195 |
+
[1]1.5125,[2]1.9280,[3]1.6178,[4]1.4760,[5]1.4000,[6]1.3378,[7]1.3006,[8]1.2759,[9]1.2557,[10]1.2356,[11]1.2434,[12]1.2544,[13]1.2647,[14]1.3110,[15]1.3541,[16]1.3996,[17]1.5016,[18]1.5846,[19]1.5731,[20]1.5561,[21]1.5612,[22]1.5507,[23]1.5331,[24]1.5278,[25]1.5172,[26]1.5098,[27]1.5009,[28]1.4958,[29]1.4917,[30]1.4977,[31]1.4967,[32]1.4862,[33]1.4809,[34]1.4913,[35]1.4946,[36]1.5053,[37]1.5355,[38]1.5694,[39]1.6011,[40]1.6472,[41]1.6760,[42]1.6825,[43]1.7172,[44]1.7369,[45]1.7776,[46]1.8155,[47]1.8165,[48]1.8110,[49]1.8061,[50]1.7953,[51]1.8169,[52]1.8152,[53]1.8328,[54]1.8418,[55]1.8548,[56]1.8631,[57]1.8643,[58]1.8699,[59]1.8768,[60]1.8919,[61]1.8872,[62]1.9143,[63]1.9293,[64]1.9433,[65]1.9454,[66]1.9417,[67]1.9391,[68]1.9453,[69]1.9449,[70]1.9460,[71]1.9421,[72]1.9417,[73]1.9501,[74]1.9627,[75]1.9628,[76]1.9498,[77]1.9417,[78]1.9370,[79]1.9331,[80]1.9279,[81]1.9234,[82]1.9262,[83]1.9219,[84]1.9180,[85]1.9127,[86]1.9152,[87]1.9229,[88]1.9163,[89]1.9171,[90]1.9167,[91]1.9127,[92]1.9089,[93]1.9056,[94]1.9002,[95]1.9012,[96]1.9053,[97]1.9162,[98]1.9152,[99]1.9091,[100]1.9069,[101]1.9066,[102]1.9153,[103]1.9198,[104]1.9363,[105]1.9437,[106]1.9683,[107]1.9908,[108]2.0093,[109]2.0375,[110]2.0637,[111]2.0878,[112]2.0815,[113]2.0835,[114]2.0887,[115]2.0900,[116]2.0982,[117]2.0991,[118]2.1001,[119]2.0971,[120]2.0958,[121]2.0988,[122]2.0957,[123]2.0949,[124]2.0910,[125]2.0874,[126]2.0863,[127]2.0868,[128]2.0853,[129]2.0883,[130]2.0891,[131]2.0895,[132]2.0910,[133]2.1011,[134]2.1063,[135]2.1041,[136]2.1007,[137]2.0981,[138]2.0948,[139]2.0931,[140]2.0920,[141]2.0920,[142]2.0917,[143]2.0939,[144]2.0946,[145]2.0887,[146]2.0841,[147]2.0816,[148]2.0776,[149]2.0759,[150]2.0711,[151]2.0657,[152]2.0632,[153]2.0603,[154]2.0590,[155]2.0579,[156]2.0557,[157]2.0555,[158]2.0547,[159]2.0544,[160]2.0526,[161]2.0621,[162]2.0724,[163]2.0757,[164]2.0811,[165]2.0870,[166]2.0970,[167]2.0994,[168]2.1125,[169]2.1201,[170]2.1320,[171]2.1389,[172]2.1361,[173]2.1299,[174]2.1337,[175]2.1363,[176]2.1380,[177]2.1385,[178]2.1385,[179]2.1403,[180]2.1426,[181]2.1545,[182]2.1659,[183]2.1785,[184]2.1920,[185]2.2015,[186]2.2151,[187]2.2300,[188]2.2433,[189]2.2494,[190]2.2499,[191]2.2526,[192]2.2557,[193]2.2550,[194]2.2580,[195]2.2577,[196]2.2627,[197]2.2682,[198]2.2709,[199]2.2707,[200]2.2704,[201]2.2808,[202]2.2752,[203]2.2756,[204]2.2758,[205]2.2770,[206]2.2777,[207]2.2782,[208]2.2809,[209]2.2834,[210]2.2825,[211]2.2798,[212]2.2796,[213]2.2796,[214]2.2784,[215]2.2749,[216]2.2745,[217]2.2700,[218]2.2682,[219]2.2687,[220]2.2680,[221]2.2684,[222]2.2646,[223]2.2630,[224]2.2663,[225]2.2666,[226]2.2632,[227]2.2648,[228]2.2669,[229]2.2686,[230]2.2755,[231]2.2821,[232]2.2807,[233]2.2788,[234]2.2786,[235]2.2789,[236]2.2814,[237]2.2857,[238]2.2896,[239]2.2969,[240]2.3025,[241]2.3097,[242]2.3165,[243]2.3226,[244]2.3274,[245]2.3365,[246]2.3413,[247]2.3413,[248]2.3395,[249]2.3395,[250]2.3363,[251]2.3351,[252]2.3388,[253]2.3440,[254]2.3505,[255]2.3528,[256]2.3542,[257]2.3561,[258]2.3563,[259]2.3555,[260]2.3564,[261]2.3564,[262]2.3564,[263]2.3570,[264]2.3558,[265]2.3556,[266]2.3568,[267]2.3584,[268]2.3604,[269]2.3629,[270]2.3620,[271]2.3645,[272]2.3624,[273]2.3610,[274]2.3581,[275]2.3584,[276]2.3541,[277]2.3568,[278]2.3644,[279]2.3720,[280]2.3785,[281]2.3816,[282]2.3827,[283]2.3869,[284]2.3908,[285]2.3995,[286]2.3997,[287]2.4026,[288]2.4078,[289]2.4094,[290]2.4075,[291]2.4083,[292]2.4165,[293]2.4195,[294]2.4216,[295]2.4238,[296]2.4268,[297]2.4273,[298]2.4297,[299]2.4306,[300]2.4315,[301]2.4335,[302]2.4351,[303]2.4356,[304]2.4357,[305]2.4437,[306]2.4474,[307]2.4559,[308]2.4506,[309]2.4480,[310]2.4432,[311]2.4426,[312]2.4399,[313]2.4376,[314]2.4357,[315]2.4354,[316]2.4353,[317]2.4330,[318]2.4308,[319]2.4298,[320]2.4300,[321]2.4270,[322]2.4274,[323]2.4282,[324]2.4255,[325]2.4236,[326]2.4203,[327]2.4176,[328]2.4185,[329]2.4184,[330]2.4218,[331]2.4228,[332]2.4261,[333]2.4255,[334]2.4253,[335]2.4257,[336]2.4261,[337]2.4274,[338]2.4280,[339]2.4294,[340]2.4319,[341]2.4356,[342]2.4404,[343]2.4458,[344]2.4486,[345]2.4474,[346]2.4446,[347]2.4455,[348]2.4445,[349]2.4417,[350]2.4408,[351]2.4423,[352]2.4415,[353]2.4421,[354]2.4420,[355]2.4420,[356]2.4403,[357]2.4410,[358]2.4415,[359]2.4387,[360]2.4372,[361]2.4374,[362]2.4370,[363]2.4360,[364]2.4361,[365]2.4331,[366]2.4331,[367]2.4333,[368]2.4315,[369]2.4314,[370]2.4304,[371]2.4320,[372]2.4343,[373]2.4323,[374]2.4299,[375]2.4292,[376]2.4321,[377]2.4358,[378]2.4335,[379]2.4320,[380]2.4310,[381]2.4324,[382]2.4333,[383]2.4354,[384]2.4386,[385]2.4416,[386]2.4447,[387]2.4495,[388]2.4516,[389]2.4481,[390]2.4448,[391]2.4411,[392]2.4397,[393]2.4389,[394]2.4375,[395]2.4345,[396]2.4323,[397]2.4286,[398]2.4258,[399]2.4222,[400]2.4188,[401]2.4144,[402]2.4113,[403]2.4074,[404]2.4042,[405]2.4002,[406]2.3964,[407]2.3934,[408]2.3907,[409]2.3869,[410]2.3861,[411]2.3874,[412]2.3864,[413]2.3888,[414]2.3894,[415]2.3861,[416]2.3825,[417]2.3850,[418]2.3814,[419]2.3801,[420]2.3776,[421]2.3747,[422]2.3706,[423]2.3670,[424]2.3663,[425]2.3636,[426]2.3602,[427]2.3576,[428]2.3562,[429]2.3537,[430]2.3505,[431]2.3470,[432]2.3453,[433]2.3430,[434]2.3409,[435]2.3391,[436]2.3380,[437]2.3377,[438]2.3381,[439]2.3395,[440]2.3425,[441]2.3478,[442]2.3534,[443]2.3516,[444]2.3511,[445]2.3515,[446]2.3537,[447]2.3564,[448]2.3580,[449]2.3595,[450]2.3612,[451]2.3634,[452]2.3642,[453]2.3656,[454]2.3641,[455]2.3664,[456]2.3674,[457]2.3700,[458]2.3738,[459]2.3739,[460]2.3745,[461]2.3727,[462]2.3734,[463]2.3768,[464]2.3811,[465]2.3792,[466]2.3804,[467]2.3820,[468]2.3835,[469]2.3839,[470]2.3849,[471]2.3872,[472]2.3892,[473]2.3895,[474]2.3912,[475]2.3928,[476]2.3930,[477]2.3936,[478]2.3945,[479]2.3961,[480]2.3975,[481]2.3948,[482]2.3958,[483]2.3948,[484]2.3977,[485]2.4024,[486]2.4038,[487]2.4061,[488]2.4079,[489]2.4099,[490]2.4128,[491]2.4155,[492]2.4188,[493]2.4186,[494]2.4172,[495]2.4168,[496]2.4166,[497]2.4169,[498]2.4168,[499]2.4157,[500]2.4170,[501]2.4207,[502]2.4199,[503]2.4202,[504]2.4209,[505]2.4228,[506]2.4245,[507]2.4259,[508]2.4280,[509]2.4251,[510]2.4246,[511]2.4238,[512]2.4222,[513]2.4199,[514]2.4195,[515]2.4192,[516]2.4170,[517]2.4164,[518]2.4161,[519]2.4153,[520]2.4149,[521]2.4149,[522]2.4137,[523]2.4146,[524]2.4141,[525]2.4148,[526]2.4135,[527]2.4115,[528]2.4113,[529]2.4105,[530]2.4100,[531]2.4090,[532]2.4065,[533]2.4042,[534]2.4025,[535]2.4023,[536]2.4037,[537]2.4056,[538]2.4072,[539]2.4089,[540]2.4121,[541]2.4149,[542]2.4175,[543]2.4190,[544]2.4184,[545]2.4186,[546]2.4160,[547]2.4136,[548]2.4108,[549]2.4085,[550]2.4071,[551]2.4058,[552]2.4042,[553]2.4031,[554]2.4033,[555]2.4028,[556]2.4056,[557]2.4078,[558]2.4111,[559]2.4132,[560]2.4173,[561]2.4169,
|
| 196 |
+
llama_print_timings: load time = 168747.99 ms
|
| 197 |
+
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 198 |
+
llama_print_timings: prompt eval time = 879771.94 ms / 287232 tokens ( 3.06 ms per token, 326.48 tokens per second)
|
| 199 |
+
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 200 |
+
llama_print_timings: total time = 890647.44 ms / 287233 tokens
|
| 201 |
+
|
| 202 |
+
Final estimate: PPL over 561 chunks for n_ctx=512 = 2.4169 +/- 0.01107
|