add some perplexity data

Browse files

Files changed (4) hide show

README.md +91 -27
images/perplexity.png +3 -0
logs/imatrix-Step-3.5-Flash-BF16.log +780 -0
logs/perplexity-Step-3.5-Flash-BF16.log +202 -0

README.md CHANGED Viewed

@@ -11,11 +11,6 @@ tags:
 - step3p5
 ---
-## WIP
-Only one test quant for now, a custom `IQ4_XS` which runs on both mainline llama.cpp and [ik_llama.cpp now that this was just merged to main](https://github.com/ikawrakow/ik_llama.cpp/pull/1240).
-I'm cooking imatrix now and planning to release some more ik_llama.cpp quants on Saturday!
 ## `ik_llama.cpp` imatrix Quantizations of stepfun-ai/Step-3.5-Flash
 *NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
@@ -35,31 +30,72 @@ Perplexity computed against *wiki.test.raw*. (lower is "better")
 ![Perplexity Chart](images/perplexity.png "Chart showing Perplexity vs Model Size.")
-These two are just a test quants for baseline perplexity comparison:
 * `BF16` 366.952 GiB (16.004 BPW)
-  - TODO
 * `Q8_0` 195.031 GiB (8.506 BPW)
-  - TODO
 *NOTE*: The first split file is much smaller on purpose to only contain metadata, its fine!
-## IQ5_K TODO
-TODO
 <details>
 <summary>👈 Secret Recipe</summary>
 ```bash
-echo TODO
 ```
 </details>
 ## IQ4_XS 100.53 GiB (4.38 BPW)
-TODO
-*NOTE*: This is the first test quant and does not use imatrix. It is compatible with mainline llama.cpp as well.
 <details>
@@ -111,33 +147,61 @@ numactl -N ${SOCKET} -m ${SOCKET} \
 </details>
-## IQ4_KSS TODO
-TODO
 <details>
 <summary>👈 Secret Recipe</summary>
 ```bash
-echo TODO
-```
-</details>
-## IQ3_KS TODO
-TODO
-<details>
-<summary>👈 Secret Recipe</summary>
-```bash
-echo TODO
 ```
 </details>
-## IQ2_KS TODO
 TODO
 <details>
@@ -185,9 +249,9 @@ numactl -N "$SOCKET" -m "$SOCKET" \
     --jinja
 ```
-For tool use you can always bring your own template with `--jinja --chat-template-file myTemplate.jinja` and might need `--special` etc. The chat template baked into these GGUFs from the [original one](https://huggingface.co/stepfun-ai/Step-3.5-Flash/blob/main/chat_template.jinja). However just for tool use, it is possible [to copy paste the line out of this one](https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int4/blob/main/step3p5_flash_Q4_K_S-00001-of-00012.gguf) but seems to mess it up for normal usage.
-Another option is to check out [pwilkin's autoparser branch](https://github.com/ggml-org/llama.cpp/pull/18675) which might work best in many cases.
 ## References
 * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)

 - step3p5
 ---
 ## `ik_llama.cpp` imatrix Quantizations of stepfun-ai/Step-3.5-Flash
 *NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
 ![Perplexity Chart](images/perplexity.png "Chart showing Perplexity vs Model Size.")
+These two are just a test quants for baseline perplexity comparison and not available for download here:
 * `BF16` 366.952 GiB (16.004 BPW)
+  - PPL over 561 chunks for n_ctx=512 = 2.4169 +/- 0.01107
 * `Q8_0` 195.031 GiB (8.506 BPW)
+  - PPL over 561 chunks for n_ctx=512 = 2.4188 +/- 0.01109
 *NOTE*: The first split file is much smaller on purpose to only contain metadata, its fine!
+## IQ5_K 136.891 GiB (5.970 BPW)
+PPL over 561 chunks for n_ctx=512 = 2.4304 +/- 0.01117
 <details>
 <summary>👈 Secret Recipe</summary>
 ```bash
+#!/usr/bin/env bash
+custom="
+# 45 Repeating Layers [0-44]
+# Attention [0-44] GPU
+blk\..*\.attn_gate.*=q8_0
+blk\..*\.attn_q.*=q8_0
+blk\..*\.attn_k.*=q8_0
+blk\..*\.attn_v.*=q8_0
+blk\..*\.attn_output.*=q8_0
+# First 3 Dense Layers [0-2] GPU
+blk\..*\.ffn_down\.weight=q8_0
+blk\..*\.ffn_(gate|up)\.weight=q8_0
+# Shared Expert Layers [3-44] GPU
+blk\..*\.ffn_down_shexp\.weight=q8_0
+blk\..*\.ffn_(gate|up)_shexp\.weight=q8_0
+# Routed Experts Layers [3-44] CPU
+blk\..*\.ffn_down_exps\.weight=iq6_k
+blk\..*\.ffn_(gate|up)_exps\.weight=iq5_k
+# Non-Repeating Layers
+token_embd\.weight=q8_0
+output\.weight=q8_0
+"
+custom=$(
+  echo "$custom" | grep -v '^#' | \
+  sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
+)
+numactl -N ${SOCKET} -m ${SOCKET} \
+./build/bin/llama-quantize \
+    --custom-q "$custom" \
+    --imatrix /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat \
+    /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf \
+    /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-IQ5_K.gguf \
+    IQ5_K \
+    128
 ```
 </details>
 ## IQ4_XS 100.53 GiB (4.38 BPW)
+PPL over 561 chunks for n_ctx=512 = 2.5181 +/- 0.01178
+*NOTE*: This mainline compatible quant does not use imatrix.
 <details>
 </details>
+## smol-IQ3_KS 75.934 GiB (3.312 BPW)
+PPL over 561 chunks for n_ctx=512 = 2.7856 +/- 0.01365
 <details>
 <summary>👈 Secret Recipe</summary>
 ```bash
+#!/usr/bin/env bash
+custom="
+# 45 Repeating Layers [0-44]
+# Attention [0-44] GPU
+blk\..*\.attn_gate.*=iq6_k
+blk\..*\.attn_q.*=iq6_k
+blk\..*\.attn_k.*=iq6_k
+blk\..*\.attn_v.*=iq6_k
+blk\..*\.attn_output.*=iq6_k
+# First 3 Dense Layers [0-2] GPU
+blk\..*\.ffn_down\.weight=iq6_k
+blk\..*\.ffn_(gate|up)\.weight=iq6_k
+# Shared Expert Layers [3-44] GPU
+blk\..*\.ffn_down_shexp\.weight=iq6_k
+blk\..*\.ffn_(gate|up)_shexp\.weight=iq6_k
+# Routed Experts Layers [3-44] CPU
+blk\..*\.ffn_down_exps\.weight=iq3_ks
+blk\..*\.ffn_(gate|up)_exps\.weight=iq3_ks
+# Non-Repeating Layers
+token_embd\.weight=iq4_k
+output\.weight=iq6_k
+"
+custom=$(
+  echo "$custom" | grep -v '^#' | \
+  sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
+)
+numactl -N ${SOCKET} -m ${SOCKET} \
+./build/bin/llama-quantize \
+    --custom-q "$custom" \
+    --imatrix /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat \
+    /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf \
+    /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-smol-IQ3_KS.gguf \
+    IQ3_KS \
+    128
 ```
 </details>
+## smol-IQ2_KS TODO
 TODO
 <details>
     --jinja
 ```
+For tool use you can always bring your own template with `--chat-template-file myTemplate.jinja` and might need `--special` etc. The chat template baked into these GGUFs from the [original one](https://huggingface.co/stepfun-ai/Step-3.5-Flash/blob/main/chat_template.jinja).
+Another option for mainline tool calling users is to check out [pwilkin's autoparser branch](https://github.com/ggml-org/llama.cpp/pull/18675).
 ## References
 * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)

images/perplexity.png ADDED Viewed

Git LFS Details

SHA256: dc7c397099cb15347d757c12a474fb9cacc5ba8f13c3e9922946b7c7c777d95e
Pointer size: 131 Bytes
Size of remote file: 208 kB

logs/imatrix-Step-3.5-Flash-BF16.log ADDED Viewed

	@@ -0,0 +1,780 @@

+numactl -N 0 -m 0 ./build/bin/llama-imatrix --model /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf -f ubergarm-imatrix-calibration-corpus-v02.txt -o /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat --no-fused-moe --no-fused-up-gate --no-fused-mul-multiadd --ctx-size 512 -ub 4096 -b 4096 --threads 96 --threads-batch 128 --no-mmap --numa numactl --verbosity 1 --layer-similarity
+CPU: using device CPU - 0 MiB free
+llama_model_loader: additional 8 GGUFs metadata loaded.
+llama_model_loader: loaded meta data with 50 key-value pairs and 754 tensors from /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf (version GGUF V3 (latest))
+llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
+llama_model_loader: - kv   0:                       general.architecture str              = step35
+llama_model_loader: - kv   1:                               general.type str              = model
+llama_model_loader: - kv   2:                               general.name str              = Step 3.5 Flash
+llama_model_loader: - kv   3:                         general.size_label str              = 288x7.4B
+llama_model_loader: - kv   4:                            general.license str              = apache-2.0
+llama_model_loader: - kv   5:                   general.base_model.count u32              = 1
+llama_model_loader: - kv   6:                  general.base_model.0.name str              = Step 3.5 Flash
+llama_model_loader: - kv   7:          general.base_model.0.organization str              = Stepfun Ai
+llama_model_loader: - kv   8:              general.base_model.0.repo_url str              = https://huggingface.co/stepfun-ai/ste...
+llama_model_loader: - kv   9:                         step35.block_count u32              = 45
+llama_model_loader: - kv  10:                      step35.context_length u32              = 262144
+llama_model_loader: - kv  11:                    step35.embedding_length u32              = 4096
+llama_model_loader: - kv  12:                 step35.feed_forward_length u32              = 11264
+llama_model_loader: - kv  13:                step35.attention.head_count arr[i32,45]      = [64, 96, 96, 96, 64, 96, 96, 96, 64, ...
+llama_model_loader: - kv  14:                      step35.rope.freq_base f32              = 5000000.000000
+llama_model_loader: - kv  15:                  step35.rope.freq_base_swa f32              = 10000.000000
+llama_model_loader: - kv  16:                  step35.expert_gating_func u32              = 2
+llama_model_loader: - kv  17:                step35.attention.key_length u32              = 128
+llama_model_loader: - kv  18:              step35.attention.value_length u32              = 128
+llama_model_loader: - kv  19:                          general.file_type u32              = 32
+llama_model_loader: - kv  20:             step35.attention.head_count_kv arr[i32,45]      = [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
+llama_model_loader: - kv  21:            step35.attention.sliding_window u32              = 512
+llama_model_loader: - kv  22:    step35.attention.sliding_window_pattern arr[i32,45]      = [0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, ...
+llama_model_loader: - kv  23:                        step35.expert_count u32              = 288
+llama_model_loader: - kv  24:                   step35.expert_used_count u32              = 8
+llama_model_loader: - kv  25:          step35.expert_feed_forward_length u32              = 1280
+llama_model_loader: - kv  26:   step35.expert_shared_feed_forward_length u32              = 1280
+llama_model_loader: - kv  27:                step35.expert_weights_scale f32              = 3.000000
+llama_model_loader: - kv  28:                 step35.expert_weights_norm bool             = true
+llama_model_loader: - kv  29:           step35.leading_dense_block_count u32              = 3
+llama_model_loader: - kv  30:                  step35.moe_every_n_layers u32              = 1
+llama_model_loader: - kv  31:    step35.attention.layer_norm_rms_epsilon f32              = 0.000010
+llama_model_loader: - kv  32:                    step35.swiglu_clamp_exp arr[f32,45]      = [0.000000, 0.000000, 0.000000, 0.0000...
+llama_model_loader: - kv  33:                  step35.swiglu_clamp_shexp arr[f32,45]      = [0.000000, 0.000000, 0.000000, 0.0000...
+llama_model_loader: - kv  34:               general.quantization_version u32              = 2
+llama_model_loader: - kv  35:                       tokenizer.ggml.model str              = gpt2
+llama_model_loader: - kv  36:                         tokenizer.ggml.pre str              = deepseek-v3
+llama_model_loader: - kv  37:                      tokenizer.ggml.tokens arr[str,128896]  = ["<｜begin▁of▁sentence｜>", "<�...
+llama_model_loader: - kv  38:                  tokenizer.ggml.token_type arr[i32,128896]  = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
+llama_model_loader: - kv  39:                      tokenizer.ggml.merges arr[str,127741]  = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
+llama_model_loader: - kv  40:                tokenizer.ggml.bos_token_id u32              = 0
+llama_model_loader: - kv  41:                tokenizer.ggml.eos_token_id u32              = 128007
+llama_model_loader: - kv  42:            tokenizer.ggml.padding_token_id u32              = 1
+llama_model_loader: - kv  43:               tokenizer.ggml.add_bos_token bool             = true
+llama_model_loader: - kv  44:               tokenizer.ggml.add_sep_token bool             = false
+llama_model_loader: - kv  45:               tokenizer.ggml.add_eos_token bool             = false
+llama_model_loader: - kv  46:                    tokenizer.chat_template str              = {% macro render_content(content) %}{%...
+llama_model_loader: - kv  47:                                   split.no u16              = 0
+llama_model_loader: - kv  48:                                split.count u16              = 9
+llama_model_loader: - kv  49:                        split.tensors.count i32              = 754
+llama_model_loader: - type  f32:  266 tensors
+llama_model_loader: - type bf16:  488 tensors
+load: printing all EOG tokens:
+load:   - 128007 ('<|im_end|>')
+load: special tokens cache size = 818
+load: token to piece cache size = 0.8220 MB
+llm_load_print_meta: format           = GGUF V3 (latest)
+llm_load_print_meta: arch             = step35
+llm_load_print_meta: n_ctx_train      = 262144
+llm_load_print_meta: n_embd           = 4096
+llm_load_print_meta: n_layer          = 45
+llm_load_print_meta: n_head           = [64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64]
+llm_load_print_meta: n_head_kv        = 8
+llm_load_print_meta: n_rot            = 128
+llm_load_print_meta: n_swa            = 512
+llm_load_print_meta: n_swa_pattern    = 1
+llm_load_print_meta: n_embd_head_k    = 128
+llm_load_print_meta: n_embd_head_v    = 128
+llm_load_print_meta: n_gqa            = [8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8]
+llm_load_print_meta: n_embd_k_gqa     = 1024
+llm_load_print_meta: n_embd_v_gqa     = 1024
+llm_load_print_meta: f_norm_eps       = 0.0e+00
+llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
+llm_load_print_meta: f_clamp_kqv      = 0.0e+00
+llm_load_print_meta: f_max_alibi_bias = 0.0e+00
+llm_load_print_meta: f_logit_scale    = 0.0e+00
+llm_load_print_meta: n_ff             = 11264
+llm_load_print_meta: n_expert         = 288
+llm_load_print_meta: n_expert_used    = 8
+llm_load_print_meta: causal attn      = 1
+llm_load_print_meta: pooling type     = 0
+llm_load_print_meta: rope type        = 2
+llm_load_print_meta: rope scaling     = linear
+llm_load_print_meta: freq_base_train  = 5000000.0
+llm_load_print_meta: freq_scale_train = 1
+llm_load_print_meta: n_ctx_orig_yarn  = 262144
+llm_load_print_meta: rope_finetuned   = unknown
+llm_load_print_meta: ssm_d_conv       = 0
+llm_load_print_meta: ssm_d_inner      = 0
+llm_load_print_meta: ssm_d_state      = 0
+llm_load_print_meta: ssm_dt_rank      = 0
+llm_load_print_meta: model type       = ?B
+llm_load_print_meta: model ftype      = BF16
+llm_load_print_meta: model params     = 196.956 B
+llm_load_print_meta: model size       = 366.952 GiB (16.004 BPW)
+llm_load_print_meta: repeating layers = 364.986 GiB (16.004 BPW, 195.900 B parameters)
+llm_load_print_meta: general.name     = Step 3.5 Flash
+print_info: vocab type       = BPE
+print_info: n_vocab          = 128896
+print_info: n_merges         = 127741
+print_info: BOS token        = 0 '<｜begin▁of▁sentence｜>'
+print_info: EOS token        = 128007 '<|im_end|>'
+print_info: EOT token        = 128007 '<|im_end|>'
+print_info: PAD token        = 1 '<｜end▁of▁sentence｜>'
+print_info: LF token         = 201 'Ċ'
+print_info: FIM PRE token    = 128801 '<｜fim▁begin｜>'
+print_info: FIM SUF token    = 128800 '<｜fim▁hole｜>'
+print_info: FIM MID token    = 128802 '<｜fim▁end｜>'
+print_info: EOG token        = 128007 '<|im_end|>'
+print_info: max token length = 256
+llm_load_tensors: ggml ctx size =    0.31 MiB
+llm_load_tensors: offloading 0 repeating layers to GPU
+llm_load_tensors: offloaded 0/46 layers to GPU
+llm_load_tensors:        CPU buffer size = 375759.27 MiB
+....................................................................................................
+llama_new_context_with_model: n_ctx         = 512
+llama_new_context_with_model: n_batch       = 512
+llama_new_context_with_model: n_ubatch      = 512
+llama_new_context_with_model: flash_attn    = 1
+llama_new_context_with_model: attn_max_b    = 0
+llama_new_context_with_model: fused_moe     = 0
+llama_new_context_with_model: grouped er    = 0
+llama_new_context_with_model: fused_up_gate = 0
+llama_new_context_with_model: fused_mmad    = 0
+llama_new_context_with_model: rope_cache    = 0
+llama_new_context_with_model: graph_reuse   = 1
+llama_new_context_with_model: k_cache_hadam = 0
+llama_new_context_with_model: split_mode_graph_scheduling = 0
+llama_new_context_with_model: reduce_type   = f16
+llama_new_context_with_model: sched_async   = 0
+llama_new_context_with_model: ser           = -1, 0
+llama_new_context_with_model: freq_base     = 5000000.0
+llama_new_context_with_model: freq_scale    = 1
+llama_kv_cache_init:        CPU KV buffer size =    90.00 MiB
+llama_new_context_with_model: KV self size  =   90.00 MiB, K (f16):   45.00 MiB, V (f16):   45.00 MiB
+llama_new_context_with_model:        CPU  output buffer size =     0.49 MiB
+llama_new_context_with_model:        CPU compute buffer size =   259.75 MiB
+llama_new_context_with_model: graph nodes  = 2369
+llama_new_context_with_model: graph splits = 1
+XXXXXXXXXXXXXXXXXXXXX Setting only active experts offload
+system_info: n_threads = 96 (n_threads_batch = 128) / 512 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
+compute_imatrix: tokenizing the input ..
+compute_imatrix: tokenization took 599.134 ms
+compute_imatrix: computing over 812 chunks with batch_size 512
+compute_imatrix: 4.10 seconds per pass - ETA 55.55 minutes
+===================================== llama_new_context_with_model: f16
+======================================= HAVE_FANCY_SIMD is defined
+[1]92.2870,[2]15.6185,[3]9.0021,[4]5.2226,[5]3.8316,[6]3.1180,[7]2.6999,[8]2.4021,[9]2.2278,
+save_imatrix: entry '               blk.43.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.42.ffn_down_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.39.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.38.ffn_gate_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.39.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.37.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.36.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.40.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.35.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.35.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.34.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.34.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.33.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.33.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.39.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.32.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.32.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.34.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.31.ffn_down_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.31.ffn_gate_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.40.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.43.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.30.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.30.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.29.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.29.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.42.ffn_up_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.28.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.28.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.43.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.31.ffn_up_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.27.ffn_gate_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.26.ffn_gate_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.26.ffn_up_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.36.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.24.ffn_down_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.24.ffn_gate_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.28.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.23.ffn_down_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.23.ffn_gate_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.23.ffn_up_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.38.ffn_up_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.22.ffn_down_exps.weight' has partial data (89.58%) 30 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.22.ffn_gate_exps.weight' has partial data (89.58%) 30 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.25.ffn_gate_exps.weight' has partial data (90.97%) 26 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.15.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.7.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.11.ffn_up_exps.weight' has partial data (89.93%) 29 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.6.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.20.ffn_down_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.11.ffn_down_exps.weight' has partial data (89.93%) 29 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.16.ffn_up_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.41.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.33.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.4.ffn_up_exps.weight' has partial data (82.99%) 49 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.29.ffn_down_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.8.ffn_up_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.10.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.3.ffn_up_exps.weight' has partial data (99.31%) 2 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.6.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.37.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.9.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.36.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.3.ffn_down_exps.weight' has partial data (99.31%) 2 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.12.ffn_down_exps.weight' has partial data (88.54%) 33 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.21.ffn_down_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.27.ffn_up_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.41.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.12.ffn_gate_exps.weight' has partial data (88.54%) 33 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.38.ffn_down_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.44.ffn_up_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.4.ffn_gate_exps.weight' has partial data (82.99%) 49 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.19.ffn_up_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.13.ffn_up_exps.weight' has partial data (83.33%) 48 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.44.ffn_down_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.7.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.30.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.5.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.18.ffn_up_exps.weight' has partial data (90.62%) 27 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.4.ffn_down_exps.weight' has partial data (82.99%) 49 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.17.ffn_up_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.41.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.9.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.25.ffn_up_exps.weight' has partial data (90.97%) 26 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.3.ffn_gate_exps.weight' has partial data (99.31%) 2 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.8.ffn_gate_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.9.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.5.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.13.ffn_down_exps.weight' has partial data (83.33%) 48 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.16.ffn_gate_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.27.ffn_down_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.26.ffn_down_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.5.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.11.ffn_gate_exps.weight' has partial data (89.93%) 29 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.37.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.18.ffn_gate_exps.weight' has partial data (90.62%) 27 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.20.ffn_gate_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.13.ffn_gate_exps.weight' has partial data (83.33%) 48 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.40.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.14.ffn_up_exps.weight' has partial data (87.85%) 35 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.10.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.14.ffn_gate_exps.weight' has partial data (87.85%) 35 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.14.ffn_down_exps.weight' has partial data (87.85%) 35 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.8.ffn_down_exps.weight' has partial data (92.36%) 22 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.24.ffn_up_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.12.ffn_up_exps.weight' has partial data (88.54%) 33 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.42.ffn_gate_exps.weight' has partial data (88.89%) 32 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.10.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.15.ffn_up_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.15.ffn_gate_exps.weight' has partial data (90.28%) 28 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.16.ffn_down_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.17.ffn_gate_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.35.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.17.ffn_down_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.18.ffn_down_exps.weight' has partial data (90.62%) 27 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.21.ffn_up_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.25.ffn_down_exps.weight' has partial data (90.97%) 26 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.6.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.19.ffn_gate_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.19.ffn_down_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.20.ffn_up_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.22.ffn_up_exps.weight' has partial data (89.58%) 30 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.21.ffn_gate_exps.weight' has partial data (89.24%) 31 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.44.ffn_gate_exps.weight' has partial data (91.67%) 24 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.7.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.32.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: warning: storing only 418 out of 529 entries
+save_imatrix: stored collected data after 10 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[10]2.1021,[11]2.3311,[12]2.4035,[13]2.3973,[14]2.4537,[15]2.3408,[16]2.2269,[17]2.1399,[18]2.0725,[19]2.0198,
+save_imatrix: entry '               blk.43.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.42.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.37.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.36.ffn_down_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.35.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.35.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.34.ffn_gate_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.34.ffn_up_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.33.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.33.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.32.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.32.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.34.ffn_down_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.31.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.31.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.43.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.30.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.30.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.29.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.29.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.42.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.28.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.28.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.43.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.31.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.27.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.26.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.26.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.36.ffn_up_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.24.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.24.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.28.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.23.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.23.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.23.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.22.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.22.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.25.ffn_gate_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.15.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.11.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '                blk.6.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.20.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.11.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.16.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.41.ffn_down_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.33.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '                blk.4.ffn_up_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.29.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '                blk.8.ffn_up_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.10.ffn_gate_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.6.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.37.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.9.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.36.ffn_gate_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.12.ffn_down_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.21.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.27.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.41.ffn_gate_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.12.ffn_gate_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.44.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.4.ffn_gate_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.19.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.13.ffn_up_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.44.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.30.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.5.ffn_down_exps.weight' has partial data (98.61%) 4 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.18.ffn_up_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.4.ffn_down_exps.weight' has partial data (93.06%) 20 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.17.ffn_up_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.41.ffn_up_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.9.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.25.ffn_up_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.8.ffn_gate_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '                blk.9.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '                blk.5.ffn_up_exps.weight' has partial data (98.61%) 4 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.13.ffn_down_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.16.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.27.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.26.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.5.ffn_gate_exps.weight' has partial data (98.61%) 4 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.11.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.37.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.18.ffn_gate_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.20.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.13.ffn_gate_exps.weight' has partial data (91.32%) 25 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.14.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.10.ffn_down_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.14.ffn_gate_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.14.ffn_down_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.8.ffn_down_exps.weight' has partial data (96.88%) 9 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.24.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.12.ffn_up_exps.weight' has partial data (93.40%) 19 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.42.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.10.ffn_up_exps.weight' has partial data (97.57%) 7 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.15.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.15.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.16.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.17.ffn_gate_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.35.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.17.ffn_down_exps.weight' has partial data (94.10%) 17 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.18.ffn_down_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.21.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.25.ffn_down_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '              blk.6.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.19.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.19.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.20.ffn_up_exps.weight' has partial data (93.75%) 18 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.22.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.21.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.44.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.32.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: warning: storing only 478 out of 529 entries
+save_imatrix: stored collected data after 20 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[20]1.9686,[21]1.9365,[22]1.8869,[23]1.8609,[24]1.8847,[25]1.8792,[26]1.8445,[27]1.9620,[28]2.0677,[29]2.1517,
+save_imatrix: entry '               blk.43.ffn_up_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.42.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.43.ffn_gate_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.30.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.30.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.29.ffn_gate_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.29.ffn_up_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.42.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.28.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.28.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.43.ffn_down_exps.weight' has partial data (94.79%) 15 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.24.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.24.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.28.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.23.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.23.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.23.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.25.ffn_gate_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.15.ffn_down_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.20.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '                blk.4.ffn_up_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.29.ffn_down_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.12.ffn_down_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.21.ffn_down_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.12.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.4.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.13.ffn_up_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.30.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.18.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.4.ffn_down_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.17.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.25.ffn_up_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.13.ffn_down_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.18.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.20.ffn_gate_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.13.ffn_gate_exps.weight' has partial data (92.71%) 21 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.14.ffn_up_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.14.ffn_gate_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '             blk.14.ffn_down_exps.weight' has partial data (94.44%) 16 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.24.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.12.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.42.ffn_gate_exps.weight' has partial data (95.14%) 14 out of 288 experts are missing data - skipping
+save_imatrix: entry '               blk.15.ffn_up_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.15.ffn_gate_exps.weight' has partial data (95.49%) 13 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.17.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.17.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.18.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.21.ffn_up_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.25.ffn_down_exps.weight' has partial data (97.22%) 8 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.20.ffn_up_exps.weight' has partial data (95.83%) 12 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.21.ffn_gate_exps.weight' has partial data (96.53%) 10 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: warning: storing only 511 out of 529 entries
+save_imatrix: stored collected data after 30 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[30]2.1645,[31]2.1882,[32]2.1900,[33]2.1667,[34]2.2092,[35]2.2241,[36]2.2535,[37]2.2534,[38]2.3061,[39]2.2955,
+save_imatrix: entry '               blk.43.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.42.ffn_down_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.43.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.42.ffn_up_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.43.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.23.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.23.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.23.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '                blk.4.ffn_up_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.4.ffn_gate_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.13.ffn_up_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '              blk.4.ffn_down_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.13.ffn_down_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.13.ffn_gate_exps.weight' has partial data (96.18%) 11 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '               blk.14.ffn_up_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.14.ffn_gate_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.14.ffn_down_exps.weight' has partial data (97.92%) 6 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: entry '             blk.42.ffn_gate_exps.weight' has partial data (98.26%) 5 out of 288 experts are missing data Storing **but be aware**
+save_imatrix: stored collected data after 40 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[40]2.3248,[41]2.3142,[42]2.2977,[43]2.3104,[44]2.3077,[45]2.3022,[46]2.3080,[47]2.2986,[48]2.2730,[49]2.2517,
+save_imatrix: stored collected data after 50 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[50]2.2370,[51]2.2344,[52]2.2286,[53]2.2301,[54]2.2405,[55]2.2243,[56]2.2017,[57]2.2026,[58]2.1997,[59]2.2053,
+save_imatrix: stored collected data after 60 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[60]2.1886,[61]2.2348,[62]2.2826,[63]2.3263,[64]2.3770,[65]2.4355,[66]2.4710,[67]2.5238,[68]2.5784,[69]2.6394,
+save_imatrix: stored collected data after 70 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[70]2.7132,[71]2.7529,[72]2.7870,[73]2.8046,[74]2.8227,[75]2.8730,[76]2.9207,[77]2.9354,[78]2.9547,[79]2.9834,
+save_imatrix: stored collected data after 80 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[80]3.0243,[81]3.0668,[82]3.1158,[83]3.1211,[84]3.1977,[85]3.2135,[86]3.2150,[87]3.2843,[88]3.3456,[89]3.4204,
+save_imatrix: stored collected data after 90 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[90]3.4383,[91]3.4328,[92]3.4361,[93]3.4487,[94]3.4527,[95]3.4903,[96]3.4952,[97]3.5381,[98]3.5657,[99]3.5372,
+save_imatrix: stored collected data after 100 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[100]3.5712,[101]3.6255,[102]3.6591,[103]3.7021,[104]3.7340,[105]3.7678,[106]3.8049,[107]3.7880,[108]3.7927,[109]3.7995,
+save_imatrix: stored collected data after 110 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[110]3.8065,[111]3.7946,[112]3.8325,[113]3.8564,[114]3.8682,[115]3.8427,[116]3.8048,[117]3.7913,[118]3.7995,[119]3.7751,
+save_imatrix: stored collected data after 120 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[120]3.7521,[121]3.7361,[122]3.7214,[123]3.7211,[124]3.7221,[125]3.7333,[126]3.7428,[127]3.7639,[128]3.7981,[129]3.8097,
+save_imatrix: stored collected data after 130 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[130]3.7767,[131]3.7424,[132]3.7104,[133]3.6785,[134]3.6792,[135]3.6723,[136]3.7012,[137]3.7375,[138]3.7522,[139]3.7524,
+save_imatrix: stored collected data after 140 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[140]3.7753,[141]3.8038,[142]3.8356,[143]3.8474,[144]3.8692,[145]3.8896,[146]3.9076,[147]3.9221,[148]3.9314,[149]3.9288,
+save_imatrix: stored collected data after 150 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[150]3.9329,[151]3.9521,[152]3.9701,[153]3.9697,[154]3.9747,[155]3.9853,[156]3.9910,[157]3.9969,[158]4.0021,[159]4.0095,
+save_imatrix: stored collected data after 160 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[160]4.0235,[161]4.0243,[162]4.0248,[163]4.0299,[164]4.0368,[165]4.0363,[166]4.0330,[167]4.0539,[168]4.0627,[169]4.0696,
+save_imatrix: stored collected data after 170 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[170]4.0906,[171]4.1067,[172]4.1002,[173]4.1049,[174]4.1072,[175]4.1209,[176]4.1278,[177]4.1407,[178]4.1391,[179]4.1392,
+save_imatrix: stored collected data after 180 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[180]4.1375,[181]4.1371,[182]4.1347,[183]4.1326,[184]4.1200,[185]4.1316,[186]4.1609,[187]4.1892,[188]4.2153,[189]4.2400,
+save_imatrix: stored collected data after 190 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[190]4.2773,[191]4.2866,[192]4.2995,[193]4.2804,[194]4.2936,[195]4.2837,[196]4.2593,[197]4.2321,[198]4.2527,[199]4.2750,
+save_imatrix: stored collected data after 200 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[200]4.2825,[201]4.2905,[202]4.3070,[203]4.3250,[204]4.3391,[205]4.3518,[206]4.3650,[207]4.3586,[208]4.3318,[209]4.3069,
+save_imatrix: stored collected data after 210 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[210]4.2806,[211]4.2549,[212]4.2300,[213]4.2044,[214]4.2076,[215]4.2338,[216]4.2205,[217]4.2112,[218]4.2377,[219]4.2507,
+save_imatrix: stored collected data after 220 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[220]4.2713,[221]4.2947,[222]4.3132,[223]4.3261,[224]4.3557,[225]4.3644,[226]4.3954,[227]4.4296,[228]4.4535,[229]4.4635,
+save_imatrix: stored collected data after 230 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[230]4.4711,[231]4.4786,[232]4.5008,[233]4.5066,[234]4.5144,[235]4.5423,[236]4.5473,[237]4.5815,[238]4.6125,[239]4.6244,
+save_imatrix: stored collected data after 240 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[240]4.6367,[241]4.6566,[242]4.6653,[243]4.6757,[244]4.6927,[245]4.7105,[246]4.7363,[247]4.7391,[248]4.7495,[249]4.7627,
+save_imatrix: stored collected data after 250 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[250]4.7765,[251]4.7811,[252]4.7926,[253]4.8027,[254]4.8115,[255]4.8230,[256]4.8389,[257]4.8518,[258]4.8654,[259]4.8754,
+save_imatrix: stored collected data after 260 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[260]4.8781,[261]4.8896,[262]4.8903,[263]4.9069,[264]4.9292,[265]4.9484,[266]4.9674,[267]4.9801,[268]4.9860,[269]4.9948,
+save_imatrix: stored collected data after 270 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[270]5.0074,[271]5.0283,[272]5.0497,[273]5.0671,[274]5.0720,[275]5.0726,[276]5.0887,[277]5.0971,[278]5.1116,[279]5.1267,
+save_imatrix: stored collected data after 280 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[280]5.1272,[281]5.1285,[282]5.1367,[283]5.1374,[284]5.1515,[285]5.1579,[286]5.1643,[287]5.1887,[288]5.2028,[289]5.2193,
+save_imatrix: stored collected data after 290 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[290]5.2383,[291]5.2501,[292]5.2751,[293]5.2875,[294]5.3043,[295]5.3194,[296]5.3327,[297]5.3388,[298]5.3604,[299]5.3687,
+save_imatrix: stored collected data after 300 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[300]5.3728,[301]5.3898,[302]5.4105,[303]5.4165,[304]5.4243,[305]5.4300,[306]5.4394,[307]5.4488,[308]5.4525,[309]5.4703,
+save_imatrix: stored collected data after 310 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[310]5.4786,[311]5.4932,[312]5.5118,[313]5.5282,[314]5.5483,[315]5.5213,[316]5.5219,[317]5.4985,[318]5.5149,[319]5.5228,
+save_imatrix: stored collected data after 320 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[320]5.5227,[321]5.5188,[322]5.5330,[323]5.5465,[324]5.5547,[325]5.5643,[326]5.5650,[327]5.5811,[328]5.5871,[329]5.6020,
+save_imatrix: stored collected data after 330 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[330]5.6110,[331]5.6185,[332]5.6276,[333]5.5984,[334]5.6102,[335]5.6322,[336]5.6522,[337]5.6740,[338]5.6883,[339]5.7097,
+save_imatrix: stored collected data after 340 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[340]5.7121,[341]5.7121,[342]5.7181,[343]5.7259,[344]5.7456,[345]5.7737,[346]5.7669,[347]5.7665,[348]5.7747,[349]5.7696,
+save_imatrix: stored collected data after 350 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[350]5.7718,[351]5.7753,[352]5.7696,[353]5.7763,[354]5.7891,[355]5.7864,[356]5.7856,[357]5.7662,[358]5.7433,[359]5.7300,
+save_imatrix: stored collected data after 360 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[360]5.7155,[361]5.6990,[362]5.6893,[363]5.6717,[364]5.6643,[365]5.6480,[366]5.6478,[367]5.6308,[368]5.6273,[369]5.6031,
+save_imatrix: stored collected data after 370 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[370]5.5829,[371]5.5735,[372]5.5600,[373]5.5399,[374]5.5243,[375]5.5158,[376]5.4972,[377]5.4881,[378]5.4873,[379]5.4862,
+save_imatrix: stored collected data after 380 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[380]5.4784,[381]5.4703,[382]5.4476,[383]5.4259,[384]5.4148,[385]5.4007,[386]5.3796,[387]5.3563,[388]5.3332,[389]5.3184,
+save_imatrix: stored collected data after 390 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[390]5.3114,[391]5.3143,[392]5.3078,[393]5.3050,[394]5.2956,[395]5.2810,[396]5.2605,[397]5.2439,[398]5.2358,[399]5.2181,
+save_imatrix: stored collected data after 400 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[400]5.2028,[401]5.1889,[402]5.1764,[403]5.1653,[404]5.1494,[405]5.1340,[406]5.1233,[407]5.1054,[408]5.0884,[409]5.0733,
+save_imatrix: stored collected data after 410 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[410]5.0606,[411]5.0529,[412]5.0454,[413]5.0384,[414]5.0269,[415]5.0170,[416]4.9984,[417]4.9798,[418]4.9609,[419]4.9444,
+save_imatrix: stored collected data after 420 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[420]4.9271,[421]4.9130,[422]4.8962,[423]4.8795,[424]4.8668,[425]4.8511,[426]4.8386,[427]4.8284,[428]4.8148,[429]4.7992,
+save_imatrix: stored collected data after 430 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[430]4.7833,[431]4.7701,[432]4.7664,[433]4.7576,[434]4.7630,[435]4.7521,[436]4.7382,[437]4.7269,[438]4.7143,[439]4.7061,
+save_imatrix: stored collected data after 440 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[440]4.6956,[441]4.6814,[442]4.6741,[443]4.6630,[444]4.6612,[445]4.6517,[446]4.6430,[447]4.6422,[448]4.6334,[449]4.6247,
+save_imatrix: stored collected data after 450 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[450]4.6132,[451]4.6061,[452]4.5944,[453]4.5835,[454]4.5715,[455]4.5610,[456]4.5472,[457]4.5362,[458]4.5256,[459]4.5127,
+save_imatrix: stored collected data after 460 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[460]4.5012,[461]4.4927,[462]4.4892,[463]4.4768,[464]4.4725,[465]4.4667,[466]4.4614,[467]4.4546,[468]4.4480,[469]4.4419,
+save_imatrix: stored collected data after 470 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[470]4.4352,[471]4.4286,[472]4.4222,[473]4.4156,[474]4.4099,[475]4.4034,[476]4.3970,[477]4.3924,[478]4.3810,[479]4.3720,
+save_imatrix: stored collected data after 480 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[480]4.3595,[481]4.3528,[482]4.3495,[483]4.3489,[484]4.3361,[485]4.3263,[486]4.3164,[487]4.3049,[488]4.2961,[489]4.2901,
+save_imatrix: stored collected data after 490 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[490]4.2823,[491]4.2764,[492]4.2675,[493]4.2613,[494]4.2508,[495]4.2465,[496]4.2389,[497]4.2302,[498]4.2205,[499]4.2204,
+save_imatrix: stored collected data after 500 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[500]4.2207,[501]4.2235,[502]4.2204,[503]4.2211,[504]4.2210,[505]4.2177,[506]4.2105,[507]4.2207,[508]4.2305,[509]4.2408,
+save_imatrix: stored collected data after 510 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[510]4.2493,[511]4.2572,[512]4.2653,[513]4.2721,[514]4.2796,[515]4.2847,[516]4.2921,[517]4.2970,[518]4.2971,[519]4.3143,
+save_imatrix: stored collected data after 520 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[520]4.3267,[521]4.3409,[522]4.3505,[523]4.3562,[524]4.3613,[525]4.3665,[526]4.3715,[527]4.3778,[528]4.3836,[529]4.3874,
+save_imatrix: stored collected data after 530 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[530]4.3930,[531]4.3979,[532]4.4008,[533]4.4036,[534]4.4078,[535]4.4051,[536]4.4066,[537]4.4142,[538]4.4193,[539]4.4241,
+save_imatrix: stored collected data after 540 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[540]4.4356,[541]4.4396,[542]4.4415,[543]4.4458,[544]4.4472,[545]4.4490,[546]4.4543,[547]4.4598,[548]4.4671,[549]4.4731,
+save_imatrix: stored collected data after 550 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[550]4.4794,[551]4.4874,[552]4.4927,[553]4.5002,[554]4.5035,[555]4.5077,[556]4.5118,[557]4.5194,[558]4.5195,[559]4.5247,
+save_imatrix: stored collected data after 560 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[560]4.5291,[561]4.5356,[562]4.5417,[563]4.5440,[564]4.5502,[565]4.5578,[566]4.5635,[567]4.5724,[568]4.5739,[569]4.5765,
+save_imatrix: stored collected data after 570 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[570]4.5767,[571]4.5807,[572]4.5769,[573]4.5728,[574]4.5711,[575]4.5738,[576]4.5735,[577]4.5762,[578]4.5760,[579]4.5796,
+save_imatrix: stored collected data after 580 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[580]4.5788,[581]4.5769,[582]4.5764,[583]4.5737,[584]4.5691,[585]4.5699,[586]4.5660,[587]4.5588,[588]4.5570,[589]4.5552,
+save_imatrix: stored collected data after 590 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[590]4.5495,[591]4.5451,[592]4.5398,[593]4.5345,[594]4.5309,[595]4.5299,[596]4.5265,[597]4.5261,[598]4.5232,[599]4.5186,
+save_imatrix: stored collected data after 600 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[600]4.5134,[601]4.5140,[602]4.5149,[603]4.5143,[604]4.5097,[605]4.5077,[606]4.5034,[607]4.5078,[608]4.5055,[609]4.5028,
+save_imatrix: stored collected data after 610 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[610]4.5024,[611]4.5070,[612]4.5081,[613]4.4981,[614]4.4911,[615]4.4818,[616]4.4730,[617]4.4655,[618]4.4565,[619]4.4458,
+save_imatrix: stored collected data after 620 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[620]4.4353,[621]4.4247,[622]4.4166,[623]4.4103,[624]4.4047,[625]4.4032,[626]4.3953,[627]4.3885,[628]4.3801,[629]4.3742,
+save_imatrix: stored collected data after 630 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[630]4.3734,[631]4.3750,[632]4.3698,[633]4.3645,[634]4.3603,[635]4.3512,[636]4.3435,[637]4.3355,[638]4.3274,[639]4.3192,
+save_imatrix: stored collected data after 640 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[640]4.3116,[641]4.3051,[642]4.2995,[643]4.2915,[644]4.2844,[645]4.2776,[646]4.2779,[647]4.2723,[648]4.2643,[649]4.2584,
+save_imatrix: stored collected data after 650 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[650]4.2522,[651]4.2453,[652]4.2373,[653]4.2301,[654]4.2238,[655]4.2182,[656]4.2116,[657]4.2119,[658]4.2109,[659]4.2123,
+save_imatrix: stored collected data after 660 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[660]4.2098,[661]4.2022,[662]4.1968,[663]4.1905,[664]4.1822,[665]4.1751,[666]4.1678,[667]4.1612,[668]4.1540,[669]4.1467,
+save_imatrix: stored collected data after 670 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[670]4.1401,[671]4.1332,[672]4.1270,[673]4.1204,[674]4.1135,[675]4.1059,[676]4.0989,[677]4.0931,[678]4.0862,[679]4.0799,
+save_imatrix: stored collected data after 680 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[680]4.0738,[681]4.0672,[682]4.0606,[683]4.0531,[684]4.0467,[685]4.0404,[686]4.0371,[687]4.0292,[688]4.0222,[689]4.0156,
+save_imatrix: stored collected data after 690 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[690]4.0082,[691]4.0020,[692]3.9976,[693]3.9952,[694]3.9912,[695]3.9880,[696]3.9845,[697]3.9813,[698]3.9780,[699]3.9749,
+save_imatrix: stored collected data after 700 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[700]3.9718,[701]3.9691,[702]3.9669,[703]3.9641,[704]3.9606,[705]3.9584,[706]3.9549,[707]3.9518,[708]3.9490,[709]3.9462,
+save_imatrix: stored collected data after 710 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[710]3.9466,[711]3.9471,[712]3.9477,[713]3.9479,[714]3.9485,[715]3.9479,[716]3.9493,[717]3.9502,[718]3.9503,[719]3.9497,
+save_imatrix: stored collected data after 720 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[720]3.9500,[721]3.9501,[722]3.9499,[723]3.9515,[724]3.9533,[725]3.9540,[726]3.9537,[727]3.9534,[728]3.9537,[729]3.9552,
+save_imatrix: stored collected data after 730 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[730]3.9562,[731]3.9560,[732]3.9550,[733]3.9541,[734]3.9559,[735]3.9573,[736]3.9575,[737]3.9584,[738]3.9593,[739]3.9593,
+save_imatrix: stored collected data after 740 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[740]3.9591,[741]3.9590,[742]3.9602,[743]3.9603,[744]3.9601,[745]3.9609,[746]3.9612,[747]3.9615,[748]3.9606,[749]3.9609,
+save_imatrix: stored collected data after 750 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[750]3.9599,[751]3.9609,[752]3.9604,[753]3.9600,[754]3.9608,[755]3.9604,[756]3.9608,[757]3.9618,[758]3.9614,[759]3.9625,
+save_imatrix: stored collected data after 760 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[760]3.9629,[761]3.9641,[762]3.9633,[763]3.9637,[764]3.9646,[765]3.9640,[766]3.9639,[767]3.9642,[768]3.9633,[769]3.9631,
+save_imatrix: stored collected data after 770 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[770]3.9635,[771]3.9626,[772]3.9624,[773]3.9617,[774]3.9616,[775]3.9632,[776]3.9632,[777]3.9638,[778]3.9640,[779]3.9624,
+save_imatrix: stored collected data after 780 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[780]3.9619,[781]3.9623,[782]3.9627,[783]3.9611,[784]3.9616,[785]3.9611,[786]3.9622,[787]3.9626,[788]3.9620,[789]3.9625,
+save_imatrix: stored collected data after 790 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[790]3.9628,[791]3.9645,[792]3.9663,[793]3.9661,[794]3.9649,[795]3.9648,[796]3.9660,[797]3.9665,[798]3.9659,[799]3.9668,
+save_imatrix: stored collected data after 800 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[800]3.9682,[801]3.9689,[802]3.9695,[803]3.9710,[804]3.9716,[805]3.9721,[806]3.9725,[807]3.9742,[808]3.9749,[809]3.9743,
+save_imatrix: stored collected data after 810 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+[810]3.9744,[811]3.9747,[812]3.9755,
+save_imatrix: stored collected data after 812 chunks in /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/imatrix-Step-3.5-Flash-BF16.dat
+Final estimate: PPL = 3.9755 +/- 0.01997
+======================== sorted layer importances
+  0: Layer   0, <cos_sim> = 0.191944
+  1: Layer  44, <cos_sim> = 0.794719
+  2: Layer  11, <cos_sim> = 0.880959
+  3: Layer  15, <cos_sim> = 0.889736
+  4: Layer  12, <cos_sim> = 0.892113
+  5: Layer  19, <cos_sim> = 0.896638
+  6: Layer  16, <cos_sim> = 0.902101
+  7: Layer  14, <cos_sim> = 0.904488
+  8: Layer  13, <cos_sim> = 0.904849
+  9: Layer  18, <cos_sim> = 0.90949
+ 10: Layer  20, <cos_sim> = 0.912555
+ 11: Layer  17, <cos_sim> = 0.913825
+ 12: Layer  21, <cos_sim> = 0.916861
+ 13: Layer  43, <cos_sim> = 0.920321
+ 14: Layer  22, <cos_sim> = 0.920897
+ 15: Layer   7, <cos_sim> = 0.925641
+ 16: Layer  10, <cos_sim> = 0.928077
+ 17: Layer   9, <cos_sim> = 0.930262
+ 18: Layer  23, <cos_sim> = 0.930822
+ 19: Layer   8, <cos_sim> = 0.932862
+ 20: Layer  24, <cos_sim> = 0.936006
+ 21: Layer   3, <cos_sim> = 0.940002
+ 22: Layer  41, <cos_sim> = 0.945994
+ 23: Layer  25, <cos_sim> = 0.946426
+ 24: Layer  27, <cos_sim> = 0.946791
+ 25: Layer  42, <cos_sim> = 0.94737
+ 26: Layer  26, <cos_sim> = 0.948684
+ 27: Layer  36, <cos_sim> = 0.949698
+ 28: Layer  39, <cos_sim> = 0.949899
+ 29: Layer  37, <cos_sim> = 0.9515
+ 30: Layer  28, <cos_sim> = 0.951921
+ 31: Layer  38, <cos_sim> = 0.953373
+ 32: Layer  35, <cos_sim> = 0.955007
+ 33: Layer  29, <cos_sim> = 0.955639
+ 34: Layer  34, <cos_sim> = 0.955797
+ 35: Layer  31, <cos_sim> = 0.956181
+ 36: Layer   6, <cos_sim> = 0.956762
+ 37: Layer  33, <cos_sim> = 0.958702
+ 38: Layer   5, <cos_sim> = 0.959416
+ 39: Layer  40, <cos_sim> = 0.96006
+ 40: Layer  30, <cos_sim> = 0.960335
+ 41: Layer  32, <cos_sim> = 0.961425
+ 42: Layer   4, <cos_sim> = 0.963155
+ 43: Layer   1, <cos_sim> = 0.977383
+ 44: Layer   2, <cos_sim> = 0.981096
+======================== sorted attention importances
+  0: Layer   3, <cos_sim> = 0.268473
+  1: Layer   5, <cos_sim> = 0.445389
+  2: Layer   1, <cos_sim> = 0.491229
+  3: Layer   4, <cos_sim> = 0.507703
+  4: Layer   2, <cos_sim> = 0.523524
+  5: Layer   7, <cos_sim> = 0.546491
+  6: Layer   6, <cos_sim> = 0.551201
+  7: Layer   0, <cos_sim> = 0.657228
+  8: Layer   9, <cos_sim> = 0.693649
+  9: Layer   8, <cos_sim> = 0.693792
+ 10: Layer  10, <cos_sim> = 0.715702
+ 11: Layer  11, <cos_sim> = 0.738956
+ 12: Layer  13, <cos_sim> = 0.812073
+ 13: Layer  14, <cos_sim> = 0.819818
+ 14: Layer  12, <cos_sim> = 0.85671
+ 15: Layer  15, <cos_sim> = 0.860875
+ 16: Layer  17, <cos_sim> = 0.888072
+ 17: Layer  18, <cos_sim> = 0.89278
+ 18: Layer  16, <cos_sim> = 0.914259
+ 19: Layer  19, <cos_sim> = 0.931089
+ 20: Layer  21, <cos_sim> = 0.949091
+ 21: Layer  22, <cos_sim> = 0.955978
+ 22: Layer  20, <cos_sim> = 0.958918
+ 23: Layer  23, <cos_sim> = 0.963765
+ 24: Layer  24, <cos_sim> = 0.963995
+ 25: Layer  28, <cos_sim> = 0.965883
+ 26: Layer  43, <cos_sim> = 0.967174
+ 27: Layer  42, <cos_sim> = 0.969761
+ 28: Layer  26, <cos_sim> = 0.970181
+ 29: Layer  25, <cos_sim> = 0.971553
+ 30: Layer  39, <cos_sim> = 0.972275
+ 31: Layer  41, <cos_sim> = 0.975387
+ 32: Layer  29, <cos_sim> = 0.975487
+ 33: Layer  36, <cos_sim> = 0.977112
+ 34: Layer  32, <cos_sim> = 0.978462
+ 35: Layer  38, <cos_sim> = 0.979173
+ 36: Layer  27, <cos_sim> = 0.979313
+ 37: Layer  35, <cos_sim> = 0.980944
+ 38: Layer  34, <cos_sim> = 0.98212
+ 39: Layer  30, <cos_sim> = 0.982521
+ 40: Layer  33, <cos_sim> = 0.982989
+ 41: Layer  37, <cos_sim> = 0.983563
+ 42: Layer  40, <cos_sim> = 0.985181
+ 43: Layer  31, <cos_sim> = 0.985454
+ 44: Layer  44, <cos_sim> = 0.987712
+======================== sorted ffn importances
+  0: Layer   0, <cos_sim> = 0.431108
+  1: Layer   2, <cos_sim> = 0.44518
+  2: Layer   3, <cos_sim> = 0.450093
+  3: Layer   4, <cos_sim> = 0.471592
+  4: Layer   5, <cos_sim> = 0.482406
+  5: Layer   6, <cos_sim> = 0.559887
+  6: Layer   1, <cos_sim> = 0.602544
+  7: Layer   8, <cos_sim> = 0.643123
+  8: Layer   7, <cos_sim> = 0.684008
+  9: Layer   9, <cos_sim> = 0.708513
+ 10: Layer  10, <cos_sim> = 0.718472
+ 11: Layer  13, <cos_sim> = 0.770861
+ 12: Layer  12, <cos_sim> = 0.786273
+ 13: Layer  44, <cos_sim> = 0.811898
+ 14: Layer  14, <cos_sim> = 0.832882
+ 15: Layer  11, <cos_sim> = 0.841347
+ 16: Layer  16, <cos_sim> = 0.847809
+ 17: Layer  17, <cos_sim> = 0.867317
+ 18: Layer  18, <cos_sim> = 0.875668
+ 19: Layer  15, <cos_sim> = 0.886359
+ 20: Layer  19, <cos_sim> = 0.932629
+ 21: Layer  21, <cos_sim> = 0.935681
+ 22: Layer  20, <cos_sim> = 0.936905
+ 23: Layer  22, <cos_sim> = 0.94295
+ 24: Layer  23, <cos_sim> = 0.944582
+ 25: Layer  27, <cos_sim> = 0.947721
+ 26: Layer  24, <cos_sim> = 0.95027
+ 27: Layer  25, <cos_sim> = 0.952
+ 28: Layer  43, <cos_sim> = 0.953131
+ 29: Layer  35, <cos_sim> = 0.954686
+ 30: Layer  31, <cos_sim> = 0.954798
+ 31: Layer  38, <cos_sim> = 0.958932
+ 32: Layer  26, <cos_sim> = 0.960332
+ 33: Layer  37, <cos_sim> = 0.960368
+ 34: Layer  28, <cos_sim> = 0.96127
+ 35: Layer  29, <cos_sim> = 0.961706
+ 36: Layer  34, <cos_sim> = 0.962314
+ 37: Layer  36, <cos_sim> = 0.964392
+ 38: Layer  32, <cos_sim> = 0.965215
+ 39: Layer  33, <cos_sim> = 0.9656
+ 40: Layer  39, <cos_sim> = 0.965828
+ 41: Layer  41, <cos_sim> = 0.966507
+ 42: Layer  30, <cos_sim> = 0.966721
+ 43: Layer  42, <cos_sim> = 0.967369
+ 44: Layer  40, <cos_sim> = 0.970084
+llama_print_timings:        load time =   89422.20 ms
+llama_print_timings:      sample time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_print_timings: prompt eval time = 3082057.10 ms / 415744 tokens (    7.41 ms per token,   134.89 tokens per second)
+llama_print_timings:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_print_timings:       total time = 3182699.59 ms / 415745 tokens

logs/perplexity-Step-3.5-Flash-BF16.log ADDED Viewed

	@@ -0,0 +1,202 @@

+#!/usr/bin/env bash
+# echo 0 | sudo tee /proc/sys/kernel/numa_balancing
+# sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
+model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf
+#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-Q8_0.gguf
+#model=/mnt/data/models/stepfun-ai/Step-3.5-Flash-Int4/step3p5_flash_Q4_K_S-00001-of-00012.gguf
+#model=/mnt/raid/hf/Step-3.5-Flash-GGUF/IQ4_XS/Step-3.5-Flash-IQ4_XS-00001-of-00004.gguf
+#model=/mnt/raid/hf/Step-3.5-Flash-GGUF/IQ5_K/Step-3.5-Flash-IQ5_K-00001-of-00004.gguf
+#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-IQ3_KS.gguf
+#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-smol-IQ3_KS.gguf
+#model=/mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-IQ2_KL.gguf
+# Check if the SOCKET variable is unset or empty.
+if [[ -z "${SOCKET}" ]]; then
+  # If it is, print an error to standard error and exit with a non-zero status.
+  echo "Error: The SOCKET environment variable is not set." >&2
+  exit 1
+else
+  # If it is set, print its value and exit successfully.
+  echo "SOCKET is set to: ${SOCKET}"
+fi
+SOCKET="${SOCKET}"
+numactl -N "$SOCKET" -m "$SOCKET" \
+./build/bin/llama-perplexity \
+    -m "$model" \
+    -f wiki.test.raw \
+    --seed 1337 \
+    --ctx-size 512 \
+    -ub 4096 -b 4096 \
+    --numa numactl \
+    --threads 96 \
+    --threads-batch 128 \
+    --validate-quants \
+    --no-mmap
+SOCKET is set to: 1
+main: build = 4186 (82c4f273)
+main: built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
+main: seed  = 1337
+CPU: using device CPU - 0 MiB free
+llama_model_loader: additional 8 GGUFs metadata loaded.
+llama_model_loader: loaded meta data with 50 key-value pairs and 754 tensors from /mnt/data/models/ubergarm/Step-3.5-Flash-GGUF/Step-3.5-Flash-288x7.4B-BF16-00001-of-00009.gguf (version GGUF V3 (latest))
+llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
+llama_model_loader: - kv   0:                       general.architecture str              = step35
+llama_model_loader: - kv   1:                               general.type str              = model
+llama_model_loader: - kv   2:                               general.name str              = Step 3.5 Flash
+llama_model_loader: - kv   3:                         general.size_label str              = 288x7.4B
+llama_model_loader: - kv   4:                            general.license str              = apache-2.0
+llama_model_loader: - kv   5:                   general.base_model.count u32              = 1
+llama_model_loader: - kv   6:                  general.base_model.0.name str              = Step 3.5 Flash
+llama_model_loader: - kv   7:          general.base_model.0.organization str              = Stepfun Ai
+llama_model_loader: - kv   8:              general.base_model.0.repo_url str              = https://huggingface.co/stepfun-ai/ste...
+llama_model_loader: - kv   9:                         step35.block_count u32              = 45
+llama_model_loader: - kv  10:                      step35.context_length u32              = 262144
+llama_model_loader: - kv  11:                    step35.embedding_length u32              = 4096
+llama_model_loader: - kv  12:                 step35.feed_forward_length u32              = 11264
+llama_model_loader: - kv  13:                step35.attention.head_count arr[i32,45]      = [64, 96, 96, 96, 64, 96, 96, 96, 64, ...
+llama_model_loader: - kv  14:                      step35.rope.freq_base f32              = 5000000.000000
+llama_model_loader: - kv  15:                  step35.rope.freq_base_swa f32              = 10000.000000
+llama_model_loader: - kv  16:                  step35.expert_gating_func u32              = 2
+llama_model_loader: - kv  17:                step35.attention.key_length u32              = 128
+llama_model_loader: - kv  18:              step35.attention.value_length u32              = 128
+llama_model_loader: - kv  19:                          general.file_type u32              = 32
+llama_model_loader: - kv  20:             step35.attention.head_count_kv arr[i32,45]      = [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
+llama_model_loader: - kv  21:            step35.attention.sliding_window u32              = 512
+llama_model_loader: - kv  22:    step35.attention.sliding_window_pattern arr[i32,45]      = [0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, ...
+llama_model_loader: - kv  23:                        step35.expert_count u32              = 288
+llama_model_loader: - kv  24:                   step35.expert_used_count u32              = 8
+llama_model_loader: - kv  25:          step35.expert_feed_forward_length u32              = 1280
+llama_model_loader: - kv  26:   step35.expert_shared_feed_forward_length u32              = 1280
+llama_model_loader: - kv  27:                step35.expert_weights_scale f32              = 3.000000
+llama_model_loader: - kv  28:                 step35.expert_weights_norm bool             = true
+llama_model_loader: - kv  29:           step35.leading_dense_block_count u32              = 3
+llama_model_loader: - kv  30:                  step35.moe_every_n_layers u32              = 1
+llama_model_loader: - kv  31:    step35.attention.layer_norm_rms_epsilon f32              = 0.000010
+llama_model_loader: - kv  32:                    step35.swiglu_clamp_exp arr[f32,45]      = [0.000000, 0.000000, 0.000000, 0.0000...
+llama_model_loader: - kv  33:                  step35.swiglu_clamp_shexp arr[f32,45]      = [0.000000, 0.000000, 0.000000, 0.0000...
+llama_model_loader: - kv  34:               general.quantization_version u32              = 2
+llama_model_loader: - kv  35:                       tokenizer.ggml.model str              = gpt2
+llama_model_loader: - kv  36:                         tokenizer.ggml.pre str              = deepseek-v3
+llama_model_loader: - kv  37:                      tokenizer.ggml.tokens arr[str,128896]  = ["<｜begin▁of▁sentence｜>", "<�...
+llama_model_loader: - kv  38:                  tokenizer.ggml.token_type arr[i32,128896]  = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
+llama_model_loader: - kv  39:                      tokenizer.ggml.merges arr[str,127741]  = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
+llama_model_loader: - kv  40:                tokenizer.ggml.bos_token_id u32              = 0
+llama_model_loader: - kv  41:                tokenizer.ggml.eos_token_id u32              = 128007
+llama_model_loader: - kv  42:            tokenizer.ggml.padding_token_id u32              = 1
+llama_model_loader: - kv  43:               tokenizer.ggml.add_bos_token bool             = true
+llama_model_loader: - kv  44:               tokenizer.ggml.add_sep_token bool             = false
+llama_model_loader: - kv  45:               tokenizer.ggml.add_eos_token bool             = false
+llama_model_loader: - kv  46:                    tokenizer.chat_template str              = {% macro render_content(content) %}{%...
+llama_model_loader: - kv  47:                                   split.no u16              = 0
+llama_model_loader: - kv  48:                                split.count u16              = 9
+llama_model_loader: - kv  49:                        split.tensors.count i32              = 754
+llama_model_loader: - type  f32:  266 tensors
+llama_model_loader: - type bf16:  488 tensors
+load: printing all EOG tokens:
+load:   - 128007 ('<|im_end|>')
+load: special tokens cache size = 818
+load: token to piece cache size = 0.8220 MB
+llm_load_print_meta: format           = GGUF V3 (latest)
+llm_load_print_meta: arch             = step35
+llm_load_print_meta: n_ctx_train      = 262144
+llm_load_print_meta: n_embd           = 4096
+llm_load_print_meta: n_layer          = 45
+llm_load_print_meta: n_head           = [64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64, 96, 96, 96, 64]
+llm_load_print_meta: n_head_kv        = 8
+llm_load_print_meta: n_rot            = 128
+llm_load_print_meta: n_swa            = 512
+llm_load_print_meta: n_swa_pattern    = 1
+llm_load_print_meta: n_embd_head_k    = 128
+llm_load_print_meta: n_embd_head_v    = 128
+llm_load_print_meta: n_gqa            = [8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8, 12, 12, 12, 8]
+llm_load_print_meta: n_embd_k_gqa     = 1024
+llm_load_print_meta: n_embd_v_gqa     = 1024
+llm_load_print_meta: f_norm_eps       = 0.0e+00
+llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
+llm_load_print_meta: f_clamp_kqv      = 0.0e+00
+llm_load_print_meta: f_max_alibi_bias = 0.0e+00
+llm_load_print_meta: f_logit_scale    = 0.0e+00
+llm_load_print_meta: n_ff             = 11264
+llm_load_print_meta: n_expert         = 288
+llm_load_print_meta: n_expert_used    = 8
+llm_load_print_meta: causal attn      = 1
+llm_load_print_meta: pooling type     = 0
+llm_load_print_meta: rope type        = 2
+llm_load_print_meta: rope scaling     = linear
+llm_load_print_meta: freq_base_train  = 5000000.0
+llm_load_print_meta: freq_scale_train = 1
+llm_load_print_meta: n_ctx_orig_yarn  = 262144
+llm_load_print_meta: rope_finetuned   = unknown
+llm_load_print_meta: ssm_d_conv       = 0
+llm_load_print_meta: ssm_d_inner      = 0
+llm_load_print_meta: ssm_d_state      = 0
+llm_load_print_meta: ssm_dt_rank      = 0
+llm_load_print_meta: model type       = ?B
+llm_load_print_meta: model ftype      = BF16
+llm_load_print_meta: model params     = 196.956 B
+llm_load_print_meta: model size       = 366.952 GiB (16.004 BPW)
+llm_load_print_meta: repeating layers = 364.986 GiB (16.004 BPW, 195.900 B parameters)
+llm_load_print_meta: general.name     = Step 3.5 Flash
+print_info: vocab type       = BPE
+print_info: n_vocab          = 128896
+print_info: n_merges         = 127741
+print_info: BOS token        = 0 '<｜begin▁of▁sentence｜>'
+print_info: EOS token        = 128007 '<|im_end|>'
+print_info: EOT token        = 128007 '<|im_end|>'
+print_info: PAD token        = 1 '<｜end▁of▁sentence｜>'
+print_info: LF token         = 201 'Ċ'
+print_info: FIM PRE token    = 128801 '<｜fim▁begin｜>'
+print_info: FIM SUF token    = 128800 '<｜fim▁hole｜>'
+print_info: FIM MID token    = 128802 '<｜fim▁end｜>'
+print_info: EOG token        = 128007 '<|im_end|>'
+print_info: max token length = 256
+llm_load_tensors: ggml ctx size =    0.31 MiB
+llm_load_tensors: offloading 0 repeating layers to GPU
+llm_load_tensors: offloaded 0/46 layers to GPU
+llm_load_tensors:        CPU buffer size = 375759.27 MiB
+....................................................................................................
+llama_new_context_with_model: n_ctx         = 4096
+llama_new_context_with_model: n_batch       = 4096
+llama_new_context_with_model: n_ubatch      = 4096
+llama_new_context_with_model: flash_attn    = 1
+llama_new_context_with_model: attn_max_b    = 0
+llama_new_context_with_model: fused_moe     = 1
+llama_new_context_with_model: grouped er    = 0
+llama_new_context_with_model: fused_up_gate = 1
+llama_new_context_with_model: fused_mmad    = 1
+llama_new_context_with_model: rope_cache    = 0
+llama_new_context_with_model: graph_reuse   = 1
+llama_new_context_with_model: k_cache_hadam = 0
+llama_new_context_with_model: split_mode_graph_scheduling = 0
+llama_new_context_with_model: reduce_type   = f16
+llama_new_context_with_model: sched_async   = 0
+llama_new_context_with_model: ser           = -1, 0
+llama_new_context_with_model: freq_base     = 5000000.0
+llama_new_context_with_model: freq_scale    = 1
+llama_kv_cache_init:        CPU KV buffer size =   720.00 MiB
+llama_new_context_with_model: KV self size  =  720.00 MiB, K (f16):  360.00 MiB, V (f16):  360.00 MiB
+llama_new_context_with_model:        CPU  output buffer size =     3.93 MiB
+llama_new_context_with_model:        CPU compute buffer size =  2078.00 MiB
+llama_new_context_with_model: graph nodes  = 2201
+llama_new_context_with_model: graph splits = 1
+XXXXXXXXXXXXXXXXXXXXX Setting only active experts offload
+system_info: n_threads = 96 (n_threads_batch = 128) / 512 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
+perplexity: tokenizing the input ..
+perplexity: tokenization took 723.567 ms
+perplexity: calculating perplexity over 561 chunks, n_ctx=512, batch_size=4096, n_seq=8
+perplexity: 15.47 seconds per pass - ETA 18.07 minutes
+===================================== llama_new_context_with_model: f16
+======================================= HAVE_FANCY_SIMD is defined
+[1]1.5125,[2]1.9280,[3]1.6178,[4]1.4760,[5]1.4000,[6]1.3378,[7]1.3006,[8]1.2759,[9]1.2557,[10]1.2356,[11]1.2434,[12]1.2544,[13]1.2647,[14]1.3110,[15]1.3541,[16]1.3996,[17]1.5016,[18]1.5846,[19]1.5731,[20]1.5561,[21]1.5612,[22]1.5507,[23]1.5331,[24]1.5278,[25]1.5172,[26]1.5098,[27]1.5009,[28]1.4958,[29]1.4917,[30]1.4977,[31]1.4967,[32]1.4862,[33]1.4809,[34]1.4913,[35]1.4946,[36]1.5053,[37]1.5355,[38]1.5694,[39]1.6011,[40]1.6472,[41]1.6760,[42]1.6825,[43]1.7172,[44]1.7369,[45]1.7776,[46]1.8155,[47]1.8165,[48]1.8110,[49]1.8061,[50]1.7953,[51]1.8169,[52]1.8152,[53]1.8328,[54]1.8418,[55]1.8548,[56]1.8631,[57]1.8643,[58]1.8699,[59]1.8768,[60]1.8919,[61]1.8872,[62]1.9143,[63]1.9293,[64]1.9433,[65]1.9454,[66]1.9417,[67]1.9391,[68]1.9453,[69]1.9449,[70]1.9460,[71]1.9421,[72]1.9417,[73]1.9501,[74]1.9627,[75]1.9628,[76]1.9498,[77]1.9417,[78]1.9370,[79]1.9331,[80]1.9279,[81]1.9234,[82]1.9262,[83]1.9219,[84]1.9180,[85]1.9127,[86]1.9152,[87]1.9229,[88]1.9163,[89]1.9171,[90]1.9167,[91]1.9127,[92]1.9089,[93]1.9056,[94]1.9002,[95]1.9012,[96]1.9053,[97]1.9162,[98]1.9152,[99]1.9091,[100]1.9069,[101]1.9066,[102]1.9153,[103]1.9198,[104]1.9363,[105]1.9437,[106]1.9683,[107]1.9908,[108]2.0093,[109]2.0375,[110]2.0637,[111]2.0878,[112]2.0815,[113]2.0835,[114]2.0887,[115]2.0900,[116]2.0982,[117]2.0991,[118]2.1001,[119]2.0971,[120]2.0958,[121]2.0988,[122]2.0957,[123]2.0949,[124]2.0910,[125]2.0874,[126]2.0863,[127]2.0868,[128]2.0853,[129]2.0883,[130]2.0891,[131]2.0895,[132]2.0910,[133]2.1011,[134]2.1063,[135]2.1041,[136]2.1007,[137]2.0981,[138]2.0948,[139]2.0931,[140]2.0920,[141]2.0920,[142]2.0917,[143]2.0939,[144]2.0946,[145]2.0887,[146]2.0841,[147]2.0816,[148]2.0776,[149]2.0759,[150]2.0711,[151]2.0657,[152]2.0632,[153]2.0603,[154]2.0590,[155]2.0579,[156]2.0557,[157]2.0555,[158]2.0547,[159]2.0544,[160]2.0526,[161]2.0621,[162]2.0724,[163]2.0757,[164]2.0811,[165]2.0870,[166]2.0970,[167]2.0994,[168]2.1125,[169]2.1201,[170]2.1320,[171]2.1389,[172]2.1361,[173]2.1299,[174]2.1337,[175]2.1363,[176]2.1380,[177]2.1385,[178]2.1385,[179]2.1403,[180]2.1426,[181]2.1545,[182]2.1659,[183]2.1785,[184]2.1920,[185]2.2015,[186]2.2151,[187]2.2300,[188]2.2433,[189]2.2494,[190]2.2499,[191]2.2526,[192]2.2557,[193]2.2550,[194]2.2580,[195]2.2577,[196]2.2627,[197]2.2682,[198]2.2709,[199]2.2707,[200]2.2704,[201]2.2808,[202]2.2752,[203]2.2756,[204]2.2758,[205]2.2770,[206]2.2777,[207]2.2782,[208]2.2809,[209]2.2834,[210]2.2825,[211]2.2798,[212]2.2796,[213]2.2796,[214]2.2784,[215]2.2749,[216]2.2745,[217]2.2700,[218]2.2682,[219]2.2687,[220]2.2680,[221]2.2684,[222]2.2646,[223]2.2630,[224]2.2663,[225]2.2666,[226]2.2632,[227]2.2648,[228]2.2669,[229]2.2686,[230]2.2755,[231]2.2821,[232]2.2807,[233]2.2788,[234]2.2786,[235]2.2789,[236]2.2814,[237]2.2857,[238]2.2896,[239]2.2969,[240]2.3025,[241]2.3097,[242]2.3165,[243]2.3226,[244]2.3274,[245]2.3365,[246]2.3413,[247]2.3413,[248]2.3395,[249]2.3395,[250]2.3363,[251]2.3351,[252]2.3388,[253]2.3440,[254]2.3505,[255]2.3528,[256]2.3542,[257]2.3561,[258]2.3563,[259]2.3555,[260]2.3564,[261]2.3564,[262]2.3564,[263]2.3570,[264]2.3558,[265]2.3556,[266]2.3568,[267]2.3584,[268]2.3604,[269]2.3629,[270]2.3620,[271]2.3645,[272]2.3624,[273]2.3610,[274]2.3581,[275]2.3584,[276]2.3541,[277]2.3568,[278]2.3644,[279]2.3720,[280]2.3785,[281]2.3816,[282]2.3827,[283]2.3869,[284]2.3908,[285]2.3995,[286]2.3997,[287]2.4026,[288]2.4078,[289]2.4094,[290]2.4075,[291]2.4083,[292]2.4165,[293]2.4195,[294]2.4216,[295]2.4238,[296]2.4268,[297]2.4273,[298]2.4297,[299]2.4306,[300]2.4315,[301]2.4335,[302]2.4351,[303]2.4356,[304]2.4357,[305]2.4437,[306]2.4474,[307]2.4559,[308]2.4506,[309]2.4480,[310]2.4432,[311]2.4426,[312]2.4399,[313]2.4376,[314]2.4357,[315]2.4354,[316]2.4353,[317]2.4330,[318]2.4308,[319]2.4298,[320]2.4300,[321]2.4270,[322]2.4274,[323]2.4282,[324]2.4255,[325]2.4236,[326]2.4203,[327]2.4176,[328]2.4185,[329]2.4184,[330]2.4218,[331]2.4228,[332]2.4261,[333]2.4255,[334]2.4253,[335]2.4257,[336]2.4261,[337]2.4274,[338]2.4280,[339]2.4294,[340]2.4319,[341]2.4356,[342]2.4404,[343]2.4458,[344]2.4486,[345]2.4474,[346]2.4446,[347]2.4455,[348]2.4445,[349]2.4417,[350]2.4408,[351]2.4423,[352]2.4415,[353]2.4421,[354]2.4420,[355]2.4420,[356]2.4403,[357]2.4410,[358]2.4415,[359]2.4387,[360]2.4372,[361]2.4374,[362]2.4370,[363]2.4360,[364]2.4361,[365]2.4331,[366]2.4331,[367]2.4333,[368]2.4315,[369]2.4314,[370]2.4304,[371]2.4320,[372]2.4343,[373]2.4323,[374]2.4299,[375]2.4292,[376]2.4321,[377]2.4358,[378]2.4335,[379]2.4320,[380]2.4310,[381]2.4324,[382]2.4333,[383]2.4354,[384]2.4386,[385]2.4416,[386]2.4447,[387]2.4495,[388]2.4516,[389]2.4481,[390]2.4448,[391]2.4411,[392]2.4397,[393]2.4389,[394]2.4375,[395]2.4345,[396]2.4323,[397]2.4286,[398]2.4258,[399]2.4222,[400]2.4188,[401]2.4144,[402]2.4113,[403]2.4074,[404]2.4042,[405]2.4002,[406]2.3964,[407]2.3934,[408]2.3907,[409]2.3869,[410]2.3861,[411]2.3874,[412]2.3864,[413]2.3888,[414]2.3894,[415]2.3861,[416]2.3825,[417]2.3850,[418]2.3814,[419]2.3801,[420]2.3776,[421]2.3747,[422]2.3706,[423]2.3670,[424]2.3663,[425]2.3636,[426]2.3602,[427]2.3576,[428]2.3562,[429]2.3537,[430]2.3505,[431]2.3470,[432]2.3453,[433]2.3430,[434]2.3409,[435]2.3391,[436]2.3380,[437]2.3377,[438]2.3381,[439]2.3395,[440]2.3425,[441]2.3478,[442]2.3534,[443]2.3516,[444]2.3511,[445]2.3515,[446]2.3537,[447]2.3564,[448]2.3580,[449]2.3595,[450]2.3612,[451]2.3634,[452]2.3642,[453]2.3656,[454]2.3641,[455]2.3664,[456]2.3674,[457]2.3700,[458]2.3738,[459]2.3739,[460]2.3745,[461]2.3727,[462]2.3734,[463]2.3768,[464]2.3811,[465]2.3792,[466]2.3804,[467]2.3820,[468]2.3835,[469]2.3839,[470]2.3849,[471]2.3872,[472]2.3892,[473]2.3895,[474]2.3912,[475]2.3928,[476]2.3930,[477]2.3936,[478]2.3945,[479]2.3961,[480]2.3975,[481]2.3948,[482]2.3958,[483]2.3948,[484]2.3977,[485]2.4024,[486]2.4038,[487]2.4061,[488]2.4079,[489]2.4099,[490]2.4128,[491]2.4155,[492]2.4188,[493]2.4186,[494]2.4172,[495]2.4168,[496]2.4166,[497]2.4169,[498]2.4168,[499]2.4157,[500]2.4170,[501]2.4207,[502]2.4199,[503]2.4202,[504]2.4209,[505]2.4228,[506]2.4245,[507]2.4259,[508]2.4280,[509]2.4251,[510]2.4246,[511]2.4238,[512]2.4222,[513]2.4199,[514]2.4195,[515]2.4192,[516]2.4170,[517]2.4164,[518]2.4161,[519]2.4153,[520]2.4149,[521]2.4149,[522]2.4137,[523]2.4146,[524]2.4141,[525]2.4148,[526]2.4135,[527]2.4115,[528]2.4113,[529]2.4105,[530]2.4100,[531]2.4090,[532]2.4065,[533]2.4042,[534]2.4025,[535]2.4023,[536]2.4037,[537]2.4056,[538]2.4072,[539]2.4089,[540]2.4121,[541]2.4149,[542]2.4175,[543]2.4190,[544]2.4184,[545]2.4186,[546]2.4160,[547]2.4136,[548]2.4108,[549]2.4085,[550]2.4071,[551]2.4058,[552]2.4042,[553]2.4031,[554]2.4033,[555]2.4028,[556]2.4056,[557]2.4078,[558]2.4111,[559]2.4132,[560]2.4173,[561]2.4169,
+llama_print_timings:        load time =  168747.99 ms
+llama_print_timings:      sample time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_print_timings: prompt eval time =  879771.94 ms / 287232 tokens (    3.06 ms per token,   326.48 tokens per second)
+llama_print_timings:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_print_timings:       total time =  890647.44 ms / 287233 tokens
+Final estimate: PPL over 561 chunks for n_ctx=512 = 2.4169 +/- 0.01107