Qwen3 Lambda Gates — Knowledge/Reasoning Disentanglement
Collection
Per-neuron sigmoid gates on Qwen3 FFN neurons to disentangle factual knowledge from reasoning. • 10 items • Updated
1.7B scale-up of qwen3-0.6b-lambda-gates-chat. Same four variants, same recipe, larger base model.
| Folder | scale_mode |
λ_f |
λ_r |
unmasked_retain_weight |
|---|---|---|---|---|
chat_energy/ |
energy | 0.1 | 0.5 | 0.0 |
chat_energy_optA/ |
energy | 0.1 | 0.5 | 0.05 |
chat_energy_optB/ |
energy | 1.0 | 0.5 | 0.1 |
chat_mean/ |
mean | 0.1 | 0.5 | 0.0 |
Common hyperparameters: β=4.0, distill T=2.0, forget_retain_ratio=1:2, lr=1e-2 cosine, 3 epochs, bf16, init_logit_std=0.1, use_chat_template=True, enable_thinking=False.
| Metric | Value |
|---|---|
| Total gates | 172,032 (28 layers × 6144 intermediate) |
| Mean sigmoid gate | ≈ 0.501 |
| Std | ≈ 0.030 |
| Min / Max | 0.321 / 0.672 |
| Threshold | Off-fraction |
|---|---|
| 0.451 | 4.9% |
| 0.482 | 25.4% |
| 0.501 | 49.5% |
| 0.521 | 75.1% |
| 0.551 | 95.0% |
Per-variant thresholds are in each folder's thresholds.txt / gate_stats.json.
<variant>/
lambda_logits.pt # 172,032 per-neuron logits
neuron_indices.json # Knowledge neurons at threshold 0.5
gate_stats.json # Statistics + selected thresholds
thresholds.txt # Comma-separated thresholds
import torch, json
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-1.7B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-1.7B", torch_dtype=torch.bfloat16)
messages = [{"role": "user", "content": "What is 2+2?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
gate_state = torch.load("chat_energy_optA/lambda_logits.pt", map_location="cpu")
See the 0.6B chat README for the complete gating recipe and scale-mode notes.