shunxing1234 commited on
Commit
d3728cf
·
verified ·
1 Parent(s): 2a58a3a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -148
README.md CHANGED
@@ -25,178 +25,31 @@ library_name: transformers
25
 
26
  # News
27
 
28
- - We released the technical report of the **KAT-V1 model**, available at https://arxiv.org/pdf/2507.08297.
29
- - Kwaipilot-AutoThink ranks first among all open-source models on [LiveCodeBench Pro](https://livecodebenchpro.com/), a challenging benchmark explicitly designed to prevent data leakage, and even surpasses strong proprietary systems such as Seed and o3-mini.
30
 
31
  ***
32
 
33
  # Introduction
34
 
35
- **KAT (Kwaipilot-AutoThink)** is an open-source large-language model that mitigates *over-thinking* by learning **when** to produce explicit chain-of-thought and **when** to answer directly.
36
-
37
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61ee40a269351366e29972ad/zdnsvBmv6hWIC2Qxxy1fD.png)
38
-
39
- Its development follows a concise two-stage training pipeline:
40
-
41
- <table>
42
- <thead>
43
- <tr>
44
- <th style="text-align:left; width:18%;">Stage</th>
45
- <th style="text-align:left;">Core Idea</th>
46
- <th style="text-align:left;">Key Techniques</th>
47
- <th style="text-align:left;">Outcome</th>
48
- </tr>
49
- </thead>
50
- <tbody>
51
- <tr>
52
- <td><strong>1. Pre-training</strong></td>
53
- <td>Inject knowledge while separating “reasoning” from “direct answering”.</td>
54
- <td>
55
- <em>Dual-regime data</em><br>
56
- • <strong>Think-off</strong> queries labeled via a custom tagging system.<br>
57
- • <strong>Think-on</strong> queries generated by a multi-agent solver.<br><br>
58
- <em>Knowledge Distillation&nbsp;+&nbsp;Multi-Token Prediction</em> for fine-grained utility.
59
- </td>
60
- <td>Base model attains strong factual and reasoning skills without full-scale pre-training costs.</td>
61
- </tr>
62
- <tr>
63
- <td><strong>2. Post-training</strong></td>
64
- <td>Make reasoning optional and efficient.</td>
65
- <td>
66
- <em>Cold-start AutoThink</em> — majority vote sets the initial thinking mode.<br>
67
- <em>Step-SRPO</em> — intermediate supervision rewards correct <strong>mode selection</strong> and <strong>answer accuracy</strong> under that mode.
68
- </td>
69
- <td>Model triggers CoT only when beneficial, reducing token use and speeding inference.</td>
70
- </tr>
71
- </tbody>
72
- </table>
73
-
74
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61ee40a269351366e29972ad/cwFAEh7Rl3f4FU46z8gBZ.png)
75
 
76
 
77
- ***
78
-
79
  # Data Format
80
 
81
 
82
- KAT produces responses in a **structured template** that makes the reasoning path explicit and machine-parsable.
83
- Two modes are supported:
84
-
85
-
86
- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61ee40a269351366e29972ad/H8iAvQMMT02nyvlYnI5q1.jpeg)
87
-
88
-
89
  ## Special Tokens
90
 
91
- | Token | Description |
92
- |-------|-------------|
93
- | `<judge>` | Analyzes the input to decide whether explicit reasoning is needed. |
94
- | `<think_on>` / `<think_off>` | Indicates whether reasoning is **activated** (“on”) or **skipped** (“off”). |
95
- | `<think>` | Marks the start of the chain-of-thought segment when `think_on` is chosen. |
96
- | `<answer>` | Marks the start of the final user-facing answer. |
97
-
98
 
99
  ***
100
 
101
  # 🔧 Quick Start
102
 
103
- ```python
104
- from transformers import AutoTokenizer, AutoModelForCausalLM
105
-
106
- model_name = "Kwaipilot/KAT-V1-40B"
107
-
108
- # load the tokenizer and the model
109
- tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
110
- model = AutoModelForCausalLM.from_pretrained(
111
- model_name,
112
- torch_dtype="auto",
113
- device_map="auto"
114
- )
115
-
116
- # prepare the model input
117
- prompt = "Give me a short introduction to large language model."
118
- messages = [
119
- {"role": "user", "content": prompt}
120
- ]
121
- text = tokenizer.apply_chat_template(
122
- messages,
123
- tokenize=False,
124
- add_generation_prompt=True
125
- )
126
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
127
-
128
- # conduct text completion
129
- generated_ids = model.generate(
130
- **model_inputs,
131
- max_new_tokens=65536,
132
- temperature=0.6,
133
- top_p=0.95,
134
- )
135
- output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
136
- content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
137
- print("prompt:\n", prompt)
138
- print("content:\n", content)
139
- """
140
- prompt:
141
- Give me a short introduction to large language model.
142
- content:
143
- <judge>
144
- The user's request is to provide a concise factual introduction to large language models, which involves retrieving and summarizing basic information. This task is straightforward as it only requires recalling and presenting well-known details without deeper analysis. No complex reasoning is needed here—just a simple explanation will suffice.
145
- </judge>
146
-
147
- <think_off>
148
- <answer>
149
- A **Large Language Model (LLM)** is an advanced AI system trained on vast amounts of text data to understand, generate, and process human-like language. Here’s a concise introduction:
150
-
151
- ### Key Points:
152
- 1. **Training**: Trained on diverse text sources (books, websites, etc.) using deep learning.
153
- 2. **Capabilities**:
154
- - Answer questions, generate text, summarize content, translate languages.
155
- - Understand context, sentiment, and nuances in language.
156
- 3. **Architecture**: Often based on **transformer models** (e.g., BERT, GPT, LLaMA).
157
- 4. **Scale**: Billions of parameters, requiring massive computational resources.
158
- 5. **Applications**: Chatbots, content creation, coding assistance, research, and more.
159
-
160
- ### Examples:
161
- - **OpenAI’s GPT-4**: Powers ChatGPT.
162
- - **Google’s Gemini**: Used in Bard.
163
- - **Meta’s LLaMA**: Open-source alternative.
164
-
165
- ### Challenges:
166
- - **Bias**: Can reflect biases in training data.
167
- - **Accuracy**: May hallucinate "facts" not grounded in reality.
168
- - **Ethics**: Raises concerns about misinformation and job displacement.
169
-
170
- LLMs represent a leap forward in natural language processing, enabling machines to interact with humans in increasingly sophisticated ways. 🌐🤖
171
- </answer>
172
- """
173
- ```
174
 
175
  ***
176
 
177
  # Future Releases
178
 
179
- Looking ahead, we will publish a companion paper that fully documents the **AutoThink training framework**, covering:
180
-
181
- * Cold-start initialization procedures
182
- * Reinforcement-learning (Step-SRPO) strategies
183
- * Data curation and reward design details
184
-
185
- At the same time, we will open-source:
186
 
187
- * **Training resources** – the curated dual-regime datasets and RL codebase
188
- * **Model suite** – checkpoints at 1.5B, 7B, and 13B parameters, all trained with AutoThink gating
189
 
190
 
191
  # Citation
192
 
193
- ```
194
- @techreport{Zhan2025KATV1,
195
- title={KAT-V1: Kwai-AutoThink Technical Report},
196
- author={Zizheng, Zhan and Ken, Deng and Huaixi, Tang and Wen, Xiang and Kun, Wu and others},
197
- year={2025},
198
- institution={arXiv preprint arXiv:2507.08297},
199
- number={arXiv:2507.08297},
200
- url={https://arxiv.org/abs/2507.08297}
201
- }
202
- ```
 
25
 
26
  # News
27
 
 
 
28
 
29
  ***
30
 
31
  # Introduction
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
 
 
 
35
  # Data Format
36
 
37
 
 
 
 
 
 
 
 
38
  ## Special Tokens
39
 
 
 
 
 
 
 
 
40
 
41
  ***
42
 
43
  # 🔧 Quick Start
44
 
45
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
  ***
48
 
49
  # Future Releases
50
 
 
 
 
 
 
 
 
51
 
 
 
52
 
53
 
54
  # Citation
55