Update README.md
Browse files
README.md
CHANGED
|
@@ -138,15 +138,18 @@ We provide our training datasets:
|
|
| 138 |
|
| 139 |
Please refer to our blog and research paper for more technical details of Satori.
|
| 140 |
- [Blog](https://satori-reasoning.github.io/blog/satori/)
|
| 141 |
-
- [Paper](https://
|
| 142 |
|
| 143 |
# **Citation**
|
| 144 |
If you find our model and data helpful, please cite our paper:
|
| 145 |
```
|
| 146 |
-
@
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
|
|
|
|
|
|
|
|
|
| 151 |
}
|
| 152 |
```
|
|
|
|
| 138 |
|
| 139 |
Please refer to our blog and research paper for more technical details of Satori.
|
| 140 |
- [Blog](https://satori-reasoning.github.io/blog/satori/)
|
| 141 |
+
- [Paper](https://arxiv.org/pdf/2502.02508)
|
| 142 |
|
| 143 |
# **Citation**
|
| 144 |
If you find our model and data helpful, please cite our paper:
|
| 145 |
```
|
| 146 |
+
@misc{shen2025satorireinforcementlearningchainofactionthought,
|
| 147 |
+
title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
|
| 148 |
+
author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
|
| 149 |
+
year={2025},
|
| 150 |
+
eprint={2502.02508},
|
| 151 |
+
archivePrefix={arXiv},
|
| 152 |
+
primaryClass={cs.CL},
|
| 153 |
+
url={https://arxiv.org/abs/2502.02508},
|
| 154 |
}
|
| 155 |
```
|