| | --- |
| | base_model: Qwen/Qwen2.5-32B-Instruct |
| | library_name: transformers |
| | model_name: step-conditional-control |
| | tags: |
| | - generated_from_trainer |
| | - trl |
| | - sft |
| | license: apache-2.0 |
| | --- |
| | |
| | # Model Summary |
| |
|
| | - **Repository:** [simplescaling/s1](https://github.com/simplescaling/s1) |
| | - **Paper:** https://arxiv.org/abs/2501.19393 |
| |
|
| | # Use |
| |
|
| | This is the token-conditional control model for our paper. You can evaluate using the information [here](https://github.com/simplescaling/s1?tab=readme-ov-file#evaluation). |
| |
|
| | # Training information |
| |
|
| | [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/hashimoto-group/o1/runs/i3e03g4y) |
| |
|
| | - TRL: 0.13.0 |
| | - Transformers: 4.48.0 |
| | - Pytorch: 2.3.1 |
| | - Datasets: 3.0.1 |
| | - Tokenizers: 0.21.0 |
| |
|
| | # Citation |
| |
|
| | ```bibtex |
| | @misc{muennighoff2025s1simpletesttimescaling, |
| | title={s1: Simple test-time scaling}, |
| | author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, |
| | year={2025}, |
| | eprint={2501.19393}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2501.19393}, |
| | } |
| | ``` |