Surrogate code verifiers across three model sizes trained using multiple different algorithms as described in the Aletheia paper
Aletheia
community
AI & ML interests
None defined yet.
models 21
Aletheia-Bench/DPO-Think-14B
Text Generation • 15B • Updated • 10 • 1
Aletheia-Bench/DPO-Think-1.5B
Text Generation • 2B • Updated • 10
Aletheia-Bench/BatchOnline-GRPO-7B
Text Generation • 8B • Updated • 7 • 1
Aletheia-Bench/BatchOnline-GRPO-14B
Text Generation • 15B • Updated • 10 • 1
Aletheia-Bench/BatchOnline-GRPO-1.5B
Text Generation • 2B • Updated • 10
Aletheia-Bench/GRPO-Think-14B-8k
Text Generation • 15B • Updated • 1 • 1
Aletheia-Bench/GRPO-Think-7B-8k
Text Generation • 8B • Updated • 1
Aletheia-Bench/GRPO-Think-14B-4k
Text Generation • 15B • Updated
Aletheia-Bench/RAFT-7B
8B • Updated • 7
Aletheia-Bench/GRPO-Think-1.5B-8k
Text Generation • 2B • Updated • 1
datasets 6
Aletheia-Bench/Aletheia-Heldout
Viewer • Updated • 33.3k • 49
Aletheia-Bench/Aletheia-Strong
Viewer • Updated • 57.3k • 48
Aletheia-Bench/Aletheia-Train
Viewer • Updated • 50k • 12
Aletheia-Bench/Aletheia-Adv
Viewer • Updated • 18k • 44
Aletheia-Bench/Aletheia-DPO
Viewer • Updated • 50k • 15
Aletheia-Bench/Aletheia-Hard
Viewer • Updated • 18k • 47