Text Generation
• 2B • Updated
• 8
• 1
Text Generation
• 0.2B • Updated
• 10
• 2
Text Generation
• 0.5B • Updated
• 9
• 1
Text Generation
• 1B • Updated
• 435
• 1
Text Generation
• 7B • Updated
• 10
• 3
fla-hub/SmolLM-1.7b-predecay
2B • Updated
• 1
Text Generation
• 0.2B • Updated
• 78
• 6
Text Generation
• 0.5B • Updated
• 8
• 1
Text Generation
• 2B • Updated
• 88
• 1
Text Generation
• 3B • Updated
• 14
• 3
Text Generation
• 7B • Updated
• 302
• 3
fla-hub/Qwen2.5-3B-Instruct
3B • Updated
• 254
8B • Updated
• 2
fla-hub/Qwen2.5-7B-Instruct
8B • Updated
Text Generation
• 3B • Updated
• 88
• 4
Text Generation
• 2B • Updated
• 196
• 10
Text Generation
• 0.2B • Updated
• 120
• 2
Text Generation
• 0.5B • Updated
• 96
• 2
Text Generation
• 1B • Updated
• 15
• 1
Text Generation
• 0.4B • Updated
• 2
• 1
Text Generation
• 0.2B • Updated
• 24
• 5
fla-hub/transformer-340M-4K-0.5B-20480-lr3e-4-decay0.1-sqrt
0.4B • Updated
• 2
fla-hub/transformer-340M-4K-0.5B-20480-lr3e-4-cosine
0.4B • Updated
• 133
• 1
fla-hub/transformer-3B-qwen2.5
3B • Updated
• 1
fla-hub/transformer-3B-qwen2.5-instruct
3B • Updated
fla-hub/transformer-1.5B-qwen2.5-instruct
2B • Updated
fla-hub/transformer-1.5B-qwen2.5
2B • Updated
• 2
• 1
fla-hub/transformer-340M-10B
Text Generation
• 0.3B • Updated
• 3
fla-hub/delta_net-1.3B-100B
Text Generation
• 1B • Updated
• 882
Text Generation
• 3B • Updated
• 12