arxiv:2506.00956
Sungmin Cha
csm9493
·
AI & ML interests
None yet
Organizations
models
25
csm9493/67_five_dataset_shuffle_10000_fewshot_lora_all_r16_alpha32_lr_1e5_decay_1e2_cosine_epoch_3_mbs_4
Text Generation
•
7B
•
Updated
•
7
csm9493/26_one_dataset_cot_lora_all_r16_alpha32_lr_3e5_decay_1e2_cosine_epoch_3_mbs_4
Text Generation
•
7B
•
Updated
•
8
csm9493/43_five_dataset_shuffle_10000_cot_lora_all_r4_alpha8_lr_1e-05_decay_1e2_cosine_epoch_3_mbs_16
Text Generation
•
7B
•
Updated
•
3
csm9493/43_five_dataset_shuffle_10000_cot_lora_all_r32_alpha64_lr_1e-05_decay_1e2_cosine_epoch_3_mbs_16
Text Generation
•
7B
•
Updated
•
8
csm9493/43_five_dataset_shuffle_10000_cot_lora_all_r8_alpha16_lr_1e-05_decay_1e2_cosine_epoch_3_mbs_16
Text Generation
•
7B
•
Updated
•
5
csm9493/43_five_dataset_shuffle_10000_cot_lora_all_r16_alpha32_lr_1e-05_decay_1e2_cosine_epoch_3_mbs_16
Text Generation
•
7B
•
Updated
•
8
csm9493/24_three_dataset_shuffle_50000_cot_lora_all_r16_alpha32_lr_3e5_decay_1e2_cosine_epoch_2_mbs_4
Text Generation
•
7B
•
Updated
•
7
csm9493/41_three_dataset_shuffle_3200_cot_lora_all_r128_alpha256_lr_3e5_decay_1e2_cosine_epoch_3_mbs_4
Text Generation
•
7B
•
Updated
•
7
csm9493/23_one_dataset_cot_lora_all_r16_alpha32_lr_3e5_decay_1e2_cosine_epoch_2_mbs_4
Text Generation
•
7B
•
Updated
•
9
csm9493/37_three_dataset_shuffle_3200_cot_lora_all_r64_alpha128_lr_3e5_decay_1e2_cosine_epoch_3_mbs_4
Text Generation
•
7B
•
Updated
•
8
datasets
0
None public yet