Post
33
expanding my small dataset using
- contemplation on text (for further CPT)
- q&a generation (for GRPO)
after doing GRPO, the successful ones go again with a SFT.
almost doubled my dataset. although the new ones are synthetic, they are from important sources and important matters. focusing on controversial claims more than anything else because these actually move models.
started fine tuning qwen 3.6. using vibe coding to play with LoRA adapters. i made lots of LoRAs for qwen 3.5 and now i can apply them to 3.6 except one tensor type. all of MLP matches to 3.6 and most of attentions match to 3.6. that will save me a lot of time. fine tune of 3.6 will probably appear faster, with a better alignment since the dataset is expanded.
started a truth db project where i will compare all the claims in the world with each other and give them a score. claims will fight with each other, supporting or weakening each other. the result hopefully will be very useful for better fine tuning LLMs. it will also automate my curation processes..
- contemplation on text (for further CPT)
- q&a generation (for GRPO)
after doing GRPO, the successful ones go again with a SFT.
almost doubled my dataset. although the new ones are synthetic, they are from important sources and important matters. focusing on controversial claims more than anything else because these actually move models.
started fine tuning qwen 3.6. using vibe coding to play with LoRA adapters. i made lots of LoRAs for qwen 3.5 and now i can apply them to 3.6 except one tensor type. all of MLP matches to 3.6 and most of attentions match to 3.6. that will save me a lot of time. fine tune of 3.6 will probably appear faster, with a better alignment since the dataset is expanded.
started a truth db project where i will compare all the claims in the world with each other and give them a score. claims will fight with each other, supporting or weakening each other. the result hopefully will be very useful for better fine tuning LLMs. it will also automate my curation processes..