Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
40.3
TFLOPS
6
2
15
hanzlajavaid
PRO
hanzla
Follow
walmayali's profile picture
SzwenskaChriss's profile picture
Kayplanet's profile picture
53 followers
·
20 following
AI & ML interests
Direct Preference Optimization, Supervised Finetuning, Stable Diffusion
Recent Activity
posted
an
update
19 days ago
Reinforcement learning can sometimes lead to emergent behavior through much simpler training setups compared to large scale pre-training. I explored this idea by running a small GRPO experiment on Qwen3.5 4B, and the results were pretty exciting. Hypothesis: improving visual mathematical reasoning may also improve the model’s ability to transcribe LaTeX from images. I wrote a short breakdown of the experiment here: https://hanzlajavaid.github.io/blog/grpo-experiment-exploring-emergent-properties/
updated
a model
27 days ago
hanzla/Qwen3.5-4B-mathvista-GRPO
published
a model
27 days ago
hanzla/Qwen3.5-4B-mathvista-GRPO
View all activity
Organizations
hanzla
's datasets
3
Sort: Recently updated
hanzla/STEM_Reasoning
Viewer
•
Updated
Mar 20, 2025
•
23.5k
•
12
•
1
hanzla/webinstruct-reasoning-sft
Viewer
•
Updated
Mar 9, 2025
•
5.23k
•
10
hanzla/datascience-instruct
Viewer
•
Updated
Mar 24, 2024
•
6.83k
•
34
•
2