Bilingual LMs ( L1 {es fr de pl tr ar zh} + L2 en ) trained on Cultura-X for L1 and FineWebEdu (L2)
Suchir Salhan
suchirsalhan
AI & ML interests
Multilinguality and Cognitively-Inspired AI. Tokenization, Pretraining, Interpretability & Alignment.
Recent Activity
updated a dataset 13 minutes ago
Beetle-Data/en-for-ko-2B-pretok published a dataset 13 minutes ago
Beetle-Data/en-for-ko-2B-pretok updated a model 38 minutes ago
Beetle-FineWeb-2B/beetle-bilingual-l2-50-simultaneous-b2-fineweb-2b-eus-eng-1xa100