Article
Ken Tsui
kenhktsui
AI & ML interests
ML engineer, researcher
VLM, LLM benchmark
Opinions are my own
Recent Activity
liked a model about 20 hours ago
talkie-lm/talkie-1930-13b-it upvoted a paper 2 months ago
A Very Big Video Reasoning Suite liked a model 3 months ago
moonshotai/Kimi-K2.5Organizations
FastText Model for Pretraining Data Curation
-
kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2
Text Classification • Updated • 241 • 28 -
kenhktsui/fineweb-edu-fasttext-classifier
Text Classification • Updated • 19 • 4 -
kenhktsui/code-natural-language-fasttext-classifier
Text Classification • Updated • 546 • 5 -
kenhktsui/math-fasttext-classifier
Text Classification • Updated • 6.4k • 2
Self Correction Bench
Benchmarking LLM capability of external and internal error correction
FastText Model for Pretraining Data Curation
-
kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2
Text Classification • Updated • 241 • 28 -
kenhktsui/fineweb-edu-fasttext-classifier
Text Classification • Updated • 19 • 4 -
kenhktsui/code-natural-language-fasttext-classifier
Text Classification • Updated • 546 • 5 -
kenhktsui/math-fasttext-classifier
Text Classification • Updated • 6.4k • 2
models 34
kenhktsui/math-fasttext-classifier
Text Classification • Updated • 6.4k • 2
kenhktsui/code-natural-language-fasttext-classifier
Text Classification • Updated • 546 • 5
kenhktsui/fineweb-edu-fasttext-classifier
Text Classification • Updated • 19 • 4
kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2
Text Classification • Updated • 241 • 28
kenhktsui/finefineweb-domain-fasttext-classifier
Text Classification • Updated • 92 • 2
kenhktsui/Qwen2.5-3B-Instruct-GRPO-basic-sampling_temp_05
Text Generation • Updated • 5
kenhktsui/Qwen2.5-3B-Instruct-GRPO-minp-sampling_temp_05
Text Generation • Updated • 6
kenhktsui/Qwen-0.5B-GRPO
Text Generation • 0.5B • Updated • 5 • 1
kenhktsui/Qwen-0.5B-GRPO-gsm8k-count-wait-cap-cross-correct
Text Generation • 0.5B • Updated • 7 •
kenhktsui/llama3.1-8b-instruct-thinking-sft-merged-gguf
8B • Updated • 43 • 1
datasets 51
kenhktsui/scli5
Viewer • Updated • 286 • 25
kenhktsui/prm800k_sc
Viewer • Updated • 448 • 27
kenhktsui/gsm8k_sc
Viewer • Updated • 1.31k • 17
kenhktsui/FineFineWeb-First100K
Viewer • Updated • 6.7M • 133
kenhktsui/serp-bench
Updated • 4
kenhktsui/math-classifiers-data
Viewer • Updated • 2M • 206
kenhktsui/longtalk-cot-v0.1
Viewer • Updated • 61.2k • 48 • 13
kenhktsui/code-natural-language-classification-dataset
Viewer • Updated • 4.05M • 283
kenhktsui/github-code-permissive-sample
Viewer • Updated • 3.21M • 393
kenhktsui/llm-data-textbook-quality-v2
Viewer • Updated • 1.01M • 63