HAE-RAE

non-profit

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

seungone authored a paper 5 days ago

Measuring Sycophancy of Language Models in Multi-turn Dialogues

seungone authored a paper 5 days ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

seungone authored a paper 5 days ago

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

View all activity

seungone

authored 5 papers 5 days ago

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Paper • 2505.23840 • Published May 28, 2025 • 2

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Paper • 2508.13141 • Published Aug 18, 2025

VideoJudge: Bootstrapping Enables Scalable Supervision of MLLM-as-a-Judge for Video Understanding

Paper • 2509.21451 • Published Sep 25, 2025

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 17

seungone

authored a paper about 1 month ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published Nov 27, 2025 • 14

Albertmade

authored a paper 2 months ago

AI PB: A Grounded Generative Agent for Personalized Investment Insights

Paper • 2510.20099 • Published Oct 23, 2025

amphora

updated a dataset 3 months ago

HAERAE-HUB/KoSimpleEval

Viewer • Updated Oct 12, 2025 • 123k • 884

Dasool

authored 11 papers 3 months ago

Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap

Paper • 2501.02448 • Published Jan 5, 2025

Multi-Step Reasoning in Korean and the Emergent Mirage

Paper • 2501.05712 • Published Jan 10, 2025

Improving Fine-grained Visual Understanding in VLMs through Text-Only Training

Paper • 2412.12940 • Published Dec 17, 2024

KoMultiText: Large-Scale Korean Text Dataset for Classifying Biased Speech in Real-World Online Services

Paper • 2310.04313 • Published Oct 6, 2023

HRET: A Self-Evolving LLM Evaluation Toolkit for Korean

Paper • 2503.22968 • Published Mar 29, 2025

Towards Machine Unlearning Benchmarks: Forgetting the Personal Identities in Facial Recognition Systems

Paper • 2311.02240 • Published Nov 3, 2023

Better Safe Than Sorry? Overreaction Problem of Vision Language Models in Visual Emergency Recognition

Paper • 2505.15367 • Published May 21, 2025 • 2

No Language Data Left Behind: A Comparative Study of CJK Language Datasets in the Hugging Face Ecosystem

Paper • 2507.04329 • Published Jul 6, 2025

When Good Sounds Go Adversarial: Jailbreaking Audio-Language Models with Benign Inputs

Paper • 2508.03365 • Published Aug 5, 2025 • 4

Ko-PIQA: A Korean Physical Commonsense Reasoning Dataset with Cultural Context

Paper • 2509.11303 • Published Sep 14, 2025

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 26

Cartinoe5930

authored a paper 3 months ago

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 26