Yinxu Pan

cppowboy

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

upvoted a paper about 1 hour ago

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

liked a model 1 day ago

moonshotai/Kimi-K2.6

upvoted a paper 2 days ago

Qwen3.5-Omni Technical Report

View all activity

Organizations

upvoted a paper about 1 hour ago

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

Paper • 2604.18543 • Published 2 days ago • 19

upvoted a paper 2 days ago

Qwen3.5-Omni Technical Report

Paper • 2604.15804 • Published 5 days ago • 43

upvoted 3 papers 7 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 8 days ago • 12

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2604.12374 • Published 8 days ago • 36

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published 8 days ago • 84

upvoted 2 papers 11 days ago

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published 13 days ago • 282

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 13 days ago • 258

upvoted 2 papers 21 days ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 338

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 24 days ago • 144

upvoted a paper 26 days ago

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

Paper • 2603.24755 • Published 28 days ago • 30

upvoted a paper 29 days ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 77

upvoted 9 papers about 1 month ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 139

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published Mar 17 • 58

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published Mar 17 • 58

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published Mar 17 • 308

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published Mar 16 • 185

SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?

Paper • 2603.15401 • Published Mar 16 • 19

daVinci-Env: Open SWE Environment Synthesis at Scale

Paper • 2603.13023 • Published Mar 13 • 30

In-Context Reinforcement Learning for Tool Use in Large Language Models

Paper • 2603.08068 • Published Mar 9 • 43

Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams

Paper • 2603.07392 • Published Mar 8 • 18

Yinxu Pan

AI & ML interests

Recent Activity

Organizations

cppowboy's activity