Yuanxing Zhang

LongoXC

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

upvoted a paper 20 days ago

GARDO: Reinforcing Diffusion Models without Reward Hacking

upvoted a paper 27 days ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

View all activity

Organizations

None yet

upvoted a paper 10 days ago

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Paper • 2601.10061 • Published 11 days ago • 30

upvoted a paper 20 days ago

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published 27 days ago • 29

upvoted a paper 27 days ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Paper • 2512.15560 • Published Dec 17, 2025 • 25

upvoted 3 papers about 1 month ago

updated a dataset about 1 month ago

LongoXC/exp_1212_y3sb

Viewer • Updated Dec 17, 2025 • 2.9k

upvoted a paper about 1 month ago

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Paper • 2512.12675 • Published Dec 14, 2025 • 41

published a dataset about 1 month ago

LongoXC/exp_1212_y3sb

Viewer • Updated Dec 17, 2025 • 2.9k

authored a paper about 2 months ago

ViDiC: Video Difference Captioning

Paper • 2512.03405 • Published Dec 3, 2025 • 28

upvoted a paper 2 months ago

Monet: Reasoning in Latent Visual Space Beyond Images and Language

Paper • 2511.21395 • Published Nov 26, 2025 • 17

upvoted 3 papers 3 months ago

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Paper • 2511.07250 • Published Nov 10, 2025 • 18

MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues

Paper • 2510.17722 • Published Oct 20, 2025 • 20

IF-VidCap: Can Video Caption Models Follow Instructions?

Paper • 2510.18726 • Published Oct 21, 2025 • 26

published a dataset 3 months ago

LongoXC/testupload

Viewer • Updated Oct 17, 2025 • 4.2k • 1

updated a dataset 3 months ago

LongoXC/testupload

Viewer • Updated Oct 17, 2025 • 4.2k • 1

upvoted 4 papers 3 months ago

VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning

Paper • 2510.10518 • Published Oct 12, 2025 • 19

ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding

Paper • 2510.11498 • Published Oct 13, 2025 • 11

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Paper • 2510.10689 • Published Oct 12, 2025 • 47

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Paper • 2510.10395 • Published Oct 12, 2025 • 31

Yuanxing Zhang

AI & ML interests

Recent Activity

Organizations

LongoXC's activity