DreamOmni2: Multimodal Instruction-based Editing and Generation Paper • 2510.06679 • Published Oct 8, 2025 • 73
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6, 2025 • 118
Code2Video: A Code-centric Paradigm for Educational Video Generation Paper • 2510.01174 • Published Oct 1, 2025 • 33
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29, 2025 • 140
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 184
DiffusionNFT: Online Diffusion Reinforcement with Forward Process Paper • 2509.16117 • Published Sep 19, 2025 • 22
SPATIALGEN: Layout-guided 3D Indoor Scene Generation Paper • 2509.14981 • Published Sep 18, 2025 • 27
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 47
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion Paper • 2507.06165 • Published Jul 8, 2025 • 59
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published Feb 25, 2025 • 36
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10, 2025 • 65