VLM4D: Towards Spatiotemporal Awareness in Vision Language Models Paper • 2508.02095 • Published Aug 4, 2025 • 10
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization Paper • 2604.04787 • Published Apr 6 • 12
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 7 days ago • 92
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 7 days ago • 92
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion Paper • 2408.03178 • Published Aug 6, 2024 • 40