Fu-En Yang's picture

Fu-En Yang

FuEnYang

·

https://fuenyang1127.github.io/

AI & ML interests

Computer Vision, Deep Learning, Vision-Language Models (VLMs), Vision-Language-Action Models (VLAs), Reasoning Models, Embodied AI

Recent Activity

upvoted a paper 5 days ago

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

upvoted a paper 10 days ago

RISE: Self-Improving Robot Policy with Compositional World Model

upvoted a paper 10 days ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

View all activity

Organizations

upvoted a paper 5 days ago

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

Paper • 2602.18422 • Published 8 days ago • 30

upvoted 6 papers 10 days ago

RISE: Self-Improving Robot Policy with Compositional World Model

Paper • 2602.11075 • Published 17 days ago • 30

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published 16 days ago • 57

Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution

Paper • 2602.12684 • Published 15 days ago • 7

RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models

Paper • 2602.12628 • Published 15 days ago • 11

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 19 days ago • 49

Olaf-World: Orienting Latent Actions for Video World Modeling

Paper • 2602.10104 • Published 18 days ago • 27

upvoted a paper 16 days ago

PhyCritic: Multimodal Critic Models for Physical AI

Paper • 2602.11124 • Published 17 days ago • 52

upvoted 3 papers 18 days ago

Self-Improving World Modelling with Latent Actions

Paper • 2602.06130 • Published 23 days ago • 30

WorldCompass: Reinforcement Learning for Long-Horizon World Models

Paper • 2602.09022 • Published 19 days ago • 20

Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

Paper • 2602.07845 • Published 20 days ago • 69

upvoted 3 papers 22 days ago

MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Paper • 2602.02474 • Published 26 days ago • 57

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published 23 days ago • 36

RISE-Video: Can Video Generators Decode Implicit World Rules?

Paper • 2602.05986 • Published 23 days ago • 26

upvoted 6 papers 23 days ago

UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

Paper • 2602.02437 • Published 26 days ago • 77

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published 26 days ago • 251

Green-VLA: Staged Vision-Language-Action Model for Generalist Robots

Paper • 2602.00919 • Published 28 days ago • 305

No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs

Paper • 2602.02103 • Published 26 days ago • 71

EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models

Paper • 2602.04515 • Published 24 days ago • 38

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Paper • 2602.02958 • Published 26 days ago • 33