Running on Zero 670 IndexTTS 2 Demo π’ 670 Generate expressive voice from text using audio reference
Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 Image-Text-to-Text β’ 31B β’ Updated Nov 26, 2025 β’ 151k β’ 94