Marcos Remar commited on
Commit
9bbdeda
·
1 Parent(s): b65e164

Add version information for v1.0-cosyvoice-300m

Browse files
Files changed (1) hide show
  1. VERSION.md +30 -0
VERSION.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CosyVoice Version Information
2
+
3
+ ## Current Version: v1.0-cosyvoice-300m
4
+
5
+ ### Models Installed:
6
+ - CosyVoice-300M (Main model)
7
+ - CosyVoice-300M-SFT (Supervised Fine-Tuning)
8
+ - CosyVoice-300M-direct (Zero-shot inference)
9
+ - CosyVoice-ttsfrd (Required resources)
10
+
11
+ ### Features:
12
+ - Multi-language TTS (Chinese, English, Japanese, Korean)
13
+ - Zero-shot voice cloning
14
+ - Cross-lingual synthesis
15
+ - GPU acceleration with RTX A5000
16
+
17
+ ### Performance:
18
+ - Generation speed: ~1x real-time
19
+ - Model loading: 5-10 seconds
20
+ - GPU: RTX A5000 (24GB VRAM)
21
+
22
+ ### Known Issues:
23
+ - Chinese accent in English/Portuguese synthesis
24
+ - Model trained primarily on Chinese data
25
+
26
+ ### Next Version:
27
+ - CosyVoice2-0.5B (downloading)
28
+ - Improved English pronunciation
29
+ - Lower latency (150ms)
30
+ - 30-50% reduction in pronunciation errors