Chatterbox Multilingual
यह Soniqo पेज स्थानीय speech-swift / speech-core implementation में Chatterbox Multilingual को दस्तावेज़ करता है। Hugging Face bundle links integration notes के बाद दिए गए हैं।
पहले आंतरिक पेज
Landing cards और docs menus पहले इसी पेज पर आते हैं; source model और bundle links यहीं उपलब्ध रहते हैं।
सारांश
| मॉडल | Chatterbox Multilingual |
|---|---|
| भूमिका | Multilingual zero-shot voice-cloning TTS |
| Backend | MLX fp16 on Apple; LiteRT default-voice runtime in Speech Core |
| Output | 24 kHz mono waveform |
| भाषाएँ | 23 languages |
| लाइसेंस | MIT |
| स्थिति | MLX cloning runtime; LiteRT greedy/default-voice runtime for edge deployments |
| Source | Resemble AI Chatterbox |
| Swift product | ChatterboxTTS |
| CLI / runtime | Programmatic speech-swift runtime; LiteRT example CLI in speech-core/examples/litert |
उपयोग
नीचे का snippet मौजूदा speech-swift API या command से मेल खाता है।
import ChatterboxTTS
let model = try await ChatterboxTTSModel.fromPretrained()
let pcm = try model.clone(
referenceSamples: reference,
sampleRate: 24_000,
text: "The cloned voice now speaks a new sentence.",
languageId: "en"
)
मॉडल लिंक
implementation notes
- Download is split between the main Chatterbox bundle and the separate S3 tokenizer repository.
- Pipeline is T3 text-to-speech tokens, S3Gen flow mel decoder, then HiFi-GAN / HiFTGenerator vocoder.
- Reference clips are resampled to 24 kHz for S3Gen and 16 kHz for speaker/token conditioning.
- speech-core's LiteRT path ships a default voice; fp16 T3 is recommended for Arabic quality.