Chatterbox Multilingual

यह Soniqo पेज स्थानीय speech-swift / speech-core implementation में Chatterbox Multilingual को दस्तावेज़ करता है। Hugging Face bundle links integration notes के बाद दिए गए हैं।

पहले आंतरिक पेज

Landing cards और docs menus पहले इसी पेज पर आते हैं; source model और bundle links यहीं उपलब्ध रहते हैं।

सारांश

मॉडल	Chatterbox Multilingual
भूमिका	Multilingual zero-shot voice-cloning TTS
Backend	MLX fp16 on Apple; LiteRT default-voice runtime in Speech Core
Output	24 kHz mono waveform
भाषाएँ	23 languages
लाइसेंस	MIT
स्थिति	MLX cloning runtime; LiteRT greedy/default-voice runtime for edge deployments
Source	Resemble AI Chatterbox
Swift product	`ChatterboxTTS`
CLI / runtime	Programmatic speech-swift runtime; LiteRT example CLI in speech-core/examples/litert

उपयोग

नीचे का snippet मौजूदा speech-swift API या command से मेल खाता है।

import ChatterboxTTS

let model = try await ChatterboxTTSModel.fromPretrained()
let pcm = try model.clone(
    referenceSamples: reference,
    sampleRate: 24_000,
    text: "The cloned voice now speaks a new sentence.",
    languageId: "en"
)

मॉडल लिंक

implementation notes

Download is split between the main Chatterbox bundle and the separate S3 tokenizer repository.
Pipeline is T3 text-to-speech tokens, S3Gen flow mel decoder, then HiFi-GAN / HiFTGenerator vocoder.
Reference clips are resampled to 24 kHz for S3Gen and 16 kHz for speaker/token conditioning.
speech-core's LiteRT path ships a default voice; fp16 T3 is recommended for Arabic quality.