OmniVoice

यह Soniqo पेज स्थानीय speech-swift / speech-core implementation में OmniVoice को दस्तावेज़ करता है। Hugging Face bundle links integration notes के बाद दिए गए हैं।

पहले आंतरिक पेज

Landing cards और docs menus पहले इसी पेज पर आते हैं; source model और bundle links यहीं उपलब्ध रहते हैं।

सारांश

मॉडल	OmniVoice
भूमिका	Massively multilingual zero-shot voice-cloning TTS
Backend	MLX int8 default bundle; fp16 bundle available
Output	24 kHz mono waveform
भाषाएँ	600+ languages
लाइसेंस	Apache-2.0 upstream family
स्थिति	Programmatic speech-swift runtime used by Studio sidecar
Source	k2-fsa OmniVoice
Swift product	`OmniVoiceTTS`
CLI / runtime	Programmatic runtime; not a primary speech speak engine yet

उपयोग

नीचे का snippet मौजूदा speech-swift API या command से मेल खाता है।

import OmniVoiceTTS

let model = try await OmniVoiceTTSModel.fromPretrained()
let pcm = try model.generate(
    text: "A new sentence in the reference speaker's voice.",
    referenceAudio: URL(fileURLWithPath: "reference.wav"),
    referenceText: "This is the reference voice.",
    language: "en"
)

मॉडल लिंक

implementation notes

Download repairs incomplete caches by checking the backbone, tokenizer files, and audio_tokenizer model before loading.
Generation iteratively unmasks eight acoustic codebooks with classifier-free guidance.
The runtime combines a bidirectional Qwen3 backbone with a Higgs-audio v2 codec encoder/decoder.
Optional instructions cover restricted style controls such as accent, age, gender, pitch, and whisper.