Fish Audio S2 Pro

توثق هذه الصفحة من Soniqo نموذج Fish Audio S2 Pro كما هو منفذ في speech-swift / speech-core. روابط Hugging Face موجودة أدناه بعد ملاحظات الدمج.

الصفحة الداخلية أولا

بطاقات الصفحة الرئيسية وقوائم الوثائق تشير إلى هذه الصفحة أولا؛ وتبقى روابط النموذج والحزم داخلها.

لمحة سريعة

النموذج	Fish Audio S2 Pro
الدور	Experimental multilingual TTS with raw-reference cloning and style markers
Backend	MLX fp16
الإخراج	44.1 kHz mono PCM
اللغات	Multilingual
الرخصة	Research / non-commercial bundle; obtain Fish Audio license for commercial exposure
الحالة	Programmatic runtime; CLI integration is still pending
المصدر	Fish Audio S2 Pro
منتج Swift	`FishAudioTTS`
CLI / runtime	Programmatic runtime today; planned speech speak --engine fish-audio

الاستخدام

المقتطف أدناه يطابق API أو الأمر الحالي في speech-swift.

import FishAudioTTS

let model = try await FishAudioTTSModel.fromPretrained()
let pcm = try await model.generate(
    text: "आज मैं बहुत खुश हूँ। [excited]",
    referenceAudioURL: URL(fileURLWithPath: "reference.wav"),
    referenceText: "नमस्ते, यह संदर्भ आवाज है।"
)

روابط النموذج

ملاحظات التنفيذ

Download uses the newer explicit byte-weighted file manifest, so progress reflects real transferred bytes across the multi-GB shards.
Control markers include [pause], [emphasis], [laughing], [excited], [angry], [whisper], [screaming], [shouting], [surprised], and [sad].
The runtime generates 10 DAC codebook rows, then decodes generated or reference-conditioned codebooks through FishAudioCodec.
Current tests cover bundle loading, codebook generation, codec encode/decode, Hindi cloning, and ASR round-trip gates.