FlashSR

यह Soniqo पेज स्थानीय speech-swift / speech-core implementation में FlashSR को दस्तावेज़ करता है। Hugging Face bundle links integration notes के बाद दिए गए हैं।

पहले आंतरिक पेज

Landing cards और docs menus पहले इसी पेज पर आते हैं; source model और bundle links यहीं उपलब्ध रहते हैं।

सारांश

मॉडल	FlashSR
भूमिका	Audio super-resolution for low-bandwidth or lossy audio
Backend	MLX int4 default; int8 available
Output	48 kHz mono waveform, same length as input
भाषाएँ	Audio-content agnostic
लाइसेंस	MIT
स्थिति	Ready through speech upsample and the FlashSR Swift product
Source	FlashSR / AudioSR distillation
Swift product	`FlashSR`
CLI / runtime	`speech upsample`

उपयोग

नीचे का snippet मौजूदा speech-swift API या command से मेल खाता है।

# Upsample a low-bandwidth recording to 48 kHz mono.
.build/release/speech upsample noisy_lowres.wav \
  --variant int4 \
  -o clean_hr.wav

मॉडल लिंक

implementation notes

Download requests the single model.safetensors bundle and config explicitly; this is simple enough that byte-weighting is less critical than for sharded models.
Input is resampled to 48 kHz mono and processed in non-overlapping 5.12 second windows.
INT4 is a download-size optimization; runtime weights dequantize to FP, so memory footprint matches int8.
The model conforms to SpeechEnhancementModel, but its semantics are super-resolution rather than denoising.