FlashSR
यह Soniqo पेज स्थानीय speech-swift / speech-core implementation में FlashSR को दस्तावेज़ करता है। Hugging Face bundle links integration notes के बाद दिए गए हैं।
पहले आंतरिक पेज
Landing cards और docs menus पहले इसी पेज पर आते हैं; source model और bundle links यहीं उपलब्ध रहते हैं।
सारांश
| मॉडल | FlashSR |
|---|---|
| भूमिका | Audio super-resolution for low-bandwidth or lossy audio |
| Backend | MLX int4 default; int8 available |
| Output | 48 kHz mono waveform, same length as input |
| भाषाएँ | Audio-content agnostic |
| लाइसेंस | MIT |
| स्थिति | Ready through speech upsample and the FlashSR Swift product |
| Source | FlashSR / AudioSR distillation |
| Swift product | FlashSR |
| CLI / runtime | speech upsample |
उपयोग
नीचे का snippet मौजूदा speech-swift API या command से मेल खाता है।
# Upsample a low-bandwidth recording to 48 kHz mono.
.build/release/speech upsample noisy_lowres.wav \
--variant int4 \
-o clean_hr.wav
मॉडल लिंक
implementation notes
- Download requests the single model.safetensors bundle and config explicitly; this is simple enough that byte-weighting is less critical than for sharded models.
- Input is resampled to 48 kHz mono and processed in non-overlapping 5.12 second windows.
- INT4 is a download-size optimization; runtime weights dequantize to FP, so memory footprint matches int8.
- The model conforms to SpeechEnhancementModel, but its semantics are super-resolution rather than denoising.