Getting Started — Android
speech-android provides on-device speech processing for Android using ONNX Runtime. The pipeline runs VAD + STT + TTS with barge-in support, fully offline after model download.
Requirements
- Android 8+ (API 26)
- arm64-v8a architecture
Download the pre-built demo app to try it immediately:
Gradle Dependency
Add the SDK to your build.gradle.kts:
implementation("audio.soniqo:speech:0.0.5")
Quick Start
val modelDir = ModelManager.ensureModels(context)
val pipeline = SpeechPipeline(SpeechConfig(modelDir = modelDir))
pipeline.events.collect { event ->
when (event) {
is SpeechEvent.TranscriptionCompleted -> println(event.text)
is SpeechEvent.ResponseDone -> pipeline.resumeListening()
else -> {}
}
}
pipeline.start()
pipeline.pushAudio(samples) // 16kHz mono float32
Important
Models auto-download from HuggingFace on first use (~1.2 GB total). After the initial download, all inference runs fully offline.
Models
All models run via ONNX Runtime with NNAPI acceleration. INT8 quantized by default.
| Model | Task | Size |
|---|---|---|
| Parakeet TDT v3 (INT8) | Speech-to-Text (114 languages) | 490 MB |
| Kokoro-82M (INT8) | Text-to-Speech (7 languages) | 89 MB |
| Silero VAD v5 | Voice Activity Detection | 1.2 MB |
| DeepFilterNet3 (FP16) | Noise Cancellation | 4.2 MB |
Source code: github.com/soniqo/speech-android
Next Steps
- Benchmarks — Android inference performance
- Linux C API — embedded Linux setup