Getting Started — Android

speech-android provides on-device speech processing for Android using ONNX Runtime. The pipeline runs VAD + STT + TTS with barge-in support, fully offline after model download.

Requirements

Download the pre-built demo app to try it immediately:

Gradle Dependency

Add the SDK to your build.gradle.kts:

implementation("audio.soniqo:speech:0.0.5")

Quick Start

val modelDir = ModelManager.ensureModels(context)
val pipeline = SpeechPipeline(SpeechConfig(modelDir = modelDir))
pipeline.events.collect { event ->
    when (event) {
        is SpeechEvent.TranscriptionCompleted -> println(event.text)
        is SpeechEvent.ResponseDone -> pipeline.resumeListening()
        else -> {}
    }
}
pipeline.start()
pipeline.pushAudio(samples) // 16kHz mono float32
Important

Models auto-download from HuggingFace on first use (~1.2 GB total). After the initial download, all inference runs fully offline.

Models

All models run via ONNX Runtime with NNAPI acceleration. INT8 quantized by default.

ModelTaskSize
Parakeet TDT v3 (INT8)Speech-to-Text (114 languages)490 MB
Kokoro-82M (INT8)Text-to-Speech (7 languages)89 MB
Silero VAD v5Voice Activity Detection1.2 MB
DeepFilterNet3 (FP16)Noise Cancellation4.2 MB

Source code: github.com/soniqo/speech-android

Next Steps