Benchmarks — Android
RTF (real-time factor) below 1.0 means faster than real-time.
Android (ONNX Runtime)
Measured on Android emulator (arm64-v8a, no NNAPI). Real hardware with NNAPI is significantly faster.
| Model | Task | Audio | Inference | RTF |
|---|---|---|---|---|
| Parakeet TDT v3 | STT (114 languages) | 1.5s | 175ms | 0.12 |
| Kokoro 82M | TTS (7 languages) | 1.9s output | 1,075ms | 0.58 |
| Silero VAD v5 | VAD | 32ms chunk | <1ms | <0.01 |
| DeepFilterNet3 | Noise cancellation | 32ms chunk | ~5ms | ~0.15 |
| Platform | Acceleration | Chipsets |
|---|---|---|
| Android | NNAPI | Snapdragon 8 Gen 1+, Exynos 2200+, Google Tensor G2+ |
| Embedded Linux | QNN (Hexagon DSP) | SA8295P, SA8255P |
| Any | CPU (XNNPACK) | All arm64-v8a / x86_64 |
Note
Android benchmarks are from emulator without hardware acceleration. On real Snapdragon hardware with NNAPI delegation, expect 2–3x faster inference. Total model size: ~1.2 GB (INT8 quantized ONNX).