Use case · Pipeline

Diarized transcription.
Every speaker named.

From a meeting recording or a call file to a fully-attributed transcript — speech recognition, speaker diarization, and speaker identification stitched into one on-device pipeline. No cloud APIs, no per-minute pricing, no data leaving the device.

What you can build

Four shapes of the same pipeline.

Each shape stitches an ASR + a diarizer + an optional speaker-ID enrolment store. The components are interchangeable; what you choose depends on the audio source and your latency budget.

Meeting minutes

"Alice said …" / "Bob said …" attribution from a single Zoom export.

Call-center analytics

Agent vs. caller turns, sentiment per speaker, on-device for compliance.

Podcast transcripts

Host + guests identified across the episode, word-level timestamps.

Legal / interview records

Court-grade attribution with no audio ever leaving the device.

Deeper reading

Component guides.