Use case · Conversational

Voice in.
Voice out.

Three shapes of voice-first interfaces — a single full-duplex speech-to-speech model, a compositional wake → VAD → ASR → LLM → TTS pipeline you fully control, and wake-word activation for hands-free entry. All on-device, no cloud APIs, no audio leaving the device.

Get started Speech-to-speech guide

Ứng dụng desktop

Studio tạo giọng. Runner trò chuyện bằng giọng đó.

Speech Studio và Runner là hai mặt của cùng một ngăn xếp giọng nói cục bộ: một cho sản xuất giọng nói, một cho tương tác tác nhân giọng nói trực tiếp.

Runner Agent

Chạy toàn bộ vòng lặp từ mic đến trợ lý giọng nói cục bộ; bản xem trước hiện tại nhắm tới ngân sách bộ nhớ Apple Silicon nhỏ gọn.

Thử Runner

Speech Studio

Clone giọng, so sánh mẫu và tạo giọng nói nhiều người nói cục bộ trên Mac.

Mở Speech Studio

Three sub-use-cases

Pick the shape that fits your product.

Drop-in dialogue model, compositional pipeline with per-stage control, or a thin wake-word trigger. Each runs entirely on-device.

Full-duplex speech-to-speech

A single model takes mic input and produces voice output. Drop-in OpenAI-Realtime-compatible WebSocket; minimal code, opaque internals.

Compositional voice pipeline

Wake-word → VAD → streaming ASR → on-device LLM → TTS. Per-stage control, transcript visibility, swap engines freely. Build your own Siri.

Wake-word activation

Hands-free trigger for any voice flow. Custom keywords with per-phrase thresholds, sub-5 MB on-device, 26× real-time.

Deeper reading

Component guides.

PersonaPlex 7B

Qwen3.5 Chat

Streaming Dictation

Voice Activity Detection

Wake-Word / KWS

speech-server

Voice in.Voice out.

Studio tạo giọng. Runner trò chuyện bằng giọng đó.

Pick the shape that fits your product.

Component guides.

Voice in.
Voice out.