Speech Studio

Open-source Mac app for local voice cloning and multi-speaker dialog generation. Drop a voice sample, clone it, write a scene, synthesize — all on your laptop. No API keys, no cloud, no per-character pricing.

A 30-second blind test: a real voice, the same voice cloned locally by Speech Studio, and the same voice cloned by ElevenLabs in the cloud. Can you tell which is which?

What it does

Requirements

Install

Download the latest .dmg from GitHub Releases, open it, drag Speech Studio to /Applications, and launch it:

On first launch macOS Gatekeeper will warn that the developer can't be verified — open it via System Settings → Privacy & Security → Open anyway until notarized builds ship. First-run also downloads ~2.75 GB of VoxCPM2 weights from HuggingFace into ~/.cache/huggingface/hub/; subsequent launches reuse the cache.

Prefer the CLI?

The same voice cloning pipeline ships in the speech CLI: brew install soniqo/tap/speech, then speech speak --engine voxcpm2 --voxcpm2-ref-audio reference.wav -o cloned.wav "Hello, this is my cloned voice." — useful for scripting or pre-rendering batches. See the voice cloning guide for the full flow.

Status

Speech Studio is in active preview (v0.0.2). The source repo at github.com/soniqo/speech-studio tracks the GUI app; star/watch it for notarized release notifications. Linux and Windows builds compile today via speech-core's LiteRT VoxCPM2 engine; on-device runtime is wired but not yet hardware-validated.

What it's built on

Speech Studio is a thin GUI on top of speech-swift, the open-source Swift library that ships every model used in the demo:

Roadmap

Feedback

Open an issue at github.com/soniqo/speech-studio/issues — every one gets read.