Under the hood

A local speech engine handles voice processing on your Mac, while a cloud or local AI provider powers the thinking layer. You choose how much stays on-device.

Gemini recommended · Get your free key at Google AI Studio

On-Device Speech Engine

A custom speech recognizer running locally extracts rich metadata — intent, emotion, entities — directly from your voice before anything else happens.

Smart Routing Layer

Every request is analyzed and routed automatically — writing, research, live assist, or Mac control — based on what you said and what app you're in.

Cloud AI (Preferred)

Add your Gemini API key from Google AI Studio for the best experience. OpenAI & Claude support coming soon.

Local LLM Option

Prefer to keep everything on-device? Use any local model via Docker. Your data never leaves your machine.

Cloud AI vs local LLM

Cloud AI (e.g. Gemini): Best quality and lowest latency from your Mac; your requests go to the provider you choose. Local LLM: Keep everything on your Mac using Docker — no voice or prompts leave your machine. See Docker for the image and setup.

AI suggests; you decide — companion panel

Bring your own model

You can run a local LLM of your choice inside Docker. The app connects to the unified stack (gateway, ASR, agent, LLM) so all processing stays on your machine. For the exact image and ports, see the Docker page.

whissleasr/unified-nollm:latest — one image for the full local stack.
In-call widget — keep everything on your Mac with local mode