🔒 Service Notice: Cloud services temporarily down — reinforcing our on-prem AI. Contact: hello@whissle.ai🔒 Service Notice: Cloud services temporarily down — reinforcing our on-prem AI. Contact: hello@whissle.ai

Frequently Asked Questions

Find answers to common questions about Whissle's AI assistant, voice and text metadata APIs, and platform.

  • Home
  • Frequently Asked Questions

Frequently asked questions

Everything you need to know about Whissle — from the personal AI assistant to our speech-to-text and text intelligence APIs.

What is Whissle?

Whissle is a self-hosted voice AI platform that combines real-time speech recognition, LLM inference, text-to-speech, speaker diarization, and rich metadata extraction — all running on a single GPU. Deploy on-prem, on your own cloud, or on edge devices.

What metadata does Whissle extract?

Beyond transcription, Whissle extracts rich metadata in a single pass — emotion, intent, named entities, speaker diarization, age, gender, and punctuation. No separate models or API calls needed.

How does Whissle compare to cloud speech APIs?

Whissle's META-1 model performs transcription and metadata extraction simultaneously, unlike cloud pipelines that require separate models. Self-hosted means zero data leaves your network, lower latency (no network hops), and no per-minute API costs.

Is Whissle free?

Yes — the self-hosted Docker gateway is free to deploy on your own hardware. No usage fees for self-hosted. Cloud API with managed infrastructure returning soon.

How do I deploy Whissle?

One command: docker run -d --gpus all -p 9000:9000 whissleasr/whissle-gateway:standard. Works on any NVIDIA GPU — T4, RTX 3090, A100, H100. Full docs at whissle.ai/docs.

What is Lulu?

Lulu is Whissle's ambient AI companion — like a private Alexa that runs on your hardware. She listens, understands emotion and intent, and adapts to your communication style. Available in Whissle Browser and the macOS app.

What languages does Whissle support?

23 languages via the 1B-parameter model, plus specialized models for Hindi-English (Hinglish), Mandarin, and Gujarati. TTS supports English and Hindi with human-quality Orpheus voices.

Can I run Whissle on edge devices?

Yes. Whissle Gateway runs on any NVIDIA GPU — from DGX Spark and Mac Mini with eGPU to enterprise data center hardware. Cloud is only used for optional health monitoring and OTA model updates.

What is Instant Intelligence?

Instant Intelligence is AI that begins understanding and acting while you're still speaking — not after transcription is complete. Whissle's streaming-first architecture processes voice, text, and visual signals in real-time, extracting intent, emotion, and entities during the input stream.

Can Whissle integrate with my existing search engine?

Yes. Whissle integrates as a drop-in layer on top of Solr, Elasticsearch, OpenSearch, or custom search indices. Voice queries are converted to structured filters (entities, date ranges, intents) as the user speaks, improving search relevance without replacing your existing infrastructure.