🔒 Service Notice: Cloud services temporarily down — reinforcing our on-prem AI. Contact: hello@whissle.ai🔒 Service Notice: Cloud services temporarily down — reinforcing our on-prem AI. Contact: hello@whissle.ai

← Back to Documentation

Quick Start

Deploy Whissle Gateway

One command to run the full voice AI stack on any NVIDIA GPU:

docker run -d --gpus all -p 9000:9000 \
  -v whissle-trt:/tmp/trt_engines \
  whissleasr/whissle-gateway:standard

The gateway is ready at http://localhost:9000 after ~2 minutes (TensorRT engines build on first run, cached after).

What's Inside

ComponentDetails
ASR23 languages, 440ms TTFT, TensorRT accelerated
LLM3B params, 265 tok/s on GPU, OpenAI-compatible API
TTSHuman-quality Orpheus, EN + Hindi, 230ms TTFB
DiarizationECAPA-TDNN speaker encoder, multi-speaker separation
MetadataEmotion, intent, age, gender, entities per utterance

Verify

# Health check
curl http://localhost:9000/

# Quick ASR test
curl -X POST http://localhost:9000/asr/transcribe \
  -F "file=@audio.wav" -F "language=en"