Intelligence that
aligns to you

Capturing meta-aware intelligence text, audio and audio-visual input streams.

Self-hostablePrivacy-first
lulu.whissle.ai
Whissle App Screenshot 1

Quick Start

# Self-host the full stack. Works on macOS, Linux, and WSL.

$ curl -fsSL https://whissle.ai/install.sh | bash

Pulls the Docker image, configures API keys, and starts with Docker Compose.

Real-time natural language tokens

Traditional systems, like LLM and ASR, transcribe quickly but miss deeper meaning. Context, emotion, and intent disappear the moment words are captured or LLMs not work in streaming on the text.

Multi-modal Intelligence

Multi-modal LLMs offer richer insights but can't keep up in real time. You shouldn't have to choose between depth and speed.

Whissle portal visualization

Whissle bridges the gap between discriminative and probabilistic AI.

A modular intelligence layer that converts any stream — audio, text, or video — into transcripts, emotion, intent, and actionable insights. Instantly, privately, at scale.

Stream2Action

Text, audio and video streamed IN, structured intelligence OUT.

META-1 extracts transcription, emotion, intent, entities, age, and gender by understanding in-between words. No accumulated errors, no added latency. Audio & text today, video tomorrow.

20+LanguagesAudio and text to action
7EmotionsReal-time detection
9,900+Action TokensIntent & entity vocab
5Age BucketsVoice biometrics
3Gender ClassesVoice biometrics
SinglePassNo pipeline overhead

18,189 total vocabulary tokens — 9,919 metadata + 8,270 speech tokens decoded in a single CTC beam search. Discriminative AI: grounded outputs, zero hallucination.

Stream2Action Architecture

Any input stream → META-1 → Structured JSON → Actions

Live
Input Stream
META-1Single Pass
JSON Boardaudio_intelligence
TranscriptionReal-time speech-to-text with punctuation
Speaker InfoAge: 28-35Gender: Female
EmotionExcited, Nervous, Composed
IntentCheck_Flights
EntitiesPlaces: London, ParisDate: Tomorrow
Speech AnalysisFluency, pitch, rhythm, vocabulary
Actions
LLMGenerative layer
RouterAuto-dispatch
HumanEscalation
3rd PartyAPIs & webhooks
AudioAvailable now
TextComing next month
Video3-month roadmap

Experience it live

Click Start, speak into your microphone, and watch as Whissle transcribes your speech in real time — with emotion, intent, age, gender, and entity detection built in.

Click Start to begin

Live Transcript

/listen
Click Start to stream your microphone and see real-time transcription with metadata

Ready to meet your personal AI?

Open source, self-hostable, and privacy-first. Try Whissle free in your browser — no sign-up required.

Try Whissle Free