Whissle AI Blog
Explore the latest in AI, machine learning, and voice technology.

Gujlish META-ASR: A Bilingual English-Gu...
We open-source a bilingual English-Gujarati speech recognition model that transcribes speech while e...
By Whissle Research Team
Date
May 16 2026

Building a Hindi Speech Model That Under...
We benchmark our META-ASR Hindi model against Deepgram Nova-2 and Gemini 2.5 Flash across two test s...
By Whissle Research Team
Date
May 03 2026

Introducing Whissle Browser — The First ...
Today we're launching Whissle Browser, a next-generation browser with Lulu — an ambient AI companion...
By Karan Jakhar
Date
May 01 2026

Mandarin ASR Beyond Words: Transcription...
We benchmarked Whissle's 157M-parameter Chinese model against Deepgram Nova-3 and Gemini 2.5 Flash o...
By Whissle Research Team
Date
Apr 27 2026

Does Your ASR's Metadata Actually Make A...
613 conversations across three public datasets. Six evaluation dimensions. The metadata-aware pipeli...
By Whissle Research Team
Date
Apr 20 2026

Beyond Transcription: How a Meta-Aware A...
A single CTC model that outputs transcription and metadata action tokens — emotion, intent, speech r...
By Whissle Research Team
Date
Apr 16 2026

We Benchmarked 3 Streaming ASR Providers...
4,915 samples. Four datasets. Clean speech, Indian-accented tech interviews, noisy soccer broadcasts...
By Whissle Research Team
Date
Apr 10 2026

Using Visuals for Better Sound Awareness
Humans naturally use visual cues to understand speech in noisy places. This article explores how we'...
By Whissle Research Team
Date
Apr 18 2025

Meta-aware Voice Action Model
To create AI companions that feel genuinely interactive, speech recognition must go beyond raw trans...
By Whissle Research Team
Date
Mar 16 2025

Summer of open-source with Whissle and R...
multi-modal open-source research

By Karan Singla
Date
Aug 31 2024


