Whissle AI Blog

Explore the latest in AI, machine learning, and voice technology.

Gujlish META-ASR: A Bilingual English-Gujarati Speech Model with Built-in Speaker Profiling

Gujlish META-ASR: A Bilingual English-Gu...

We open-source a bilingual English-Gujarati speech recognition model that transcribes speech while e...

By Whissle Research Team

Date

May 16 2026

Building a Hindi Speech Model That Understands Context

Building a Hindi Speech Model That Under...

We benchmark our META-ASR Hindi model against Deepgram Nova-2 and Gemini 2.5 Flash across two test s...

By Whissle Research Team

Date

May 03 2026

Introducing Whissle Browser — The First Browser That Reads the Room

Introducing Whissle Browser — The First ...

Today we're launching Whissle Browser, a next-generation browser with Lulu — an ambient AI companion...

By Karan Jakhar

Date

May 01 2026

Mandarin ASR Beyond Words: Transcription, Demographics, and Named Entities in a Single Pass

Mandarin ASR Beyond Words: Transcription...

We benchmarked Whissle's 157M-parameter Chinese model against Deepgram Nova-3 and Gemini 2.5 Flash o...

By Whissle Research Team

Date

Apr 27 2026

Does Your ASR's Metadata Actually Make AI Responses Better? We Benchmarked 613 Conversations to Find Out.

Does Your ASR's Metadata Actually Make A...

613 conversations across three public datasets. Six evaluation dimensions. The metadata-aware pipeli...

By Whissle Research Team

Date

Apr 20 2026

Beyond Transcription: How a Meta-Aware ASR Model Delivers Words, Emotion, and Intent in 200ms

Beyond Transcription: How a Meta-Aware A...

A single CTC model that outputs transcription and metadata action tokens — emotion, intent, speech r...

By Whissle Research Team

Date

Apr 16 2026

We Benchmarked 3 Streaming ASR Providers Across 17 Hours of Audio. Here's What We Found.

We Benchmarked 3 Streaming ASR Providers...

4,915 samples. Four datasets. Clean speech, Indian-accented tech interviews, noisy soccer broadcasts...

By Whissle Research Team

Date

Apr 10 2026

Using Visuals for Better Sound Awareness

Using Visuals for Better Sound Awareness

Humans naturally use visual cues to understand speech in noisy places. This article explores how we'...

By Whissle Research Team

Date

Apr 18 2025

Meta-aware Voice Action Model

Meta-aware Voice Action Model

To create AI companions that feel genuinely interactive, speech recognition must go beyond raw trans...

By Whissle Research Team

Date

Mar 16 2025

open-sourceSummer of open-source with Whissle and Red Hen Lab

Summer of open-source with Whissle and R...

multi-modal open-source research

Karan Singla

By Karan Singla

Date

Aug 31 2024

researchAudio visual speech recognition

Audio visual speech recognition

noise-aware

Karan Singla

By Karan Singla

Date

Jul 09 2024

research1-step speech processing unit

1-step speech processing unit

speech processing

Karan Singla

By Karan Singla

Date

Apr 10 2024

Ready to meet your personal AI?

Download the browser, try the web app, or build with our APIs — open source, self-hostable, and privacy-first.