Programming/Web Development News

MacWhisper has Automatic Speaker Recognition now

Simon Willison…
2025-11-19 7 min read
MacWhisper has Automatic Speaker Recognition now
MacWhisper has Automatic Speaker Recognition now

<p>Inspired by <a href="https://news.ycombinator.com/item?id=45970519#45971014">this conversation</a> on Hacker News I decided to upgrade <a href="https://goodsnooze.gumroad.com/l/macwhisper">MacWhisp...

Inspired by this conversation on Hacker News I decided to upgrade MacWhisper to try out NVIDIA Parakeet and the new Automatic Speaker Recognition feature.

It appears to work really well! Here's the result against this 39.7MB m4a file from my Gemini 3 Pro write-up this morning:

A screenshot of the MacWhisper transcription application interface displaying a file named "HMB_compressed." The center panel shows a transcript of a City Council meeting. Speaker 2 begins, "Thank you, Mr. Mayor, uh City Council... Victor Hernandez, Spanish interpreter," followed by Spanish instructions: "Buenas noches, les queremos dejar saber a todos ustedes que pueden acceder lo que es el canal de Zoom..." Speaker 1 responds, "Thank you. Appreciate that. Can we please have a roll call?" Speaker 3 then calls out "Councilmember Johnson?" and "Councilmember Nagengast?" to which Speaker 1 answers, "Here." The interface includes metadata on the right indicating the model "Parakeet v3" and a total word count of 26,109.

You can export the transcript with both timestamps and speaker names using the Share -> Segments > .json menu item:

A close-up of the MacWhisper interface showing the export dropdown menu with "Segments" selected. A secondary menu lists various file formats including .txt, .csv, and .pdf, with a red arrow pointing specifically to the ".json" option, set against the background of the meeting transcript.

Here's the resulting JSON.

Tags: whisper, nvidia, ai, speech-to-text, macwhisper

Source: Simon Willison's Weblog Word count: 2129 words
Published on 2025-11-19 06:19