Question 1

What is speaker diarization?

Accepted Answer

The process of automatically working out “who spoke when” in audio: it partitions a recording by speaker and labels each one, separate from transcribing what was said.

Question 2

Speaker diarization vs transcription - what’s the difference?

Accepted Answer

Transcription captures what was said; diarization captures who said it. This tool does both: a transcript with speaker labels.

Question 3

How does speaker diarization work?

Accepted Answer

It splits audio into short frames, turns each into a voice “embedding,” clusters similar voices together, and assigns each cluster a speaker label - then aligns those labels to the transcript.

Question 4

Is this tool really free?

Accepted Answer

Yes. Upload audio and get a speaker-labeled transcript free in your browser; sign in with Google to run it. No app install.

Question 5

How many speakers can it detect?

Accepted Answer

Up to 32, labeled automatically (Speaker 1, Speaker 2…). You can rename labels after.

Question 6

Can it tell who said what in a meeting or interview?

Accepted Answer

Yes - that’s the main use case: meetings, interviews, calls, podcasts and focus groups.

Question 7

What audio formats and length are supported?

Accepted Answer

mp3, m4a, wav, aac and ogg, up to 70 MB (about 1 hour of audio) per file.

Question 8

How accurate is it with accents or overlapping speech?

Accepted Answer

Strong on clear multi-speaker audio across 90+ languages; heavy crosstalk and overlap are the hardest case for any diarizer.

Question 9

Do I need to install anything?

Accepted Answer

No. It’s browser-based; there’s nothing to download.

Question 10

Is my audio private?

Accepted Answer

Your audio file is auto-deleted within 24 hours and your transcript after 30 days; neither is used for AI training.

Question 11

Can I edit or rename the speakers?

Accepted Answer

Yes - rename labels, correct text, and export (TXT and SRT) right in the browser.

Question 12

What’s the difference vs open-source (pyannote / WhisperX)?

Accepted Answer

Those need Python and setup; this gives you the same “who said what” output instantly, with no code.

Free Speaker Diarization: See Who Said What in Your Audio

How it works

Sample output

Who it's for

Frequently asked questions