Speech Transcription with parakeet-tdt-0.6b-v2

This demo showcases parakeet-tdt-0.6b-v2, a 600-million-parameter model designed for high-quality English speech recognition.

Key Features:

Automatic punctuation and capitalization
Accurate word-level timestamps (click on a segment in the table below to play it!)
Efficiently transcribes long audio segments (up to 20 minutes) (For even longer audios, see this script)
Robust performance on spoken numbers, and song lyrics transcription

This model is available for commercial and non-commercial use.

Transcription Results (Click row to play segment)

Transcription Segments

Transcription Segments

Selected Segment