Speech Transcription with parakeet-tdt-0.6b-v2

This demo showcases parakeet-tdt-0.6b-v2, a 600-million-parameter model designed for high-quality English speech recognition.

Key Features:

  • Automatic punctuation and capitalization
  • Accurate word-level timestamps (click on a segment in the table below to play it!)
  • Efficiently transcribes long audio segments (up to 20 minutes) (For even longer audios, see this script)
  • Robust performance on spoken numbers, and song lyrics transcription

This model is available for commercial and non-commercial use.

๐ŸŽ™๏ธ Learn more about the Model | ๐Ÿ“„ Fast Conformer paper | ๐Ÿ“š TDT paper | ๐Ÿง‘โ€๐Ÿ’ป NeMo Repository

Example Audio Files (Click to Load)

Transcription Results (Click row to play segment)

Transcription Segments

Transcription Segments