AI Glossary

Whisper

OpenAI's open-source automatic speech recognition model that transcribes and translates speech in 100+ languages with near-human accuracy.

Architecture

Whisper uses a transformer encoder-decoder trained on 680,000 hours of multilingual audio-text pairs from the internet. It handles transcription, translation, language identification, and voice activity detection in a single model.

Impact

Released as open-source, Whisper became the foundation for countless speech applications. Its multilingual capability and robustness to noise made high-quality speech recognition accessible to everyone.

← Back to AI Glossary

Whisper

Architecture

Impact

Related Articles

Audio Transformers: Whisper and the Future of Speech AI

Related Concepts