Review



image

image

Summary

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

Most heavily developed recently by:

pros

cons

Next

Would love to get a better perspective on what the biggest issues for this repo are and if anyone is using it in production.

The Show and tell has helpful stories.

Also examples exist on https://huggingface.co/spaces/aadnk/whisper-webui