AI Tools.

Search

automatic speech recognition models

32 models · ranked by HuggingFace downloads

speaker-diarization-3.1

speaker-diarization-3.1 is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

10,392,314 ↓ · 1,812 ♡

whisperkit-coreml

whisperkit-coreml is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

9,464,576 ↓ · 170 ♡

whisper-large-v3-turbo

whisper-large-v3-turbo is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

7,440,086 ↓ · 2,981 ♡

whisper-large-v3

whisper-large-v3 is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

5,005,049 ↓ · 5,644 ♡

wav2vec2-large-xlsr-53-russian

wav2vec2-large-xlsr-53-russian is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

4,786,573 ↓ · 74 ♡

mms-300m-1130-forced-aligner

mms-300m-1130-forced-aligner is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

3,676,467 ↓ · 86 ♡

wav2vec2-large-xlsr-53-portuguese

wav2vec2-large-xlsr-53-portuguese is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

3,472,487 ↓ · 53 ♡

voice-activity-detection

voice-activity-detection is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

3,114,831 ↓ · 231 ♡

speaker-diarization-community-1

speaker-diarization-community-1 is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

2,631,892 ↓ · 333 ♡

whisper-small

whisper-small is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

2,143,759 ↓ · 551 ♡

Qwen3-ASR-1.7B

Qwen3-ASR-1.7B is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

1,900,052 ↓ · 772 ♡

whisper-base

whisper-base is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

1,761,447 ↓ · 267 ♡

wav2vec2-large-xlsr-53-polish

wav2vec2-large-xlsr-53-polish is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

1,499,661 ↓ · 12 ♡

distil-large-v3

distil-large-v3 is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

1,466,997 ↓ · 376 ♡

mms-1b-all

mms-1b-all is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

1,335,002 ↓ · 198 ♡

wav2vec2-base-960h

wav2vec2-base-960h is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

1,231,714 ↓ · 396 ♡

wav2vec2-large-xlsr-53-japanese

wav2vec2-large-xlsr-53-japanese is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

957,614 ↓ · 56 ♡

wav2vec2-large-xlsr-53-chinese-zh-cn

wav2vec2-large-xlsr-53-chinese-zh-cn is an open-source automatic-speech-recognition model available on HuggingFace. Details are sourced from the public model registry.

939,660 ↓ · 133 ♡

wav2vec2-large-xlsr-korean

A wav2vec2-large XLSR model fine-tuned on Korean speech data from the Zeroth Korean dataset. Performs automatic speech recognition for Korean audio, leveraging cross-lingual self-supervised pretraining followed by supervised fine-tuning on Korean-specific acoustic patterns.

692,637 ↓ · 55 ♡