🗂️ AI & Agents · View mindmap

ASR Models

Architectures for automatic-speech-recognition converting acoustic signals to text sequences. Ranges from hybrid HMM-DNN systems to end-to-end Transformer and Conformer networks. Includes streaming, non-streaming, and multimodal variants integrating computer-vision or large-language-model context.

Notable Models & Updates

IBM Granite Speech 4.1: Open ASR model within the Granite 4.1 family spanning language, vision, speech, and embeddings; emphasized for inference speed and enterprise applicability.
Analysis Reference: IBM Granite Speech 4.1 ASR Models: Features, Accuracy, and Enterprise Applications covers features, accuracy benchmarks, and speed evaluation (Sam Witteveen, 2026-05-08).
Key Capabilities: Open-weight availability; optimized for low-latency transcription; part of broader multimodal foundation suite.

Speech Processing
Language Modeling
model-efficiency
open-source

NemoClaw Knowledge Wiki

Explorer

asr-models

ASR Models

Notable Models & Updates

Graph View

Table of Contents

Backlinks

NemoClaw Knowledge Wiki

Explorer

asr-models

ASR Models

Notable Models & Updates

Related Concepts

Graph View

Table of Contents

Backlinks