Cohere releases an open source audio template specifically for transcription

AI company Cohere on Thursday launched its first audio model: Transcribe, an open source automatic speech recognition model that can be used for tasks like note-taking and speech analysis.

Relatively lightweight at just 2 billion parameters, the model is intended for use with consumer GPUs for those who want to self-host it. It currently supports 14 languages: English, French, German, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Chinese, Japanese, Korean, Vietnamese, and Arabic.

Transcribe outperforms models like Zoom Scribe v1, IBM Granite 4.0 1B, ElevenLabs Scribe v2, and Qwen3-ASR-1.7B Speech on, Cowher says. Hugging Face Open ASR Leaderboardachieving an average word error rate (WER) of 5.42, which is lower than any other model in the benchmark.

The company claims that Transcribe achieved an average win rate of 61% compared to other models when human raters evaluated its transcripts for accuracy, consistency, and ease of use. However, this model fell behind its competitors when it had to copy Portuguese, German and Spanish.

Cohere says the Transcribe can process 525 minutes of audio in one minute, which is high for its model class.

The company plans to integrate Transcribe into its enterprise agent orchestration platform, northThe form is provided through it API Free. The form will also be available at Vault modelCohere’s managed inference platform.

Speech recognition models are growing in popularity as demand for note-taking and dictation applications such as Granola and Flow Whisper.

TechCrunch event

San Francisco, California
|
October 13-15, 2026

Earlier this year, it was reported that Cowher He said investors that it was generating annual recurring revenue of $240 million in 2025, and its CEO, Aidan Gomez, was quoted as saying that the startup May be released to the public ‘soon’.

Leave a ReplyCancel Reply