Voxtral Transcribes at the Speed of Sound (2026)

Voxtral Transcribes at the Speed of Sound: Introducing Voxtral Transcribe 2

Today, we're thrilled to unveil Voxtral Transcribe 2, a groundbreaking leap in speech-to-text technology. This release introduces two cutting-edge models: Voxtral Mini Transcribe V2 for batch transcription and Voxtral Realtime for real-time applications. Voxtral Realtime is open-source under the Apache 2.0 license, offering unparalleled flexibility and control.

We've also launched an interactive audio playground in Mistral Studio (https://console.mistral.ai/build/audio/speech-to-text) that allows you to test transcription instantly, complete with diarization and timestamps, powered by Voxtral Transcribe 2.

Here's a breakdown of the key features:

Voxtral Mini Transcribe V2

  • State-of-the-art Transcription: Achieves industry-leading accuracy with speaker diarization, context biasing, and word-level timestamps in 13 languages.
  • Lowest Word Error Rate: Voxtral Mini Transcribe V2 boasts the lowest word error rate at the lowest price point, outperforming competitors.
  • Multilingual Support: Supports 13 languages, including English, Chinese, Hindi, Spanish, Arabic, French, Portuguese, Russian, German, Japanese, Korean, Italian, and Dutch.

Voxtral Realtime

  • Real-Time Transcription: Purpose-built for low-latency applications, achieving sub-200ms latency, ideal for voice agents and real-time applications.
  • Multilingual Excellence: Multilingual, achieving strong transcription performance in 13 languages.
  • Edge Deployment: Deployable on edge devices for privacy-first applications.

Best-in-Class Efficiency

  • Industry-Leading Accuracy: Offers industry-leading accuracy at a fraction of the cost, outperforming competitors in transcription quality.
  • Lowest Price Point: Voxtral Mini Transcribe V2 offers the best price-performance ratio in the market.

Transforming Voice Applications

Voxtral empowers a wide range of voice applications across diverse industries:
- Meeting Intelligence: Transcribes multilingual recordings with speaker diarization, enabling accurate meeting content annotation.
- Voice Agents and Virtual Assistants: Enables conversational AI with sub-200ms transcription latency for natural voice interfaces.
- Contact Center Automation: Real-time transcription for sentiment analysis, response suggestions, and CRM field population.
- Media and Broadcast: Live multilingual subtitle generation with minimal latency and context biasing for technical terms.
- Compliance and Documentation: Regulatory compliance monitoring and transcription with clear speaker attribution and precise audit trails.

Get Started

Voxtral Mini Transcribe V2 is available now via API at $0.003 per minute. Voxtral Realtime is available via API at $0.006 per minute and as open weights on Hugging Face.

Explore the full capabilities of Voxtral Transcribe 2 and Mistral's audio transcription features in our documentation: https://docs.mistral.ai/capabilities/audio_transcription

Join Our Team: We're hiring passionate individuals to build world-class speech AI. Apply now: https://mistral.ai/careers

Voxtral Transcribes at the Speed of Sound (2026)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Edmund Hettinger DC

Last Updated:

Views: 5711

Rating: 4.8 / 5 (58 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Edmund Hettinger DC

Birthday: 1994-08-17

Address: 2033 Gerhold Pine, Port Jocelyn, VA 12101-5654

Phone: +8524399971620

Job: Central Manufacturing Supervisor

Hobby: Jogging, Metalworking, Tai chi, Shopping, Puzzles, Rock climbing, Crocheting

Introduction: My name is Edmund Hettinger DC, I am a adventurous, colorful, gifted, determined, precious, open, colorful person who loves writing and wants to share my knowledge and understanding with you.