Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
Voxtral Mini Transcribe V2
Model family: voxtral
Transcription-optimized Voxtral — $0.003/min via Mistral's API, batch ASR for meetings, podcasts, and recordings. Apache 2.0.
Listing Notes
This is the transcription-workhorse of the Voxtral family and the model that Mistral's hosted transcription endpoint routes to by default. Use this when you want batch speech-to-text — transcribing recorded meetings, podcasts, voice notes — rather than speech understanding (Voxtral Small 24B) or real-time streaming (Voxtral Mini 4B Realtime). At $0.003 per minute of audio on Mistral's API, it's roughly half the per-minute cost of ElevenLabs Scribe while matching transcription quality according to Mistral's comparative benchmarks. Apache 2.0 for self-hostedRunning a model on hardware you control — your own servers, your own cloud instance, or your own laptop — rather than paying to access it through someone else's API. Self-hosting gives you full control over data and predictable costs, but requires the hardware and operational effort to run the model. Only possible with open-weight models. deployment.
Identity
- Creator
- Mistral AI
- Model family
- voxtral
- Release date
- 2026-02-03
Technical specs
- Parameter count
- 3B
- Context window
- 33K tokens
- Modalities
- Audio Input
- Text
- Primary capabilities
- Multilingual
- Speech To Text
License
- License
- Apache 2.0
- Commercial use
- Allowed
- Terms
- Modification ✓
- Redistribution ✓
- Attribution ✓
Access
- Openness
- Open Weight
- Access methods
- Api First Party
- Local Runtime Vllm
- Weights Download Hf
- Cost tier
- Mixed
- Cost details
- $0.003/minute on Mistral's hosted API — roughly half the price of ElevenLabs Scribe for comparable transcription quality according to Mistral's benchmarks.
- audio
- speech-to-text
- transcription
- multilingual
- open-weight
- commercial-friendly
- apache-licensed
- eu-based