Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Catalog entry last reviewed 92 days ago.

Codestral Embed

Model family: embeddings

Context

8,192 tokens

Released

2025-05-27

Openness

closed-api

License

Mistral AI Terms of Service (Proprietary API) · commercial: yes

Cost tier

paid-api

Rating

4.0 ★ — State-of-the-art code retrieval performance with aggressive pricing, and Matryoshka-style variable output dimensions (256 to 3072) let you trade retrieval quality for storage cost smoothly. The go-to first-party choice for code-specific RAG and code-agent retrieval workflows. Closed weights keep this from being the obvious universal pick.

Modalities

text

Capabilities

coding, embeddings

Access

api-first-party, api-third-party

embeddings
code
code-search
code-retrieval
proprietary
closed-api
eu-based
matryoshka

Quick Take

Mistral's code-specialized embedding model — purpose-built for code retrieval, outperforms Voyage Code 3 and OpenAI's embeddings on code benchmarks, and lets you pick your output dimensions to trade quality for storage cost.

Plain-English Description

Codestral Embed does one thing very well: convert source code into vector embeddings optimized for retrieval. That narrow focus matters because general-purpose text embedding models (OpenAI's text-embedding-3-large, Cohere Embed, Voyage AI's general models) treat code as just another text input. Codestral Embed was trained specifically on code with the downstream tasks of code search, repository retrieval, and code-agent context retrieval in mind. The results show it — on Mistral's own benchmarks and on independent code-retrieval evaluations, Codestral Embed outperforms the leading general-purpose embedders for code-specific retrieval, including at smaller output dimensions where general embedders lose quality rapidly.

The most interesting technical feature is variable output dimension. Codestral Embed supports Matryoshka-style nested embeddings: the model produces up to 3,072-dimensional vectors, and you can take the first N dimensions (256, 512, 1024, etc.) as a valid lower-dimensional representation without re-embedding. This is a practical operational win because storage costs for embedding indexes scale linearly with dimensionality — cutting from 3,072 to 512 reduces storage by 6× with modest quality loss. Codestral Embed at dimension 256 and int8 precision still outperforms general-purpose embedders at their full dimensions for code retrieval, which makes it genuinely attractive for very large code indexes.

Codestral Embed is closed-weight — available only through Mistral's hosted API at $0.15 per million tokens (50% discount on batch API for offline indexing). For enterprise customers, Mistral offers on-premise deployment agreements, but the weights themselves aren't publicly released. This is consistent with Mistral's broader pattern for API-first products: Codestral Embed, Mistral Embed (general-purpose), Mistral Moderation, Mistral OCR, and Mistral Saba all follow this model.

Best For

Retrieval-Augmented Generation (RAG) systems over large codebases. Feeding relevant code context into a coding agent or IDE copilot. This is the workload Codestral Embed was built for.
Code search across enterprise repositories. Natural language or code-query search against proprietary code. Strong retrieval quality at variable dimensions matches enterprise-scale search requirements.
Near-duplicate detection and code similarity analysis. Finding functionally similar code across files or repositories. Useful for deduplication, license-policy enforcement, and refactoring-candidate identification.
Code clustering and repository analytics. Unsupervised grouping of code by functionality or architectural pattern. Useful for automated documentation, codebase visualization, and architecture analysis.
High-volume indexing workloads. The 50% batch API discount makes one-time indexing of very large codebases (millions of files) affordable.

Not For

Teams with strict open-weights requirements. Codestral Embed is closed. For open-weight code embeddings, alternatives exist (various community-released Mistral/Ministral fine-tunes, Nomic Embed Code, and others) — none match Codestral Embed's first-party performance but they're inspectable.
General-purpose text embeddings. Codestral Embed is code-specialized. For mixed content (documentation, natural language, code), Mistral Embed (the general-purpose counterpart) or a hybrid approach is better.
Tiny-budget workloads where embedding quality is secondary. $0.15/M tokens is competitive but not free. For small hobby projects, free community code embedders suffice.
Cross-modal retrieval (code + images, code + diagrams). Text-only. Vision-language code retrieval needs different architecture.

License — Plain-English Summary

Proprietary closed-API model. You pay per token to call Mistral's API and you get the right to use the embeddings in your applications. You don't get the model weights, you can't fine-tune it, and you can't redistribute it. For enterprise customers, Mistral will discuss on-premise deployment under separate commercial terms. Standard proprietary-API arrangement.

How It Compares

vs. OpenAI text-embedding-3-large — OpenAI is a general-purpose embedder with broader language coverage; Codestral Embed is code-specialized and outperforms OpenAI on code-retrieval benchmarks, especially at lower dimensions. OpenAI's model is better for mixed content; Codestral Embed is better for code-specific retrieval.
vs. Voyage Code 3 — Direct competitor; Mistral's benchmarks show Codestral Embed outperforming Voyage Code 3 on retrieval tasks. Voyage has broader model offerings and longer track record in embedding-specific products.
vs. Cohere Embed v4.0 — Cohere's general-purpose embedder, not code-specialized. Codestral Embed wins on code; Cohere wins on multilingual and general-purpose coverage.
vs. Mistral Embed (general-purpose) — Mistral's own general-purpose text embedder. Use Codestral Embed for code-heavy content; use Mistral Embed for general documentation, mixed content, or natural-language-dominant workloads.

Under the Hood

Codestral Embed generates output vectors of up to 3,072 dimensions with Matryoshka-style nesting — the first N dimensions of a full vector form a valid lower-dimensional embedding, ordered by retained relevance. This is a practical win over models that require separate model variants or re-embedding for dimensional trade-offs. The model supports float (32-bit default) and int8 precision outputs, with int8 at dimension 256 still outperforming competitors at their full dimensions.

Context window is 8,192 tokens per chunk. Mistral's recommended chunking strategy for code retrieval is 3,000 characters with 1,000-character overlap — larger chunks measurably degrade retrieval performance according to their documentation. The embedding space is optimized for code-to-code retrieval (given a query snippet, find similar code), code-to-text retrieval (given a natural-language query, find relevant code), and text-to-code retrieval (given code documentation, find the code).

The model is accessible via Mistral's Python and TypeScript SDKs, through Spring AI's Mistral integration, via OpenRouter's OpenAI-compatible embeddings API, and through Mistral's batch API for offline indexing workloads. Fine-tuning is not publicly supported — adaptation to domain-specific code is handled through Mistral's enterprise on-premise engagements.

Cost

API input (per 1M tokens): $0.15
API providers: mistral, openrouter
Notes: $0.15 per million tokens via Mistral's API. 50% discount available through the batch API for large offline indexing jobs. On-premise deployment is available for enterprise customers — contact Mistral's applied AI team. Recommended chunking: 3,000 characters with 1,000-character overlap for retrieval use cases (larger chunks degrade retrieval quality).

Pricing data is 92 days old. Verify with the source before relying on it.

Comparable models

Mistral Embed