Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · Qwen · Qwen2.5-Math-7B

Feature-frozen. The creator has frozen feature development on this model (critical fixes only).

DeepSeek-R1-Distill-Qwen-7B

distillation derivative of Qwen2.5-Math-7B by DeepSeek

Fine-tuned (distilled) from Qwen2.5-Math-7B on 800K reasoning samples generated by DeepSeek-R1, transferring R1's step-by-step chain-of-thought reasoning into a smaller dense model.

Size

small (7.0B params)

Context

131,072 tokens

Released

2025-01-19

Openness

open-weight

License

MIT License (over Apache 2.0 base) · commercial: yes

Cost tier

mixed

Rating

4.0 ★ — A laptop-class reasoning model with strong math, a clean MIT-over-Apache license, and broad ecosystem support — one of the most practical small reasoning models available.

Modalities

text

Capabilities

coding, math, reasoning

Access

local-runtime-llama-cpp, local-runtime-lm-studio, local-runtime-ollama, local-runtime-vllm, weights-download-hf

llm
open-weight
commercial-friendly
small
reasoning
math
on-device
distillation
china-based
apache-2-0

Quick Take

A laptop-class reasoning model: R1's chain-of-thought distilled onto Qwen2.5-Math-7B, with a clean MIT-over-Apache license and a big following.

Plain-English Description

The 7B R1 distill is one of the most widely used small reasoning models. It takes the full DeepSeek-R1's step-by-step problem-solving and compresses it into a 7-billion-parameter model built on Qwen2.5-Math, small enough to run on a single consumer GPU or a capable laptop.

It's particularly strong on math and logic, where the distilled chain-of-thought pays off, and it's a common choice for private, offline reasoning assistants. Like all the distills, it's a reasoning specialist rather than a general-purpose chatbot.

For a business that wants a capable reasoning model running entirely in-house at no per-token cost, the 7B distill hits a sweet spot of capability, footprint, and clean licensing.

Best For

A private, offline reasoning assistant for math, logic, and code on a laptop or single GPU.
Edge deployments needing real step-by-step reasoning in a small footprint.
Cost-free local experimentation and fine-tuning.
Drop-in reasoning for pipelines where a clean commercial license matters.

Not For

General chat or open-ended writing — it's tuned for structured reasoning.
The strongest reasoning at this scale-up — DeepSeek-R1-Distill-Qwen-14B and the 32B go further.
Multimodal tasks — text only.

License — Plain-English Summary

This distill is unusually clean on licensing. DeepSeek released its fine-tuned weights under the permissive MIT license, and the base it was built on — Qwen2.5-Math-7B — is Apache 2.0. Both layers allow commercial use, modification, fine-tuning, and redistribution with no royalties and no user-count carve-outs; you just keep the respective notices. That's a meaningful contrast with the Llama-based R1 distills, which inherit Meta's community license and its 700M-monthly-user clause. If clean commercial terms matter, the Qwen-based distills like this one are the easier choice.

How It Compares

Against DeepSeek-R1-Distill-Qwen-1.5B, the 7B is meaningfully stronger for a modest hardware step-up. Against the same-size DeepSeek-R1-Distill-Llama-8B, the 7B often edges it on math and comes with a cleaner license. Against DeepSeek-R1-Distill-Qwen-14B, it's the lighter option when you don't have a bigger GPU.

Cost

Self-hosted cost: $0.00 beyond compute
Notes: Free to self-host under Apache 2.0; also served by third-party hosts.

Comparable models

Commercial-use conditions

DeepSeek released the distilled weights under MIT; the base model (Qwen2.5) is Apache 2.0. Both layers are permissive and allow commercial use, so there are no carve-outs to worry about here.