DeepSeek R1 VRAM & GPU Requirements

How much VRAM each DeepSeek R1 variant needs — from the 8B distilled model on a single consumer GPU to the full 671B running across H100 clusters.

DeepSeek R1 VRAM & GPU Requirements

DeepSeek R1 comes in several sizes. The distilled variants run on consumer hardware; the full model does not. Here is what you actually need for each one.

Model variants at a glance

Model	Parameters	Minimum VRAM	Comfortable setup
R1 8B (Distilled)	8B	6 GB (Q4)	8 GB
R1 14B (Distilled)	14B	10 GB (Q4)	16 GB
R1 32B (Distilled)	32B	18 GB (Q4)	24 GB
R1 70B (Distilled)	70B	35 GB (Q4)	48 GB
R1 Full	671B	~320 GB (FP8)	640 GB+

DeepSeek R1 8B (Distilled)

Recommended VRAM: 8 GB+
Good GPUs: RTX 4060 8GB, RTX 3070 8GB
Runs comfortably at 4-bit quantization on any modern 8 GB card.
Full BF16 needs ~16 GB; stick to Q4/Q5 for 8 GB cards.

DeepSeek R1 14B (Distilled)

Recommended VRAM: 16 GB
Good GPUs: RTX 4080 16GB, RTX 4060 Ti 16GB
Q4 fits in 10–12 GB but expect slower generation; 16 GB gives you headroom for longer contexts.

DeepSeek R1 32B (Distilled)

Recommended VRAM: 24 GB+
Good GPUs: RTX 3090 24GB, RTX 4090 24GB
Q4 can squeeze into ~18–20 GB, but 24 GB is the practical minimum for a usable workflow.
Long reasoning chains (R1's strong suit) consume extra KV cache — headroom matters here.

Can't run 32B locally? Compare RunPod, Vast.ai, and other cloud GPU rental prices on the homepage.

DeepSeek R1 70B (Distilled)

Recommended VRAM: 48 GB+
Good GPUs: 2× RTX 3090 (NVLink), 1× RTX A6000 48GB, 1× L40S 48GB
At Q4 you can squeeze by on ~35–40 GB, but a single 48 GB card is the cleanest single-node option.
Dual RTX 3090 via NVLink is the cheapest path to 48 GB on consumer hardware.

For multi-GPU cloud rentals at reasonable hourly rates, see the GPU rental comparison on the homepage.

DeepSeek R1 Full (671B)

Recommended VRAM: 640 GB+ (FP16/BF16)
Practical minimum: ~320 GB at FP8 across multiple nodes
Typical cloud setup: 8× H100 80GB SXM, or 8× A100 80GB
This is not a local-deployment model. Even Q4 quantized weighs ~335 GB.
Expect to pay $20–$50/hr on cloud infrastructure for inference at this scale.

This tier is cloud-only for virtually everyone. Compare H100 and A100 cluster pricing across providers.