DeepSeek-R1 Models Download

Main Models

Model	#Total Params	#Activated Params	Context Length	Download
DeepSeek-R1-Zero	671B	37B	128K	🤗 HuggingFace
DeepSeek-R1	671B	37B	128K	🤗 HuggingFace

Note:
DeepSeek-R1-Zero & DeepSeek-R1 are trained based on DeepSeek-V3-Base. For architecture details, see the DeepSeek-V3 repository.

Implementation Notes:

Distilled models are fine-tuned on open-source base models using DeepSeek-R1-generated samples
Configuration files and tokenizers have been slightly modified
Important: Use our provided settings to run these models