With RL, DeepSeek-R1-Zero naturally emerged together with numerous powerful plus interesting reasoning behaviours. However, DeepSeek-R1-Zero incurs challenges such since endless repetition, poor readability, and language mixing. To deal with these issues and additional enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves overall performance comparable to OpenAI-o1 across math, computer …