DeepSeek V4
DeepSeek V4 is a preview of two powerful MoE models: V4-Pro (1.6T params) and V4-Flash (284B params) with 1M context and state of the art open-source performance.
Model details
View repositoryExample usage
DeepSeek V4 is a preview release of two powerful Mixture-of-Experts (MoE) language models: DeepSeek-V4-Pro (1.6T parameters, 49B activated) and DeepSeek-V4-Flash (284B parameters, 13B activated), both supporting up to 1 million tokens of context.
Built on a hybrid attention architecture that dramatically cuts inference costs, the V4 series was pre-trained on over 32 trillion tokens and refined through a two-stage post-training pipeline combining supervised fine-tuning, reinforcement learning, and on-policy distillation. DeepSeek-V4-Pro-Max delivers top-tier performance on coding, reasoning, and agentic tasks.
1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3
4from openai import OpenAI
5
6client = OpenAI(
7 api_key="YOUR_API_KEY",
8 base_url="https://inference.baseten.co/v1"
9)
10
11response = client.chat.completions.create(
12 model="deepseek-ai/DeepSeek-V4-Pro",
13 messages=[
14 {
15 "role": "user",
16 "content": "Implement Hello World in Python"
17 }
18 ],
19 stream=True,
20 stream_options={
21 "include_usage": True,
22 "continuous_usage_stats": True
23 },
24 top_p=1,
25 max_tokens=1000,
26 temperature=1,
27 presence_penalty=0,
28 frequency_penalty=0
29)
30
31for chunk in response:
32 if chunk.choices and chunk.choices[0].delta.content is not None:
33 print(chunk.choices[0].delta.content, end="", flush=True)1{
2 "id": "143",
3 "choices": [
4 {
5 "finish_reason": "stop",
6 "index": 0,
7 "logprobs": null,
8 "message": {
9 "content": "[Model output here]",
10 "role": "assistant",
11 "audio": null,
12 "function_call": null,
13 "tool_calls": null
14 }
15 }
16 ],
17 "created": 1741224586,
18 "model": "",
19 "object": "chat.completion",
20 "service_tier": null,
21 "system_fingerprint": null,
22 "usage": {
23 "completion_tokens": 145,
24 "prompt_tokens": 38,
25 "total_tokens": 183,
26 "completion_tokens_details": null,
27 "prompt_tokens_details": null
28 }
29}