DeepSeek V4

DeepSeek V4 is a preview of two powerful MoE models: V4-Pro (1.6T params) and V4-Flash (284B params) with 1M context and state of the art open-source performance.

Try model API Talk to an engineer

Model details

Developed by
DeepSeek
Model family
DeepSeek
Use case
large language
Version
V4
Size
1.6T
Hardware
B200
License
MIT

View repository

Example usage

DeepSeek V4 is a preview release of two powerful Mixture-of-Experts (MoE) language models: DeepSeek-V4-Pro (1.6T parameters, 49B activated) and DeepSeek-V4-Flash (284B parameters, 13B activated), both supporting up to 1 million tokens of context.

Built on a hybrid attention architecture that dramatically cuts inference costs, the V4 series was pre-trained on over 32 trillion tokens and refined through a two-stage post-training pipeline combining supervised fine-tuning, reinforcement learning, and on-policy distillation. DeepSeek-V4-Pro-Max delivers top-tier performance on coding, reasoning, and agentic tasks.

Input

1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3
4from openai import OpenAI
5
6client = OpenAI(
7    api_key="YOUR_API_KEY",
8    base_url="https://inference.baseten.co/v1"
9)
10
11response = client.chat.completions.create(
12    model="deepseek-ai/DeepSeek-V4-Pro",
13    messages=[
14        {
15            "role": "user",
16            "content": "Implement Hello World in Python"
17        }
18    ],
19    stream=True,
20    stream_options={
21        "include_usage": True,
22        "continuous_usage_stats": True
23    },
24    top_p=1,
25    max_tokens=1000,
26    temperature=1,
27    presence_penalty=0,
28    frequency_penalty=0
29)
30
31for chunk in response:
32    if chunk.choices and chunk.choices[0].delta.content is not None:
33        print(chunk.choices[0].delta.content, end="", flush=True)

JSON output

1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}