"Inference Engineering" is now available. Get your copy here
large language

DeepSeek LogoDeepSeek V4

DeepSeek V4 is a preview of two powerful MoE models: V4-Pro (1.6T params) and V4-Flash (284B params) with 1M context and state of the art open-source performance.

Model details

View repository

Example usage

DeepSeek V4 is a preview release of two powerful Mixture-of-Experts (MoE) language models: DeepSeek-V4-Pro (1.6T parameters, 49B activated) and DeepSeek-V4-Flash (284B parameters, 13B activated), both supporting up to 1 million tokens of context.

Built on a hybrid attention architecture that dramatically cuts inference costs, the V4 series was pre-trained on over 32 trillion tokens and refined through a two-stage post-training pipeline combining supervised fine-tuning, reinforcement learning, and on-policy distillation. DeepSeek-V4-Pro-Max delivers top-tier performance on coding, reasoning, and agentic tasks.

Input
1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3
4from openai import OpenAI
5
6client = OpenAI(
7    api_key="YOUR_API_KEY",
8    base_url="https://inference.baseten.co/v1"
9)
10
11response = client.chat.completions.create(
12    model="deepseek-ai/DeepSeek-V4-Pro",
13    messages=[
14        {
15            "role": "user",
16            "content": "Implement Hello World in Python"
17        }
18    ],
19    stream=True,
20    stream_options={
21        "include_usage": True,
22        "continuous_usage_stats": True
23    },
24    top_p=1,
25    max_tokens=1000,
26    temperature=1,
27    presence_penalty=0,
28    frequency_penalty=0
29)
30
31for chunk in response:
32    if chunk.choices and chunk.choices[0].delta.content is not None:
33        print(chunk.choices[0].delta.content, end="", flush=True)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

🔥 Trending models