large language
DeepSeek V3.2
DeepSeek's new hybrid reasoning model with efficient long context scaling
Model details
View repository
Example usage
DeepSeek V3.2 runs using the Baseten Inference Stack and is accessible via an OpenAI-compatible API endpoint.
By default, DeepSeek V3.2 has reasoning disabled. See the code sample for instructions on enabling reasoning. Note that for best performance in multi-turn chat, you should take the reasoning content from previous responses and pass it back in via the reasoning_content field.
# To enable reasoning, add the required extra_body to the request
message = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V3.2",
messages=[{"role": "user", "content": "What is 2+2?"}],
extra_body={"chat_template_args": {"enable_thinking": True}},
)
# message.reasoning contains reasoning traceInput
1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3from openai import OpenAI
4import os
5client = OpenAI(
6 api_key=os.getenv("BASETEN_API_KEY"),
7 base_url="https://inference.baseten.co/v1"
8)
9response = client.chat.completions.create(
10 model="deepseek-ai/DeepSeek-V3.2",
11 messages=[
12 {
13 "role": "user",
14 "content": "Implement Hello World in Python"
15 }
16 ],
17 stream=True,
18 stream_options={
19 "include_usage": True,
20 "continuous_usage_stats": True
21 },
22 top_p=1,
23 max_tokens=1000,
24 temperature=1,
25 presence_penalty=0,
26 frequency_penalty=0
27)
28for chunk in response:
29 if chunk.choices and chunk.choices[0].delta.content is not None:
30 print(chunk.choices[0].delta.content, end="", flush=True)JSON output
1{
2 "id": "chatcmpl-130642b89fba4c4d9ea162ba88a1153a",
3 "object": "text_completion",
4 "created": 1764865060,
5 "model": "deepseek-ai/DeepSeek-V3.2",
6 "choices": [
7 {
8 "index": 0,
9 "text": "Here's the classic \"Hello, World!\"...",
10 "logprobs": null,
11 "finish_reason": "stop",
12 "matched_stop": 1
13 }
14 ],
15 "usage": {
16 "prompt_tokens": 9,
17 "total_tokens": 237,
18 "completion_tokens": 228,
19 "prompt_tokens_details": null
20 }
21}