large language

Qwen3 Next 80B A3B Thinking

Qwen3-Next-80B-A3B Thinking is the first installment in the Qwen3-Next series featuring Hybrid Attention, High-Sparsity MoE, Stability Optimizations, and MTP

Deploy now

‌

Model details

Developed by
Qwen
Model family
Qwen
Use case
large language
Version
Qwen3 Next 80B A3B Instruct
Variant
Instruct
Size
80B
Optimization
SGLang
Hardware
H100
API
openai-compatible
License
Apache 2.0

View repository

Example usage

This example code shows how to call Qwen 3 Next 80B A3B Thinking using the openAI client. You can also make requests to the /predict endpoint with a message or /v1/completions endpoint with a prompt.

Qwen3-Next-80B-A3B-Thinking supports only thinking mode. To enforce model thinking, the default chat template automatically includes <think>. Therefore, it is normal for the model's output to contain only </think> without an explicit opening <think> tag.

Qwen3-Next-80B-A3B-Thinking may generate thinking content longer than its predecessor. We strongly recommend its use in highly complex reasoning tasks.

Input

1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3
4from openai import OpenAI
5model_id = "YOUR_MODEL_ID_HERE"
6client = OpenAI(
7    api_key="YOUR_API_KEY",
8    base_url=f"https://model-{model_id}.api.baseten.co/environments/production/sync/v1"
9)
10
11response = client.chat.completions.create(
12    model="Qwen/Qwen3-Next-80B-A3B-Thinking",
13    messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Write FizzBuzz in Python"}],
14)
15
16print(response.choices[0].message.content)

JSON output

1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}