Our Series E: we raised $300M at a $5B valuation to power a multi-model future. READ

large language

Qwen3 235B 2507

Mixture-of-experts LLM with math and reasoning capabilities

‌

Model details

Developed by
Qwen
Model family
Qwen
Use case
large language
Version
2507
Size
235B
API
OpenAI SDK
License
Apache 2.0

Example usage

Baseten offers Dedicated Deployments and Model APIs for Qwen3 235B A22B Instruct 2507 powered by the Baseten Inference Stack.

Qwen3 has shown strong performance on math and reasoning tasks, but running it in production requires a highly optimized inference stack to avoid excessive latency.

Deployments of Qwen3 are OpenAI-compatible.

Input

1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="",
14    messages=[
15        {"role": "user", "content": "Write FizzBuzz."}
16    ],
17    temperature=0.6,
18    max_tokens=100,
19)
20print(response_chat)

JSON output

1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}