DeepSeek-R1 Qwen 7B

Qwen 7B fine-tuned for CoT reasoning capabilities with DeepSeek R1

Model details

Developed by
DeepSeek
Model family
DeepSeek
Use case
large language
Version
R1
Variant
Qwen
Size
7B
Optimization
TRT-LLM
Hardware
H100 MIG 40GB
License
Deepseek License Agreement

Example usage

The fine-tuned version of Qwen uses the standard llama-style multi-turn messaging framework with system and user prompts.

Input

1import requests
2
3# Replace the empty string with your model id below
4model_id = ""
5baseten_api_key = os.environ["BASETEN_API_KEY"]
6
7data = {
8    "messages": [
9        {"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."},
10        {"role": "user", "content": "Which weighs more, a pound of bricks or a pound of feathers?"},
11    ]
12    "stream": True,
13    "max_new_tokens": 2048,
14    "temperature": 0.6
15}
16
17# Call model endpoint
18res = requests.post(
19    f"https://model-{model_id}.api.baseten.co/production/predict",
20    headers={"Authorization": f"Api-Key {baseten_api_key}"},
21    json=data,
22    stream=True
23)
24
25# Print the generated tokens as they get streamed
26for content in res.iter_content():
27    print(content.decode("utf-8"), end="", flush=True)

JSON output

1[
2    "streaming",
3    "output",
4    "text"
5]

large language models

See all

Model API

LLM

GLM 5.2

5.2

LLM

Laguna M.1

H100

LLM

Laguna XS.2

H100

DeepSeek models

See all

Model API

LLM

DeepSeek V4

V4 - B200

Model API

LLM

DeepSeek V3.1

V3.1 - B200

LLM

DeepSeek V3.2

V3.2 - B200

🔥 Trending models

Model API

LLM

GLM 5.2

5.2

Model API

LLM

Kimi K2.7 Code

2.7 - Code

Model API

LLM

DeepSeek V4

V4 - B200

Explore Baseten today

Start deploying Talk to an engineer