NVIDIA Nemotron 3 Nano

Open mixture of experts model with 30B total parameters

Talk to an engineer

‌

Model details

Developed by
NVIDIA
Model family
Nemotron
Use case
large language
Version
V3
Size
30B A3B
License
NVIDIA AI Foundation Models Community License Agreement
Readme
View

View repository

Example usage

NVIDIA Nemotron 3 Nano is a small language model with a hybrid MoE architecture, offering high compute efficiency and accuracy for developers to build specialized AI agents. The model is fully open across weights, datasets, and recipes so developers can easily customize, optimize, and deploy the model.

Nemotron 3 Nano is available in two pecisions:

Nemotron 3 Nano vs other SOTA models of similar sizes

Input

1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="",
14    messages=[
15        {"role": "user", "content": "Write FizzBuzz."}
16    ],
17    temperature=0.6,
18    max_tokens=512,
19)
20print(response_chat)

JSON output

1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}