large language
NVIDIA Nemotron 3 Nano
Open mixture of experts model with 30B total parameters
Model details
View repositoryExample usage
NVIDIA Nemotron 3 Nano is a small language model with a hybrid MoE architecture, offering high compute efficiency and accuracy for developers to build specialized AI agents. The model is fully open across weights, datasets, and recipes so developers can easily customize, optimize, and deploy the model.
✕
Nemotron 3 Nano vs other SOTA models of similar sizesInput
1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7 api_key=os.environ['BASETEN_API_KEY'],
8 base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13 model="",
14 messages=[
15 {"role": "user", "content": "Write FizzBuzz."}
16 ],
17 temperature=0.6,
18 max_tokens=512,
19)
20print(response_chat)JSON output
1{
2 "id": "143",
3 "choices": [
4 {
5 "finish_reason": "stop",
6 "index": 0,
7 "logprobs": null,
8 "message": {
9 "content": "[Model output here]",
10 "role": "assistant",
11 "audio": null,
12 "function_call": null,
13 "tool_calls": null
14 }
15 }
16 ],
17 "created": 1741224586,
18 "model": "",
19 "object": "chat.completion",
20 "service_tier": null,
21 "system_fingerprint": null,
22 "usage": {
23 "completion_tokens": 145,
24 "prompt_tokens": 38,
25 "total_tokens": 183,
26 "completion_tokens_details": null,
27 "prompt_tokens_details": null
28 }
29}