Nemotron 3 Nano Omni

NVIDIA Nemotron 3 Nano Omni Reasoning 30B A3B — multimodal (text, image, video, audio) on a single H100 80GB via vLLM, OpenAI-compatible.

Deploy now

Model details

Developed by
NVIDIA
Model family
Nemotron
Use case
large language
Version
V1
Variant
Latency
Size
30B
Hardware
H100
API
openai
License
NVIDIA AI Foundation Models Community License Agreement

Example usage

Nemotron Nano V3 Omni is a multi‑modal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcription, and document intelligence workflows. It extends the Nemotron Nano family with integrated video+speech comprehension, Graphical User Interface (GUI) and Optical Character Recognition (OCR) capabilities, enabling end-to-end processing of rich enterprise content such as meeting recordings, M&E assets, training videos, and complex business documents.

Input

1from openai import OpenAI
2import os
3
4model_url = ""  # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="nvidia/nemotron-3-nano-omni",
14    messages=[
15        {"role": "user", "content": "Tell me a fun fact about cats."}
16    ],
17    temperature=0.6,
18    max_tokens=100,
19)
20print(response_chat)

JSON output

1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}