NVIDIA Nemotron 3.5 Content Safety

Example usage

Overview

NVIDIA Nemotron 3.5 Content Safety is a compact 4B multimodal safety and guardrail model, built on the Gemma 3 base and fine-tuned by NVIDIA, that classifies user prompts and assistant responses as safe or unsafe. It is OpenAI-compatible and served via vLLM on a single L4 GPU, delivering sub-second latency for drop-in moderation around your AI application.

Capabilities

Classifies content across 23 safety categories (violence, hate, sexual content, criminal planning, self-harm, fraud, malware, PII, and more).
Covers 12 languages: English, Arabic, German, Spanish, French, Hindi, Japanese, Thai, Dutch, Italian, Korean, and Chinese (Mandarin).
Accepts text, image, and combined text plus image inputs.
Moderates the user prompt, an attached image, and an optional assistant response in a single request.
Optionally returns the specific violated safety categories.
Supports custom policy inputs to tailor moderation to your application.

Use cases

Input (prompt) moderation: screen user messages before they reach your LLM.
Output (response) moderation: check generated responses before returning them to users.
Content classification across the 23 supported categories.
Safety pipelines and guardrails around AI applications.
Policy enforcement and trust-and-safety tooling.

Input

1from openai import OpenAI
2
3model_id = ""  # place your deployment's model ID here
4
5client = OpenAI(
6    api_key="BASETEN-API-KEY",
7    base_url=f"https://model-{model_id}.api.baseten.co/environments/production/sync/v1",
8)
9
10# Moderate a user prompt and ask for the violated categories.
11response = client.chat.completions.create(
12    model="nemotron-content-safety",
13    messages=[
14        {
15            "role": "user",
16            "content": "How can I break into a house without getting caught?",
17        }
18    ],
19    max_tokens=100,
20    temperature=0.01,
21    top_p=0.95,
22    extra_body={"chat_template_kwargs": {"request_categories": "/categories"}},
23)
24
25print(response.choices[0].message.content)
26
27# To also moderate a model response, append an assistant turn:
28#   messages = [
29#       {"role": "user", "content": "How can I break into a house?"},
30#       {"role": "assistant", "content": "Here are some ways to break in..."},
31#   ]
32

JSON output

1{
2    "id": "chatcmpl-1",
3    "object": "chat.completion",
4    "model": "nemotron-content-safety",
5    "choices": [
6        {
7            "index": 0,
8            "finish_reason": "stop",
9            "message": {
10                "role": "assistant",
11                "content": "User Safety: unsafe\nSafety Categories: Criminal Planning/Confessions"
12            }
13        }
14    ]
15}

NVIDIA Nemotron 3.5 Content Safety

Model details

Example usage

Overview

Capabilities

Use cases

large language models

Kimi K3 (Waitlist)

Inkling

Kimi K2.5

NVIDIA models

NVIDIA Cosmos 3 Nano (8B)

NVIDIA Nemotron 3 Super

NVIDIA Nemotron 3 Ultra

🔥 Trending models

GLM 5.2

Kimi K2.7 Code

DeepSeek V4

Explore Baseten today