Introducing Baseten Loops: A Training SDK for Frontier RL. Learn more here
large language

NVIDIA logoNVIDIA Nemotron 3.5 Content Safety

Compact 4B multimodal safety/guardrail model (Gemma 3 base) that classifies prompts and responses as safe or unsafe across 23 categories and 12 languages.

Model details

View repository

Example usage

Overview

NVIDIA Nemotron 3.5 Content Safety is a compact 4B multimodal safety and guardrail model, built on the Gemma 3 base and fine-tuned by NVIDIA, that classifies user prompts and assistant responses as safe or unsafe. It is OpenAI-compatible and served via vLLM on a single L4 GPU, delivering sub-second latency for drop-in moderation around your AI application.

Capabilities

  • Classifies content across 23 safety categories (violence, hate, sexual content, criminal planning, self-harm, fraud, malware, PII, and more).

  • Covers 12 languages: English, Arabic, German, Spanish, French, Hindi, Japanese, Thai, Dutch, Italian, Korean, and Chinese (Mandarin).

  • Accepts text, image, and combined text plus image inputs.

  • Moderates the user prompt, an attached image, and an optional assistant response in a single request.

  • Optionally returns the specific violated safety categories.

  • Supports custom policy inputs to tailor moderation to your application.

Use cases

  • Input (prompt) moderation: screen user messages before they reach your LLM.

  • Output (response) moderation: check generated responses before returning them to users.

  • Content classification across the 23 supported categories.

  • Safety pipelines and guardrails around AI applications.

  • Policy enforcement and trust-and-safety tooling.

Input
1from openai import OpenAI
2
3model_id = ""  # place your deployment's model ID here
4
5client = OpenAI(
6    api_key="BASETEN-API-KEY",
7    base_url=f"https://model-{model_id}.api.baseten.co/environments/production/sync/v1",
8)
9
10# Moderate a user prompt and ask for the violated categories.
11response = client.chat.completions.create(
12    model="nemotron-content-safety",
13    messages=[
14        {
15            "role": "user",
16            "content": "How can I break into a house without getting caught?",
17        }
18    ],
19    max_tokens=100,
20    temperature=0.01,
21    top_p=0.95,
22    extra_body={"chat_template_kwargs": {"request_categories": "/categories"}},
23)
24
25print(response.choices[0].message.content)
26
27# To also moderate a model response, append an assistant turn:
28#   messages = [
29#       {"role": "user", "content": "How can I break into a house?"},
30#       {"role": "assistant", "content": "Here are some ways to break in..."},
31#   ]
32
JSON output
1{
2    "id": "chatcmpl-1",
3    "object": "chat.completion",
4    "model": "nemotron-content-safety",
5    "choices": [
6        {
7            "index": 0,
8            "finish_reason": "stop",
9            "message": {
10                "role": "assistant",
11                "content": "User Safety: unsafe\nSafety Categories: Criminal Planning/Confessions"
12            }
13        }
14    ]
15}

🔥 Trending models