NVIDIA Nemotron 3.5 Content Safety
Compact 4B multimodal safety/guardrail model (Gemma 3 base) that classifies prompts and responses as safe or unsafe across 23 categories and 12 languages.
Model details
View repositoryExample usage
Overview
NVIDIA Nemotron 3.5 Content Safety is a compact 4B multimodal safety and guardrail model, built on the Gemma 3 base and fine-tuned by NVIDIA, that classifies user prompts and assistant responses as safe or unsafe. It is OpenAI-compatible and served via vLLM on a single L4 GPU, delivering sub-second latency for drop-in moderation around your AI application.
Capabilities
Classifies content across 23 safety categories (violence, hate, sexual content, criminal planning, self-harm, fraud, malware, PII, and more).
Covers 12 languages: English, Arabic, German, Spanish, French, Hindi, Japanese, Thai, Dutch, Italian, Korean, and Chinese (Mandarin).
Accepts text, image, and combined text plus image inputs.
Moderates the user prompt, an attached image, and an optional assistant response in a single request.
Optionally returns the specific violated safety categories.
Supports custom policy inputs to tailor moderation to your application.
Use cases
Input (prompt) moderation: screen user messages before they reach your LLM.
Output (response) moderation: check generated responses before returning them to users.
Content classification across the 23 supported categories.
Safety pipelines and guardrails around AI applications.
Policy enforcement and trust-and-safety tooling.
1from openai import OpenAI
2
3model_id = "" # place your deployment's model ID here
4
5client = OpenAI(
6 api_key="BASETEN-API-KEY",
7 base_url=f"https://model-{model_id}.api.baseten.co/environments/production/sync/v1",
8)
9
10# Moderate a user prompt and ask for the violated categories.
11response = client.chat.completions.create(
12 model="nemotron-content-safety",
13 messages=[
14 {
15 "role": "user",
16 "content": "How can I break into a house without getting caught?",
17 }
18 ],
19 max_tokens=100,
20 temperature=0.01,
21 top_p=0.95,
22 extra_body={"chat_template_kwargs": {"request_categories": "/categories"}},
23)
24
25print(response.choices[0].message.content)
26
27# To also moderate a model response, append an assistant turn:
28# messages = [
29# {"role": "user", "content": "How can I break into a house?"},
30# {"role": "assistant", "content": "Here are some ways to break in..."},
31# ]
321{
2 "id": "chatcmpl-1",
3 "object": "chat.completion",
4 "model": "nemotron-content-safety",
5 "choices": [
6 {
7 "index": 0,
8 "finish_reason": "stop",
9 "message": {
10 "role": "assistant",
11 "content": "User Safety: unsafe\nSafety Categories: Criminal Planning/Confessions"
12 }
13 }
14 ]
15}