NVIDIA Nemotron 3.5 ASR Streaming Multilingual (0.6B)
600M Cache-Aware FastConformer-RNNT streaming ASR for real-time multilingual voice agents across ~36 languages, with native punctuation and low latency.
Model details
View repositoryExample usage
Overview
NVIDIA Nemotron 3.5 ASR Streaming Multilingual is a 600M parameter Cache-Aware FastConformer-RNNT streaming speech recognition model built for real-time multilingual voice agents. It transcribes roughly 36 languages across 40 language-locale pairs, emits native punctuation and capitalization, and supports runtime-configurable streaming latency with chunk sizes as low as 80 ms.
Capabilities
Real-time streaming transcription with strictly non-overlapping chunks (no buffered-inference redundancy).
Prompt-guided language selection: pass target_lang as a locale tag, or "auto" for automatic language identification.
Native punctuation and capitalization in the output transcript.
Low-latency chunked decoding with configurable attention context (chunk sizes down to 80 ms).
Use cases
Real-time voice agents.
Live captioning.
Multilingual transcription pipelines.
This model uses a custom JSON request (it is not OpenAI-compatible). POST an audio_url (or base64 audio_b64) along with a target_lang locale tag, or "auto" for language ID, to the deployment's predict endpoint.
1import requests
2
3model_id = "" # place your deployment's model ID here
4
5resp = requests.post(
6 f"https://model-{model_id}.api.baseten.co/environments/production/predict",
7 headers={"Authorization": "Api-Key BASETEN-API-KEY"},
8 json={
9 "audio_url": "https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav",
10 "target_lang": "auto",
11 },
12)
13
14print(resp.json())
15
16# Pass a specific locale (for example "es-ES") instead of "auto" to force a
17# language, or send audio inline as base64 with the "audio_b64" field.
181{
2 "text": "The cut on his chest still dripping blood, the ache of his overstrained eyes, even the soaring arena around him with the thousands of spectators, were trivialities not worth a thought.",
3 "target_lang": "auto"
4}