large language

Z AIGLM-4.6V

A frontier vision language model by Z AI with native multimodal function calling and interleaved image-text content generation

Model details

Example usage

GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales. Crucially, we integrate native Function Calling capabilities for the first time. This effectively bridges the gap between "visual perception" and "executable action" providing a unified technical foundation for multimodal agents in real-world business scenarios. You can deploy GLM-4.6V on NVIDIA H100 GPUs with Baseten today.

GLM-4.6V benchmarksGLM-4.6V benchmarks

Deployments of GLM-4.6V are OpenAI-compatible.

Input
1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="zai-org/GLM-4.6V",
14    stream=True,
15    messages=[
16        {"role": "system", "content": "You are a helpful vision-language assistant."},
17        {"role": "user", "content": [{"url": "https://upload.wikimedia.org/wikipedia/commons/f/fa/Grayscale_8bits_palette_sample_image.png", "type": "image"},
18        {"text": "Describe this image in detail.", "type": "text"}
19    ],
20    max_tokens=1024
21    temperature=0.7}
22print(response_chat)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

large language models

See all
Z AI
LLM

GLM-4.6V

4.6 - Vision
DeepSeek Logo
Model API
LLM

DeepSeek V3.2

V3.2 - B200

Z AI models

See all
Z AI
LLM

GLM-4.6V

4.6 - Vision
Z AI
Model API
LLM

GLM 4.6

4.6
Z AI
LLM

GLM-4.5V

4.5 - Vision

🔥 Trending models