How to run GLM-5.2 in any harness

GLM-5.2 is this year’s DeepSeek moment. It’s already shifting the trajectory of how we interact with and consume intelligence.

As we and our agents continue to tokenmax, tokenonomics and performance are more relevant than ever. And GLM 5.2 is the perfect frontier model that matches closed-source models in quality while exceeding them in speed and cost by multiples (e.g., ~4.5x faster and ~5x cheaper than Opus 4.8, many are saying it's a drop-in replacement).

Open is no longer a compromise but a logical winner across dimensions for scaling yourself and your org. Here's exactly how to use GLM-5.2 inside 3 of the popular harnesses today in <5 min, which can be generalized to any other.

Claude Code

Install Claude Code: npm install -g@anthropic-ai/claude-code
Create an account at baseten.co and grab an API key from app.baseten.co/settings/api_keys
Edit ~/.claude/settings.json and add:

"env": {
    "ANTHROPIC_AUTH_TOKEN": "your_baseten_api_key",
    "ANTHROPIC_BASE_URL": "https://inference.baseten.co",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "zai-org/GLM-5.2",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "zai-org/GLM-5.2",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "zai-org/GLM-5.2"
}

Run CC, and GLM-5.2 will power every Claude Code call.

Codex

Get a Baseten API key like above and store it in macOS Keychain:

security add-generic-password \
  -a "$USER" \
  -s codex-baseten-api-key \
  -w "YOUR_BASETEN_API_KEY_HERE" \
  -U

2. Add the Baseten provider to ~/.codex/config.toml:

[model_providers.baseten]
name = "Baseten"
base_url = "https://inference.baseten.co/v1"
wire_api = "responses"

[model_providers.baseten.auth]
command = "/usr/bin/security"
args = ["find-generic-password", "-a", "YOUR_MAC_USERNAME", "-s", "codex-baseten-api-key", "-w"]
timeout_ms = 5000

3. Create a profile at ~/.codex/baseten-glm.config.toml:

model = "zai-org/GLM-5.2"
model_provider = "baseten"
model_reasoning_effort = "medium"

Run: codex --profile baseten-glm to start Codex with GLM-5.2, and all your requests will be routed to this model.

Deep Agents CLI (LangChain)

1. Run:

curl -LsSf https://langch.in/dcode

2. Launch with dcode and inside it:

/install baseten → add Baseten integration
/auth → add your Baseten API key
/model → switch to GLM-5.2

A similar flow works with OpenCode, Cline, and other harnesses for a native installation experience of simply supplying the API key, so I won’t labor the setup process. I’d love to hear learnings from the token factory and what you build.

How to run GLM-5.2 in any harness

Authors

Last updated

Share

Claude Code

Codex

Deep Agents CLI (LangChain)

Related posts

Rolling deployments for zero-downtime model updates

Cost-efficient, high-performance TTS with Qwen3-TTS

Harnesses are everything. Here's how to optimize yours.

Explore Baseten today

Related posts

Rolling deployments for zero-downtime model updates

Cost-efficient, high-performance TTS with Qwen3-TTS

Harnesses are everything. Here's how to optimize yours.