"Inference Engineering" is now available. Get your copy here
large language

poolsideLaguna XS

Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work.

Model details

View repository

Example usage

Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work. It uses Sliding Window Attention with per-head gating in 30 out of 40 layers for fast inference and low KV cache requirements.

This model was pre-trained on 30 trillion tokens and delivers the strongest performance in Poolside’s agent harness, pool, after undergoing agent RL. Poolside’s RL stack is a custom-built system loosely coupling the major components of inference and rollout generation, orchestration of code execution sandboxes, trajectory scoring, buffering and filtering, and distributed training

Input
1# You can use this model with any of the OpenAI clients in any language!
2# Simply set the API Key to get started
3
4import os
5from openai import OpenAI
6
7model_url = "" # Copy in from API pane in Baseten model dashboard
8
9client = OpenAI(
10    api_key=os.environ['BASETEN_API_KEY'],
11    base_url=model_url,
12)
13
14response = client.chat.completions.create(
15    model="poolside/laguna-xs.2",
16    messages=[
17        {
18            "role": "user",
19            "content": "Implement Hello World in Python",
20        }
21    ],
22    stream=True,
23    stream_options={
24        "include_usage": True,
25        "continuous_usage_stats": True,
26    },
27    top_p=1,
28    max_tokens=1000,
29    temperature=1,
30    presence_penalty=0,
31    frequency_penalty=0,
32)
33
34for chunk in response:
35    if chunk.choices and chunk.choices[0].delta.content is not None:
36        print(chunk.choices[0].delta.content, end="", flush=True)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

🔥 Trending models