"Inference Engineering" is now available. Get your copy here
changelog / post

Retrieve billing usage via API

Go back

You can now query your billing usage programmatically using the new GET /v1/billing/usage_summary endpoint. Pass a date range of up to 31 days to get a breakdown of costs across Dedicated Inference, Training, and Model APIs.

The response includes aggregate totals and a per-resource or per-model breakdown[] array, with daily granularity on each entry. Deleted resources are still returned in breakdown[] with is_deleted: true, so historical cost attribution is preserved.

Example output:

1{
2  "dedicated_usage": {
3    "subtotal": 123,
4    "credits_used": 123,
5    "total": 123,
6    "minutes": 123,
7    "breakdown": [{
8      "billable_resource": {
9        "id": "<string>",
10        "kind": "MODEL_DEPLOYMENT",
11        "name": "<string>",
12        "is_deleted": true,
13        "instance_type": "<string>",
14        "environment_name": "<string>"
15      },
16      "subtotal": 123,
17      "minutes": 123,
18      "inference_requests": 123,
19      "daily": [{ "date": "2023-12-25", "subtotal": 123, "minutes": 123, "inference_requests": 123 }]
20    }]
21  },
22  "training_usage": {
23    "subtotal": 123,
24    "credits_used": 123,
25    "total": 123,
26    "minutes": 123,
27    "breakdown": [{ ... }]
28  },
29  "model_apis_usage": {
30    "subtotal": 123,
31    "credits_used": 123,
32    "total": 123,
33    "breakdown": [{
34      "model_name": "<string>",
35      "model_family": "<string>",
36      "subtotal": 123,
37      "input_tokens": 123,
38      "output_tokens": 123,
39      "cached_input_tokens": 123,
40      "daily": [{ "date": "2023-12-25", "subtotal": 123, "input_tokens": 123, "output_tokens": 123 }]
41    }]
42  }
43}

Check out the billing API reference for the full schema and parameters.