Sail

Mistral AI logoMixtral 8x22B

A new state-of-the-art Mixtral model that can be used for general chat applications

Deploy Mixtral 8x22B behind an API endpoint in seconds.

Deploy model

Example usage

REST API Token Streaming Example

You can also make a REST API call using the requests library. To invoke the model using this method you need to same three inputs prompt , stream, and max_tokens.

Because this code example streams the tokens as they get generated, it does not produce a JSON output.

Input
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "prompt": "Instruction: Answer the following question. What is a llama's favorite food?",
10    "stream": True,
11    "max_new_tokens": 512,
12    "temperature": 0.9
13}
14
15# Call model endpoint
16res = requests.post(
17    f"https://model-{model_id}.api.baseten.co/production/predict",
18    headers={"Authorization": f"Api-Key {baseten_api_key}"},
19    json=data,
20    stream=True
21)
22
23# Print the generated tokens as they get streamed
24for content in res.iter_content():
25    print(content.decode("utf-8"), end="", flush=True)
JSON output
1[
2    "Llamas",
3    "are",
4    "herbivorous",
5    "animals.",
6    "They",
7    "eat",
8    "grass",
9    "......"
10]

REST API Non-streaming Example

This way you send a request is the same as above, just remove the streaming bits. This will produce a valid json output.

Input
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "prompt": "Instruction: Answer the following question. What is a llama's favorite food?",,
10    "stream": False,
11    "max_new_tokens": 512,
12    "temperature": 0.9
13}
14
15# Call model endpoint
16res = requests.post(
17    f"https://model-{model_id}.api.baseten.co/production/predict",
18    headers={"Authorization": f"Api-Key {baseten_api_key}"},
19    json=data
20)
21
22# Print the output of the model
23print(res.json())
JSON output
1{
2    "result": "A llama’s favorite food is anything that’s soft and easy to eat. Llamas are herbivorous animals. They eat grass, hay, and vegetables"
3}

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten 👋 👽

INFO

🚀 Uploading model to Baseten 🚀

Upload progress: 0% | | 0.00G/2.39G