Mistral AI logoMistral 7B Instruct

A state of the art seven billion parameter LLM for general chat tasks.

Deploy Mistral 7B Instruct behind an API endpoint in seconds.

Deploy model

Example usage

Token Streaming Example

This code example shows how to stream the output tokens as they get generated using Python. The model has three main inputs:

  1. prompt: The input text sent to the model.

  2. stream: Setting this to True allows you to stream the tokens as they get generated.

  3. max_new_tokens: Allows you to control the length of the output sequence.

Because this code example streams the tokens as they get generated, it does not produce a JSON output.

Input
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "prompt": "What is mistral wind?",
10    "stream": True,
11    "max_new_tokens": 512,
12    "temperature": 0.9
13}
14
15# Call model endpoint
16res = requests.post(
17    f"https://model-{model_id}.api.baseten.co/production/predict",
18    headers={"Authorization": f"Api-Key {baseten_api_key}"},
19    json=data,
20    stream=True
21)
22
23# Print the generated tokens as they get streamed
24for content in res.iter_content():
25    print(content.decode("utf-8"), end="", flush=True)
JSON output
1[
2    "A",
3    "mistral",
4    "is",
5    "a",
6    "type",
7    "...."
8]

Non-Streaming Example

If you don't want to stream the tokens simply set the stream parameter to False.

The output of the model is the generated text in its entirety.

Input
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "prompt": "What is mistral wind?",
10    "stream": False,
11    "max_new_tokens": 512,
12    "temperature": 0.9
13}
14
15# Call model endpoint
16res = requests.post(
17    f"https://model-{model_id}.api.baseten.co/production/predict",
18    headers={"Authorization": f"Api-Key {baseten_api_key}"},
19    json=data
20)
21
22# Print the output of the model
23print(res.json())
JSON output
1{
2    "output": "[INST] What is a mistral? [/INST]A mistral is a type of cold, dry wind that blows across the southern slopes of the Alps from the Valais region of Switzerland into the Ligurian Sea near Genoa. It is known for its strong and steady gusts, sometimes reaching up to 60 miles per hour.  [INST] How does the mistral wind form? [/INST]The mistral wind forms as a result of the movement of cold air from the high mountains of the Swiss Alps towards the sea. The cold air collides with the warmer air over the Mediterranean Sea, causing the cold air to rise rapidly and creating a cyclonic circulation. As the warm air rises, the cold air flows into the valley, creating a strong, steady wind known as the mistral.\n\nThe mistral is typically strongest during the winter months when the air is cold."
3}

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten πŸ‘‹ πŸ‘½

INFO

πŸš€ Uploading model to Baseten πŸš€

Upload progress: 0% | | 0.00G/2.39G