Model library / Microsoft / All MiniLM L6 v2

All MiniLM L6 v2

A text embedding model with a context window of 256 tokens and a dimensionality of 384 values.

Deploy All MiniLM L6 v2 behind an API endpoint in seconds.

Example usage

This model takes a list of strings and returns a list of embeddings, where each embedding is a list of 384 floating-point number representing the semantic text embedding of the associated string.

Strings can be up to 256 tokens in length (approximately 190 words). If the strings are longer, they'll be truncated before being run through the embedding model.

Input

1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "text": ["I want to eat pasta", "I want to eat pizza"],
10}
11
12# Call model endpoint
13res = requests.post(
14    f"https://model-{model_id}.api.baseten.co/production/predict",
15    headers={"Authorization": f"Api-Key {baseten_api_key}"},
16    json=data
17)
18
19# Print the output of the model
20print(res.json())

JSON output

1[
2    [
3        0.2593194842338562,
4        "...",
5        -1.4059709310531616
6    ],
7    [
8        0.11028853803873062,
9        "...",
10        -0.9492666125297546
11    ]
12]

Example usage

Deploy any model in just a few commands