Microsoft LogoAll MiniLM L6 v2

A text embedding model with a context window of 256 tokens and a dimensionality of 384 values.

Deploy All MiniLM L6 v2 behind an API endpoint in seconds.

Deploy model

Example usage

This model takes a list of strings and returns a list of embeddings, where each embedding is a list of 384 floating-point number representing the semantic text embedding of the associated string.

Strings can be up to 256 tokens in length (approximately 190 words). If the strings are longer, they'll be truncated before being run through the embedding model.

Input
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "text": ["I want to eat pasta", "I want to eat pizza"],
10}
11
12# Call model endpoint
13res = requests.post(
14    f"https://model-{model_id}.api.baseten.co/production/predict",
15    headers={"Authorization": f"Api-Key {baseten_api_key}"},
16    json=data
17)
18
19# Print the output of the model
20print(res.json())
JSON output
1[
2    [
3        0.2593194842338562,
4        "...",
5        -1.4059709310531616
6    ],
7    [
8        0.11028853803873062,
9        "...",
10        -0.9492666125297546
11    ]
12]

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten πŸ‘‹ πŸ‘½

INFO

πŸš€ Uploading model to Baseten πŸš€

Upload progress: 0% | | 0.00G/2.39G