Prompt: A cyberpunk movie still of a Llama writing code in a coffee shop. Model: Playground 2.

Model library / Meta / Code Llama 7B Instruct

Code Llama 7B Instruct

A seven billion parameter large language model tuned for chat-style assistant tasks on programming-related topics.

Deploy Code Llama 7B Instruct behind an API endpoint in seconds.

Example usage

This code example shows how to invoke the model using the requests library in Python. The model has a couple of key inputs:

prompt: The input text sent to the model.
max_new_tokens: Allows you to control the length of the output sequence.

The output of the model is a JSON object which has a key called output that contains the generated text.

Input

1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "prompt": "Write some code in python that calculates the meaning of life",
10    "max_new_tokens": 512
11}
12
13# Call model endpoint
14res = requests.post(
15    f"https://model-{model_id}.api.baseten.co/production/predict",
16    headers={"Authorization": f"Api-Key {baseten_api_key}"},
17    json=data
18)
19
20# Print the output of the model
21print(res.json())

JSON output

1{
2    "output": "<summary>Answer</summary>\n\n\t\t```python\n\t\t\t\tdef calculate_meaning_of_life():\n    \t\t\treturn 42\n\t\t```\n"
3}

Example usage

Deploy any model in just a few commands