Llama 3.1 8B Instruct
State of the art open-source 8B LLM by Meta
Deploy Llama 3.1 8B Instruct behind an API endpoint in seconds.
Deploy modelExample usage
Call the model and stream results
Input
1import requests
2
3# Replace the empty string with your model id below
4model_id = ""
5baseten_api_key = os.environ["BASETEN_API_KEY"]
6
7data = {
8 "prompt": "What even is AGI?",
9 "stream": True,
10 "max_tokens": 1024
11}
12
13# Call model endpoint
14res = requests.post(
15 f"https://model-{model_id}.api.baseten.co/production/predict",
16 headers={"Authorization": f"Api-Key {baseten_api_key}"},
17 json=data,
18 stream=True
19)
20
21# Print the generated tokens as they get streamed
22for content in res.iter_content():
23 print(content.decode("utf-8"), end="", flush=True)
JSON output
1[
2 "arrrg",
3 "me hearty",
4 "I",
5 "be",
6 "doing",
7 "..."
8]