Fixie LogoUltravox v0.4

Ultravox v0.4 is a multimodal Speech LLM built around a pretrained Llama3.1-8B-Instruct and Whisper-medium backbone.

Deploy Ultravox v0.4 behind an API endpoint in seconds.

Deploy model

Example usage

You can stream responses from this LLM using OpenAI chat completions.

Input
1from openai import OpenAI
2
3model_id = ""
4
5client = OpenAI(
6    api_key="YOUR-API-KEY",
7    base_url="https://bridge.baseten.co/v1/direct"
8)
9
10response = client.chat.completions.create(
11  model="ultravox", # Replace with your model name
12  messages=[
13      {
14          "role": "user",
15          "content": [
16              {
17                  "type": "text",
18                  "text": "Answer in one sentence. For lake Michigan,"
19              },
20              {
21                  "type": "audio_url",
22                  "audio_url": {"url": "http://study.aitech.ac.jp/tat/239977.mp3"}
23              }
24          ]
25      }
26  ],
27  stream=True
28  extra_body={
29    "baseten": {
30      "model_id": model_id
31    }
32  }
33)
34
35for chunk in response:
36    print(chunk.choices[0].delta)

JSON output
1[
2    "Lake",
3    "Michigan",
4    "is",
5    "approximately",
6    "..."
7]

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten 👋 👽

INFO

🚀 Uploading model to Baseten 🚀

Upload progress: 0% | | 0.00G/2.39G