Fixie LogoUltravox v0.4

Ultravox v0.4 is a multimodal Speech LLM built around a pretrained Llama3.1-8B-Instruct and Whisper-medium backbone.

Deploy Ultravox v0.4 behind an API endpoint in seconds.

Example usage

You can stream responses from this LLM using OpenAI chat completions.

Input
1from openai import OpenAI
2
3model_id = ""
4
5client = OpenAI(
6    api_key="YOUR-API-KEY",
7    base_url="https://bridge.baseten.co/v1/direct"
8)
9
10response = client.chat.completions.create(
11  model="ultravox", # Replace with your model name
12  messages=[
13      {
14          "role": "user",
15          "content": [
16              {
17                  "type": "text",
18                  "text": "Answer in one sentence. For lake Michigan,"
19              },
20              {
21                  "type": "audio_url",
22                  "audio_url": {"url": "http://study.aitech.ac.jp/tat/239977.mp3"}
23              }
24          ]
25      }
26  ],
27  stream=True
28  extra_body={
29    "baseten": {
30      "model_id": model_id
31    }
32  }
33)
34
35for chunk in response:
36    print(chunk.choices[0].delta)

JSON output
1[
2    "Lake",
3    "Michigan",
4    "is",
5    "approximately",
6    "..."
7]

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten 👋 👽

INFO

🚀 Uploading model to Baseten 🚀

Upload progress: 0% | | 0.00G/2.39G