Model library / Fixie AI / Ultravox v0.4

Ultravox v0.4

LLM VLLM H100 MIG 40GB

Ultravox v0.4 is a multimodal Speech LLM built around a pretrained Llama3.1-8B-Instruct and Whisper-medium backbone.

Deploy Ultravox v0.4 behind an API endpoint in seconds.

Deploy model

Example usage

You can stream responses from this LLM using OpenAI chat completions.

Input

1from openai import OpenAI
2
3model_id = ""
4
5client = OpenAI(
6    api_key="YOUR-API-KEY",
7    base_url="https://bridge.baseten.co/v1/direct"
8)
9
10response = client.chat.completions.create(
11  model="ultravox", # Replace with your model name
12  messages=[
13      {
14          "role": "user",
15          "content": [
16              {
17                  "type": "text",
18                  "text": "Answer in one sentence. For lake Michigan,"
19              },
20              {
21                  "type": "audio_url",
22                  "audio_url": {"url": "http://study.aitech.ac.jp/tat/239977.mp3"}
23              }
24          ]
25      }
26  ],
27  stream=True
28  extra_body={
29    "baseten": {
30      "model_id": model_id
31    }
32  }
33)
34
35for chunk in response:
36    print(chunk.choices[0].delta)

JSON output

1[
2    "Lake",
3    "Michigan",
4    "is",
5    "approximately",
6    "..."
7]

Example usage

Deploy any model in just a few commands