Ultravox v0.4
Ultravox v0.4 is a multimodal Speech LLM built around a pretrained Llama3.1-8B-Instruct and Whisper-medium backbone.
Deploy Ultravox v0.4 behind an API endpoint in seconds.
Deploy modelExample usage
You can stream responses from this LLM using OpenAI chat completions.
Input
1from openai import OpenAI
2
3model_id = ""
4
5client = OpenAI(
6 api_key="YOUR-API-KEY",
7 base_url="https://bridge.baseten.co/v1/direct"
8)
9
10response = client.chat.completions.create(
11 model="ultravox", # Replace with your model name
12 messages=[
13 {
14 "role": "user",
15 "content": [
16 {
17 "type": "text",
18 "text": "Answer in one sentence. For lake Michigan,"
19 },
20 {
21 "type": "audio_url",
22 "audio_url": {"url": "http://study.aitech.ac.jp/tat/239977.mp3"}
23 }
24 ]
25 }
26 ],
27 stream=True
28 extra_body={
29 "baseten": {
30 "model_id": model_id
31 }
32 }
33)
34
35for chunk in response:
36 print(chunk.choices[0].delta)
JSON output
1[
2 "Lake",
3 "Michigan",
4 "is",
5 "approximately",
6 "..."
7]