AudioGen Medium
A text-to-audio model for creating short snippets of sound effects like dogs barking or footsteps in a hallway.
Deploy AudioGen Medium behind an API endpoint in seconds.
Deploy modelExample usage
This code example shows how to invoke the model using the requests library in Python. The model has two inputs:
prompts
: This is a list of texts which the model uses to determine the type of audio to generate.duration
: The duration in seconds for each output audio file
The output of the model is a JSON object that contains a key called data
which has a list of all the generated audio files. Each audio file in the list is represented as a base64 string.
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9 "prompts": ['dog barking'],
10 "duration": 8
11}
12
13# Call model endpoint
14res = requests.post(
15 f"https://model-{model_id}.api.baseten.co/production/predict",
16 headers={"Authorization": f"Api-Key {baseten_api_key}"},
17 json=data
18)
19
20# Print the output of the model
21print(res.json())
22output = response.get("data")
23
24# Convert the output base64 strings to audio files
25for idx, clip in enumerate(output):
26 with open(f"clip_{idx}.wav", "wb") as f:
27 f.write(base64.b64decode(clip))
1{
2 "data": [
3 "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBg..."
4 ]
5}
Here is another example with the following prompt:
sirene of an emergency vehicle
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9 "prompts": ['sirene of an emergency vehicle'],
10 "duration": 8
11}
12
13# Call model endpoint
14res = requests.post(
15 f"https://model-{model_id}.api.baseten.co/production/predict",
16 headers={"Authorization": f"Api-Key {baseten_api_key}"},
17 json=data
18)
19
20# Print the output of the model
21print(res.json())
22output = response.get("data")
23
24# Convert the output base64 strings to audio files
25for idx, clip in enumerate(output):
26 with open(f"clip_{idx}.wav", "wb") as f:
27 f.write(base64.b64decode(clip))
1{
2 "data": [
3 "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBg..."
4 ]
5}
Final example with the prompt:
footsteps in a corridor
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9 "prompts": ['footsteps in a corridor'],
10 "duration": 8
11}
12
13# Call model endpoint
14res = requests.post(
15 f"https://model-{model_id}.api.baseten.co/production/predict",
16 headers={"Authorization": f"Api-Key {baseten_api_key}"},
17 json=data
18)
19
20# Print the output of the model
21print(res.json())
22output = response.get("data")
23
24# Convert the output base64 strings to audio files
25for idx, clip in enumerate(output):
26 with open(f"clip_{idx}.wav", "wb") as f:
27 f.write(base64.b64decode(clip))
1{
2 "data": [
3 "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBg..."
4 ]
5}