Prompt: A movie still of a squirrel in a forest green ski suit

Model library / Nomic AI / Nomic Embed v1.5

Nomic Embed v1.5

SOTA text embedding model with variable dimensionality — outperforms OpenAI text-embedding-ada-002 and text-embedding-3-small models. Learn more

Deploy Nomic Embed v1.5 behind an API endpoint in seconds.

Deploy model

Example usage

Nomic Embed v1.5 is a state of the art text embedding model with two special features:

You can choose whether to optimize the embeddings for retrieval, search, clustering, or classification.
You can trade off between cost and accuracy by choosing your own dimensionality thanks to Matryoshka Representation Learning.

Nomic Embed v1.5 takes the following parameters:

texts the strings to embed.
task_type the task to optimize the embedding for. Can be search_document (default), search_query, clustering, or classification.
dimensionality the size of each output vector, any integer between 64 and 768 (default).

This code sample demonstrates embedding a set of sentences for retrieval with a dimensionality of 512.

Input

1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "texts": ["I want to eat pasta", "I want to eat pizza"],
10    "task_type": "search_document",
11    "dimensionality": 512
12}
13
14# Call model endpoint
15res = requests.post(
16    f"https://model-{model_id}.api.baseten.co/production/predict",
17    headers={"Authorization": f"Api-Key {baseten_api_key}"},
18    json=data
19)
20
21# Print the output of the model
22print(res.json())

JSON output

1[
2    [
3        -0.03811980411410332,
4        "...",
5        -0.023593541234731674
6    ],
7    [
8        -0.042617011815309525,
9        "...",
10        -0.0191882885992527
11    ]
12]

Example usage

Deploy any model in just a few commands