Nomic Embed v1.5
SOTA text embedding model with variable dimensionality — outperforms OpenAI text-embedding-ada-002 and text-embedding-3-small models. Learn more
Deploy Nomic Embed v1.5 behind an API endpoint in seconds.
Deploy modelExample usage
Nomic Embed v1.5 is a state of the art text embedding model with two special features:
You can choose whether to optimize the embeddings for retrieval, search, clustering, or classification.
You can trade off between cost and accuracy by choosing your own dimensionality thanks to Matryoshka Representation Learning.
Nomic Embed v1.5 takes the following parameters:
texts
the strings to embed.task_type
the task to optimize the embedding for. Can besearch_document
(default),search_query
,clustering
, orclassification
.dimensionality
the size of each output vector, any integer between64
and768
(default).
This code sample demonstrates embedding a set of sentences for retrieval with a dimensionality of 512.
1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9 "texts": ["I want to eat pasta", "I want to eat pizza"],
10 "task_type": "search_document",
11 "dimensionality": 512
12}
13
14# Call model endpoint
15res = requests.post(
16 f"https://model-{model_id}.api.baseten.co/production/predict",
17 headers={"Authorization": f"Api-Key {baseten_api_key}"},
18 json=data
19)
20
21# Print the output of the model
22print(res.json())
1[
2 [
3 -0.03811980411410332,
4 "...",
5 -0.023593541234731674
6 ],
7 [
8 -0.042617011815309525,
9 "...",
10 -0.0191882885992527
11 ]
12]