BGE Embedding ICL
BGE Embedding ICL is an excellent all-around model for text embedding.
Deploy BGE Embedding ICL behind an API endpoint in seconds.
Example usage
BAAI/bge-en-icl
is a text-embeddings model, producing a 1D embeddings vector, given an input. It's frequently used for downstream tasks like clustering, used with vector databases.
This model is quantized to FP8 for deployment, which is supported by Nvidia's newest GPUs e.g. H100, H100_40GB or L4. Quantization is optional, but leads to higher efficiency.
1from openai import OpenAI
2import os
3
4client = OpenAI(
5 api_key=os.environ['BASETEN_API_KEY'],
6 base_url="https://model-xxxxxx.api.baseten.co/environments/production/sync/v1"
7)
8
9embedding = client.embeddings.create(
10 input="Baseten Embeddings are fast",
11 model="model"
12)
1{
2 "data": [
3 {
4 "embedding": [
5 0
6 ],
7 "index": 0,
8 "object": "embedding"
9 }
10 ],
11 "model": "thenlper/gte-base",
12 "object": "list",
13 "usage": {
14 "prompt_tokens": 512,
15 "total_tokens": 512
16 }
17}