AutoEmbeddings is a class that automatically selects the appropriate embeddings handler for you, based on the model name you provide.

Installation

Embeddings require the appropriate library to be installed. See the Installation Guide for more information.

Usage

Load the embeddings handler for the model you want to use.

from chonkie import AutoEmbeddings

# Get the embeddings handler for SentenceTransformer
embeddings = AutoEmbeddings.get_embeddings("all-MiniLM-L6-v2")

# Get the embeddings handler for OpenAI
embeddings = AutoEmbeddings.get_embeddings("text-embedding-3-large")

# Get the embeddings handler for Model2Vec
embeddings = AutoEmbeddings.get_embeddings("minishlab/potion-base-8M")

After loading the embeddings handler, you can use it in the same way you would use any other embeddings handler.

from chonkie import SemanticChunker
chunker = SemanticChunker(embeddings=embeddings, similarity_threshold=0.7)

# Chunk the text
chunks = chunker(text)

SemanticChunkers interally call upon the AutoEmbeddings class to get the embeddings handler. So you can directly pass in a string to the embeddings parameter as well, as long as it matches one of the models supported by AutoEmbeddings, and its dependencies are installed.

Method: get_embeddings

The get_embeddings method is a factory method that returns an instance of the appropriate embeddings handler.

model_name
str

The name of the embeddings model to use.