Skip to main content
The ElasticHandshake class provides seamless integration between Chonkie’s chunking system and Elasticsearch, allowing you to leverage its powerful vector search capabilities. Embed and store your Chonkie chunks in an Elasticsearch index without ever leaving the Chonkie SDK. The handshake automatically handles index creation and the necessary vector field mapping.

Installation

Before using the Elasticsearch handshake, make sure to install the required dependencies:
pip install chonkie[elastic]

Basic Usage

Initialization

from chonkie import ElasticHandshake

# Connects to http://localhost:9200 by default
handshake = ElasticHandshake()

Parameters

client
Optional[Elasticsearch]
default:"None"
An existing elasticsearch.Elasticsearch client instance. If not provided, a new client will be created based on other parameters.
index_name
Union[str, Literal['random']]
default:"random"
Name of the Elasticsearch index to use. If “random”, a unique name will be generated.
embedding_model
Union[str, BaseEmbeddings]
default:"minishlab/potion-retrieval-32M"
The embedding model to use for creating vectors. Can be a model name from Hugging Face or a BaseEmbeddings instance.
hosts
Optional[Union[str, List[str]]]
default:"None"
The URL(s) of the Elasticsearch instance(s) to connect to.
cloud_id
Optional[str]
default:"None"
The Cloud ID for connecting to an Elastic Cloud deployment.
api_key
Optional[str]
default:"None"
The API key for authenticating with Elasticsearch, commonly used for Elastic Cloud.

Writing Chunks to Elasticsearch

from chonkie import ElasticHandshake, SentenceChunker

# Initialize the handshake for your deployment
handshake = ElasticHandshake(
    cloud_id="YOUR_CLOUD_ID",
    api_key="YOUR_API_KEY",
    index_name="my_documents",
)

# Create some chunks
chunker = SentenceChunker()
chunks = chunker.chunk("Chonkie uses the bulk API for efficient indexing. It's fast and reliable!")

# Write chunks to Elasticsearch
handshake.write(chunks)

Searching Chunks in Elasticsearch

You can retrieve the most similar chunks from your Elasticsearch index using the search method, which performs a k-Nearest Neighbor (kNN) vector search.
from chonkie import ElasticHandshake

# Initialize the handshake to connect to your index
handshake = ElasticHandshake(
    hosts="YOUR_CLOUD_ID",
    api_key="YOUR_API_KEY",
    index_name="my_documents",
)

results = handshake.search(query="fast and efficient indexing", limit=2)
I