Skip to main content
The MilvusHandshake class provides seamless integration between Chonkie’s chunking system and Milvus, a powerful, open-source vector database. Embed and store your Chonkie chunks in a Milvus collection, with automatic schema and index creation, without ever leaving the Chonkie SDK.

Installation

Before using the Milvus handshake, make sure to install the required dependencies:
pip install chonkie[milvus]

Basic Usage

Initialization

from chonkie import MilvusHandshake

# Connects to Milvus at http://localhost:19530 by default
handshake = MilvusHandshake()

Parameters

collection_name
Union[str, Literal['random']]
default:"random"
The name of the Milvus collection to use. If “random”, a unique name is generated.
embedding_model
Union[str, BaseEmbeddings]
default:"minishlab/potion-retrieval-32M"
The embedding model to use for creating vectors.
uri
Optional[str]
default:"None"
The full URI to connect to Milvus. This is the preferred method for specifying connection details.
host
str
default:"localhost"
The host of the Milvus instance. Used if uri is not provided.
port
str
default:"19530"
The port of the Milvus instance. Used if uri is not provided.
alias
str
default:"default"
The connection alias to use for this Milvus connection.

Writing Chunks to Milvus

from chonkie import MilvusHandshake, SentenceChunker

# Initialize the handshake for your deployment
handshake = MilvusHandshake(
    uri="http://localhost:19530",
    collection_name="my_documents",
)

# Create some chunks
chunker = SentenceChunker()
chunks = chunker.chunk("Milvus stores data in collections. Chonkie makes ingestion easy!")

# Write chunks to the Milvus collection
handshake.write(chunks)

Searching Chunks in Milvus

You can retrieve the most similar chunks from your Milvus collection using the search method.
from chonkie import MilvusHandshake

# Initialize the handshake to connect to your collection
handshake = MilvusHandshake(
    uri="http://localhost:19530",
    collection_name="my_documents",
)

results = handshake.search(query="easy data ingestion", limit=2)