Skip to main content
The WeaviateHandshake class provides seamless integration between Chonkie’s chunking system and Weaviate, a powerful vector database. Embed and store your Chonkie chunks in Weaviate without ever leaving the Chonkie SDK.

Installation

Before using the Weaviate handshake, make sure to install the required dependencies:
pip install chonkie[weaviate]

Basic Usage

Initialization

from chonkie import WeaviateHandshake

# Initialize with default settings (local Weaviate)
handshake = WeaviateHandshake()

# Or connect to a Weaviate server
handshake = WeaviateHandshake(url="http://localhost:8080", api_key= "YOUR_API_KEY")

Parameters

client
Optional[weaviate.Client]
default:"None"
Weaviate client instance. If not provided, a new client will be created based on other parameters.
collection_name
Union[str, Literal['random']]
default:"random"
Name of the collection to use. If “random”, a unique name will be generated.
embedding_model
Union[str, BaseEmbeddings]
default:"minishlab/potion-retrieval-32M"
Embedding model to use. Can be a model name or a BaseEmbeddings instance.
url
Optional[str]
default:"None"
URL of the Weaviate server. If provided, will connect to this server.
api_key
Optional[str]
default:"None"
API key for Weaviate Cloud authentication.
auth_config
Optional[Dict[str, Any]]
default:"None"
OAuth configuration for authentication (optional).
batch_size
int
default:"100"
Batch size for batch operations.
batch_dynamic
bool
default:"True"
Whether to use dynamic batching.
batch_timeout_retries
int
default:"3"
Number of retries for batch timeouts.
additional_headers
Optional[Dict[str, str]]
default:"None"
Additional headers for the Weaviate client.

Writing Chunks to Weaviate

from chonkie import WeaviateHandshake, SemanticChunker    

# Initialize the handshake
handshake = WeaviateHandshake(
    url="YOUR_CLOUD_URL",
    api_key="YOUR_API_KEY",
    collection_name="my_documents"
)

# Create some chunks
chunker = SemanticChunker()
chunks = chunker.chunk("Chonkie loves to chonk your texts!")

# Write chunks to Weaviate
handshake.write(chunks)

Searching Chunks in Weaviate

You can retrieve the most similar chunks from your Weaviate collection using the search method:
from chonkie import WeaviateHandshake

# Initialize the handshake
handshake = WeaviateHandshake(
    url="YOUR_CLOUD_URL",
    api_key="YOUR_API_KEY",
    collection_name="my_documents"
)
results = handshake.search(query="chonk your texts", limit=2)
for result in results:
    print(result["score"], result["text"])
I