The TurbopufferHandshake class provides seamless integration between Chonkie’s chunking system and Turbopuffer, a high-performance vector database.

Embed and store your Chonkie chunks in Turbopuffer without ever leaving the Chonkie SDK.

The Turbopuffer Handshake requires a Turbopuffer API key. You can get one by signing up for a Turbopuffer account.

Installation

Before using the Turbopuffer handshake, make sure to install the required dependencies:

pip install chonkie[turbopuffer]

Basic Usage

Initialization

from chonkie import TurbopufferHandshake

# Initialize with default settings (requires TURBOPUFFER_API_KEY environment variable)
handshake = TurbopufferHandshake()

# Or provide an API key directly
handshake = TurbopufferHandshake(api_key="your_turbopuffer_api_key")

# Use a specific namespace
handshake = TurbopufferHandshake(namespace_name="my_documents")

# Or use an existing Turbopuffer namespace
import turbopuffer as tpuf
ns = tpuf.Namespace("existing_namespace")
handshake = TurbopufferHandshake(namespace=ns)

Writing Chunks to Turbopuffer

from chonkie import TurbopufferHandshake, SemanticChunker

handshake = TurbopufferHandshake(namespace_name="my_documents")

chunker = SemanticChunker()
chunks = chunker("Chonkie chunks, turbopuffer puffs!")

handshake.write(chunks)

Parameters

namespace
Optional[tpuf.Namespace]
default:"None"

An existing Turbopuffer Namespace instance to use. If not provided, a new namespace will be created.

namespace_name
Union[str, Literal['random']]
default:"random"

Name of the namespace to use. If “random”, a unique name will be generated. Only used if namespace parameter is not provided.

embedding_model
Union[str, BaseEmbeddings]
default:"minishlab/potion-retrieval-32M"

Embedding model to use. Can be a model name or a BaseEmbeddings instance.

api_key
Optional[str]
default:"None"

Turbopuffer API key. If not provided, will look for TURBOPUFFER_API_KEY environment variable.

Authentication

You can authenticate with Turbopuffer in one of two ways:

  1. Environment Variable (Recommended for development):

    export TURBOPUFFER_API_KEY='your-api-key-here'
    
  2. Directly in code (Not recommended for production):

    handshake = TurbopufferHandshake(api_key="your-api-key-here")
    

For production environments, it’s recommended to use environment variables or a secure secret management system to handle your API keys.