The PgvectorHandshake class provides seamless integration between Chonkie’s chunking system and PostgreSQL with pgvector. It uses the vecs client library from Supabase underneath to provide a higher-level API with automatic indexing, metadata filtering, and simplified connection management.

Store your Chonkie chunks in PostgreSQL with vector embeddings and perform semantic search without ever leaving the Chonkie SDK.

Installation

Before using the Pgvector handshake, make sure to install the required dependencies:

pip install chonkie[pgvector]

You’ll also need PostgreSQL with the pgvector extension installed:

-- Connect to your database and enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;

Initialization

from chonkie import PgvectorHandshake

# Initialize with individual connection parameters
handshake = PgvectorHandshake(
    host="localhost",
    port=5432,
    database="your_database",
    user="your_user",
    password="your_password",
    collection_name="chonkie_chunks"
)

# Or use a connection string
handshake = PgvectorHandshake(
    connection_string="postgresql://user:password@localhost:5432/database"
)

# Or use an existing vecs client
import vecs
client = vecs.create_client("postgresql://user:password@localhost:5432/database")
handshake = PgvectorHandshake(client=client, collection_name="my_collection")

Usage

Parameters

client
Optional[vecs.Client]
default:"None"

An existing vecs.Client instance. If provided, other connection parameters are ignored.

host
str
default:"localhost"

PostgreSQL host address.

port
int
default:"5432"

PostgreSQL port number.

database
str
default:"postgres"

PostgreSQL database name.

user
str
default:"postgres"

PostgreSQL username.

password
str
default:"postgres"

PostgreSQL password.

connection_string
Optional[str]
default:"None"

Full PostgreSQL connection string. If provided, individual connection parameters are ignored.

collection_name
str
default:"chonkie_chunks"

Name of the collection to store chunks in.

embedding_model
Union[str, BaseEmbeddings]
default:"minishlab/potion-retrieval-32M"

Embedding model to use. Can be a model name or a BaseEmbeddings instance.

vector_dimensions
Optional[int]
default:"None"

Number of dimensions for the vector embeddings. If not provided, will be inferred from the embedding model.