RecursiveChunker
Recursively chunk documents into smaller chunks.
The RecursiveChunker is a chunker that recursively chunks documents into smaller chunks. It is a good choice for documents that are long but well structured, for example, a book or a research paper.
Installation
The RecursiveChunker is included in the base installation of Chonkie. No additional dependencies are required.
Initialization
Parameters
Tokenizer to use. Can be a string identifier or a tokenizer instance
Maximum number of tokens per chunk
Rules to use for chunking.
Minimum number of characters per chunk
Usage
Single Text Chunking
Batch Chunking
Using as a Callable
Return Type
The RecursiveChunker returns chunks as RecursiveChunk
objects with additional sentence metadata:
Additional Information
The RecursiveChunker uses the RecursiveRules
class to determine the chunking rules. The rules are a list of RecursiveLevel
objects, which define the delimiters and whitespace rules for each level of the recursive tree.
You can pass in custom rules to the RecursiveChunker, or use the default rules. The default rules are designed to be a good starting point for most documents, but you can customize them to your needs.
Was this page helpful?