Chonkie’s Release Notes and Updates 🦛✨
SlumberChunker
: Welcome Chonkie’s very own agentic chunker! Requires the genie
optional install and a GEMINI_API_KEY
. It leverages Genie
, Chonkie’s interface for generative models.NeuralChunker
: Introducing a fully neural approach to chunking! Requires the neural
optional install. This uses a fine-tuned BERT-like model for fast, high-quality chunking.auto
Language Detection for CodeChunker
: CodeChunker
can now automatically detect the programming language. Specify the language manually if performance is critical.Genie
s: Added Genie
to power SlumberChunker
and future generative features. Genie
s are Chonkie’s way to handle multiple generative APIs and model interfaces. The first is GeminiGenie
, requiring the genie
optional install.Full Changelog: https://github.com/chonkie-inc/chonkie/compare/v1.0.5…v1.0.6
This is a quick patch release to include CodeChunker
in the __init__.py
for chonkie
so it can be properly accessed via from chonkie import CodeChunker
.
Full Changelog: https://github.com/chonkie-inc/chonkie/compare/v1.0.4…v1.0.5
CodeChunker
: Introducing the CodeChunker
, specialized for handling code files across 100+ programming languages. It understands code structure to provide more meaningful chunks.JinaAI
Embeddings Support: Added JinaEmbeddings
, enabling their use with SemanticChunker
and SDPMChunker
. Just install the jina
optional install to use it!OverlapRefinery
: Enhance your chunks by adding overlapping context using the new OverlapRefinery
. It’s included in the default install and works seamlessly with any chunker.EmbeddingsRefinery
: Compute and attach embeddings directly to your chunks using the EmbeddingsRefinery
. Streamline the process of loading chunks into vector databases.Full Changelog: https://github.com/chonkie-inc/chonkie/compare/v1.0.3…v1.0.4
Chonkie Visualizer
: Visualize and debug chunks easily via terminal printouts or HTML saves. Understand chunk quality and debug your chunker with visual feedback~ Use the print
method to print rich text on your terminal or use the save
method to save a highlighted html
on your device! It’s very simple to use, just pass in your chunks~
Recipes: Chonkie now adds support for Recipes
which allow you to use multilingual chunking out-of-the-box, as well as document specific chunking methods. Initial support starts with: en
, hi
, zh
, jp
and ko
, while document type markdown
is supported too. Use it via the from_recipe
class method with any chunker that takes delimiters or RecursiveRules
.
Performance enhancements in RecursiveChunker
, SentenceChunker
, and WordTokenizer
.
Full Changelog: https://github.com/chonkie-inc/chonkie/compare/v1.0.2…v1.0.3