Chonkie Documentation

✨Transform your text into perfect chunks✨

Chonkie’s Chunking API is an enterprise-grade text processing service built on Chonkie OSS. Get instant access to all of our chunking algorithms, embedding generation, and text refinement. Avialble through simple REST APIs. No chunking logic to write, tokenizers to configure, or edge cases to debug. Just perfect chunks, delivered fast.

Key Features

All Our Chunkers

Use any of our awesome chunkers to split your data. Each optimized for different document types and use cases.

Embeddings Refinery

Add vector embeddings to your chunks using any Hugging Face or OpenAI model. Supports all major embedding providers.

Overlap Refinery

Add contextual overlap between chunks to prevent information loss at boundaries. Configurable overlap sizes for optimal retrieval quality.

Multi-Language SDKs

Official Python and JavaScript/TypeScript SDKs with full type safety and environment variable support.

Production-Ready

Battle-tested algorithms refined through thousands of real-world deployments. Used by startups to enterprises for mission-critical RAG systems.

Auto-Scaling

Automatically scales to handle your workload. From prototyping with single documents to production pipelines processing millions of chunks.

Why Use Hosted Chunking?

Building chunking from scratch means dealing with:

Algorithm complexity: Implementing semantic chunking, code parsing, recursive splitting
Tokenizer headaches: Managing multiple tokenizers, handling edge cases, counting accurately
Performance optimization: Caching, batching, parallelization, memory management
Maintenance burden: Updating dependencies, fixing bugs, handling new file formats

Chonkie Chunking API eliminates all of this. You get:

Instant Integration

Add chunking to your application in minutes with simple REST APIs or SDKs. No complex setup or configuration required.

Battle-Tested Algorithms

Proven chunking strategies refined through thousands of production deployments across diverse document types and use cases.

Always Up-to-Date

Automatic updates with new chunkers, optimizations, and bug fixes. You get improvements without changing a line of code.

Use Cases

RAG Pipelines

Preprocess documents for retrieval augmented generation. Chunk once, embed, and store in your vector database for lightning-fast retrieval.

Document Analysis

Break down large documents for analysis, summarization, or classification. Process documents too large for single LLM calls.

Semantic Search

Create search indices over documentation, support articles, or knowledge bases. Combine with embeddings for powerful semantic search.

Code Indexing

Index codebases for AI-powered code search, documentation generation, or code review tools. Language-aware chunking preserves context.

Content Processing

Process blogs, articles, and content at scale. Perfect for content recommendation systems or editorial tools.

Refineries: Enhance Your Chunks

Add Embeddings

Generate vector embeddings for your chunks using any Hugging Face model. Supports all major providers: OpenAI, Cohere, Sentence Transformers, and more.

Add Overlap

Add contextual overlap between chunks post-processing. Perfect for adding overlap to existing chunks or experimenting with different overlap sizes.

Pricing

Visit our pricing page to see which plan works best for you.

Next Steps

Ready to start chunking?

API Documentation

Explore detailed API docs for all chunkers and refineries

Welcome

Products

Chunking API

Key Features

All Our Chunkers

Embeddings Refinery

Overlap Refinery

Multi-Language SDKs

Production-Ready

Auto-Scaling

Why Use Hosted Chunking?

Instant Integration

Battle-Tested Algorithms

Always Up-to-Date

Use Cases

Refineries: Enhance Your Chunks

Add Embeddings

Add Overlap

Pricing

Next Steps

API Documentation

Get API Key

Welcome

Products

​Key Features

All Our Chunkers

Embeddings Refinery

Overlap Refinery

Multi-Language SDKs

Production-Ready

Auto-Scaling

​Why Use Hosted Chunking?

Instant Integration

Battle-Tested Algorithms

Always Up-to-Date

​Use Cases

​Refineries: Enhance Your Chunks

Add Embeddings

Add Overlap

​Pricing

​Next Steps

API Documentation

Get API Key

Key Features

Why Use Hosted Chunking?

Use Cases

Refineries: Enhance Your Chunks

Pricing

Next Steps