> ## Documentation Index > Fetch the complete documentation index at: https://docs.chonkie.ai/llms.txt > Use this file to discover all available pages before exploring further. # Endpoints > API reference for all Chonkie chunkers and refineries Start the server and visit `http://localhost:8000/docs` for an interactive Swagger UI where you can try every endpoint directly in your browser. ## Response Format All chunking endpoints return a list of chunk objects: ```json theme={"system"} [ { "text": "chunk content", "start_index": 0, "end_index": 42, "token_count": 8 } ] ``` Submit a **list of strings** instead of a single string to get back a **list of lists** — one inner list per input document. *** ## Chunkers ### Token Chunker `POST /v1/chunk/token` Splits text into fixed-size token windows. The fastest and most predictable chunker. ```bash theme={"system"} curl -X POST http://localhost:8000/v1/chunk/token \ -H "Content-Type: application/json" \ -d '{ "text": "Your text here...", "chunk_size": 512, "chunk_overlap": 50 }' ``` Text or list of texts to chunk. Tokenizer to use. Options: `"character"`, `"gpt2"`, `"cl100k_base"`, or any HuggingFace tokenizer name. Maximum tokens per chunk. Token overlap between consecutive chunks. *** ### Sentence Chunker `POST /v1/chunk/sentence` Groups sentences into chunks while respecting a token-size limit. Preserves sentence boundaries — no mid-sentence splits. ```bash theme={"system"} curl -X POST http://localhost:8000/v1/chunk/sentence \ -H "Content-Type: application/json" \ -d '{ "text": "First sentence. Second sentence. Third sentence.", "chunk_size": 256, "min_sentences_per_chunk": 2 }' ``` Text or list of texts to chunk. Tokenizer to use. Maximum tokens per chunk. Token overlap between chunks. Minimum sentences to include in each chunk. Minimum characters required to count as a sentence. Use approximate token counting for faster processing. Sentence delimiter(s). Attach the delimiter to the previous (`"prev"`) or next (`"next"`) sentence. *** ### Recursive Chunker `POST /v1/chunk/recursive` Splits text using a hierarchy of separators defined by a named recipe. Great for structured text like Markdown or code. Chunker instances are cached per `(recipe, lang, tokenizer)` for speed. ```bash theme={"system"} curl -X POST http://localhost:8000/v1/chunk/recursive \ -H "Content-Type: application/json" \ -d '{ "text": "# Heading\n\nParagraph one.\n\nParagraph two.", "chunk_size": 256, "recipe": "markdown" }' ``` Text or list of texts to chunk. Tokenizer to use. Maximum tokens per chunk. Named splitting recipe. Options: `"default"` (paragraph → sentence → word), `"markdown"`, `"python"`, `"js"`. Language hint for the recipe. Minimum characters to include in a chunk. *** ### Semantic Chunker `POST /v1/chunk/semantic` Splits where semantic similarity between adjacent sentences drops below a threshold. Produces topically coherent chunks. Requires the `semantic` extra. ```bash theme={"system"} curl -X POST http://localhost:8000/v1/chunk/semantic \ -H "Content-Type: application/json" \ -d '{ "text": "Dogs are loyal and friendly pets. Cats are independent animals. Quantum physics studies subatomic particles.", "embedding_model": "minishlab/potion-base-8M", "threshold": 0.5 }' ``` Text or list of texts to chunk. Sentence-embedding model for computing similarity. Any model compatible with `sentence-transformers` works. Cosine-similarity threshold for splitting (0.0–1.0). Lower values produce larger, fewer chunks. Maximum tokens per chunk. Number of surrounding sentences to consider when computing similarity. Minimum sentences per chunk. Minimum characters per sentence. *** ### Code Chunker `POST /v1/chunk/code` Splits source code at syntactic boundaries using AST parsing. Never breaks inside a function or class. Requires the `code` extra. ```bash theme={"system"} curl -X POST http://localhost:8000/v1/chunk/code \ -H "Content-Type: application/json" \ -d '{ "text": "def hello():\n print(\"Hello\")\n\ndef world():\n print(\"World\")", "language": "python", "chunk_size": 100 }' ``` Source code or list of source code snippets to chunk. Tokenizer to use. Maximum tokens per chunk. Programming language. Supported: `"python"`, `"javascript"`, `"typescript"`, `"java"`, `"go"`, `"rust"`, `"c"`, `"cpp"`, and more. Include AST node metadata (node type, line numbers) in the chunk output. *** ## Refineries Refineries enrich an existing list of chunks. Pass the output of any chunker endpoint directly into a refinery. ### Overlap Refinery `POST /v1/refine/overlap` Appends or prepends overlapping context from neighbouring chunks. Useful when downstream consumers need continuity across chunk boundaries. ```bash theme={"system"} curl -X POST http://localhost:8000/v1/refine/overlap \ -H "Content-Type: application/json" \ -d '{ "chunks": [ {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3}, {"text": "Second chunk.", "start_index": 13, "end_index": 26, "token_count": 3} ], "context_size": 0.25, "method": "suffix" }' ``` List of chunk objects from any chunker endpoint. Each must contain `text`, `start_index`, `end_index`, and `token_count`. Tokenizer to use. Size of the overlap context. A float (0–1) is treated as a fraction of the chunk size; an integer is an absolute token count. Strategy used to create the overlap window. `"suffix"` appends context from the next chunk; `"prefix"` prepends context from the previous chunk; `"justified"` adds context from both sides. Merge the overlap context into the chunk text field. *** ### Embeddings Refinery `POST /v1/refine/embeddings` Computes and attaches embeddings to each chunk via Chonkie's `AutoEmbeddings`. Each chunk in the response gains an `embedding` field containing a list of floats. **Local models** (e.g. `minishlab/potion-base-8M`) run entirely on-device and require no API key. **API-based models** require the appropriate environment variable for your provider. ```bash theme={"system"} # Local model (no API key required) curl -X POST http://localhost:8000/v1/refine/embeddings \ -H "Content-Type: application/json" \ -d '{ "chunks": [ {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3}, {"text": "Second chunk.", "start_index": 13, "end_index": 26, "token_count": 3} ], "embedding_model": "minishlab/potion-base-8M" }' # OpenAI (requires OPENAI_API_KEY) curl -X POST http://localhost:8000/v1/refine/embeddings \ -H "Content-Type: application/json" \ -d '{ "chunks": [ {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3} ], "embedding_model": "text-embedding-3-small" }' ``` ## Embeddings Providers | Type | Example Model | Requirement | | ----------------- | ------------------------------------------------------------ | ---------------- | | Local (model2vec) | `minishlab/potion-base-8M`, `minishlab/potion-retrieval-32M` | None | | OpenAI | `text-embedding-3-small`, `text-embedding-3-large` | `OPENAI_API_KEY` | | Cohere | `embed-english-v3.0`, `embed-multilingual-v3.0` | `COHERE_API_KEY` | | Voyage AI | `voyage-large-2`, `voyage-code-2` | `VOYAGE_API_KEY` | List of chunk objects to embed. Embedding model name. Local model2vec models (e.g. `minishlab/potion-base-8M`) require no API key. For API-based models, set the appropriate environment variable for your provider. *** ## Batch Processing Send a list of strings to process multiple documents in one request: ```bash theme={"system"} curl -X POST http://localhost:8000/v1/chunk/token \ -H "Content-Type: application/json" \ -d '{ "text": ["First document.", "Second document.", "Third document."], "chunk_size": 512 }' ``` The response is a **list of lists** — one inner list of chunks per input document: ```json theme={"system"} [ [{"text": "First document.", "start_index": 0, "end_index": 15, "token_count": 3}], [{"text": "Second document.", "start_index": 0, "end_index": 16, "token_count": 3}], [{"text": "Third document.", "start_index": 0, "end_index": 15, "token_count": 3}] ] ``` *** ## Chaining Chunkers and Refineries Pipeline example — chunk semantically, then add overlap context: ```python theme={"system"} import requests BASE = "http://localhost:8000" # Step 1: chunk chunks = requests.post(f"{BASE}/v1/chunk/semantic", json={ "text": "Your long document here...", "threshold": 0.5, }).json() # Step 2: add overlap enriched = requests.post(f"{BASE}/v1/refine/overlap", json={ "chunks": chunks, "context_size": 0.2, }).json() # Step 3: embed (requires OPENAI_API_KEY) embedded = requests.post(f"{BASE}/v1/refine/embeddings", json={ "chunks": enriched, "embedding_model": "text-embedding-3-small", }).json() ``` *** ## Error Handling | Status | Meaning | | ------ | ------------------------------------------------------------ | | `200` | Success | | `400` | Invalid request parameters or chunk format | | `500` | Internal error (missing extras, model loading failure, etc.) | Error responses follow FastAPI's standard format: ```json theme={"system"} { "detail": "SemanticChunker requires the 'semantic' extra. Install it with: pip install 'chonkie[semantic]'" } ``` *** ## Health & Info ```bash theme={"system"} # Health check (used by load balancers and container orchestrators) curl http://localhost:8000/health # {"status": "ok"} # API info curl http://localhost:8000/ # {"name": "Chonkie OSS API", "version": "...", "docs": "/docs", ...} ```