The Code Chunker splits source code at logical boundaries (functions, classes, methods) while preserving code structure and syntax.
Examples
Text Input
from chonkie.cloud import CodeChunker
chunker = CodeChunker(
language="python",
chunk_size=512
)
code = "def example():\n return True"
chunks = chunker.chunk(code)
from chonkie.cloud import CodeChunker
chunker = CodeChunker(
language="python",
chunk_size=512
)
# Chunk from file
with open("script.py", "rb") as f:
chunks = chunker.chunk(file=f)
Request
Parameters
The code to chunk. Can be a single string or an array of strings for batch processing. Either text or file is required.
Code file to chunk. Use multipart/form-data encoding. Either text or file is required.
Programming language of the code. Supports: python, javascript, typescript,
java, cpp, etc.
Tokenizer to use for counting tokens.
Maximum number of tokens per chunk.
Response
Returns
Array of Chunk objects, each containing:
Starting character position in the original text.
Ending character position in the original text.
Number of tokens in the chunk.