The Sentence Chunker splits text at sentence boundaries, ensuring chunks contain complete sentences for better readability.
Examples
Text Input
from chonkie.cloud import SentenceChunker
chunker = SentenceChunker(
chunk_size=512,
min_sentences_per_chunk=2
)
text = "Your text here..."
chunks = chunker.chunk(text)
from chonkie.cloud import SentenceChunker
chunker = SentenceChunker(
chunk_size=512,
min_sentences_per_chunk=2
)
# Chunk from file
with open("document.txt", "rb") as f:
chunks = chunker.chunk(file=f)
Request
Parameters
The text to chunk. Can be a single string or an array of strings for batch processing. Either text or file is required.
File to chunk. Use multipart/form-data encoding. Either text or file is required.
Tokenizer to use for counting tokens.
Maximum number of tokens per chunk.
Number of tokens to overlap between consecutive chunks.
Minimum number of sentences to include in each chunk.
Response
Returns
Array of Chunk objects, each containing:
Starting character position in the original text.
Ending character position in the original text.
Number of tokens in the chunk.