Semantic Chunker
Chunk the given text using the Semantic Chunker.
Authorizations
Your API Key from the Chonkie Cloud dashboard
Body
Data to pass to the Semantic Chunker.
The input text or list of texts to be chunked.
Model identifier or embedding model instance
When in the range [0,1], denotes the similarity threshold to consider sentences similar. When in the range (1,100], interprets the given value as a percentile threshold. When set to 'auto', the threshold is automatically calculated.
"auto"
Maximum tokens per chunk
Number of sentences to consider for similarity threshold calculation
Minimum number of sentences per chunk
Minimum tokens per chunk
Minimum number of characters per sentence
Step size for similarity threshold calculation
Delimiters to split sentences on. Default is ['.', '!', '?', '\n']
Include delimiters in the chunk text. If so, specifies whether to include in the previous or next chunk
prev
, next
Return type for chunking. If 'chunks', returns a list of SemanticChunk objects. If 'texts', returns a list of strings
texts
, chunks
Response
The actual text content of the chunk.
The starting character index of the chunk within the original input text.
The ending character index (exclusive) of the chunk within the original input text.
The number of tokens in this specific chunk, according to the tokenizer used.
List of semantic sentences contained within this chunk.
Represents a single sentence within a semantic chunk, potentially including an embedding.
Was this page helpful?