SDPM Chunker
Chunk the given text using the SDPM Chunker.
Authorizations
Your API Key from the Chonkie Cloud dashboard
Body
Data to pass to the SDPM Chunker.
The input text or list of texts to be chunked.
Model identifier or embedding model instance
When in the range [0,1], denotes the similarity threshold to consider sentences similar. When in the range (1,100], interprets the given value as a percentile threshold. When set to 'auto', the threshold is automatically calculated.
"auto"
Mode for grouping sentences, either 'cumulative' or 'window'
Maximum tokens per chunk
Number of sentences to consider for similarity threshold calculation
Minimum number of sentences per chunk
Minimum number of characters per sentence
Step size for threshold calculation
Delimiters to split sentences on
Number of chunks to skip when looking for similarities
Return type for the chunking process. If 'chunks', returns a list of SemanticChunk objects. If 'texts', returns a list of strings.
texts
, chunks
Response
The actual text content of the chunk.
The starting character index of the chunk within the original input text.
The ending character index (exclusive) of the chunk within the original input text.
The number of tokens in this specific chunk, according to the tokenizer used.
List of semantic sentences contained within this chunk.
Represents a single sentence within a semantic chunk, potentially including an embedding.
Was this page helpful?