POST
/
v1
/
chunk
/
recursive
curl --request POST \
  --url https://api.chonkie.ai/v1/chunk/recursive \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "text": "<string>",
  "tokenizer_or_token_counter": "gpt2",
  "chunk_size": 512,
  "recipe": "default",
  "lang": "en",
  "min_characters_per_chunk": 1,
  "return_type": "chunks"
}'
[
  {
    "text": "<string>",
    "start_index": 123,
    "end_index": 123,
    "token_count": 123,
    "level": 123
  }
]

Authorizations

Authorization
string
header
required

Your API Key from the Chonkie Cloud dashboard

Body

application/json

Data to pass to the Recursive Character Text Splitter.

text
required

The input text or list of texts to be chunked.

tokenizer_or_token_counter
string
default:gpt2

Tokenizer to use. Can be a string identifier or a tokenizer instance

chunk_size
integer
default:512

Maximum number of tokens per chunk

recipe
string
default:default

Rules to split text by. Find all recipes on our Hugging Face.

lang
string
default:en

Language of the text. This must match the language of the recipe.

min_characters_per_chunk
integer
default:1

Minimum number of characters per chunk

return_type
enum<string>
default:chunks

Whether to return chunks as text strings or as RecursiveChunk objects

Available options:
texts,
chunks

Response

200 - application/json
Successful Response: A list of recursive chunk objects.
text
string

The actual text content of the chunk.

start_index
integer

The starting character index of the chunk within the original input text.

end_index
integer

The ending character index (exclusive) of the chunk within the original input text.

token_count
integer

The number of tokens in this specific chunk, according to the tokenizer used.

level
integer

The level of this chunk in the recursive splitting process.