Quick Start
http://localhost:8000. Visit /docs for the interactive Swagger UI.
docker-compose.yml
The repository ships with a ready-to-usedocker-compose.yml:
./data volume mount persists the SQLite database (chonkie.db) across container restarts.
Environment Variables
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL | INFO | Log verbosity: DEBUG, INFO, WARNING, ERROR |
CORS_ORIGINS | * | Comma-separated allowed origins. Use * to allow all. |
DATABASE_URL | sqlite+aiosqlite:///./data/chonkie.db | SQLite database path. Override for custom locations. |
OPENAI_API_KEY | (empty) | For OpenAI embeddings (text-embedding-3-small, etc.) |
COHERE_API_KEY | (empty) | For Cohere embeddings (embed-english-v3.0, etc.) |
VOYAGE_API_KEY | (empty) | For Voyage AI embeddings (voyage-large-2, etc.) |
MISTRAL_API_KEY | (empty) | For Mistral embeddings (mistral-embed) |
.env file in the project root:
Build and Run Without Compose
Image Details
The Dockerfile uses a multi-stage build to keep the final image lean:- Builder stage — installs
chonkie[api,semantic,code,openai]into a virtual environment - Runtime stage — copies only the venv; runs as a non-root
chonkieuser - Exposed port —
8000 - Health check — HTTP GET to
/healthevery 30 seconds
Production Tips
Restrict CORS — in production, replace* with your actual domains:
The
SemanticChunker loads its embedding model on first use. Send a warm-up request after startup to avoid cold-start latency on the first real request in production.