TextChef
processes plain text files and returns structured Document
objects for further processing.
Installation
TextChef is included in the base installation of Chonkie. No additional dependencies are required.For installation instructions, see the Installation
Guide.
Initialization
Methods
process()
Process a text file and return aDocument
object.
Parameters
Path to the text file (string or Path object)
Returns
Document
object containing the file content
process_batch()
Process multiple text files at once.Parameters
List of file paths to process
Returns
List[Document]
where each Document
contains a file’s contents.
Usage
Integration with Chunkers
TextChef is designed to work seamlessly with Chonkie’s chunkers:Encoding
TextChef reads files with UTF-8 encoding by default, ensuring proper handling of:- Unicode characters
- International text
- Special symbols
- Emoji and other non-ASCII characters