Data Extraction¶
KARLI-hosted data-extraction models turn uploaded files into structured text that downstream components can consume. They are selected from the Read File component when its Extraction Backend is set to karli.
KARLI is currently the only provider offering this category of model.
Available Models¶
| Model | Accepts | Notes |
|---|---|---|
karli/default-data-extraction |
Any | KARLI-managed default; routes the file to a sensible extractor. |
karli/data-extraction-moe-latest |
Any | Mixture-of-Experts router — picks the optimal extractor per file type (and per page for PDFs). See details below. |
docling-project/docling |
Documents | Docling, run server-side by KARLI. |
datalab-to/marker |
Documents | Marker. |
opendatalab/MinerU |
Documents | MinerU. |
karli/multimodal-data-extraction |
Documents | Multimodal hybrid pipeline. |
openai/whisper-large-v3 |
Audio | Audio transcription via Whisper. |
The Read File component validates the uploaded file against the chosen model's accepted type before uploading — submitting, for example, a PDF to the Whisper model produces an error rather than an upload.
Mixture-of-Experts Routing¶
The karli/data-extraction-moe-latest model automatically selects the best extractor for each file type:
| File Category | Extensions | Extractor |
|---|---|---|
| Per-page router (see below) | ||
| Word documents | doc, docx | Docling |
| Presentations | ppt, pptx | Docling |
| Spreadsheets / tables | xls, xlsx, csv | Default |
| HTML | htm, html, xhtml | MarkItDown |
| Images | png, jpg, jpeg, gif, bmp, tiff, tif, webp | Vision |
| Audio | aac, mpeg, wav, webm, mp3, mp4 | Audio (Whisper) |
| eml, msg, pst | Default | |
| Plain text | txt | Default |
Per-page PDF routing¶
For PDFs the MoE model inspects each page individually, classifies its complexity, and dispatches it to the most suitable extractor:
| Page Category | Extractor | Needs Vision |
|---|---|---|
| Scanned page | Marker | Yes |
| Handwriting | Marker | Yes |
| Image-dominant | Marker | Yes |
| Complex layout | Marker | No |
| Plain text | MinerU | No |
| Mixed content | Marker | No |
This means a single PDF can be processed by multiple extractors — for example, a text-heavy page goes through MinerU while a scanned page is routed to Marker with vision support.
Request Shape¶
When a file is sent for extraction, the component issues a POST to {KARLI_BASE_URL}/data-extraction/extract as a multipart upload:
- Form field
extractorModelcarries the selected model (mapped to its KARLI identifier). - The file part carries the document or audio file.
- Authentication follows the priority described below.
The response is a JSON object whose segments are concatenated into a single text payload; segments with a title are emitted as ## <title> Markdown headers.
Authentication¶
Extraction requests authenticate using the following priority:
- JWT (if a Karli Studio session is active) — sent as
Authorization: Bearer <token>. KARLI_API_KEY(from the component attribute, provider variables, or theKARLI_API_KEYenvironment variable) — sent asX-API-Key: <key>.- Error — if neither is available, a
ValueErroris raised instructing the user to configureKARLI_API_KEYor access via Karli Studio.
External API users
If you call the Agentlab API directly and your flows include a FileComponent with document extraction, you must configure the KARLI_API_KEY provider variable or set the KARLI_API_KEY environment variable on the server. See the Model Providers overview for details.
See Document Extraction for how the Read File component uses these models in practice, including the downstream Data shape.