Skip to content

Data Extraction

KARLI-hosted data-extraction models turn uploaded files into structured text that downstream components can consume. They are selected from the Read File component when its Extraction Backend is set to karli.

KARLI is currently the only provider offering this category of model.

Available Models

Model Accepts Notes
karli/default-data-extraction Any KARLI-managed default; routes the file to a sensible extractor.
karli/data-extraction-moe-latest Any Mixture-of-Experts router — picks the optimal extractor per file type (and per page for PDFs). See details below.
docling-project/docling Documents Docling, run server-side by KARLI.
datalab-to/marker Documents Marker.
opendatalab/MinerU Documents MinerU.
karli/multimodal-data-extraction Documents Multimodal hybrid pipeline.
openai/whisper-large-v3 Audio Audio transcription via Whisper.

The Read File component validates the uploaded file against the chosen model's accepted type before uploading — submitting, for example, a PDF to the Whisper model produces an error rather than an upload.

Mixture-of-Experts Routing

The karli/data-extraction-moe-latest model automatically selects the best extractor for each file type:

File Category Extensions Extractor
PDF pdf Per-page router (see below)
Word documents doc, docx Docling
Presentations ppt, pptx Docling
Spreadsheets / tables xls, xlsx, csv Default
HTML htm, html, xhtml MarkItDown
Images png, jpg, jpeg, gif, bmp, tiff, tif, webp Vision
Audio aac, mpeg, wav, webm, mp3, mp4 Audio (Whisper)
Email eml, msg, pst Default
Plain text txt Default

Per-page PDF routing

For PDFs the MoE model inspects each page individually, classifies its complexity, and dispatches it to the most suitable extractor:

Page Category Extractor Needs Vision
Scanned page Marker Yes
Handwriting Marker Yes
Image-dominant Marker Yes
Complex layout Marker No
Plain text MinerU No
Mixed content Marker No

This means a single PDF can be processed by multiple extractors — for example, a text-heavy page goes through MinerU while a scanned page is routed to Marker with vision support.

Request Shape

When a file is sent for extraction, the component issues a POST to {KARLI_BASE_URL}/data-extraction/extract as a multipart upload:

  • Form field extractorModel carries the selected model (mapped to its KARLI identifier).
  • The file part carries the document or audio file.
  • Authentication follows the priority described below.

The response is a JSON object whose segments are concatenated into a single text payload; segments with a title are emitted as ## <title> Markdown headers.

Authentication

Extraction requests authenticate using the following priority:

  1. JWT (if a Karli Studio session is active) — sent as Authorization: Bearer <token>.
  2. KARLI_API_KEY (from the component attribute, provider variables, or the KARLI_API_KEY environment variable) — sent as X-API-Key: <key>.
  3. Error — if neither is available, a ValueError is raised instructing the user to configure KARLI_API_KEY or access via Karli Studio.

External API users

If you call the Agentlab API directly and your flows include a FileComponent with document extraction, you must configure the KARLI_API_KEY provider variable or set the KARLI_API_KEY environment variable on the server. See the Model Providers overview for details.

See Document Extraction for how the Read File component uses these models in practice, including the downstream Data shape.