Safran PO Extraction

6-stage VLM pipeline for aviation spare parts Purchase Order extraction. Upload any PO — PDF or image — get structured JSON in seconds.

POST /extract

Unified extraction — upload PDF, PNG, JPG. Auto-converts to PDF, runs the full pipeline. Returns task_id for async polling.

POST /stage_0

Fast sync identification — customer ID, Safran entity, PO format. Regex tier (~5ms) with Haiku VLM fallback (~1s).

GET /extract/{id}

Poll extraction result. Returns full structured PO data with matching, validation flags, and routing decision.

Pipeline

0
Client ID — regex + Haiku, customer + entity in ~5ms
1
Document Analyzer — classify layout, triage pages, skip T&C
2
Unified Extractor — Sonnet VLM, full PO JSON in one shot
3
Rules Engine — dates, part numbers, arithmetic verification
4
Matcher — Snake SAT + fuzzy against article & client basis
5
Validation — Haiku cross-checks extraction vs original PDF
6
Router — auto-approve / human review based on confidence