Send a purchase order PDF, scan, or photo to one REST endpoint and get back clean JSON: PO number, supplier, dates, terms, and the full line-item table. No template per supplier, no manual keying. Upload a real PO below to see the exact structured output the API returns.
Submit your purchase orders
Drop documents here, or click to file
Up to 50 files per batch
Uploading...
A purchase order PDF is a picture of a table, not data. Your application cannot post a supplier order to an ERP, run a match, or update inventory until every field is structured. Building that extraction in house means wiring up OCR, layout parsing, and per-supplier rules, then maintaining them as vendors change their templates.
Open-source OCR returns a wall of text with no field labels. You still have to figure out which number is the PO number, which block is the supplier, and where each line item starts. That parsing layer is the hard part, and it breaks on the next new layout.
Template-based extraction needs a rule set per vendor. The moment a supplier moves a field or a new one arrives, the template fails and a PO drops to manual review. Maintaining hundreds of templates is a permanent engineering cost.
Header fields are one thing; a multi-page line-item table with wrapped descriptions, merged cells, and per-row quantities and prices is where most parsers fail. Missing a row throws off matching and spend totals downstream.
A faxed or photographed PO is pure pixels. A PDF text parser returns nothing on it, so any solution that skips real OCR silently drops the messiest documents, which are exactly the ones a team wants automated.
Teams reach for a purchase order API when the volume of inbound supplier orders passes the point where a person can key them, and the data has to land in another system programmatically. The requirement is almost always the same: submit a file, receive validated JSON that maps to an ERP or database schema, and never build a per-vendor template. The work below is turning the document into that JSON.
The PurchaseOrders API reads each purchase order with AI and returns the fields your application needs as JSON. Post a file, poll for the result, and get a consistent object with header fields and a line-item array, whatever supplier sent it and whether it is a native PDF, a scan, or a phone photo.
Upload the file to the documents upload endpoint, start extraction with the extract endpoint, then fetch the result from the extraction endpoint by its hash. Standard JSON in, JSON out.
Authenticate with a bearer token tied to your account. The extract permission is granted on the Pro plan, so you control which tokens can pull data.
The response includes each line item with SKU, description, quantity, unit of measure, unit price, and line total, not just the header, so your code can match and total at the row level.
Consistent field names across every supplier mean the JSON drops into your ERP, database, or accounting import without a cleanup pass per vendor.
The API is the programmatic version of the same engine behind the on-site tools. If you want the full field breakdown, purchase order line item extraction covers how the table is captured, and purchase order OCR vs AI extraction explains why field-by-meaning parsing beats templates. For a no-code path, the PO PDF to CSV converter and bulk purchase order processing produce files instead of JSON, and teams automating the whole flow use it to automate purchase order data entry end to end and import purchase orders into an ERP. Full request and response details are in the API documentation.
Three approaches to getting structured PO data into your application programmatically.
| What you need | Build it on raw OCR | Generic OCR API | PurchaseOrders API |
|---|---|---|---|
| Structured PO fields, not raw text | You write the parser | Partial, generic fields | PO fields mapped for you |
| Handles any supplier layout | Template per vendor | Varies | No templates needed |
| Full line-item table capture | Hard to build | Often header only | Full line-item array |
| Reads scanned and photo POs | Add your own OCR | Sometimes | Yes, built in |
| Setup and maintenance | Ongoing engineering | Some mapping work | Call the endpoint |
| Cost model | Your build plus infra | Per call or seat | Per document processed |
PurchaseOrders is a data-extraction API: it returns structured PO data from documents you already have. It does not create, approve, match, or track purchase orders the way a procurement suite does; validation and matching stay in your system.
Upload, extract, retrieve. Standard REST, JSON responses.
POST the PO file to the upload endpoint with your bearer token. Native PDFs, scanned PDFs, and images are all accepted in the same call.
Tip: Multi-page purchase orders are handled in one upload.
Call the extract endpoint with the returned file reference. The AI reads the document and pulls the header fields and the full line-item table.
Tip: A token needs the extract permission, granted on the Pro plan.
Fetch the result by its hash and receive a consistent JSON object with PO number, supplier, dates, terms, and every line item, ready to write to your ERP or database.
Upload the PO file to the API, start extraction, then retrieve the result as JSON. With PurchaseOrders you POST the file to the upload endpoint, call the extract endpoint, and fetch the structured response by its hash. The JSON contains the PO number, supplier, dates, terms, and every line item, so your application can write it straight to an ERP or database.
The API returns a structured JSON object, not raw text. It includes header fields such as PO number, supplier, order and delivery dates, and payment terms, plus a line-item array where each row carries the SKU, description, quantity, unit of measure, unit price, and line total. Field names stay consistent across suppliers so the output maps to your schema.
Yes. The API runs AI-based OCR, so it reads native PDFs, scanned PDFs, and phone photos through the same endpoint. A scan or photo has no text layer, which is why a plain PDF text parser returns nothing on it, but the API reads the pixels and rebuilds the fields and line-item table.
Authentication uses a bearer token tied to your account, sent in the request header. The token needs the extract-data permission, which is available on the Pro plan. You can issue and manage tokens from your account, so you control which integrations are allowed to pull purchase order data.
Yes. Line-item capture is the core of the extraction. The response includes the full line-item table as an array, with quantity, unit price, and total per row, even when the table spans multiple pages or descriptions wrap. That row-level detail is what makes downstream matching and spend analysis accurate.
No. The API is an extraction service: it turns purchase order documents into structured JSON. It does not perform two-way or three-way matching, route approvals, or track PO status. Those steps stay in your ERP or procurement system, which consumes the clean data the API provides.
How the line-item table is captured.
Why field-by-meaning beats templates.
Batch extraction without code.
Automate PO data entry end to end.
Get the data into your ERP.