ef81c9809d
* Zanran Scaffolder initial version added * "intro.md" renamed to "readme.md" |
||
---|---|---|
.. | ||
apiDefinition.swagger.json | ||
apiProperties.json | ||
readme.md |
readme.md
The Zanran Scaffolder server provides a web API which enables users to automatically extract content from PDFs and images. It is designed primarily for extracting from reports (annual accounts, scientific papers, market reports, etc.) Zanran's Scaffolder engine automatically determines the structure and layout of these documents and extracts content into constituent parts: blocks of text (e.g. paragraphs); tables; and images/graphics. It uses Computer Vision and Machine Learning and outputs data in structured formats like Excel and XML. It is scalable and does not require any manual intervention or pre-defined templates, any training or configuration. The software is language agnostic and it is built for automation / RPA environments to process millions of files.
Prerequisites
This connector accesses a free service for low-volume extraction of text and tables from PDFs. Prerequisite: a user name (email address) and password (which you invent).
How to get credentials
Please register at: http://scaffolderlink.zanran.com/
Known issues and limitations
We recommend testing using 'native' PDFs, rather than scanned ones - to remove any effects of OCR.