OCR, Classification and Extraction for Appian
I spent this week at the AppianWorld Conference with one of our great partners, GxP Partners. GxP has done some really cool things with Ephesoft Transact, through leveraging our open capture platform and OCR & Extraction web services to create ApnCapture. The solution is tightly tied to a previous article I wrote Document Capture+BPM, and below
is a quick summary of the value of using intelligent document capture with any workflow tool:
With the rise of Document Capture as a Platform (CaaP), there is an enormous opportunity for organizations to leverage the power of capture as an intelligent document automation component to any business process or workflow solution. Here are the core use areas of document capture and automation with any Business Process Management System (BPMS):
- The “Pre” – The logical fit is to use document capture software to “feed the beast”, or in other terms, as a front-end processor for inbound documents destined for workflows. You might ask, “Why? My BPM/Workflow solution has the capability to import documents.” Modern capture platforms add another dimension of automation through the use of several features like separation, document classification and automatic data extraction. Imagine a mortgage banking process where a PDF document is sent inbound that houses 12 different document types in a single PDF file. The power of capture is to auto-split the PDF, classify each document, extract information and then pass all of that in a neatly formatted packages to the workflow engine. Now, the workflow has a second dimension of intelligence, and it can use that to branch, route and execute. Platforms like Ephesoft Enterprise have the ability to ingest documents from email, folders, legacy document management systems, fax and also legacy capture (like Kofax and Captiva).
2. Mid-stream – What about activities during the workflow? Ones that are necessary mid-process? This is where the true power of a “platform” comes into play, and it requires a web services API (See other requirements of a Capture Platform in this article: 6 Key Components of a Document Capture Platform). Some examples of activities that can be accomplished through a capture platform API in workflow:
- Value Extraction – pass the engine a document and return extracted information.
- Read Barcodes – pass the engine an image, and read and return the value of a barcode.
- Classify a Document – pass a document and identify what it is
- Create OCR – pass a non-searchable PDF and return a searchable file.
As you can imagine, this can provide extreme customization in any process that requires document automation, and can reduce end-user input, create added efficiency, and once again add that second dimension of intelligence after the workflow has begun. You can see an extensive list of API operations here: Document Capture API Guide
3. The “Post” – Depending on the process and requirements, a “post-process” capture may be in order. Most capture platforms have extensive integrations with 3rd party ECM systems like SharePoint, Filebound and Onbase, and can be leveraged as an integration point to these systems. In addition, there is a new wave in the big data and analytics world, with a focus on data contained within documents. Routing documents and data to analytics repositories can help organizations glean important insight into their operations. If you choose a capture platform with a tied-in document analytics component, this can be accomplished automatically.
ApnCapture: Capture and OCR for Appian
So, how did GxP implement ApnCapture and integrate with Appian? Below is a series of screen shots as an overview from start to finish:
The capture process is initiated from any document source Ephesoft supports:
- Web-browser scanning
- Network folders
- Email + Attachments
- Mobile (Through our mobile client SnapDoc: Mobile OCR and Capture)
- CMIS Based Repositories
- Custom Code
Those documents are processed, and if there are no confidence issues, they pass right through the process. If there are issues that require end-user correction or validation, users can access document batches through the ApnCapture Batch Report.
Clicking any line in the Appian produced form interacts with Ephesoft Web Services to open a validation and review screen.
Extracted data is then sent into Appian and can be used for all types of purposes: adding intelligence to workflows, enhancing business rules with data, and leveraging documents for approval and review.
Finally, all the extracted data can provide a deeper view of any process that is capture enabled.
To find out more about Intelligent Capture and OCR for Appian, contact GxP Partners.