Contract Management: Ephesoft and SharePoint Online

SharePoint Contract Management

Capturing Contract Data for Analysis

We have had several requests recently to show how we can help in processing contracts and extracting metadata.  The below video uses Ephesoft Transact in two ways to process contracts:

  1.  Extracting historical contract data for analysis.  In example one, we utilize Ephesoft to import  contract PDFs, classify them, and then extract pertinent data for routing to a SharePoint Contract library.
  2.  Routing and archiving new, inbound contracts.  This example brings in contracts from email, folders and other sources and classifies them, the places the contracts and data within a SharePoint Contract library.

Here is the overall Contract Management Solution:

 

 

Advertisements

Unstructured Data and the Cloud: The Benefits of Capture as a Service

Document Capture & Analytics in the Cloud

We launched the first fully functional Capture as a Service (CaaS) offering in Microsoft Azure this week at the Microsoft Inspire Partner Conference.  We were helped along the way by one of our larger partners, that had high demand for Capture as a Service, and we were seeing more and more requests for the intelligent processing of unstructured data in the Cloud.   Below are some core benefits of our cloud offering:

Cloud OCR Services

Time to Value – On-premise software implementations can be a long-term journey and require additional budget for hardware, IT resource time and long budget cycles for capital expenditure approvals.  With the Ephesoft Transact Cloud, your time to value is minimized and your intelligent capture platform can be up and running in a fraction of the time.  Want to see how you can calculate a quick ROI on CaaS?  See our recorded webinar here: The Ephesoft Effect

 

Cloud Scanning Services

Cost of Ownership – Software as a Service (SaaS) reduces the overall solution cost of ownership by including support, and eliminating the need for hardware, backups, monitoring, dedicated administration and overall management.  By including these costs in one recurring fee, complexity and overhead are reduced, and IT spend becomes more predictable.

 

Competitiveness – No longer is intelligent document capture only for large organizations Cloud Document Data Extractionwith an army of IT folks.  Now smaller organizations can have access to enterprise-class technology, and glean the all the advantages and efficiency to stay competitive, and challenge their larger rivals.

 

Gartner estimates that the annual cost of owning and managing software applications can be as much as four times the cost of the initial purchase.

 

SDK and APIs For Cloud Scanning and CaptureAccess to Innovation – with SaaS, access to the latest and greatest software is included.  As Ephesoft improves its processing engine, you can immediately take advantage of the added efficiency.   Your subscription provides continuous value, and appreciates over time as more features and functionality are added.

 

Azure OCR ServicesScalability and Agility – the Ephesoft Transact Cloud is built for maximum scalability and agility.  You can easily add more cores, features and processing power, depending on your requirements and needs.  You can start small, and grow with our flexible subscription model.

 

Cloud Document AnalyticsCapture Anywhere – with the Ephesoft Cloud provides intelligent document capture anywhere, on any device.  You can automate document processes with a browser, a smart phone or a tablet.  This allows todays distributed workforce to access all the benefits Ephesoft can provide.

 

Read more on our offering in the Cloud here:  Unlocking Unstructured Data: Document Capture, OCR and Scanning in the Cloud

 

AP Invoice Processing ROI & Value

Invoice Capture and OCR

Scanning and Capturing Invoices: Justifying AP Solution Costs

We had a great webinar this past week, all about using the efficiency and automation in Ephesoft to drive reduction in errors, and cost savings.   This rings especially true in the complex, unstructured world of invoice scanning and capture as it relates to software.  See the webinar below:

 

 

Using Ephesoft to Add Intelligent Automation to Microsoft Technologies

OCR and Automation for SharePoint

Ephesoft Automation for SharePoint, Azure, BI, Flow and Dynamics

We are ramping up our team for the Microsoft Inspire Conference (Booth 1237) in Washington, DC in a few weeks (July 9-13), and I thought I would put together some ideas on Microsoft Classificationthe power of Ephesoft technology when combined with Microsoft technologies.  We have been working with several Microsoft Teams (Azure, SharePoint, Flow) to bring solutions to market, and provide extensive document-centric solutions to their partner and customer ecosystem.  So how do we fit?  I will outline a quick primer.

 

Just Who Is Ephesoft?

Ephesoft was founded in 2010 by leaders from the document capture industry that wanted to drive innovation and disrupt the legacy document automation space.  The company has shown explosive growth through its unique perspective on taming unstructured content using patented complex analytics and machine learning.   Its technology has garnered broad interest, and investment from top-tier firms like Fujitsu and In-Q-Tel.

Just What Does Ephesoft Technology Do?

At the heart of Ephesoft Technology is an engine that provides automated document classification and data extraction.  Feed it documents from any source (fax, scanners, copiers, folders, legacy ECM systems, mobile devices, repositories) and it will do all the heavy lifting –  sorting, separating, classifying and getting you the data you need to drive efficiency, productivity, automation and decision-making with minimal end-user intervention.  Providing SaaS and PaaS solutions, and available on premise or in the cloud, the Ephesoft platform can provide great value to any size organization.  Ephesoft has two products:

Ephesoft Transact – a transaction document capture platform for day-to-day document processing.

Ephesoft Insight – a document analytics platform for ingesting large volumes of existing unstructured content and extracting meaning.

How Does Ephesoft Fit With Microsoft?

Think of Ephesoft as an added intelligent document automation layer that can be placed on top of other technologies as a catalyst for automation.  Below is a list of core technologies from Microsoft, and how Ephesoft can fit from a business perspective.

Microsoft SharePoint and Ephesoft

With SharePoint, Ephesoft Transact can be an intelligent on ramp for documents into SharePoint libraries.   As a front end loader, Transact can auto-identify and route documents from just about any source, and make sure they wind up in the right library, as a searchable PDF, with all the important metadata extracted.   It provides a standardized, repeatable process for adding any type of document to Microsoft SharePoint.

With Ephesoft Insight, SharePoint libraries can now be consumed and leveraged for Document Analytics.  Insight provides the “document side” of the analytics equation.

You can get more information here:

Ephesoft/SharePoint Integration

Email Classification with SharePoint

Microsoft Flow and Ephesoft

Utilizing Ephesoft Web Services in the cloud, you can add intelligence to any Microsoft Flow workflow.   Using the classification or extraction services, you can use Ephesoft Transact technology to “open up” documents mid-process, and make workflow branching decisions based on what you find.   An example of a Flow use case here:

Ephesoft and Microsoft Flow with SharePoint Online

Scanning to Microsoft Flow

Microsoft Dynamics and Ephesoft

ERP and Accounting systems can leverage the power of Ephesoft in many different ways.  As a processing engine, Ephesoft Transact can extract information from critical documents, like invoices or sales orders, and pass the information on to Dynamics.  No longer will employees have to hand key information, and waste precious time.  Along with time savings, data entry errors can now be eliminated through Ephesoft Transact’s validation and exception processing capabilities.  More info:

Ephesoft Accounting ERP Solutions

Microsoft Azure and Ephesoft

Document capture and automation is a great fit for the cloud.  Ephesoft’s web-based technology and RESTful APIs are cloud ready, and are available in Microsoft Azure.  As a Cloud Infrastructure partner, Ephesoft has worked diligently to insure compatibility with Azure, and also to take advantage of all the cloud has to offer from a scalability and availability perspective.  Read more on Ephesoft’s cloud platform:

Ephesoft Capture in the Cloud

This is just a short list of possibilities.  Ephesoft’s products are built for partners, and have an open architecture to facilitate the building of portable solutions to add value and drive revenue.  Come see  us at Inspire (Booth 1237), or reach out to us directly for more information: Contact Us.

 

Invoice Scanning and Capture Software

Invoice Scanning Software

Using OCR Invoice Scanning Software

There are typically 3 types of invoice scanning software solutions that companies implement.  If you need a quick primer before you read on, see our terminology page here: Invoice Scanning Software.   Below are the 3 types of solutions:

  1.  Scan Invoices and Extract the Vendor and Total with OCR – As a baby-step solution, many Accounts Payable departments look to start with a minimal impact solution to add some level of automation and efficiency.  They scan their invoices with a copier or scanner, and pass it through an application like Ephesoft Transact for AP Invoices.   Users can either point and click in the interface to populate fields, manually type in information, use ERP or Accounting system lookups for their vendors, and output their scanned invoices to structured folders, an accounting Microsoft SharePoint Library, or another content management system like Alfresco.
  2. Capture All Invoice Header Information – the next step is usually to capture (with document analytics + OCR), all the scanned invoice header information.  Here is the typical information:
    1.  Invoice Number
    2. Invoice Date
    3. Invoice Vendor Name
    4. Invoice Total

This information can be extracted automatically through data extraction, and modern solutions use an analytics algorithm to improve accuracy and reduce end-user interaction with the invoice scanning process.   Once again, all accounts payable invoice information is sent to an end resting place, and typically in this level of solution, the information is sent to an accounting system automatically.

3.  Scan to Capture Invoice Header and Line Item Data – Top tier invoice scanning applications provide the ability to extract all the information on the invoice automatically.  They do all of the above, and also provide line item/table extraction of invoice data.  This extracted information can be used to compare the invoice to the purchase order, and provide line item matching capability.  In addition, all OCR invoice data can be routed to an ERP or accounting system for update.

There are tools, like Ephesoft Transact, that can provide all 3 invoice scanning solutions without the need to add additional modules or components.

 

Machine Learning and Distributed Document Capture and Scanning

Machine Learning for Copier Scanning

Using Copiers to Machine Learn Documents

I have been working with several of our MFP/Copier partners, and wanted to put together a video demo on how to use copiers to train Ephesoft when it comes to our machine learning engine.  This demo shows how you can use our document analytics engine and train HR documents.

 

Thoughts?

Appian and Intelligent Capture: Onramping to Workflow

OCR and Extraction for BPM

OCR, Classification and Extraction for Appian

I spent this week at the AppianWorld Conference with one of our great partners, GxP Partners.  GxP has done some really cool things with Ephesoft Transact, through leveragingCapture and OCR for Appian our open capture platform and OCR & Extraction web services to create ApnCapture.   The solution is tightly tied to a previous article I wrote Document Capture+BPM, and below
is a quick summary of the value of using intelligent document capture with any workflow tool:

With the rise of Document Capture as a Platform (CaaP), there is an enormous opportunity for organizations to leverage the power of capture as an intelligent document automation component to any business process or workflow solution.  Here are the core use areas of document capture and automation with any Business Process Management System (BPMS):

  1. The “Pre” – The logical fit is to use document capture software to “feed the beast”, or in other terms, as a front-end processor for inbound documents destined for workflows.  You might ask, “Why?  My BPM/Workflow solution has the capability to import documents.”  Modern capture platforms add another dimension of automation through the use of several features like separation, document classification and automatic data extraction.  Imagine a mortgage banking process where a PDF document is sent inbound that houses  12 different document types in a single PDF file.  The power of capture is to auto-split the PDF, classify each document, extract information and then pass all of that in a neatly formatted packages to the workflow engine.   Now, the workflow has a second dimension of intelligence, and it can use that to branch, route and execute.  Platforms like Ephesoft Enterprise have the ability to ingest documents from email, folders, legacy document management systems, fax and also legacy capture (like Kofax and Captiva).

2.  Mid-stream –  What about activities during the workflow?  Ones that are necessary mid-process?  This is where the true power of a “platform” comes into play, and it requires a web services API (See other requirements of a Capture Platform in this article: 6 Key Components of a Document Capture Platform).   Some examples of activities that can be accomplished through a capture platform API in workflow:

  • Value Extraction – pass the engine a document and return extracted information.
  • Read Barcodes – pass the engine an image, and read and return the value of a barcode.
  • Classify a Document – pass a document and identify what it is
  • Create OCR – pass a non-searchable PDF and return a searchable file.

As you can imagine, this can provide extreme customization in any process that requires document automation, and can reduce end-user input, create added efficiency, and once again add that second dimension of intelligence after the workflow has begun.  You can see an extensive list of API operations here: Document Capture API Guide

3.  The “Post” – Depending on the process and requirements, a “post-process” capture may be in order.  Most capture platforms have extensive integrations with 3rd party ECM systems like SharePoint, Filebound and Onbase, and can be leveraged as an integration point to these systems.  In addition, there is a new wave in the big data and analytics world, with a focus on data contained within documents.   Routing documents and data to analytics repositories can help organizations glean important insight into their operations.   If you choose a capture platform with a tied-in document analytics component, this can be accomplished automatically.

ApnCapture: Capture and OCR for Appian

So, how did GxP implement ApnCapture and integrate with Appian?  Below is a series of screen shots as an overview from start to finish:

The capture process is initiated from any document source Ephesoft supports:

  • Web-browser scanning
  • Copiers/MFPs
  • Network folders
  • Email + Attachments
  • Mobile (Through our mobile client SnapDoc: Mobile OCR and Capture)
  • CMIS Based Repositories
  • Custom Code

Those documents are processed, and if there are no confidence issues, they pass right through the process.  If there are issues that require end-user correction or validation, users can access document batches through the ApnCapture Batch Report.

Appian Document Data Extraction and Classification OCR
Document Batches Are Queued for End User Review

Clicking any line in the Appian produced form interacts with Ephesoft Web Services to open a validation and review screen.

Appian Capture and OCR Review Screen
Documents Can Be Reviewed Through the ApnCapture Web Interface

Extracted data is then sent into Appian and can be used for all types of purposes: adding intelligence to workflows, enhancing business rules with data, and leveraging documents for approval and review.

AP Invoice Solution with OCR for Appian
Extracted Invoice Data is Presented in Appian for Approval

Finally, all the extracted data can provide a deeper view of any process that is capture enabled.

Accounts Payable Process in Appian Dashboards
Captured Invoice Data is Presented for a Deeper Look Into the AP Process

To find out more about Intelligent Capture and OCR for Appian, contact GxP Partners.

 

GDPR and Compliance: Are Documents the Enterprise Minefield?

What is your privacy strategy for documents and content repositories?

The new General Data Protection Regulation (GDPR) is set to replace the older Data Protection Directive in the EU on May 25, 2018.  This new roll out of privacy protections for EU nations has broad and expansive implications for any company within the realm of the EU, or those that process EU citizen information and data.   Here is a summary of the major changes:

  • GDPR jurisdiction now applies to all organizations that process EU subject personal data, regardless of the
    Data Protection finesorganizations location.
  • Breach of GDPR can be fined up to 4% of global turnover or 20M Euros (whichever is larger)
  • Consent when providing personal information must be clear and easy to understand.

There are a set of core subject rights that apply, and below is a quick summary:

  • Breach Notification – any data breach requires notification within 72 hours.
  • Right to Access – subjects can request an electronic copy of all private data at any time.
  • Right to be Forgotten – aka Data Erasure, a subject at any time can request to have all private data removed from a controlling organizations systems.
  • Data Portability – subjects can request to have their information transferred to another organization at any time.  This will go hand in hand with the “right to be forgotten”.
  • Privacy by Design – now a legal requirement, organizations must show proof of “…appropriate technical and organizational measures…” within any system or process.
  • Data Protection Officers (DPOs) – organizations will now require DPOs.  This individual will be responsible for interfacing with EU nations and authorities, and will carry the heavy burden of responsibility for all data protection efforts.

So, with that quick outline, imagine the implications of  millions of application documents with personal information that are breached.    What about the accidental scan of medical records to an insecure document sync folder?  Or the directory of millions of scanned documents that have a few documents with private information?

Organizations need a two-pronged approach to prevent the document minefield.   So, to get this under control, and mitigate risk, there are really two types of technologies that need to work hand in hand.

GDPR Compliance Solution
GDPR Document Strategy: Transactional and Analytical

First, a document and content capture technology that works as an ingestion point for new content and existing document-centric processes.  This form of enterprise input management can be placed as an non-invasive automation layer to flag/identify suspect content and provide reporting capabilities around private information for compliance.  Once again, focused on day forward transactions.

Second, is a solution to crawl existing repositories to classify, extract and identify documents that pose a risk.  This technology can work hand in hand with the transactional layer to build machine learning profiles, and establish analytical libraries of  document and data profiles so the analytical side can become proactive and preemptive.  This can be a critical step in identifying possible legacy documents that house private information that could be subject to GDPR fines.

So, where does Ephesoft fit?  We have two products that span the transactional and analytical requirements to help organizations capture, classify, identify and visualize their documents in a broad sense, and comply with GDPR privacy rules.

GDPR Compliance for Documents
Ephesoft Provides GDPR and Privacy Solutions for Documents

For the day-to-day, we have Ephesoft Transact, and for deep analytics, we have Ephesoft Insight.   If you need further information, you can contact us here: Ephesoft GDPR Solution Information.

 

The Borderless Enterprise, The Cloud and Capture 2.0

Cloud Document Content and Capture

Capture is the New Intelligent Document Transport Layer

 

As Enterprise infrastructure gets more and more complex, especially with a move to cloud content and line of business systems, organizations struggle with creating what I will call an “Intelligent Document Transport” layer.  The ability to move documents from system to system and maintain data integrity and standardization is paramount to driving organizational efficiency.  With 72% of larger organizations having 3 or  more repositories, and 25% having 5 or more, allowing a seamless interchange of documents and data seems more like a dream than an actual reality. Cloud Capture for Content In addition, legacy, in-place capture systems just lack the modern web service oriented architecture to allow the adaptability and flexibility required to work with modern cloud infrastructure.  These “Fat” client applications are often laden with complex, host-based SDKs and legacy code, requiring extensive development cycles and specialized skill sets to extend and integrate. Here are some of the core challenges in organizations lacking this transport/integration layer:

  • Lack of Document “Intelligence” – many organizations move documents throughout their systems as a closed entity.  They may know it is  a PDF or a Word document, and that it came from accounting, but beyond that, it is a digital mystery.  They usually have limited data or information, and this usually requires human intervention or hand keying of info.
  •  Lost in Translation – As documents move from department to department, person to person and system to system, things get lost in translation.  Information may be misinterpreted, data may be lost or the interpretation may be different.
  • Lack of Standardization and Normalization – With out a standardized transport layer, problems begin to arise.  Take this simple example:  The difference in file naming.  Maybe one department calls it W4, another W_4, and yet a third W-4.  As documents flow back and forth, between systems, think of the headaches this minor difference can create in reporting, workflow and overall system operation.
  • Unified Security – the ability for users and integrations to span the on premise world and the cloud ether is reliant on complex authentication and authorization.  In this day and age, having centralized reporting and audit capability on document transactions can be critical, and a single sign on capability required.

So, what is required to eliminate these challenges and create an efficient document transport layer to connect people, departments and systems?  In my previous post about the new keys to digital transformation, companies are realizing the benefits of new age application architecture: open modular platforms, cloud adaptive technology, scale up and scale down and rapid deployment.  New age document capture and analytics platforms, like Ephesoft Transact, encompass these modern traits, and help create a smooth and efficient document transport layer through the following:

  • Bundling both the document and metadata in an intelligent “suitcase”.  When documents enter the capture layer, they are immediately classified into document types, and appropriate data is extracted.  All of this information travels with the document until it reaches its destination, and the document and data are translated into the required format.
  • Breaking down the barriers that exist between on premise cloud systems.  With a platform built for cloud adaptation, there are now no barriers between systems in corporate data centers and cloud based services.  New age capture platforms can now reside anywhere, and inter-operate with all types of repositories and applications.
cloud document flow
Document Transport Enables a “Borderless” Enterprise
  • Creation of standard processing workflows and business rules.    Creating repeatable processes that are standardized regardless of the user, device or system reduce errors and streamline operations.  Document processing becomes predictable, more efficient and agile when the need for change arises.
  • Security is enforced and an audit trail created.  With a single system that is the epicenter of document traffic, all transactions can be tracked and logged.  With authentication that spans all systems (through single sign on), access can be granted to only documents and systems that are in a users security realm.

New age capture technologies breakdown the barriers that exist, and create a “borderless” Enterprise, and allow the exchange of documents and their associated data to enable improved efficiency and productivity.  Thoughts?

 

 

 

 

 

 

 

 

SharePoint and OCR 2.0: Out with The Old

Sharepoint optical character recognition

Using Adaptive OCR Technology & Analytics to Drive SharePoint Efficiency and Adoption

Optical Character Recognition technology, or OCR, has been around for quite some time.  It really became mainstream back in the ’70s when a man named Ray Kurzweil developed a technology to help the visually impaired.    He quickly realized the broad commercial implications of his invention, and so did Xerox, who purchased his company.   From there, OCR experienced broad adoption across all types of use cases.

At its simplest,OCR is a means to take an image and convert recognized characters to text.  In the Enterprise Content Management (ECM) world, it’s this technology that provides a broad range of metadata and content collection methods as documents are scanned and processed.   Here are the basic legacy forms of OCR that can be leveraged with SharePoint:

  • Full Text OCR – converts the entire document image to text, allowing full text search capabilities.  Using this OCR with SharePoint, documents are typically converted to an Image+Text PDF, which can be crawled, and the content made fully searchable.
  • Zone OCR – Zoning provides the ability to extract text from a specific location on the page.  In this form of “templated” processing, specific OCR metadata can be extracted and mapped to a SharePoint column.  This method is appropriate for structured documents that have the data in the same location.
  • Pattern Matching OCR – pattern matching is purely a method to filter, or match patterns within OCR text.  This technique can provide some capabilities when it comes to extracting data from unstructured, or non-homogeneous documents.  For example, you could extract a Social Security Number pattern (XXX-XX-XXXX) from the OCR text and map it to a SharePoint column.

These forms of OCR are deemed as legacy methods of extraction, and although they can provide some value when utilized with any document process that involves SharePoint, they are purely data driven at the text level.

In steps OCR 2.0.  Today, innovators like Ephesoft leverage OCR as the very bottom of their document analytics and intelligence stack.   The OCR text is now pushed through algorithms that create meaning out of all types of dimensions: location, size, font, patterns, values, zones, numbers, and more (You can read about this patented technology here: Document Analytics and Why It Matters in Capture and OCR ).  So rather than just being completely data-centric, or functioning at the text layer, we now create a high-functioning intelligence layer that can be used beyond just text searching and metadata.  And the best part?  This technology has been extended to non-scanned files like Office documents.   Examples?  See below:

  • Multi-dimensional Classification – using that analysis capability (with OCR as algorithm input), and all the collected dimensions of the document, document type or content type can now be accurately identified.  As documents are fed into SharePoint, they can be intelligently classified, and that information is now actionable with workflows, retention policies, security restrictions and more.  You can see more on this topic in this video on Multi-dimensional Classification Technology: Machine Learning and Classification of Documents
  • Machine Learning – legacy OCR technology provided no means or method to “get smarter” as documents were processed.  Just looking at pure text, it either recognized it, or not.  With a machine learning layer, you now have a system that gets more efficient the more you use it.   The key here is that learned intelligence must span documents, it cannot be tied to any one item.  It’s this added efficiency that can drive SharePoint usage and adoption through ease of use.  You can see more on machine learning in the videos below:

Machine Learning and OCR Data Extraction

Machine Learning and External Data

  • Document Analytics, Accuracy  and Extraction – with legacy OCR, extracting the information you need can be problematic at best.  How do you raise confidence that the information you have is accurate?  With an analysis engine, we look not just at the text,  but where it sits, what surrounds it, and know patterns or libraries.  This added layer provides the ability to express higher confidence in data extraction, and makes sure you are putting the right data into SharePoint.

This was just a quick overview of the benefits from moving away from legacy OCR, and embracing OCR 2.0 for SharePoint. Thoughts?