Teaching Machines to Understand Documents
I remember when I first started out in the document capture and ECM world, I was sitting across from a CIO, presenting our technology, and he started asking pointed questions about configuration and services. We talked for about 15 minutes, and he stopped, and I could see the gears were turning. He looked at me and said: “Why do you guys make it so damn hard?” I looked at him and said, “What do you mean?” He responded with: “Why all the configuration and setup time? Why cant it just understand my documents, and what I am trying to accomplish? I know that current technology is capable.” At that time, the trend in the industry was a heavy reliance on regular expressions, basically a pattern matching language that originated in 1956, born through mathematical theory. So essentially, the CIO hit the nail on the head: We were using 1950s math theory to provide automation and value, but it came with a deep cost in the form of expertise and services. So here we are 10 years later, and the majority of the industry still uses that same method to analyze, classify and extract data.
Rise of the Machines
In the document automation space, we typically present a magic world to the end users, one where they just hit the button, or upload their document, and stuff just happens “automagically”. But in reality, behind the scenes, there was a lot of work to get to that point. With the burden placed on IT in the form of education, configuration, service costs and testing. Machine learning strives to eliminate that burden through simple efforts to train the system, and I think the goal, although lofty, is to reduce or eliminate configuration to a point that any user can create a workable system.
So, in document capture, what can machine learning provide? In modern document automation technologies, like Ephesoft’s Capture Platform, machine learning can be leveraged in several ways:
- Classifying Documents – If I had Ephesoft back in the day, I could have really made an impression on the CIO. With Ephesoft’s training interface, I can take my different types of documents and train the system. As I drag and drop new types of documents into the system, it “learns” all the nuances of the document. It understands the structure, the words, their proximity, typeface and other information and uses that as key identifiers in the process. For more on the extent of our Document Analytics/Analysis engine, see this post: Document Analysis, Analytics and Capture.
- Intelligence & Confidence – Just like people, good machines know when to ask questions or admit when they are wrong. In a machine learning environment, having a mechanism to ask questions is key. In document capture, this comes in the form of an established confidence level, and voting algorithm that can call attention to documents or data in question. When these questions are answered, the machine gets smarter, and learns.
- Gathering Information – just as we learned through experience growing up, the machine needs to learn from every interaction. Any form of human input needs to add to understanding, and overall document intelligence. Click on a missed piece of data, and now the system knows its location, and its format. It also knows the proximity of other words, and has an enhanced understanding of new dimensions of that document.
These are just a few examples of machine learning, and what it brings to the document capture industry. More to come when we release our new version, 4.1 at our conference next week.