SharePoint OCR and Conversion

SharePoint Library OCR

SharePoint OCR (Optical Character Recognition) provides the ability to create searchable PDF documents in SharePoint libraries that can be crawled and indexed for full text search.  This usually requires an OCR Server for processing, as well as an OAuth integration to authenticate.  Enhanced engines, like Ephesoft, also can extract data and classify document types and enter the information in the SharePoint columns.   There are two types of SharePoint OCR:

  • In place OCR – this crawls the library and does the conversion “in place”, or in the library.
  • Pre-OCR – this method does the image to text conversion prior to adding the document to the SharePoint library.