Insights into ABBYY FlexiCapture
What it is, what it does and how it saves you money
For over 20 years ABBYY has been developing software technologies for automated text recognition, document capture, analysis, and classification. One of the most successful outcomes of this development, ABBYY FlexiCapture system offers powerful and intuitive instruments to streamline time-consuming and labour-intensive tasks associated with paper-based processes.
Document Options make extensive use of ABBYY software within the scanning bureau. It helps processing your documents quickly and efficiently to save you time and money.
It is important to manage expectations; to appreciate the differences between structured, semi structured and unstructured documents; the impact that the structure of the document together with a number of other issues has on set up, testing and implementation costs, as well as accuracy and any post processing checking and corrections that may be required.
Document Options use ABBYY software in conjunction with other applications to meet customer’s time critical requirements. We are experts at selecting and using the most appropriate elements of different software vendors products to produce highly cost effective, time saving solutions to our customer’s needs.
What makes ABBYY FlexiCapture stand out from other document capture software on the market is the exceptional flexibility, superior accuracy and transparency of its technology.
Data vs. paper
Paper is an inescapable part of business. Despite predictions, paper-based processes are still common for the vast majority of companies. Unfortunately, they are constantly working to get control over the floods of paper.
Document analysis and recognition technologies have advanced significantly. Optical character recognition (OCR) is now an everyday practice. Automated processing of fixed forms such as application forms, questionnaires or ballot papers is also widely accepted. And what about processing of other types of documents? Forms without a fixed layout, such as invoices, purchase orders, mortgage applications, explanations of benefits, legal documents, correspondence, contracts, patient records, etc., comprise about 80% of all business documents. The spectrum of critical document processing tasks can vary from simple image capture and indexing to highly intelligent data extraction with real-time integration into business applications.
In many cases, performing these tasks turns out to be quite time-consuming. Very often the document types we mentioned have distinctive features that make it challenging to process them.
Fixed forms are documents that always have exactly the same layout, which means that a particular data field is always located in the same place. This makes automation of form processing tasks quite easy, with the industry standard approach here based on the creation of templates. Each type of a structured document requires creation of one template, which in the case of a simple form can take a matter of several minutes.
Invoices are classified as semi-structured documents because the types of data they contain. They are generally similar, but the exact layout of different data fields may vary from one vendor to another. From invoice to invoice the number of line items changes as well as the number of columns, pages, etc. The same applies to purchase orders, explanations of benefits, price lists and so on.
Correspondence and contracts are considered unstructured documents because it is impossible to predict what kind of information they contain. Some key data like date or address can always be present, but the text of a contract or a letter is different in each case.
The unique characteristics of semi-structured and unstructured documents complicate processing. These require more sophisticated document recognition methods as compared to fixed form processing.
Based on its extensive experience and scientific research, ABBYY has developed its own techniques for managing semi-structured and unstructured documents. These have now been incorporated into ABBYY FlexiCapture software.
Imagine you need to distinguish an invoice from other documents in the pile and then find the key data on it. You would look first for specific words like “invoice” which you allow to identify the document. The next step is to find the data fields. You would start by looking for the invoice number, date and vendor address at the top of the first page and for the total amount at the bottom of the last page of the invoice. An invoice may contain several numbers , several dates and various amounts which should be interpreted correctly. Some key words or elements located near the data field can help you to make a correct decision. However, there are cases when no keywords are available. In such cases you would most likely examine the whole document and take the final decision based on the knowledge obtained.
Unlike other recognition technologies, which focus on recognising patterns, ABBYY takes recognition a step further by using artificial intelligence. This trains the computer to analyse documents in the same way that the human mind would analyse them.
ABBYY FlexiCapture offers three classification modes: auto-learning, rule-based and combined. When processing documents, ABBYY FlexiCapture refers to the available classifier first. The document type of images successfully passed through classification is added to their properties. Images that could not be confidently identified by the classifier are attributed to an “unknown document” class. On the next stage, the software applies FlexiLayouts based on the classification results instead of blind matching. For images in the “unknown document” class, it tries different FlexiLayouts just like it would if no classifier is used. Thus two different FlexiLayouts are applied to one image: the first one to identify the document type and the second one to find the data.
If the number of possible document types intended for processing is not high, there is no need to focus on the classification stage that much and separate it from the data capture process. FlexiLayout matching can be accelerated through their prioritisation and the introduction of header elements.
Burridge House Priestley Way Crawley West Sussex RH10 9NT
Tel: + 44 (0) 1293 426677
Fax: + 44 (0) 1293 403453
Tell us about you and your area/s of interest and we will send you a personalised information pack. Alternatively call for an informal conversation or advice.