Kofax Indicius Now Offers State-Of-The Art Scanning

Indicius arrives with a bag of techniques to extract content from multiple formats and can assist users in identifying which process will work best for a particular scan. For instance, the software takes advantage of Mohomine's language-independent tool to classify unstructured text and can be trained to identify structures in scanned text by using examples rather than rigid templates and rules. Indicius also uses Neurascript's algorithms to recognize and extract structured and semi-structured content from scanned text.

Document separation usually is done manually and is expensive. In addition to removing staples and sticky tapes, operators today insert sheets between sets of forms to inform scanning systems where a set of documents begins and ends—a process usually done in batches by grabbing a large stack of forms and going through them by hand. Indicius overcomes this problem by automating document separation without users having to insert separator sheets between sets of documents.

Kofax is in the process of patenting this algorithm, which uses a two-pass approach to extract documents accurately. The first pass tries to isolate every page it scans. This scan tries to identify known characteristics such as bar codes, and if the scan finds one, then that data informs Indicius of the form type and the process does not have to go any further with that file.

>> Scans multipage documents without the need to separate sheets between each batch
>> Recognizes handwritten and printed data from document images
>> Classifies and indexes documents without having to manually presort documents
>> Captures and extracts data from documents and generates recognition information for system monitoring
>> Provides a simple interface for validating document classification and recognized data
>> Processes multiple streams in parallel to suit target applications

Another advanced technique in the first pass is the image classification algorithm, which identifies documents by what they look like. Indicius does not have to read any text—it looks for known insignias, logos or marks on forms.

id
unit-1659132512259
type
Sponsored post

The third technique used in the first pass works with pattern-matching. Patterns are founded on a specific set of rules created by operators and can be based on the look of forms, the text within them or a combination of both. For example, a group of words in a box in a specific corner of a document can be used to identify the document type. This is a manual setup performed once by operators. Once patterns of matching rules are created, they can be reused in any scan job.

The second pass uses a learn-by-example approach that combines sets of documents based on the information gathered from the first pass. This approach uses fuzzy rules to identify when different form types are likely to occur. For instance, form type X will always come before form type Y. What's more, form type Y will always have five pages. This algorithm associates all of the known factors that operators look for when identifying forms visually.

The second pass works on large sets of legal documents. For instance, when applying for a mortgage, customers have to fill out a large number of different types of documents, and a customer's folder can comprise as many as 100 pages. Stacking folders in a batch is still a manual process because a folder is treated as an entity. Operators have to put a sheet between each folder, but with this process in place, they have to place far fewer separator sheets.

After taking an image of a piece of paper, Indicius extracts the entire content and indexes the data for use in a content management system or a database within a searching system. This is a prerequisite into its classification algorithm.

What's more, Indicius can be set up to retrieve specific information from a document. It can be configured to retrieve only selected information and pair it up with the appropriate names. This technique is useful for high-volume processing because it saves time when Indicius only has to find selected information from a scan.

Kofax offers a 30 percent average reseller margin to certified solution providers. In addition, the company provides incentives such as MDF, training and rebate programs. Pricing is based on how many pages a customer needs to push through the system.