Intelligent document

Topic Tools

Papers published on a yearly basis

Papers

Patent•

Document management system with enhanced intelligent document recognition capabilities

[...]

Suresh S. Pandian, Thyagarajan Swaminathan, Subramaniyan Neelagandan, Krishna K. Srinivasan, Randal J. Martin - Show less +1 more

10 Jun 2005

TL;DR: An intelligent document recognition-based document management system as discussed by the authors includes modules for image capture, image enhancement, image identification, optical character recognition (OCR), data extraction, and quality assurance.

...read moreread less

Abstract: An intelligent document recognition-based document management system (Fig. 2) includes modules for image capture (32), image enhancement (32), image identification (34), optical character recognition (36), data extraction (37) and quality assurance (42). The system captures data from electronic documents as diverse as facsimile images, scanned images and images from document management systems. It processes these images and presents the data in, for example, a standard XML format. The document management system processes both structured document images (40) (ones which have a standard format) and unstructured document images (38) (ones which do not have a standard format). The system can extract images directly from a facsimile machine, a scanner or a document management system for processing.

...read moreread less

233 citations

Proceedings Article•10.1109/DIAL.2004.1263262•

Machine learning methods for automatically processing historical documents: from paper acquisition to XML transformation

[...]

Floriana Esposito, Donato Malerba, Giovanni Semeraro, Stefano Ferilli, O. Altamura, Teresa Maria Altomare Basile, Margherita Berardi, Michelangelo Ceci, N. Di Mauro - Show less +5 more

23 Jan 2004

TL;DR: This work proposes the use of a document processing system, WISDOM++, which uses heavily machine learning techniques in order to perform such a task, and reports promising results obtained in preliminary experiments.

...read moreread less

Abstract: One of the aims of the EU project COLLATE is to design and implement a Web-based collaboratory for archives, scientists and end-users working with digitized cultural material. Since the originals of such a material are often unique and scattered in various archives, severe problems arise for their wide fruition. A solution would be to develop intelligent document processing tools that automatically transform printed documents into a Web-accessible form such as XML. Here, we propose the use of a document processing system, WISDOM++, which uses heavily machine learning techniques in order to perform such a task, and report promising results obtained in preliminary experiments.

...read moreread less

85 citations

Proceedings Article•

The Development of a General Framework for Intelligent Document Image Retrieval.

[...]

David Doermann, Jaakko Sauvola, Hannu Kauniskangas, Christian K. Shin, Matti Pietikäinen, Azriel Rosenfeld - Show less +2 more

1 Jan 1996

TL;DR: The general framework, feature extraction modules, query capabilities, a graphical query interface, and the application interface are introduced and each component of the system is demonstrated and how the query mechanisms can be used to handle both content and structural queries eeectively.

...read moreread less

Abstract: Work has recently begun on a joint project between the Universities of Maryland and Oulu on the development of a system for Intelligent Document Image Retrieval (IDIR). The IDIR system will provide close connections with and utilization of document analysis and image processing techniques, advanced computing and networking, and modern approaches to database management. The system design consists of aggressively modularized components to enhance the development of individual parts which are used in the complete solution, including: Interface speciications, multipurpose feature extraction, an integrated eecient query language, physical retrieval from an object-oriented database, and delivery of retrieved objects. In this paper, we introduce the general framework, feature extraction modules, query capabilities, a graphical query interface, and the application interface. We demonstrate each component of the system and how the query mechanisms can be used to handle both content and structural queries eeectively.

...read moreread less

66 citations

Journal Article•10.1023/A:1008735902918•

Machine Learning for Intelligent Processing of Printed Documents

[...]

Floriana Esposito¹, Donato Malerba¹, Francesca A. Lisi¹•Institutions (1)

University of Bari¹

21 Mar 2000

TL;DR: This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals.

...read moreread less

Abstract: A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information capture process is based on knowledge of the specific layout and logical structures of the documents. This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals. Knowledge is represented by means of decision trees and first-order rules automatically generated from a set of training documents. In particular, an incremental decision tree learning system is applied for the acquisition of decision trees used for the classification of segmented blocks, while a first-order learning system is applied for the induction of rules used for the layout-based classification and understanding of documents. Issues concerning the incremental induction of decision trees and the handling of both numeric and symbolic data in first-order rule learning are discussed, and the validity of the proposed solutions is empirically evaluated by processing a set of real printed documents.

...read moreread less

64 citations

Patent•

Intelligent electronic document content processing

[...]

Lifen Tian¹•Institutions (1)

Ricoh¹

8 Aug 2007

TL;DR: In this article, a content processing module is configured to perform intelligent document content processing, such as confidential information processing, content optimization and workflow optimization, on the electronic document data based upon the particular user preference data.

...read moreread less

Abstract: A network device includes a content processing module that is configured to perform intelligent document content processing, such as confidential information processing, content optimization and workflow optimization. The network device authenticates a user and determines electronic document data that is to be processed. The electronic document data may be created at the network device, e.g., by a scanning module on the network device, or at a client device, e.g., by a word processing application executing on the client device. The content processing module retrieves particular user preference data based upon the user authentication. The particular user preference data may specify confidential information preferences, content optimization preferences and/or workflow preferences. The content processing module performs intelligent document content processing on the electronic document data based upon the particular user preference data and generates processed electronic document data.

...read moreread less

58 citations

...

Expand

Year	Papers
2021	2
2020	7
2019	4
2018	5
2017	3
2015	1

Topic Tools

Papers published on a yearly basis

Papers

Document management system with enhanced intelligent document recognition capabilities

Machine learning methods for automatically processing historical documents: from paper acquisition to XML transformation

The Development of a General Framework for Intelligent Document Image Retrieval.

Machine Learning for Intelligent Processing of Printed Documents

Intelligent electronic document content processing

Related Topics (5)

Performance Metrics