TL;DR: The way the symbol database is built and which methods have been used to introduce noise in the model of the symbols are exposed in order to evaluate the robustness of the recognition methods.
Abstract: In this paper, we present the synthesis of the first international symbol recognition contest, held during the fifth IAPR workshop on graphics recognition in Barcelona. We describe the framework of the contest (goals, kind of symbols, criteria of evaluation) and how the contest was organized. We expose in particular the way we have built the symbol database and which methods have been used to introduce noise in the model of the symbols, in order to evaluate the robustness of the recognition methods. The methods of the different participants are summarily described as well as the results obtained with these methods on the test images available during the contest. The evaluation protocol is presented, the results are analyzed and we conclude with some remarks for the next symbol recognition contest.
TL;DR: A system that aims at recognizing chart images using a model-based approach and the results of type determination and the accuracies of the recovered data are reported.
Abstract: In this paper, we introduce a system that aims at recognizing chart images using a model-based approach. First of all, basic chart models are designed for four different chart types based on their characteristics. In a chart model, basic object features and constraints between objects are defined. During the chart recognition, there are two levels of matching: feature level matching to locate basic objects and object level matching to fit in an existing chart model. After the type of a chart is determined, the next step is to do data interpretation and recover the electronic form of the chart image by examining the object attributes. Experiments were done using a set of testing images downloaded from the internet or scanned from books and papers. The results of type determination and the accuracies of the recovered data are reported.
TL;DR: The method handles some layout irregularities frequently found in on-line handwritten formula recognition systems, like symbol overlapping and association of arguments of sum-like operators, as well as tabular arrangements, like matrices.
Abstract: We present a structural analysis method for the recognition of on-line handwritten mathematical expressions based on a minimum spanning tree construction and symbol dominance. The method handles some layout irregularities frequently found in on-line handwritten formula recognition systems, like symbol overlapping and association of arguments of sum-like operators. It also handles arguments of operators with non-standard layouts, as well as tabular arrangements, like matrices.
TL;DR: A brief survey on on-line graphics recognition is presented and major problems and sub-problems at three levels are identified: primitive shape recognition, composite graphic object recognition, and document recognition and understanding.
Abstract: A brief survey on on-line graphics recognition is presented. We first present some common scenarios and applications of on-line graphics recognition and then identify major problems and sub-problems at three levels: primitive shape recognition, composite graphic object recognition, and document recognition and understanding. Representative approaches to these problems are also presented. We also list several open problems at the end.
TL;DR: In this article, a method to separate and recognize the touching/overlapping alphanumeric characters is proposed, where characters are processed in raster-scanned color cartographic maps.
Abstract: A method to separate and recognize the touching/overlapping alphanumeric characters is proposed. The characters are processed in raster-scanned color cartographic maps. The map is segmented first to extract all text strings including those that are touching other symbols, strokes and characters. Second, OCR-based recognition with Artificial Neural Networks (ANN) is applied to define the coordinates, size and orientation of alphanumeric character strings in each case presented in the map. Third, four straight lines or a number of “curves” computed as a function of primarily recognized by ANN characters are extrapolated to separate those symbols that are attached. Finally, the separated characters input into ANN again to be finally identified. Results showed high method’s rendering in the context of raster-to-vector conversion of color cartographic images.
TL;DR: In this paper, a method of online sketchy shape recognition that can adapt to different user sketching styles is presented. The adaptation principle is based on incremental active learning and dynamic user modeling.
Abstract: This paper presents a method of online sketchy shape recognition that can adapt to different user sketching styles. The adaptation principle is based on incremental active learning and dynamic user modeling. Incremental active learning is used for sketchy stroke classification such that important data can actively be selected to train the classifiers. Dynamic user modeling is used to model the user’s sketching style in an incremental decision tree, which is then used to recognize the composite shapes dynamically by means of fuzzy matching. Experiments prove the proposed method both effective and efficient for user adaptation in online sketchy shape recognition.
TL;DR: The DocMining platform as discussed by the authors provides a general framework for document interpretation and integrates document processing units coming from different sources and communicating through the document being interpreted, where each unit is associated with a contract that describes the parameters, data and results of the unit as well as the way to run it.
Abstract: The DocMining platform is aimed at providing a general framework for document interpretation. It integrates document processing units coming from different sources and communicating through the document being interpreted. A task to be performed is represented by a scenario that describes the units to be run, and each unit is associated with a contract that describes the parameters, data and results of the unit as well as the way to run it. A controller interprets the scenario and triggers each required document processing unit at its turn. Documents, scenarios and contracts are all represented in XML, to make data manipulation and communications easier.
TL;DR: In this article, an approach to automatic digitation of raster-scanned color cartographic maps is presented, which is concerned with three steps involved in the digitizing process: pre-processing, processing and post-processing.
Abstract: An approach to automatic digitation of raster-scanned color cartographic maps is presented. This approach is concerned with three steps involved in the digitizing process: pre-processing, processing and post-processing. Most systems for vectorization raster maps using automatic programs essentially carry out only one kind of operation: follow discrete points along an arc (tracing, snapping) or attempt to combine automatic and interactive modes, if the tracing is met ambiguities. In our proposal, the automation problem is approached from a unified point of view, leading to the development of A2R2V (Analogical-to-Raster-to-Vector) conversion system that is able to recognize and vector a maximum number of cartographic patterns in raster maps. We discuss some strategies for solving this hard problem and illustrate it, briefly describing A2R2V. The place of the operator and knowledge in a raster to vector conversion system is considered as well.
TL;DR: In this paper, a new approach is proposed to make use of planes of mirror symmetry detected in such sketches, which can significantly improve the reconstruction process and reduce the size of the reconstruction problem.
Abstract: We aim to reconstruct three-dimensional polyhedral solids from axonometric-like line drawings. A new approach is proposed to make use of planes of mirror symmetry detected in such sketches. Taking account of mirror symmetry of such polyhedra can significantly improve the reconstruction process. Applying symmetry as a regularity in optimisation-based reconstruction is shown to be adequate by itself, without the need for other inflation techniques or regularities. Furthermore, symmetry can be used to reduce the size of the reconstruction problem, leading to a reduction in computing time.
TL;DR: The parser the authors develop in DMOS, a generic method for structured document recognition, uses EPF, a grammatical language for describing documents to build a system with a generic approach for dealing with noise.
Abstract: To develop a generic method for document recognition, it is necessary to build a system with a generic approach for dealing with noise. Indeed, a lot of noise is present in an image and a recognizer needs to find the right information in the middle of noise to make a recognition. We describe in this paper the parser we develop in DMOS, a generic method for structured document recognition. This method use EPF, a grammatical language for describing documents. From an EPF description, a new recognition system is automatically build by compilation. DMOS had been successfully used for musical scores, mathematical formulae, table structure and old forms recognition (tested on 60,000 documents).
TL;DR: In this article, a method that combines OCR-based text recognition in raster-scanned maps with heuristics specially adapted for cartographic data to resolve the recognition ambiguities using, among other information sources, the spatial object relation-ships is proposed.
Abstract: To date many methods and programs for automatic text recognition exist. However there are no effective text recognition systems for graphic documents. Graphic documents usually contain a great variety of textual information. As a rule the text appears in arbitrary spatial positions, in different fonts, sizes and colors. The text can touch and overlap graphic symbols. The text meaning is semantically much more ambiguous in comparison with standard text. To recognize a text of graphic documents, it is necessary first to separate it from linear objects, solids, and symbols and to define its orientation. Even so, the recognition programs nearly always produce errors. In the context of raster-to-vector conversion of graphic documents, the problem of text recognition is of special interest, because textual information can be used for verification of vectorization results (post-processing). In this work, we propose a method that combines OCR-based text recognition in raster-scanned maps with heuristics specially adapted for cartographic data to resolve the recognition ambiguities using, among other information sources, the spatial object relation-ships. Our goal is to form in the vector thematic layers geographically meaningful words correctly attached to the cartographic objects.
TL;DR: The Arc Segmentation Contest, as the fifth in the series of graphics recognition contests organized by IAPR TC10, was held in association with the GREC’2003 workshop and the contest rules, performance metrics, test images and their ground truths are presented.
Abstract: The Arc Segmentation Contest, as the fifth in the series of graphics recognition contests organized by IAPR TC10, was held in association with the GREC’2003 workshop. In this paper we present the report of the contest: the contest rules, performance metrics, test images and their ground truths, and the outcomes.
TL;DR: A method for recognizing main wall which is a back-bone of apartment in an architectural drawing, which is about 5.8% higher than that of Karl Tombre is suggested.
Abstract: This paper deals with plain figures on the architectural drawings of apartment. This kind of architectural drawings consist of main walls represented by two parallel bold lines, symbols(door, window, tile...), dimension line, extension line, and dimensions represent various numerical values and characters. This paper suggests a method for recognizing main wall which is a back-bone of apartment in an architectural drawing. In this thesis, the following modules are realized: an efficient image binarization, a removal of thin lines, a vectorization of detected lines, a region bounding for main walls, a calculation of extension lines, a finding main walls based on extension line, and a field expansion by searching other main walls which are linked with the detected main walls. Although the windows between main walls are not represented as main walls, a detection module for the windows is considered during the recognition period. So the windows are found as a part of main wall. An experimental result on 9 different architectural drawings shows 96.5% recognition of main walls and windows, which is about 5.8% higher than that of Karl Tombre.
TL;DR: In this article, the concept of the invisible interface is discussed, as an interface compatible with the cognitive process involved in architectural sketching, and illustrated by a software prototype EsQUIsE.
Abstract: In this paper, we propose to discuss the concept of the “invisible interface”, as an user interface compatible with the cognitive process involved in architectural sketching. We present the principles of such an interface, and illustrate them by our software prototype EsQUIsE.
TL;DR: In this paper, an intermediate representation of the document provides a precise description of all the shapes present in the initial image, and this representation constitutes the main part of a shared resource that will be used by different processes achieving the interpretation of the drawings.
Abstract: In this paper, we present different strategies for localization and recognition of graphical entities in line drawings. Most systems include first a segmentation step of the document followed by a sequential extraction of the graphical entities. Some other systems try to recognize symbols directly on the bitmap image using more or less sophisticated techniques. In our system, an intermediate representation of the document provides a precise description of all the shapes present in the initial image. Thereafter, this representation constitutes the main part of a shared resource that will be used by different processes achieving the interpretation of the drawings. The actions (recognition) done by these different specialists are scheduled in order to read and understand the content of the document. The knowledge that is provided by the shared representation is used instead of the bitmap image material to drive the interpretation process. In the current system, the specialists are trying, during several cycles to interpret the drawings in an intelligent way by interpreting the simplest parts of a drawing first and making the shared representation evolve until the total understanding of the document.
TL;DR: In this article, a vectorization system based on the use of strategic knowledge is presented, which is composed of two parts: a processing library and a graphic user interface, which allows to construct and execute scenarios, exploiting any processing of our library, according to documents' contexts and users' adopted strategies.
Abstract: This paper presents a vectorisation system based on the use of strategic knowledge. This one is composed of two parts: a processing library and a graphic user interface. Our processing library is composed of image pre-processing and vectorisation tools. Our graphic user interface is used for the strategic knowledge acquisition and operationalisation. It allows to construct and to execute scenarios, exploiting any processing of our library, according to documents’ contexts and users’ adopted strategies. A XML data representation is used, allowing an easy data manipulation. A scenario example is presented for graphics recognition on utility maps.
TL;DR: In this paper, the authors propose a general decomposition of the local structural analysis into four steps: object graph extraction, mathematical approximation, high-level object construction, and object graph correction.
Abstract: The structural analysis is a processing step during which graphs are extracted from binary images. We can decompose the structural analysis into local and global approaches. The local approach decomposes the connected components, and the global approach groups them together. This paper deals especially with the local structural analysis. The local structural analysis is employed for different applications like symbol recognition, line drawing interpretation, and character recognition. We propose here a primer on the local structural analysis. First, we propose a general decomposition of the local structural analysis into four steps: object graph extraction, mathematical approximation, high-level object construction, and object graph correction. Then, we present some considerations on the method comparison and combination.
TL;DR: In this article, the authors tackle the problem of bootstrapping engineering documents recognition systems, and present a user-friendly interface to acquire knowledge concerning the graphical appearance of objects, but also to learn the best approach to use among their tools in order to recognize the learned objects.
Abstract: This paper tackles the problem of bootstrapping engineering documents recognition systems. A user-friendly interface is presented. Its aim is to acquire knowledge concerning the graphical appearance of objects, but also to learn the best approach to use among our tools in order to recognise the learned objects.
TL;DR: In this paper, the authors proposed a new method for extracting the topological feature of an object by connecting all the pixels constituting the object under the constraint to define the shortest path (minimum spanning tree).
Abstract: All the effective object recognition systems are based on a powerful shape descriptor. We propose a new method for extracting the topological feature of an object. By connecting all the pixels constituting the object under the constraint to define the shortest path (minimum spanning tree) we capture the shape topology. The tree length is in the first approximation the key of our object recognition system. This measure (with some adjustments) make it possible to detect the object target in several geometrical configurations (translation / rotation) and it seems to have many desirable properties such as discrimination power and robustness to noise, that is the conclusion of the preliminary tests on characters and symbols.
TL;DR: This paper proposes syntactical models to represent repetitive regular structures in graphical documents that can be automatically inferred from the document and used as signatures to describe salient features consisting of regular repetitions of primitives.
Abstract: In this paper we propose syntactical models to represent repetitive regular structures in graphical documents. We refer to these structures as texture symbols and they usually contain hatched or tiled patterns. Our grammar-based models can be automatically inferred from the document and used as signatures to describe salient features consisting of regular repetitions of primitives. These signatures compactly describe texture symbols and its primitives can be used for indexing purposes. We describe different models suitable for a number of patterns. Particularly, a linear grammar to describe hatched patterns and a plex grammar and a graph grammar for different types of tiled patterns.
TL;DR: A framework of a recognition system for folding process of origami drill books is described, with a view to converting a 3D sequence of orgami illustrations printed in an Origami drill book into a3D animation automatically, so that users can observe how an origami is folded from different view-points.
Abstract: This paper describes a framework of a recognition system for folding process of origami drill books, with a view to converting a 3D sequence of orgami illustrations printed in an origami drill book into a 3D animation automatically, so that users can observe how an origami is folded from different view-points. The internal model,which maintains the changes of origami states during interpretation of the folding process, plays an important role in the recognition phase.The model also makes it possible for a CG simulator to reconstruct the recognized folding process. Several experimental results of this system have shown the validity of the proposed framework.
TL;DR: This work explores automatic object recognition and semantic capture in vector graphics through shape description and two classifiers were implemented and proved accurate in their automatic recognition of objects from drawings in different domains.
Abstract: This work explores automatic object recognition and semantic capture in vector graphics through shape description. The low-level graphical content of graphical documents, such as a map or architectural drawing, are often captured manually and the encoding of the semantic content seen as an extension of this. The large quantity of new and archived graphical data available on paper makes automatic structuring of such graphical data desirable. Contour shape description techniques, such as Fourier descriptors, moment invariants play an important role in systems for object recognition and representation. However, most work carried out in this area has concentrated on categories of object boundaries representing very specific shapes (for example, a particular type of aircraft). Two classifiers were implemented and proved accurate in their automatic recognition of objects from drawings in different domains. Classical classifier combination techniques were used to improve performance. Further work will employ more complex fusion techniques and it is envisaged they will be used in combination with recognition based on object context using various modelling methods. A demonstration system has been constructed using all these techniques.
TL;DR: In this article, the authors focus on the object recognition problem where a knowledge base defined by a finite set of representative prototypes or class objects is given, and use it for the implementation of dynamic and adaptative systems.
Abstract: Object recognition is a very large problem that can be derived in different forms. In the domain of graphic recognition, many strategies are proposed, but many of them depend on the context in which they are applied [LVSM01]. This aspect implies the necessity to find a model for this context, and to use it for the implementation of dynamic and adaptative systems. In this paper, we focus on the object recognition problem where a knowledge base defined by a finite set of representative prototypes or class objects is given.
TL;DR: In this paper, a batch-processing system that automatically transforms raster illustrated parts drawings into intelligent, interactive, layered vector graphic files is proposed to provide a solution for re-authorization of the drawings.
Abstract: Illustrated Parts drawings are used extensively in the maintenance and repair of commercial and military aircraft. The Boeing Company has hundreds of thousands of illustrated parts drawings that must be transformed into richer, more intelligent formats for use in advanced technical data systems. Because manually re-authoring the drawings is prohibitively expensive, our solution is to provide a batch-processing system that automatically transforms raster illustrated parts drawings into intelligent, interactive, layered vector graphic files.
TL;DR: In this paper, the quality of the fingerprint image is acquired by taking the regional quality of each fingerprint image in blocks in the enrollment stage, and the amount of fingerprint varies according to the size of each block which makes the result unstable.
Abstract: This paper deals with the quality of the image itself as well as the algorithm used for evaluating the quality of the fingerprints to construct an effective algorithm for evaluating the quality of the fingerprints. The quality of the fingerprint is acquired by taking the regional quality of each fingerprint image. The regional quality is acquired by measuring the quality of the fingerprint in blocks in the enrollment stage. The amount of fingerprint varies according to the size of each block which makes the result unsteady. We concentrated on finding the right size for the block used to acquire the fingerprint image as a graphical element. Also, the quality distribution included in the fingerprint image was acquired when the optimal block size was adopted. The threshold value of this distribution rate is expected to be used to acquire a high classification.
TL;DR: An easy ’three-dimensionalizing’ method is suggested by incorporating user’s own 2D sketches of characters, scenes and text to the story and by making them finally be presented in the form of 3D-like animation semi-automatically on the web.
Abstract: This paper introduces a cheap engineering solution of authoring and presenting user’s own story in the form of animation on the world wide web. Most of existing story-making tools allow users to choose characters from a given database because model building task is time-consuming and requires some level of expertise. Comparatively, we suggest an easy ’three-dimensionalizing’ method by incorporating user’s own 2D sketches of characters, scenes and text to the story and by making them finally be presented in the form of 3D-like animation semi-automatically on the web.
TL;DR: In this paper, the authors present an automatic method for recognizing the interconnections between a full set of wiring diagrams in order to derive a complete, global representation of the complete set of electrical connections in an aircraft.
Abstract: In this paper we present an automatic method for recognizing the interconnections between a full set of wiring diagrams in order to derive a complete, global representation of the full set of electrical connections in an aircraft. To fully understand the nature of a link between diagrams, the system must not only parse the link itself, but also determine the surrounding graphical context and discover exactly what object in the target diagram is being referenced. This global understanding creates the potential to significantly reduce the burden of electrical troubleshooting.
TL;DR: In this paper, an interactive approach to recognition of graphic objects in engineering drawings is proposed, where the user provides an example of one type of graphic object by selecting it in an engineering drawing, and then the system learns its graphical knowledge and uses this learnt knowledge to recognize or search for other similar graphic objects.
Abstract: In this paper, an interactive approach to recognition of graphic objects in engineering drawings is proposed. Interactively, the user provides an example of one type of graphic object by selecting it in an engineering drawing, and then the system learns its graphical knowledge and uses this learnt knowledge to recognize or search for other similar graphic objects. For improving the recognition accuracy of the system, we also propose a user feedback scheme based on multiple examples from both positive and negative aspects. We summarized four types of geometric constraints to represent the generic graphical knowledge of graphic objects. We also developed two algorithms for case-based graphical knowledge acquisition and knowledge-based graphics recognition, respectively. For the user feedback scheme, we adjust our original knowledge representation by associating a few types of tolerances to every piece of graphical knowledge and use different tolerances for recognizing different graphical objects. Experiments have shown that our proposed framework is both efficient and effective for recognizing various types of graphic objects in engineering drawings.
TL;DR: A genetic algorithm in conjunction with hill climbing search technique to reduce the complexity and Experimental results show that good solutions can be found with lower cost using the proposed method.
Abstract: We are developing a recognition system which aims to convert a sequence of illustrations in origami drill books into a 3D animation, with a view to providing an easy way for learning and enjoying origami art. As a part of this work, this paper proposes a method that recognizes a target graph corresponding to an origami shape from an origami illustration image. The target graph, called ISG (Ideal Shape Graphs), generated from an internal model that maintains 3D information about changes of origami states during the interpretation for the whole folding process. In order to understand the folding operation applied to the origami at each step, an ISG of the origami generated at current step has to be matched with the image of the origami illustration at the next step, to find whether there exists a similar graph within it. The image of the origami is usually very noisy. Since the computational cost to do this work is prohibitive even for problems of moderate sizes, we adopt a genetic algorithm in conjunction with hill climbing search technique to reduce the complexity. Experimental results show that good solutions can be found with lower cost using our method.
TL;DR: The experimental results show that a new pattern descriptor based on Attributed Relational Graph is effective for rotation-invariant retrieval of characters in regular style.
Abstract: Driven by new applications running on Tablet PC, a new pattern descriptor was proposed for retrieval of on-line Chinese scripts. The descriptor is based on Attributed Relational Graph (ARG). Fuzzy description is employed to enhance the adaptability of this model. To simplify the matching, the graph model is transformed to statistical features. Another improvement is that the descriptor is rotation-invariant. The experimental results show that this descriptor is effective for rotation-invariant retrieval of characters in regular style.