TL;DR: This paper uses an existing grammatical off-line method with on-line a posteriori signal and applies it on a freely available database to introduce some structural and syntactic knowledge on flowcharts to improve their recognition.
Abstract: In this paper, we address the problem of segmentation and recognition of on-line a posteriori flowcharts. Flowcharts are bi-dimensional documents, in the sense that the order of writing is not defined. Some statistical approaches have been proposed in the literature to label and segment the flowcharts. However, as they are very well structured documents, we propose to introduce some structural and syntactic knowledge on flowcharts to improve their recognition. For this purpose, we have used an existing grammatical off-line method with on-line a posteriori signal. We apply this work on a freely available database. The results demonstrate the interest of structural knowledge on the context to improve the recognition.
TL;DR: This graphics recognition technique eliminates the need for expert users in digitizing map images and provides opportunities to derive unique data for spatiotemporal research by facilitating time-consuming map digitization efforts.
Abstract: Historical maps contain rich cartographic information, such as road networks, but this information is "locked" in images and inaccessible to a geographic information system (GIS). Manual map digitization requires intensive user effort and cannot handle a large number of maps. Previous approaches for automatic map processing generally require expert knowledge in order to fine-tune parameters of the applied graphics recognition techniques and thus are not readily usable for non-expert users. This paper presents an efficient and effective graphics recognition technique that employs interactive user intervention procedures for processing historical raster maps with limited graphical quality. The interactive procedures are performed on color-segmented preprocessing results and are based on straightforward user training processes, which minimize the required user effort for map digitization. This graphics recognition technique eliminates the need for expert users in digitizing map images and provides opportunities to derive unique data for spatiotemporal research by facilitating time-consuming map digitization efforts. The described technique generated accurate road vector data from a historical map image and reduced the time for manual map digitization by 38%.
TL;DR: A probabilistic interpretation of both measures is developed and it is shown that, provided a sufficient number of data sources are available, it offers a viable performance measure to compare methods if no ground truth is available.
Abstract: In this paper we present a way to use precision and recall measures in total absence of ground truth. We develop a probabilistic interpretation of both measures and show that, provided a sufficient number of data sources are available, it offers a viable performance measure to compare methods if no ground truth is available. This paper also shows the limitations of the approach, in case a systematic bias is present in all compared methods, but shows that it maintains a very high level of overall coherence and stability. It opens broader perspectives and can be extended to handling partial or unreliable ground truth, as well as levels of prior confidence in the methods it aims to compare.
TL;DR: The evaluation framework is described, including datasets and evaluation measures --- and the results obtained by the only participant method are summarized.
Abstract: In this paper we summarize the framework and the results of the fourth edition of the International Symbol Recognition Contest, organized in the context of GREC'11. The contest follows the series started at the GREC'03 workshop and it is the first time that, in addition to recognition of isolated symbols, the contest includes the evaluation of symbol spotting. In this report we describe the evaluation framework --- including datasets and evaluation measures --- and we summarize the results obtained by the only participant method.
TL;DR: This paper presents the use of geometric matching for symbol recognition under similarity transformations and incorporates this matching approach in a complete symbol recognition/spotting system, which consists of denoising, symbol representation and recognition.
Abstract: Symbol recognition is important in many applications such as the automated interpretation of line drawings and retrieval-by-content search engines. This paper presents the use of geometric matching for symbol recognition under similarity transformations. We incorporate this matching approach in a complete symbol recognition/spotting system, which consists of denoising, symbol representation and recognition. The proposed system works for both isolated recognition and spotting symbols in context. For denoising, we use an adaptive preprocessing algorithm. For symbol representation, pixels and/or vectorial primitives can be used, then the recognition is done via geometric matching. When applied on the datasets of GREC'05 and GREC'11 symbol recognition contests, the system has performed significantly better than other statistical or structural methods.
TL;DR: This paper is focused on the proposed new experiment for an extended music competition, describing the new set of images and analyzing the new results.
Abstract: Since there has been a growing interest in the analysis of handwritten music scores, we have tried to foster this interest by proposing in ICDAR and GREC two different competitions: Staff removal and Writer identification. Both competitions have been tested on the CVC-MUSCIMA database of handwritten music score images. In the corresponding ICDAR publication, we have described the ground-truth, the evaluation metrics, the participants' methods and results. As a result of the discussions with attendees in ICDAR and GREC concerning our music competition, we decided to propose a new experiment for an extended competition. Thus, this paper is focused on this extended competition, describing the new set of images and analyzing the new results.
TL;DR: This paper demonstrates the integration of a classifier, based on an incremental learning method, in an interactive sketch analyzer based on a competitive breadth-first exploration of the analysis tree for interpreting the 2D architectural floor plans.
Abstract: In this paper, we present the integration of a classifier, based on an incremental learning method, in an interactive sketch analyzer. The classifier recognizes the symbol with a degree of confidence. Sometimes the analyzer considers that the response is insufficient to make the right decision. The decision process then solicits the user to explicitly validate the right decision. The user associates the symbol to an existing class, to a newly created class or ignores this recognition. The classifier learns during the interpretation phase. We can thus have a method for auto-evolutionary interpretation of sketches. In fact, the user participation has a great impact to avoid error accumulation during the analysis. This paper demonstrates this integration in an interactive method based on a competitive breadth-first exploration of the analysis tree for interpreting the 2D architectural floor plans.
TL;DR: This paper is focused on the categorization of administrative document images based on the recognition of the supplier's graphical logo, and two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
TL;DR: The final report of the outcome of the sixth edition of the Arc Segmentation Contest is presented, which shows that vectorization methods produces better results with low resolution scanned images.
Abstract: This paper presents the final report of the outcome of the sixth edition of the Arc Segmentation Contest. The theme of this edition is segmentation of images with different scanning resolutions. The contest was held offline before the workshop. Nine document images were scanned with three resolutions each and the ground truth images were created manually. Four participants have provided the output of their research prototypes. Two prototypes are more established while the other two are still in development. In general, vectorization methods produces better results with low resolution scanned images. Participants' comments on the behavior of their methods are also included in this report. A website devoted for this edition of the contest to hold the newly created dataset and other materials related to the contest is also available.
TL;DR: This paper presents an evolution of a patch-based segmentation method working at pixel level and relying on the construction of a visual vocabulary, and is able to find the best system configuration, which highly outperforms the results on wall segmentation obtained by the original paper.
Abstract: Architectural floor plans exhibit a large variability in notation. Therefore, segmenting and identifying the elements of any kind of plan becomes a challenging task for approaches based on grouping structural primitives obtained by vectorization. Recently, a patch-based segmentation method working at pixel level and relying on the construction of a visual vocabulary has been proposed in [1], showing its adaptability to different notations by automatically learning the visual appearance of the elements in each different notation. This paper presents an evolution of that previous work, after analyzing and testing several alternatives for each of the different steps of the method: Firstly, an automatic plan-size normalization process is done. Secondly we evaluate different features to obtain the description of every patch. Thirdly, we train an SVM classifier to obtain the category of every patch instead of constructing a visual vocabulary. These variations of the method have been tested for wall detection on two datasets of architectural floor plans with different notations. After studying in deep each of the steps in the process pipeline, we are able to find the best system configuration, which highly outperforms the results on wall segmentation obtained by the original paper.
TL;DR: A two-stage approach for signature segmentation from a document page is proposed, where a document is segmented into blocks and then blocks are classified into two classes: signature block and printed word block.
Abstract: Automatic signature segmentation from a printed document is a challenging task due to the nature of handwriting of the signatory, overlapping/touching of signature strokes with printed text, graphics, noise, etc. In this paper, we propose a two-stage approach for signature segmentation from a document page. In the first stage, a document is segmented into blocks and then blocks are classified into two classes: signature block and printed word block. Gradient-based features are used for block feature extraction and support vector machine classifier is used for block-wise classification. In the second stage, printed characters that may be present in isolated form or overlapped/touched with signature part are removed from signature blocks. From each of the detected signature blocks, the isolated printed characters (if exist) are removed using context information. To detect overlapping/touching printed stroke in a signature block, at first some hypothetical zones are detected where possible overlapping/touching may occur. Bounding box information of neighboring printed word block and local linearity of character strings near the signature blocks are used to detect hypothetical zones. Next, to detect the overlapping/touching printed strokes in hypothetical zones of a signature block, the corner points of contours obtained by Douglas and Peucker polygonal approximation algorithm and skeleton junction points are used. Finally, the touching strokes of signature are separated from text characters using the contour smoothness information near skeleton junction points. The experiment is performed in "Tobacco-800" dataset [The legacy tobacco document library (ltdl), available at http://legacy.library.ucsf.edu/, University of California, San Francisco, 2007.] and the results obtained from the experiment are promising.
TL;DR: A de-noising approach by using dilation, erosion, thinning operators of the mathematical morphology, and selecting appropriate structuring elements can clear up large amounts of noises in the glyphs of the character.
Abstract: This paper describes a recognition system for online handwritten Tibetan characters using advanced techniques in character recognition. To eliminate noise points of handwriting trajectories, we introduce a de-noising approach by using dilation, erosion, thinning operators of the mathematical morphology. Selecting appropriate structuring elements, we can clear up large amounts of noises in the glyphs of the character. To enhance the recognition performance, we adopt three-stage classification strategy, where the top rank output classes by the baseline classifier are re-classified by similar character discrimination classifier. Experiments have been carried out on two databases MRG-OHTC and IIP-OHTC. Test results show the used recognition algorithm is effective and can be applied in pen-based mobile devices.
TL;DR: A method for region-based segmentation of sketch map objects as part of the sketch map understanding process that is robust to gaps in the drawing and can even handle open-ended streets.
Abstract: Sketch maps are an intuitive way to display and communicate geographic data and an automatic processing is of great benefit for human-computer interaction This paper presents a method for segmentation of sketch map objects as part of the sketch map understanding process We use region-based segmentation that is robust to gaps in the drawing and can even handle open-ended streets To evaluate this approach, we manually generated a ground truth for 20 maps and conducted a preliminary quantitative performance study
TL;DR: This paper proposes to reduced the feature dimensionality through principal component analysis (PCA), which has interesting properties and enables graph based structural representations to employ the range of efficient state -of-the-art computational models of statistical machine learning.
Abstract: The motivation of this work is to address the problem of lack of efficient computational tools for graph based structural representations In this paper we take-forward our work on graph embedding to answer two important issues of high dimensionality and sparsity, of the feature vector from our previously proposed fuzzy-interval based explicit graph embedding approach The latter is a method to embed an attributed graph, with numeric as well as symbolic attributes on both nodes and edges, into a feature vector We propose to reduced the feature dimensionality through principal component analysis (PCA) The resulting feature vector has interesting properties and enables graph based structural representations to employ the range of efficient state -of-the-art computational models of statistical machine learning A set of initial graphics recognition experimentation on IAM letter, GREC and fingerprint graph datasets, shows that PCA successfully reduces the feature dimensionality without degrading performance of the original graph embedding technique
TL;DR: This paper describes a novel approach for extracting a library of symbols from a large collection of line drawings, and it achieved high accuracy in capturing and representing the contents of the line drawings.
Abstract: This paper describes a novel approach for extracting a library of symbols from a large collection of line drawings. This symbol library is a compact and indexable representation of the line drawings. Such a representation is important for efficient symbol retrieval. The proposed approach first identifies the candidate patterns in all images, and then it clusters the similar ones together to create a set of clusters. A representative pattern is chosen from each cluster, and these representative patterns form a library of symbols. We have tested our approach on a database of line drawings, and it achieved high accuracy in capturing and representing the contents of the line drawings.
TL;DR: This work presents a descriptor for symbols, especially for line drawings, based on the graph representation of graphical objects, and uses the descriptor for three applications, they are: classification of the graphical symbols, spotting of the architectural symbols on floorplans, and classifying of the historical handwritten words.
Abstract: Graphical symbol recognition and spotting recently have become an important research activity. In this work we present a descriptor for symbols, especially for line drawings. The descriptor is based on the graph representation of graphical objects. We construct graphs from the vectorized information of the binarized images, where the critical points detected by the vectorization algorithm are considered as nodes and the lines joining them are considered as edges. Graph paths between two nodes in a graph are the finite sequences of nodes following the order from the starting to the final node. The occurrences of different graph paths in a given graph is an important feature, as they capture the geometrical and structural attributes of a graph. So the graph representing a symbol can efficiently be represent by the occurrences of its different paths. Their occurrences in a symbol can be obtained in terms of a histogram counting the number of some fixed prototype paths, we call the histogram as the Bag-of-GraphPaths (BOGP). These BOGP histograms are used as a descriptor to measure the distance among the symbols in vector space. We use the descriptor for three applications, they are: (1) classification of the graphical symbols, (2) spotting of the architectural symbols on floorplans, (3) classification of the historical handwritten words.
TL;DR: A new pre-processing that permits to denoise these documents, by using a Aujol and Chambolle algorithm, which allows to extract meaningful components from image.
Abstract: With the improvement of printing technology since the 15th century, there is a huge amount of printed documents published and distributed. These documents are degraded by the time and require to be preprocessed before being submitted to image indexing strategy, in order to enhance the quality of images. This paper proposes a new pre-processing that permits to denoise these documents, by using a Aujol and Chambolle algorithm. Aujol and Chambolle algorithm allows to extract meaningful components from image. In this case, we can extract shapes, textures and noise. Some examples of specific processings applied on each layer are illustrated in this paper.
TL;DR: A complete framework based on user interaction scheme through a tactile device, exploiting image processing components to achieve groundtruthing of real-life documents in an semi-automatic way is proposed.
Abstract: In this paper, we are interested with the groundtruthing problem for performance evaluation of symbol recognition & spotting systems. We propose a complete framework based on user interaction scheme through a tactile device, exploiting image processing components to achieve groundtruthing of real-life documents in an semi-automatic way. It is based on a top-down matching algorithm, to make the recognition process less sensitive to context information. We have developed a specific architecture to achieve the recognition in constraint time, working with a sub-linear complexity and with extra memory cost.
TL;DR: The proposed contents based scoring model for clothes matching is represented and the scores generated are compared with the scores offered by human observers for real clothes images collected from internet shopping malls.
Abstract: A contents based scoring model for clothes matching is represented. The major color sets for upper and lower clothes are extracted using color grouping and clustering after discarding background. The all possible combinations between two color sets are considered to measure the overall color harmony. The regions in which printed patterns on the clothes are detected and the pattern types are classified. The pattern matching score is also calculated using statistical characteristics of dispersity and directional uniformity of edge lines. The final score of clothes matching are obtained via linear weighted sum of the scores of color harmony and pattern matching. In the experimental, the scores generated by the proposed model are compared with the scores offered by human observers for real clothes images collected from internet shopping malls.
TL;DR: A new method of chemical graph construction which is implemented in the chemical structure recognition and correction system ChemInfty, allowing users to interact with the graph-construction cycles and introduces semi-automated correction.
Abstract: This paper proposes a new method of chemical graph construction which is implemented in the chemical structure recognition and correction system ChemInfty (www.inftyproject.org/en/ChemInfty/).
The system starts with recognizing the graphical elements of the chemical structure such as lines and characters. In the chemical graph construction phase the validity of the chemical graph is checked to detect inconsistencies. The graph construction starts with an empty chemical graph using only the graphical components. After a solving cycle the system returns a partially solved graph which can be checked for inconsistencies again. This results in a flexible, cycle based and inconsistency-driven graph construction. Furthermore the system introduces semi-automated correction allowing users to interact with the graph-construction cycles.
TL;DR: A new algorithm is proposed for de-blurring of textual documents; there is no need to estimate the PSF and the filter proposed can be directed applied to the image.
Abstract: Document images may exhibit some blurred areas due to a wide number of reasons ranging from digitalization, filtering or even storage problems. Most de-blurring algorithms are hard to implement, slow, and often try to be general, attempting to remove the blur in any kind of image. In the case of text document images, the transition between characters and the paper background has a high contrast. With that in mind, a new algorithm is proposed for de-blurring of textual documents; there is no need to estimate the PSF and the filter proposed can be directed applied to the image. The presented algorithm reached an improvement rate of 17.08% in the SSIM metric.
TL;DR: This paper generalizes that result to filter out highlighting in monochromatic documents with non-white background due to paper natural aging.
Abstract: Text highlighting is often used to emphasize parts of a document for some reason. As highlighting is a personal choice of the reader, it can be seen as physically "damaging" the original document. A recent paper shows how to remove felt-pen highlighting in monochromatic documents with a white paper background. This paper generalizes that result to filter out highlighting in monochromatic documents with non-white background due to paper natural aging.
TL;DR: To improve the robustness, the parameters are altered to obtain image segmentation at multiple scales and perform component-level template matching across the image segmentations obtained at all scales.
Abstract: Symbol recognition in natural scenes plays an important role in a variety of applications such as driver assistance and environment awareness. We propose a solution including 3 phases: (1) Image segmentation, (2) component-level shape matching, and (3) structure matching. To improve the robustness, we alter the parameters to obtain image segmentation at multiple scales and perform component-level template matching across the image segmentation results obtained at all scales. By means of such exhaustive search across all possible segmentations, the chance to obtain finely matched components is increased. Some initial experimental results are obtained, which are encouraging.
TL;DR: The main idea is to decompose the symbol into the set of multi-scale local parts, some of which are not or less affected by the contextual interferences, and then recognize the symbol based on detecting and integrating individual symbol parts.
Abstract: We present a new parts-based multi-scale recognition method for graphic symbols, especially those connecting or intersecting with other elements in the context. The main idea is to decompose the symbol into the set of multi-scale local parts, some of which are not or less affected by the contextual interferences, and then recognize the symbol based on detecting and integrating individual symbol parts. An ensemble learning and classification scheme is employed, which combines three ingredients: 1) the multi-scale spatial pyramid representation of the symbol that consists of local parts for matching. 2) the random forest based classifying of symbol parts and discriminative learning of the mappings between parts and the symbol. 3) the probabilistic aggregation of individual part detections to form the symbol recognition output. The experimental results on simulation datasets show the effectiveness of the proposed method and its promising properties in handling non-segmented symbols.
TL;DR: A stroke ordering method which simulating online data of offline data is developed and good curve fitting and stroke generation ability of the proposed method can be applicable for practical real-time symbol recognition applications.
Abstract: Integrating approaches in a unified system for the online and offline data type are seldom found, although the similar recognition alogirhtms are used for the two data stream. A practical solution which is able to represent online and offline hand drawn graphic messages simultaneously is presented. Freehand-sketched graphics captured by a digitizing tablet (online) or digital camera (offline) is approximated using the quadratic Bezier curve representation. A recursive architecture performing a piecewise curve approximation is proposed to represent pen strokes. As a primary unified tool for the online/offline symbol recognition system, a stroke ordering method which simulating online data of offline data is developed. The experimental results show good curve fitting and stroke generation ability of the proposed method, which can be applicable for practical real-time symbol recognition applications.
TL;DR: A way to use precision and recall measures in total absence of ground truth to improve the quality of knowledge retrieval in the face of uncertainty.
Abstract: In this paper we present a way to use precision and recall measures in total absence of ground truth.
TL;DR: This paper presents a method for symbol description based on both spatio-structural and statistical features computed on elementary visual parts, called 'vocabulary', which has interesting properties that allows it to be used efficiently for recognising structure and by comparing its attribute signatures.
Abstract: In this paper, we present a method for symbol description based on both spatio-structural and statistical features computed on elementary visual parts, called 'vocabulary'. This extracted vocabulary is grouped by type (e.g., circle, corner) and serves as a basis for an attributed relational graph where spatial relational descriptors formalise the links between the vertices, formed by these types, labelled with global shape descriptors. The obtained attributed relational graph description has interesting properties that allows it to be used efficiently for recognising structure and by comparing its attribute signatures. The method is experimentally validated in the context of electrical symbol recognition from wiring diagrams.
TL;DR: This paper proposes to rely on this particularity of comic books to automatically extract frame and text using a connected-component labeling analysis and compared with some existing methods found in the literature.
Abstract: Comic books constitute an important heritage in many countries. Nowadays, digitization allows to search directly from content instead of metadata only (e.g. album title or author name). Few studies have been done in this direction. Only frame and speech balloon extraction have been experimented in the case of simple page structure. In fact, the page structure depends on the author which is why many different structures and drawings exist. Despite the differences, drawings have a common characteristic because of design process: they are all surrounded by a black line. In this paper, we propose to rely on this particularity of comic books to automatically extract frame and text using a connected-component labeling analysis. The approach is compared with some existing methods found in the literature and results are presented.