TL;DR: An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents, demonstating the usefulness of the model.
Abstract: In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstating the usefulness of the model.
TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
Abstract: This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern matching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the pattern matching machine in processing the text string is independent of the number of keywords. The algorithm has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
TL;DR: The Rete Match Algorithm is an efficient method for comparing a large collection of patterns to a largeCollection of objects that finds all the objects that match each pattern.
TL;DR: The algorithm has the unusual property that, in most cases, not all of the first i.” in another string, are inspected.
Abstract: An algorithm is presented that searches for the location, “il” of the first occurrence of a character string, “pat,” in another string, “string.” During the search operation, the characters of pat are matched starting with the last character of pat. The information gained by starting the match at the end of the pattern often allows the algorithm to proceed in large jumps through the text being searched. Thus the algorithm has the unusual property that, in most cases, not all of the first i characters of string are inspected. The number of characters actually inspected (on the average) decreases as a function of the length of pat. For a random English pattern of length 5, the algorithm will typically inspect i/4 characters of string before finding a match at i. Furthermore, the algorithm has been implemented so that (on the average) fewer than i + patlen machine instructions are executed. These conclusions are supported with empirical evidence and a theoretical analysis of the average behavior of the algorithm. The worst case behavior of the algorithm is linear in i + patlen, assuming the availability of array space for tables linear in patlen plus the size of the alphabet.
TL;DR: An image alignment algorithm based on the matching of relative gradient maps between images that has the advantage of robustness against inhomogeneous illumination variations and the efficiency and robustness of the proposed algorithm are presented.
Abstract: We present an image alignment algorithm based on the matching of relative gradient maps between images. This algorithm consists of two stages; namely, a learning-based approximate pattern search and an iterative energy-minimization procedure for matching relative image gradient. The first stage finds some candidate poses of the pattern in the image through a fast search of the best match of the relative gradient features from the database of training feature vectors. The training database is obtained from the synthesis of the template image under a number of uniform samplings in a range of the geometric transformation space. Subsequently, the approximate candidate poses are further verified and refined by matching the relative gradient images through an iterative energy-minimization procedure. This approach based on the matching of relative gradients has the advantage of robustness against inhomogeneous illumination variations. Some experimental results are shown to demonstrate the efficiency and robustness of the proposed algorithm.