About: Boyer–Moore–Horspool algorithm is a research topic. Over the lifetime, 105 publications have been published within this topic receiving 10772 citations. The topic is also known as: Boyer-Moore-Horspool algorithm & Horspool's algorithm.
TL;DR: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
Abstract: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings. The constant of proportionality is low enough to make this algorithm of practical use, and the procedure can also be extended to deal with some more general pattern-matching problems. A theoretical application of the algorithm shows that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time. Other algorithms which run even faster on the average are also considered.
TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
Abstract: This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern matching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the pattern matching machine in processing the text string is independent of the number of keywords. The algorithm has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
TL;DR: The algorithm has the unusual property that, in most cases, not all of the first i.” in another string, are inspected.
Abstract: An algorithm is presented that searches for the location, “il” of the first occurrence of a character string, “pat,” in another string, “string.” During the search operation, the characters of pat are matched starting with the last character of pat. The information gained by starting the match at the end of the pattern often allows the algorithm to proceed in large jumps through the text being searched. Thus the algorithm has the unusual property that, in most cases, not all of the first i characters of string are inspected. The number of characters actually inspected (on the average) decreases as a function of the length of pat. For a random English pattern of length 5, the algorithm will typically inspect i/4 characters of string before finding a match at i. Furthermore, the algorithm has been implemented so that (on the average) fewer than i + patlen machine instructions are executed. These conclusions are supported with empirical evidence and a theoretical analysis of the average behavior of the algorithm. The worst case behavior of the algorithm is linear in i + patlen, assuming the availability of array space for tables linear in patlen plus the size of the alphabet.
TL;DR: A substring search algorithm that is faster than the Boyer-Moore algorithm and does not depend on scanning the pattern string in any particular order is described.
Abstract: This article describes a substring search algorithm that is faster than the Boyer-Moore algorithm. This algorithm does not depend on scanning the pattern string in any particular order. Three variations of the algorithm are given that use three different pattern scan orders. These include: (1) a “Quick Search” algorithm; (2) a “Maximal Shift” and (3) an “Optimal Mismatch” algorithm.
TL;DR: A cleaner version and a simpler analysis of the GS algorithm that corrects the algorithm given in [GS2] for the computation of periods and presents an optimal parallel algorithm for pattern preprocessing.
Abstract: We address several technical problems related to the time-space optimal string-matching algorithm of Galil and Seiferas (called the GS algorithm). This algorithm contains a parameterk on which the complexity depends and that originally satisfiesk ≥ 4. We show thatk=3 is the least integer for which the GS algorithm works. This value of the parameterk also minimizes the time of the search phase of the string-searching algorithm. With the parameterk=2 we consider a simpler version of the algorithm working in linear time and logarithmic space. This algorithm is based on the following fact: any word of lengthn starts by less than logΦn squares of primitive prefixes. Fibonacci words have a logarithmic number of square prefixes. Hence, the combinatorics of prefix squares and cubes is essential for string-matching with small memory.