String-to-string correction problem

Topic Tools

Papers published on a yearly basis

Papers

Journal Article•10.1145/321796.321811•

The String-to-String Correction Problem

[...]

Robert A. Wagner¹, Michael J. Fischer²•Institutions (2)

Vanderbilt University¹, Massachusetts Institute of Technology²

01 Jan 1974-Journal of the ACM

TL;DR: An algorithm is presented which solves the string-to-string correction problem in time proportional to the product of the lengths of the two strings.

...read moreread less

Abstract: The string-to-string correction problem is to determine the distance between two strings as measured by the minimum cost sequence of “edit operations” needed to change the one string into the other. The edit operations investigated allow changing one symbol of a string into another single symbol, deleting one symbol from a string, or inserting a single symbol into a string. An algorithm is presented which solves this problem in time proportional to the product of the lengths of the two strings. Possible applications are to the problems of automatic spelling correction and determining the longest subsequence of characters common to two strings.

...read moreread less

3,523 citations

Journal Article•10.1145/375360.375365•

A guided tour to approximate string matching

[...]

Gonzalo Navarro¹•Institutions (1)

University of Chile¹

01 Mar 2001-ACM Computing Surveys

TL;DR: This work surveys the current techniques to cope with the problem of string matching that allows errors, and focuses on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms.

...read moreread less

Abstract: We survey the current techniques to cope with the problem of string matching that allows errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms and their complexities. We present a number of experiments to compare the performance of the different algorithms and show which are the best choices. We conclude with some directions for future work and open problems.

...read moreread less

3,075 citations

Book•

Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison

[...]

David Sankoff, Joseph B. Kruskal

1 Aug 1983

TL;DR: In this paper, a mudflap assembly for use with a dump vehicle having dual tires at the rear end thereof and including a pair of flexible flap sections one of which is supported by a rigid member adjacent the dual tires and the other is located above and to the rear of the rigid member and is secured at its upper end to the dump body.

...read moreread less

Abstract: A mudflap assembly for use with a dump vehicle having dual tires at the rear end thereof and including a pair of flexible flap sections one of which is supported by a rigid member adjacent the dual tires and the other is located above and to the rear of the rigid member and is secured at its upper end to the dump body. The rigid member is pivotally connected to the dump body and is combined with a cable which assures that the attached flap section maintains substantially the same position when the dump body is in the lowered-carry-position or raised-dump-position.

...read moreread less

1,965 citations

Journal Article•10.1109/34.682181•

Learning string-edit distance

[...]

Eric Sven Ristad¹, Peter N. Yianilos¹•Institutions (1)

Princeton University¹

01 May 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The stochastic model allows us to learn a string-edit distance function from a corpus of examples and is applicable to any string classification problem that may be solved using a similarity function against a database of labeled prototypes.

...read moreread less

Abstract: In many applications, it is necessary to determine the similarity of two strings. A widely-used notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic model for string-edit distance. Our stochastic model allows us to learn a string-edit distance function from a corpus of examples. We illustrate the utility of our approach by applying it to the difficult problem of learning the pronunciation of words in conversational speech. In this application, we learn a string-edit distance with nearly one-fifth the error rate of the untrained Levenshtein distance. Our approach is applicable to any string classification problem that may be solved using a similarity function against a database of labeled prototypes.

...read moreread less

994 citations

Journal Article•10.1016/J.TCS.2004.12.030•

A survey on tree edit distance and related problems

[...]

Philip Bille¹•Institutions (1)

IT University of Copenhagen¹

09 Jun 2005-Theoretical Computer Science

TL;DR: This work surveys the problem of comparing labeled trees based on simple local operations of deleting, inserting, and relabeling nodes and presents one or more of the central algorithms for solving the problem.

...read moreread less

920 citations

...

Expand

Year	Papers
2018	1
2017	6
2016	13
2015	23
2014	18
2013	13

Topic Tools

Papers published on a yearly basis

Papers

The String-to-String Correction Problem

A guided tour to approximate string matching

Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison

Learning string-edit distance

A survey on tree edit distance and related problems

Related Topics (5)

Performance Metrics