TL;DR: A generalization of string matching, in which the pattern is a sequence of pattern elements, each compatible with a set of symbols, is investigated, which shows that generalized string matching requires a time-space product of $\Omega ({{n^2 } / {\log n}})$ on a powerful model of computation, when the alphabet is restricted to n symbols.
Abstract: Given a pattern string of length n and an object string of length m, the string matching problem asks for the positions of all occurrences of the pattern in the object string. This paper investigates a generalization of string matching, in which the pattern is a sequence of pattern elements, each compatible with a set of symbols. The alphabet of symbols is infinite, with its members encoded in a finite alphabet. In contrast to standard string matching, which can be solved in simultaneous linear time and constant space, it is shown that generalized string matching requires a time-space product of $\Omega ({{n^2 } / {\log n}})$ on a powerful model of computation, when the alphabet is restricted to n symbols. Our proof uses a method of Borodin. The obvious algorithm for generalized string matching requires time $O(NM)$, where N is the length of the encoding of the pattern, and M is that of the object string. We describe an algorithm which solves generalized string matching in time $O(N + M + mN^{{1 / 2}} {\o...
TL;DR: In this article, a text input method is described for an electronic apparatus having a user interface with text input means and a display screen, where word completion functionality is provided for predicting word candidates for partial word inputs made by the user with the text input.
Abstract: A text input method is described for an electronic apparatus having a user interface with text input means and a display screen. Word completion functionality is provided for predicting word candidates for partial word inputs made by the user with the text input means . The method involves receiving a partial word input from the user and deriving a set of word completion candidates using the word completion functionality. Each of the word completion candidates in the set has a prefix and a suffix, wherein the prefix corresponds to the partial word input. The method also involves presenting the suffices for at least a sub set of the word completion candidates in a predetermined area on the display screen, wherein each of the presented suffices is made selectable for the user. In an embodiment this predetermined area is the space bar of a virtual keyboard, the area which still has its original function decreasing as more possible suffices are displayed there.
TL;DR: In this paper, a device for automatically identifying the language of a text from a plurality of languages extracts words from the text and constructs all of the character strings contained in each extracted word.
Abstract: After prestoring first character strings that occur frequently in words of languages and second character strings that are a typical therein, a device for automatically identifying the language of a text from a plurality of languages extracts words from the text and constructs all of the character strings contained in each extracted word. Each string in an extracted word is compared to the first and second strings of a particular language. If the word contains a first string, a score of the language is increased by a coefficient depending in particular on the position of the first string in the word. If the word contains a second string, the score is decreased by a coefficient associated with the second string. The highest of the scores corresponding to the predetermined languages identifies the language of the text.
TL;DR: In this paper, a system and method for compiling weighted context-dependent rewrite rules into weighted finite-state transducers introduces context marking symbols only when and where they are needed, and the compiler and compiling method use a composition of five simple finite state transducers generated from a weighted contextdependent rewrite rule to represent that rule.
Abstract: A system and method for compiling weighted context-dependent rewrite rules into weighted finite-state transducers introduces context marking symbols only when and where they are needed. In particular, the compiler and compiling method use a composition of five simple finite-state transducers generated from a weighted context-dependent rewrite rule to represent that rule. An "r" finite-state transducer is generated from the right context portion ρ of the weighted context-dependent rewrite rule. An "f" finite-state transducer is generated from the rewritten portion φ of the weighted context-dependent rewrite rule. A "Replace" finite-state transducer is generated from the rewritten and replacement portions φ and ψ of the weighted context-dependent rewrite rule. Finally, "l 1 " and "l 2 " finite-state transducers are generated from the left context portion λ of the weighted context-dependent rewrite rule. The "r" and "f" finite-state transducer generators of the compiler and the transducer generating steps of the compiling method introduce the context marking symbols " " in the various finite-state transducers only when and where they are needed. The right context marker symbol ">" is added to the "r" finite-state transducer only immediately before each occurrence of ρ. The left context markers " ". The "Replace", "l 1 " and "l 2 " finite-state transducers then appropriately remove the right and left context markers when replacing φ with ψ, and whether λ precedes φ in the input string.
TL;DR: This paper presents algorithms for three problems having to do with approximate matching for such trees with variable length don′t cares (VLDCs) with time complexity O(|P| × |D| × min(depth(P, leaves(P)) × min (depth(D), leaves(D)))