TL;DR: A statistical shift-reduce parser that bridges the gap between deterministic and probabilistic parsers, allowing for a simple combination that produces precision and recall of 90.9% and 90.7%, respectively.
Abstract: Recently proposed deterministic classifier-based parsers (Nivre and Scholz, 2004; Sagae and Lavie, 2005; Yamada and Mat-sumoto, 2003) offer attractive alternatives to generative statistical parsers. Deterministic parsers are fast, efficient, and simple to implement, but generally less accurate than optimal (or nearly optimal) statistical parsers. We present a statistical shift-reduce parser that bridges the gap between deterministic and probabilistic parsers. The parsing model is essentially the same as one previously used for deterministic parsing, but the parser performs a best-first search instead of a greedy search. Using the standard sections of the WSJ corpus of the Penn Treebank for training and testing, our parser has 88.1% precision and 87.8% recall (using automatically assigned part-of-speech tags). Perhaps more interestingly, the parsing model is significantly different from the generative models used by other well-known accurate parsers, allowing for a simple combination that produces precision and recall of 90.9% and 90.7%, respectively.
TL;DR: SARDSRN extends the SRN by explicitly representing the input sequence in a SARDNET self-organizing map, which allows SARDSRN to learn to parse sentences with more complicated structure than can theSRN alone, and suggests that the approach could scale up to realistic natural language.
Abstract: Simple Recurrent Networks (SRNs) have been widely used in natural language tasks. SARDSRN extends the SRN by explicitly representing the input sequence in a SARDNET self-organizing map. The distributed SRN component leads to good generalization and robust cognitive properties, whereas the SARDNET map provides exact representations of the sentence constituents. This combination allows SARDSRN to learn to parse sentences with more complicated structure than can the SRN alone, and suggests that the approach could scale up to realistic natural language.
TL;DR: A parsing technique for Bangla grammar recognition that can detect all forms of Bangla sentences even for nontraditional forms and is free from the problem of the left factoring and left recursion.
Abstract: Parser plays a very important role in computational linguistics. In this paper, here we describe a parsing technique for Bangla grammar recognition. The parser is, by nature, a shift reduce parser and constructs a parse table based on LR strategy. It takes the Context Free Grammar (CFG) of the Bangla language as input and constructs parser table from the grammar. The parse table is visited on bottom-up approach. This parser is free from the problem of the left factoring and left recursion. To avoid the inflection (BIVOKTI) of Bangla we describe a new approach. Hence only the main form of the Bangla word is stored in the repository. Our experiment shows that the scheme can detect all forms of Bangla sentences even for nontraditional forms.
TL;DR: The results from applying this algorithm to a diverse collection of faulty grammars show that the algorithm is practical, effective, and suitable for inclusion in other LALR parser generators.
Abstract: Writing a parser remains remarkably painful. Automatic parser generators offer a powerful and systematic way to parse complex grammars, but debugging conflicts in grammars can be time-consuming even for experienced language designers. Better tools for diagnosing parsing conflicts will alleviate this difficulty. This paper proposes a practical algorithm that generates compact, helpful counterexamples for LALR grammars. For each parsing conflict in a grammar, a counterexample demonstrating the conflict is constructed. When the grammar in question is ambiguous, the algorithm usually generates a compact counterexample illustrating the ambiguity. This algorithm has been implemented as an extension to the CUP parser generator. The results from applying this implementation to a diverse collection of faulty grammars show that the algorithm is practical, effective, and suitable for inclusion in other LALR parser generators.
TL;DR: A variant of the parsing algorithm for DeSR, a statistical transition-based dependency parser that learns from a training corpus suitable actions to take in order to build a parse tree while scanning a sentence, is proposed.
Abstract: DeSR is a statistical transition-based dependency parser that learns from a training corpus suitable actions to take in order to build a parse tree while scanning a sentence DeSR can be configured to use different feature models and classifier types We tuned the parser for the Evalita 2011 corpora by performing several experiments of feature selection and also by adding some new features The submitted run used DeSR with two additional techniques: (1) reverse revision parsing, which addresses the problem of long distance dependencies, by extracting hints from the output of a first parser as input to a second parser running in the opposite direction; (2) parser combination, which consists in combining the outputs of different configurations of the parser The submission achieved best accuracy among pure statistical parsers An analysis of the errors shows that the accuracy is quite high on half of the test set and lower on the second half, which belongs to a different domain We propose a variant of the parsing algorithm to address these shortcomings