TL;DR: This book is a guide for linguistic fieldworkers who wish to write a description of the morphology and syntax of one of these many under-documented languages.
Abstract: Current estimates are that around 3,000 of the 6,000 languages now spoken may become extinct during the next century. Some 4,000 of these existing languages have never been described, or described only inadequately. This book is a guide for linguistic fieldworkers who wish to write a description of the morphology and syntax of one of these many under-documented languages. It uses examples from many languages both well known and virtually unknown; it offers readers one possible outline for a grammatical description, with many questions designed to help them address the key topics. The appendices offer guidance on text and elicited data, and on sample reference grammars which readers might wish to consult.
TL;DR: This paper shows that the problem of prepositional phrase attachment ambiguity is analogous to n-gram language models in speech recognition, and that one of the most common methods for language modeling, the backed-off estimate, is applicable.
Abstract: Recent work has considered corpus-based or statistical approaches to the problem of prepositional phrase attachment ambiguity. Typically, ambiguous verb phrases of the form v np1 p np2 are resolved through a model which considers values of the four head words (v, n1, p and n2). This paper shows that the problem is analogous to n-gram language models in speech recognition, and that one of the most common methods for language modeling, the backed-off estimate, is applicable. Results on Wall Street Journal data of 84.5% accuracy are obtained using this method. A surprising result is the importance of low-count events — ignoring events which occur less than 5 times in training data reduces performance to 81.6%.
TL;DR: This paper tracked the historical development of this discourse style and observed the development of particular grammatical functions that are emerging in writing by analyzing their historical development over the last four centuries in a corpus of academic research writing (compared to other registers such as fiction, newspaper reportage and conversation).
Abstract: Many discussions of grammatical change have focused on grammatical innovation in the discourse contexts of conversational interaction. We argue here that it is also possible for grammatical innovation to emerge out of the communicative demands of written discourse. In particular, the distinctive communicative characteristics of academic writing (informational prose) have led to the development of a discourse style that relies heavily on nominal structures, with extensive phrasal modification and a relative absence of verbs. By tracking the historical development of this discourse style, we can also observe the development of particular grammatical functions that are emerging in writing. We focus here on two grammatical features – nouns as nominal premodifiers and prepositional phrases as nominal postmodifiers – analyzing their historical development over the last four centuries in a corpus of academic research writing (compared to other registers such as fiction, newspaper reportage and conversation). Our analysis shows that these grammatical features were quite restricted in function and variability in earlier historical periods of English. However, in the nineteenth and twentieth centuries, they became much more frequent and productive, accompanied by major extensions in their functions, variants, and range of lexical associations. These extensions were restricted primarily to informational written discourse, illustrating ways in which new grammatical functions emerge in writing rather than speech.
TL;DR: The authors argued that adnominal modifiers in a layered model of the noun phrase can be divided into two major subcategories: descriptive modifiers and discourse-referential modifiers, which are concerned with the status of entities as referents in the world of discourse.
Abstract: This article argues that adnominal modifiers in a layered model of the noun phrase can be divided into two major subcategories: descriptive modifiers and discourse-referential modifiers. Whereas descriptive modifiers can be subdivided into classifying, qualifying, quantifying and localizing modifiers (Section 2), discourse-referential modifiers in the noun phrase are concerned with the status of entities as referents in the world of discourse (Section 3). I will pay particular attention to three issues: (i) formal reflections of the layered, semantic structure of the noun phrase (Section 4), (ii) the special relationship between localizing and discourse-referential modifiers (Section 5), and (iii) semantic and morphosyntactic parallels between modifier categories in the noun phrase and the clause (Section 6). In addition this sample-based typological study shows (contra Hawkins's Universal 20') that there are languages with the adjective before and the demonstrative or numeral after the head noun. These word order patterns provide additional support for the layered model of the noun phrase defended here in that it can now be shown for the first time that all patterns that iconically reflect the layered structure of the simple noun phrase are actually attested in the languages of the world.