Topic

Native-language identification

About: Native-language identification is a research topic. Over the lifetime, 186 publications have been published within this topic receiving 3580 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

Papers

Book•10.1017/CBO9781139649414•

The Cambridge Handbook of Learner Corpus Research

[...]

Sylviane Granger, Gaëtanelle Gilquin, Fanny Meunier

1 Jan 2015

TL;DR: This chapter discusses learner corpus research - past, present and future Sylviane Granger, Gaetanelle Gilquin and Fanny Meunier, and the contribution of learner corpora to reference and instructional materials design.

...read moreread less

Abstract: 1. Introduction: learner corpus research - past, present and future Sylviane Granger, Gaetanelle Gilquin and Fanny Meunier Part I. Learner Corpus Design and Methodology: 2. From design to collection of learner corpora Gaetanelle Gilquin 3. Learner corpus methodology Marcus Callies 4. Learner corpora and psycholinguistics Philip Durrant and Anna Siyanova-Chanturia 5. Annotating learner corpora Bertus van Rooy 6. Speech annotation of learner corpora Nicolas Ballier and Philippe Martin 7. Error annotation systems Anke Ludeling and Hagen Hirschmann 8. Statistics for learner corpus research Stefan Th. Gries Part II. Analysis of Learner Language: 9. Learner corpora and lexis Tom Cobb and Marlise Horst 10. Learner corpora and phraseology Signe Oksefjell Ebeling and Hilde Hasselgard 11. Learner corpora and grammar Tom Rankin 12. Learner corpora and discourse JoAnne Neff-van Aertselaer 13. Learner corpora and pragmatics Nina Vyatkina and Joseph Cunningham Part III. Learner Corpus Research and Second Language Acquisition: 14. Second language acquisition theory and learner corpus research Florence Myles 15. Transfer and learner corpus research John Osborne 16. Learner corpora and formulaic language in second language acquisition research Nick C. Ellis, Rita Simpson-Vlach, Ute Romer, Matthew Brook O'Donnell and Stefanie Wulff 17. Developmental patterns in learner corpora Fanny Meunier 18. Variability in learner corpora Annelie Adel 19. Learner corpora and learning context Joybrato Mukherjee and Sandra Gotz Part IV. Learner Corpus Research and Language Teaching: 20. The learner corpus as a pedagogic corpus Angela Chambers 21. Learner corpora and language for academic and specific purposes Lynne Flowerdew 22. The contribution of learner corpora to reference and instructional materials design Sylviane Granger 23. Learner corpora and language testing Fiona Barker, Angeliki Salamoura and Nick Saville Part V. Learner Corpus Research and Natural Language Processing: 24. Learner corpora and natural language processing Detmar Meurers 25. Automatic grammar- and spell-checking for language learners Claudia Leacock, Martin Chodorow and Joel Tetreault 26. Learner corpora and automated scoring Derrick Higgins, Chaitanya Ramineni and Klaus Zechner 27. Learner corpora and native language identification Scott Jarvis and Magali Paquot.

...read moreread less

339 citations

Proceedings Article•10.21437/INTERSPEECH.2016-129•

The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language

[...]

Björn Schuller¹, Björn Schuller², Stefan Steidl³, Anton Batliner¹, Anton Batliner³, Julia Hirschberg⁴, Judee K. Burgoon, Alice Baird⁴, Aaron C. Elkins⁵, Yue Zhang², Eduardo Coutinho², Eduardo Coutinho⁶, Keelan Evanini - Show less +9 more•Institutions (6)

University of Passau¹, Imperial College London², University of Erlangen-Nuremberg³, Columbia University⁴, University of Arizona⁵, University of Liverpool⁶

8 Sep 2016

TL;DR: The INTERSPEECH 2016 Computational Paralinguistics Challenge addresses three different problems for the first time in research competition under well-defined conditions: classification of deceptive vs. non-deceptive speech, the estimation of the degree of sincerity, and the identification of the native language out of 11 L1 classes of English L2 speakers.

...read moreread less

Abstract: The INTERSPEECH 2016 Computational Paralinguistics Challenge addresses three different problems for the first time in research competition under well-defined conditions: classification of deceptive vs. non-deceptive speech, the estimation of the degree of sincerity, and the identification of the native language out of 11 L1 classes of English L2 speakers. In this paper, we describe these sub-challenges, their conditions, and the baseline feature extraction and classifiers, as provided to the participants.

...read moreread less

338 citations

Journal Article•10.1016/J.JSLW.2015.06.003•

Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds

[...]

Xiaofei Lu¹, Haiyang Ai²•Institutions (2)

Pennsylvania State University¹, University of Cincinnati²

01 Sep 2015-Journal of Second Language Writing

TL;DR: Differences in the syntactic complexity in English writing among college-level writers with different first language (L1) backgrounds are explored and varied patterns for L2 writing research and pedagogy and for automatic native language identification of learner texts are considered.

...read moreread less

305 citations

Journal Article•10.1002/J.2333-8504.2013.TB02331.X•

Toefl11: a corpus of non‐native english

[...]

Daniel Blanchard¹, Joel Tetreault², Derrick Higgins¹, Aoife Cahill¹, Martin Chodorow³ - Show less +1 more•Institutions (3)

Princeton University¹, Nuance Communications², The Graduate Center, CUNY³

01 Dec 2013-ETS Research Report Series

TL;DR: A new corpus of non-native English writing will be useful for the task of native language identification, as well as grammatical error detection and correction, and automatic essay scoring.

...read moreread less

Abstract: This report presents work on the development of a new corpus of non-native English writing. It will be useful for the task of native language identification, as well as grammatical error detection and correction, and automatic essay scoring. In this report, the corpus is described in detail.

...read moreread less

245 citations

Proceedings Article•10.18653/V1/W17-5007•

A Report on the First Native Language Identification Shared Task

[...]

Joel Tetreault¹, Daniel Blanchard¹, Aoife Cahill¹•Institutions (1)

Princeton University¹

1 Jun 2013

TL;DR: The fusion track showed that combining the written and spoken responses provides a large boost in prediction accuracy, and multiple classifier systems were the most effective in all tasks, with most based on traditional classifiers with lexical/syntactic features.

...read moreread less

Abstract: Native Language Identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is typically framed as a classification task where the set of L1s is known a priori. Two previous shared tasks on NLI have been organized where the aim was to identify the L1 of learners of English based on essays (2013) and spoken responses (2016) they provided during a standardized assessment of academic English proficiency. The 2017 shared task combines the inputs from the two prior tasks for the first time. There are three tracks: NLI on the essay only, NLI on the spoken response only (based on a transcription of the response and i-vector acoustic features), and NLI using both responses. We believe this makes for a more interesting shared task while building on the methods and results from the previous two shared tasks. In this paper, we report the results of the shared task. A total of 19 teams competed across the three different sub-tasks. The fusion track showed that combining the written and spoken responses provides a large boost in prediction accuracy. Multiple classifier systems (e.g. ensembles and meta-classifiers) were the most effective in all tasks, with most based on traditional classifiers (e.g. SVMs) with lexical/syntactic features.

...read moreread less

232 citations

...

Expand

Performance Metrics

186

Papers

1,359

Citations

No. of papers in the topic in previous years
Year	Papers
2021	1
2020	12
2019	13
2018	37
2017	35
2016	15

Native-language identification

Topic Tools

Papers published on a yearly basis

Papers

The Cambridge Handbook of Learner Corpus Research

The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language

Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds

Toefl11: a corpus of non‐native english

A Report on the First Native Language Identification Shared Task

Related Topics (5)

Performance Metrics