Book Chapter10.1007/978-3-030-53360-1_9
Author Profiling of Tweets
Jacques Savoy
- 01 Jan 2020
- pp 211-227
4
TL;DR: In this paper, the authors explore the distinct linguistic characteristics related to Twitter compared to the traditional oral or written form, and discover the linguistic features strongly related to bots, and those associated with men or women.
read more
Abstract: In this second chapter presenting stylometric applications, the social networks, and more precisely Twitter, are the source of our dataset. To explore new forms of communication, this chapter explores the distinct linguistic characteristics related to Twitter compared to the traditional oral or written form. For example, the frequency of mentions (e.g., @POTUS44), hyperlinks (e.g., www.nytimes.com), retweets or emojis (e.g., Open image in new window, Open image in new window) can be exploited to profile the author of a set of tweets. The dataset, freely available, is provided by the CLEF PAN evaluation campaign in 2019. With this corpus, the first classification task is to discriminate between tweets generated by bots or by humans. In a second application, the computer must identify tweets written by men or women. As a useful additional result, one can discover the linguistic features strongly related to bots, and those associated with men or women.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Word Frequency Distributions
R.H. Baayen
- 01 Jan 2001
TL;DR: This paper presents a meta-modelling framework for estimating the randomness of word frequency distributions using a variety of non-parametric and Parametric models.
422
•Book
Ten Lectures on Corpus Linguistics with R Applications for Usage-Based and Psycholinguistic Research
Stefan Th. Gries
- 05 Dec 2019
TL;DR: In this paper, Corpus Linguistics: the (Methods of the) Field and its Relation to cognitive linguistics is discussed. And Corpus-Linguistic applications in Cognitive/Usage-Based Explorations of Learner Language References about the Series Editor Websites for Cognitive Linguistic and CIFCL Speakers
15
Deeper Delta Across Genres and Languages: Do We Really Need the Most Frequent Words?
Jan Rybicki,Maciej Eder +1 more
- 01 Jan 2010
TL;DR: In 2007, John Burrows identified three regions in word frequency lists of corpora in authorship attribution and stylometry: the most frequent words, for which his Delta has become the best-known method of study as mentioned in this paper.
7
References
How to Win Votes and Influence Congress@@@Presidential Power and the Modern Presidents: The Politics of Leadership from Roosevelt to Reagan.
TL;DR: Richard Neustadt re-examines presidential power by testing his original theory against the administrations of modern US presidents from FDR to Reagan, evaluating their leadership styles and skills in the context of their future prospects for successful presidential power.
490
Bots increase exposure to negative and inflammatory content in online social systems.
TL;DR: Analysis of large-scale social data collected during the Catalan referendum for independence on October 1, 2017, consisting of nearly 4 millions Twitter posts generated by almost 1 million users, identifies the two polarized groups of Independentists and Constitutionalists and quantify the structural and emotional roles played by social bots.
463
•Book
Longman Student Grammar of Spoken and Written English
Douglas Biber,Susan Conrad,Geoffrey Leech +2 more
- 10 Dec 2002
TL;DR: This grammar is based on the analysis of 40-million words of British and American, written and spoken corpus text and uses over 3000 examples of real, corpus English to illustrate the points.
455
How variable may a constant be? Measures of lexical richness in perspective
TL;DR: The results suggest that the empirical trajectories tap into a considerable amount of authorial structure without, however, guaranteeing that spatial separation implies a difference in authorship.
447
•Book
Computer Age Statistical Inference: Algorithms, Evidence, and Data Science
Bradley Efron,Trevor Hastie +1 more
- 21 Jul 2016
TL;DR: This book takes an exhilarating journey through the revolution in data analysis following the introduction of electronic computation in the 1950s, with speculation on the future direction of statistics and data science.
438
Related Papers (5)
Hamdy Mubarak,Sabit Hassan,Ahmed Abdelali +2 more
- 01 May 2020
Baichuan Li,Xiance Si,Michael R. Lyu,Irwin King,Edward Y. Chang +4 more
- 24 Oct 2011
Erhan Sezerer,Ozan Polatbilek,Selma Tekir +2 more
- 01 Aug 2019