Topic

Writeprint

About: Writeprint is a research topic. Over the lifetime, 31 publications have been published within this topic receiving 1716 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers

Journal Article•10.1145/1344411.1344413•

Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace

[...]

Ahmed Abbasi¹, Hsinchun Chen¹•Institutions (1)

University of Arizona¹

08 Apr 2008-ACM Transactions on Information Systems

TL;DR: This study proposed the use of stylometric analysis techniques to help identify individuals based on writing style, and incorporated a rich set of stylistic features, including lexical, syntactic, structural, content-specific, and idiosyncratic attributes.

...read moreread less

Abstract: One of the problems often associated with online anonymity is that it hinders social accountability, as substantiated by the high levels of cybercrime. Although identity cues are scarce in cyberspace, individuals often leave behind textual identity traces. In this study we proposed the use of stylometric analysis techniques to help identify individuals based on writing style. We incorporated a rich set of stylistic features, including lexical, syntactic, structural, content-specific, and idiosyncratic attributes. We also developed the Writeprints technique for identification and similarity detection of anonymous identities. Writeprints is a Karhunen-Loeve transforms-based technique that uses a sliding window and pattern disruption algorithm with individual author-level feature sets. The Writeprints technique and extended feature set were evaluated on a testbed encompassing four online datasets spanning different domains: email, instant messaging, feedback comments, and program code. Writeprints outperformed benchmark techniques, including SVM, Ensemble SVM, PCA, and standard Karhunen-Loeve transforms, on the identification and similarity detection tasks with accuracy as high as 94p when differentiating between 100 authors. The extended feature set also significantly outperformed a baseline set of features commonly used in previous research. Furthermore, individual-author-level feature sets generally outperformed use of a single group of attributes.

...read moreread less

494 citations

Journal Article•10.1145/1121949.1121951•

From fingerprint to writeprint

[...]

Jiexun Li¹, Rong Zheng², Hsinchun Chen•Institutions (2)

University of Arizona¹, New York University²

01 Apr 2006-Communications of The ACM

TL;DR: Identifying the key features to help identify and trace online authorship are identified.

...read moreread less

Abstract: Identifying the key features to help identify and trace online authorship.

...read moreread less

186 citations

Journal Article•10.1016/J.DIIN.2010.03.003•

Mining writeprints from anonymous e-mails for forensic investigation

[...]

Farkhund Iqbal¹, Hamad Binsalleeh¹, Benjamin C. M. Fung¹, Mourad Debbabi¹•Institutions (1)

Concordia University¹

01 Oct 2010-Digital Investigation

TL;DR: Experiments on a real-life dataset suggest that clustering by writing style is a promising approach for grouping e-mails written by the same author.

...read moreread less

175 citations

Journal Article•10.1016/J.INS.2011.03.006•

A unified data mining solution for authorship analysis in anonymous textual communications

[...]

Farkhund Iqbal¹, Hamad Binsalleeh¹, Benjamin C. M. Fung¹, Mourad Debbabi¹•Institutions (1)

Concordia University¹

01 May 2013-Information Sciences

TL;DR: This paper is the first work that presents a unified data mining solution to address authorship analysis problems based on the concept of frequent pattern-based writeprint, and extensive experiments suggest that the proposed solution can precisely capture the writing styles of individuals.

...read moreread less

123 citations

Journal Article•10.1093/LLC/FQS003•

Detecting authorship deception: a supervised machine learning approach using author writeprints

[...]

Lisa Pearl¹, Mark Steyvers¹•Institutions (1)

University of California, Irvine¹

01 Jun 2012-Literary and Linguistic Computing

TL;DR: A new supervised machine learning approach for detecting author- ship deception, a specific type of authorship attribution task particularly relevant for cybercrime forensic investigations, and its validity on two case studies drawn from realistic online data sets is demonstrated.

...read moreread less

Abstract: We describe a new supervised machine learning approach for detecting author- ship deception, a specific type of authorship attribution task particularly relevant for cybercrime forensic investigations, and demonstrate its validity on two case studies drawn from realistic online data sets. The core of our approach involves identifying uncharacteristic behavior for an author, based on a writeprint ex- tracted from unstructured text samples of the author's writing. The writeprints used here involve stylometric features and content features derived from topic models, an unsupervised approach for identifying relevant keywords that relate to the content areas of a document. One innovation of our approach is to trans- form the writeprint feature values into a representation that individually balances characteristic and uncharacteristic traits of an author, and we subsequently apply a Sparse Multinomial Logistic Regression classifier to this novel representation. Our method yields high accuracy for authorship deception detection on the two case studies, confirming its utility.

...read moreread less

63 citations

...

Expand

Performance Metrics

Papers

Citations

No. of papers in the topic in previous years
Year	Papers
2020	2
2018	1
2015	5
2014	4
2013	2
2012	5

Writeprint

Topic Tools

Papers

Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace

From fingerprint to writeprint

Mining writeprints from anonymous e-mails for forensic investigation

A unified data mining solution for authorship analysis in anonymous textual communications

Detecting authorship deception: a supervised machine learning approach using author writeprints

Related Topics (5)

Performance Metrics