Journal Article10.13053/CYS-21-4-2551
Text Analysis Using Different Graph-Based Representations
TL;DR: An overview of different graph-based representations proposed to solve text classification tasks, highlighting the importance of enriched/non-enriched co-occurrence graphs as analternative to traditional features representation models like vector representation.
read more
Abstract: This paper presents an overview of different graph-based representations proposed to solve text classification tasks. The core of this manuscript is to highlight the importance of enriched/non-enriched co-occurrence graphs as analternative to traditional features representation models like vector representation, where most of the time these models can not map all the richness of text documents that comes from the web (social media, blogs, personalweb pages, news, etc). For each text classification task the type of graph created as well as the benefits of using it are presented and discussed. In specific, the type of features/patterns extracted, the implemented classification/similarity methods and the results obtained in datasets are explained. The theoretical and practical implications of using co-occurrence graphs are also discussed, pointing out the contributions and challenges of modeling text document as graphs.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
InfraNodus: Generating Insight Using Text Network Analysis
Dmitry Paranyushkin
- 13 May 2019
TL;DR: The tool (InfraNodus) can be used by researchers and writers to organize and to better understand their notes, to measure the level of bias in discourse, and to identify the parts of the discourse where there is a potential for insight and new ideas.
101
Graph-Based Siamese Network for Authorship Verification
TL;DR: The authors proposed a Siamese network architecture composed of graph convolutional networks along with pooling and classification layers to solve the authorship identification task on a cross-topic and open-set scenario.
An Ensemble of Automatic Keyword Extractors: TextRank, RAKE and TAKE
TL;DR: The ensemble method achieved a better overall performance when compared to the automatic keyword extractors that were used in its development as well as to some recent automatic keyword extraction methods.
Automatic Classification of Web News: A Systematic Mapping Study
Mauricio Pandolfi-González,Christian Quesada-López,Alexandra Martinez,Marcelo Jenkins +3 more
- 03 Sep 2020
TL;DR: In this paper, the authors performed a systematic literature mapping of 51 primary studies published between 2000 and 2019 and found that the most used techniques fall into these paradigms: clustering, support vector machines and generative models.
3
References
Finding and evaluating community structure in networks.
TL;DR: It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.
Machine learning in automated text categorization
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
•Book
Opinion Mining and Sentiment Analysis
Bo Pang,Lillian Lee +1 more
- 08 Jul 2008
TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.
The link-prediction problem for social networks
David Liben-Nowell,Jon Kleinberg +1 more
TL;DR: Experiments on large coauthorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures.
Friends and neighbors on the Web
Lada A. Adamic,Eytan Adar +1 more
TL;DR: In this paper, the authors show that some factors are better indicators of social connections than others, and that these indicators vary between user populations, and provide potential applications in automatically inferring real world connections and discovering, labeling, and characterizing communities.
3.1K