Mass Media Evaluation Using Topic Modelling

doi:10.1007/978-3-030-65218-0_13

Book Chapter10.1007/978-3-030-65218-0_13

Mass Media Evaluation Using Topic Modelling

Kirill Yakunin, +8 more

- 17 Jun 2020

- pp 165-178

3

TL;DR: In this article, an approach is proposed, which allows to classify the most important/positive/negative/resonant topics and publications, and to analyze their dynamic characteristics, which is not based on manual creation of keyword dictionary, or labelling of big amounts of documents and allows to evaluate documents according to arbitrary criterion.

Abstract: Automatic evaluation of public opinion is an actual problem in many areas, including both governmental and private sectors. There is number of scientific schools and corporations which work on to solve the problem of automatic evaluation of publications in media, social networks and other internet resources, in order to solve such problems as evaluating public image of a company, product or persona, evaluating work of PR departments and agencies, analyzing the most socially significant and resonant newsmakers and issues. The problems involve area of natural language processing and understanding, which is considered to be technologically and mathematically complex, and is nowadays being solved using deep learning models, which require a large marked dataset with texts of similar domain, which is hard and expensive to obtain. Another problem of such systems is performance issues. In this work an informational system is described, which attempts to solve the outlined problems. In the paper an approach is proposed, which allows to classify the most important/positive/negative/resonant topics and publications, and to analyze their dynamic characteristics. The proposed approach is not based on manual creation of keyword dictionary, or labelling of big amounts of documents and allows to evaluate documents according to arbitrary criterion. The approach was verified on one criterion by comparing it’s results to a dictionary-based system.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Identifying Sources of Opinions with Conditional Random Fields and

Yejin Choi, +3 more

- 01 Jan 2005

TL;DR: The authors adopted a hybrid approach that combines Conditional Random Fields (CRF) and a variation of AutoSlog (Riloff, 1996a) to identify sources of opinions, emotions, and sentiments.

...read moreread less

11

•Journal Article•10.3390/DATA6030031

KazNewsDataset: Single Country Overall Digital Mass Media Publication Corpus

Kirill Yakunin, +13 more

- 14 Mar 2021

TL;DR: In this article, the authors present a corpus of Kazakhstan media, which contains over 4 million publications from 36 primary sources (which has at least 500 publications), including more than 2 million texts of Russian media for comparative analysis of publication activity of the countries, also about 4000 sections of state policy documents.

...read moreread less

5

•Journal Article•10.1186/S40537-021-00511-0

Evaluating latent content within unstructured text: an analytical methodology based on a temporal network of associated topics

Edwin Camilleri, +1 more

- 01 Dec 2021

- Journal of Big Data

TL;DR: In this article, a step-by-step process is presented to facilitate the evaluation of latent topics from unstructured text, as well as the domain area that textual documents are sourced from.

...read moreread less

4

References

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Journal Article•10.5555/944919.944937

Latent dirichlet allocation

David M. Blei, +2 more

- 01 Mar 2003

- Journal of Machine Learning Research

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.

...read moreread less

36.2K

•Proceedings Article

Latent Dirichlet Allocation

David M. Blei, +2 more

- 03 Jan 2001

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

25.5K

Proceedings Article•10.18653/V1/N19-1423

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

24.6K

•Proceedings Article•10.3115/1118693.1118704

Thumbs up? Sentiment Classification using Machine Learning Techniques

Bo Pang, +2 more

- 06 Jul 2002

TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.

...read moreread less

7.2K