Book Chapter10.1007/978-3-030-65218-0_13
Mass Media Evaluation Using Topic Modelling
Kirill Yakunin,Ravil I. Mukhamediev,Ravil I. Mukhamediev,Rustam Mussabayev,Timur Buldybayev,Yan Kuchin,Sanzhar Murzakhmetov,Rassul Yunussov,Ulzhan Ospanova +8 more
- 17 Jun 2020
- pp 165-178
3
TL;DR: In this article, an approach is proposed, which allows to classify the most important/positive/negative/resonant topics and publications, and to analyze their dynamic characteristics, which is not based on manual creation of keyword dictionary, or labelling of big amounts of documents and allows to evaluate documents according to arbitrary criterion.
read more
Abstract: Automatic evaluation of public opinion is an actual problem in many areas, including both governmental and private sectors. There is number of scientific schools and corporations which work on to solve the problem of automatic evaluation of publications in media, social networks and other internet resources, in order to solve such problems as evaluating public image of a company, product or persona, evaluating work of PR departments and agencies, analyzing the most socially significant and resonant newsmakers and issues. The problems involve area of natural language processing and understanding, which is considered to be technologically and mathematically complex, and is nowadays being solved using deep learning models, which require a large marked dataset with texts of similar domain, which is hard and expensive to obtain. Another problem of such systems is performance issues. In this work an informational system is described, which attempts to solve the outlined problems. In the paper an approach is proposed, which allows to classify the most important/positive/negative/resonant topics and publications, and to analyze their dynamic characteristics. The proposed approach is not based on manual creation of keyword dictionary, or labelling of big amounts of documents and allows to evaluate documents according to arbitrary criterion. The approach was verified on one criterion by comparing it’s results to a dictionary-based system.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Identifying Sources of Opinions with Conditional Random Fields and
Yejin Choi,Claire Cardie,Ellen Riloff,Siddharth Patwardhan +3 more
- 01 Jan 2005
TL;DR: The authors adopted a hybrid approach that combines Conditional Random Fields (CRF) and a variation of AutoSlog (Riloff, 1996a) to identify sources of opinions, emotions, and sentiments.
11
KazNewsDataset: Single Country Overall Digital Mass Media Publication Corpus
Kirill Yakunin,Maksat Kalimoldayev,Ravil I. Mukhamediev,Rustam Mussabayev,V. B. Barakhnin,Yan Kuchin,Sanzhar Murzakhmetov,Timur Buldybayev,Ulzhan Ospanova,Marina Yelis,Akylbek Zhumabayev,Viktors I. Gopejenko,Zhazirakhanym Meirambekkyzy,Alibek Abdurazakov +13 more
- 14 Mar 2021
TL;DR: In this article, the authors present a corpus of Kazakhstan media, which contains over 4 million publications from 36 primary sources (which has at least 500 publications), including more than 2 million texts of Russian media for comparative analysis of publication activity of the countries, also about 4000 sections of state policy documents.
5
Evaluating latent content within unstructured text: an analytical methodology based on a temporal network of associated topics
Edwin Camilleri,Shah Jahan Miah +1 more
TL;DR: In this article, a step-by-step process is presented to facilitate the evaluation of latent topics from unstructured text, as well as the domain area that textual documents are sourced from.
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
Latent dirichlet allocation
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
•Proceedings Article
Latent Dirichlet Allocation
David M. Blei,Andrew Y. Ng,Michael I. Jordan +2 more
- 03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova +3 more
- 11 Oct 2018
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
24.6K
Thumbs up? Sentiment Classification using Machine Learning Techniques
Bo Pang,Lillian Lee,Shivakumar Vaithyanathan +2 more
- 06 Jul 2002
TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.