MIND: A Large-scale Dataset for News Recommendation
Fangzhao Wu,Ying Qiao,Jiun-Hung Chen,Chuhan Wu,Tao Qi,Jianxun Lian,Danyang Liu,Xing Xie,Jianfeng Gao,Winnie Wu,Ming Zhou +10 more
- 01 Jul 2020
- pp 3597-3606
TL;DR: This paper presents a large-scale dataset named MIND, constructed from the user click logs of Microsoft News, which contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body.
read more
Abstract: News recommendation is an important technique for personalized news service. Compared with product and movie recommendations which have been comprehensively studied, the research on news recommendation is much more limited, mainly due to the lack of a high-quality benchmark dataset. In this paper, we present a large-scale dataset named MIND for news recommendation. Constructed from the user click logs of Microsoft News, MIND contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body. We demonstrate MIND a good testbed for news recommendation through a comparative study of several state-of-the-art news recommendation methods which are originally developed on different proprietary datasets. Our results show the performance of news recommendation highly relies on the quality of news content understanding and user interest modeling. Many natural language processing techniques such as effective text representation methods and pre-trained language models can effectively improve the performance of news recommendation. The MIND dataset will be available at https://msnews.github.io.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Capturing High-order Interactions for Interest-aware News Recommendation
Yuan Ji,Shanliang Pan,Yuanyuan Zhang,Jinying Yuan,Chen Chen,Hongzhuo Wu +5 more
- 01 Dec 2022
TL;DR: High-order interactions in user behavior sequences are captured using a novel hypergraph-based method to improve news recommendation accuracy.
Modeling Users and Items for Recommenders:There Is More than Semantics
Mete Sertkan
- 13 Sep 2021
TL;DR: In this paper, the authors proposed a statistical-learning-based characterization of items, a picture-based approach to profile users and items, and a neural end-to-end approach, where they learn to represent users and item, and simultaneously to recommend.
Multi-purpose Recommender Platform using Perceiver IO
Ali Cevahir,Kentaro Kanada +1 more
- 01 Nov 2022
TL;DR: In this paper, a general-purpose framework for various recommendation tasks based on receiver IO model is proposed, which is a general ma-chine learning architecture based on transformer-style attention modules, which helps eliminating feature engineering for various tasks.
Forgetting User Preference in Recommendation Systems with Label-Flipping
Manal A. Alshehri,Xiangliang Zhang +1 more
- 15 Dec 2023
TL;DR: FlipRec is proposed, a general and efficient framework for recommendation models to “forget” the preferences of specific users while retaining the model’s performance for all other users.
Open Text News Benchmark: A Novel ChineseNews Benchmark for Text Classification
06 Oct 2022
TL;DR: Open Text News benchmark (OTNSTARD) as discussed by the authors is a Chinese news text benchmark which is designed to enhance related research in natural language processing, such as summarization, text classification, and sentiment analysis.
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Glove: Global Vectors for Word Representation
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova +3 more
- 11 Oct 2018
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
24.6K
•Proceedings Article
Translating Embeddings for Modeling Multi-relational Data
Antoine Bordes,Nicolas Usunier,Alberto Garcia-Duran,Jason Weston,Oksana Yakhnenko +4 more
- 05 Dec 2013
TL;DR: TransE is proposed, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities, which proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases.
Related Papers (5)
Hongwei Wang,Fuzheng Zhang,Xing Xie,Minyi Guo +3 more
- 10 Apr 2018
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015