MIND: A Large-scale Dataset for News Recommendation
Fangzhao Wu,Ying Qiao,Jiun-Hung Chen,Chuhan Wu,Tao Qi,Jianxun Lian,Danyang Liu,Xing Xie,Jianfeng Gao,Winnie Wu,Ming Zhou +10 more
- 01 Jul 2020
- pp 3597-3606
TL;DR: This paper presents a large-scale dataset named MIND, constructed from the user click logs of Microsoft News, which contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body.
read more
Abstract: News recommendation is an important technique for personalized news service. Compared with product and movie recommendations which have been comprehensively studied, the research on news recommendation is much more limited, mainly due to the lack of a high-quality benchmark dataset. In this paper, we present a large-scale dataset named MIND for news recommendation. Constructed from the user click logs of Microsoft News, MIND contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body. We demonstrate MIND a good testbed for news recommendation through a comparative study of several state-of-the-art news recommendation methods which are originally developed on different proprietary datasets. Our results show the performance of news recommendation highly relies on the quality of news content understanding and user interest modeling. Many natural language processing techniques such as effective text representation methods and pre-trained language models can effectively improve the performance of news recommendation. The MIND dataset will be available at https://msnews.github.io.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
RecPrompt: A Prompt Tuning Framework for News Recommendation Using Large Language Models
Dairui Liu,Boming Yang,Honghui Du,Derek Greene,Aonghus Lawlor,Ruihai Dong,Irene Li +6 more
TL;DR: RecPrompt is introduced, the first framework for news recommendation that leverages the capabilities of LLMs through prompt engineering, and incorporates a prompt optimizer that applies an iterative bootstrapping process, enhancing the LLM-based recommender's ability to align news content with user preferences and interests more effectively.
Reformulating Sequential Recommendation: Learning Dynamic User Interest with Content-enriched Language Modeling
Junzhe Jiang,Shang Qu,Mingyue Cheng,Qi Liu +3 more
TL;DR: This work adopts a new sequential recommendation paradigm and proposes LANCER, which leverages the semantic understanding capabilities of pre-trained language models to generate personalized recommendations, resulting in more human-like recommendations.
A contrastive news recommendation framework based on curriculum learning
Xingran Zhou,Nankai Lin,Weixiong Zheng,Dong Zhou,Aimin Yang +4 more
Removing AI’s sentiment manipulation of personalized news delivery
TL;DR: In this article , a sentiment debiasing method based on a decomposed adversarial learning framework is proposed, which can reduce 97.3% of sentiment bias with only 2.9% accuracy sacrifice.
FaLA: Fast Linear Adaptation for Replacing Backbone Models on Edge Devices
Shuo Huang,Lizhen Qu,Xingliang Yuan,Chunyang Chen +3 more
TL;DR: FaLA is a novel framework for replacing backbone models on edge devices with lightweight linear transformations. It achieves significant performance improvements over traditional methods while reducing model size and inference latency.
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Glove: Global Vectors for Word Representation
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova +3 more
- 11 Oct 2018
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
24.6K
•Proceedings Article
Translating Embeddings for Modeling Multi-relational Data
Antoine Bordes,Nicolas Usunier,Alberto Garcia-Duran,Jason Weston,Oksana Yakhnenko +4 more
- 05 Dec 2013
TL;DR: TransE is proposed, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities, which proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases.
Related Papers (5)
Hongwei Wang,Fuzheng Zhang,Xing Xie,Minyi Guo +3 more
- 10 Apr 2018
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015