Training Complex Models with Multi-Task Weak Supervision.
Alexander Ratner,Braden Hancock,Jared Dunnmon,Frederic Sala,Shreyash Pandey,Christopher Ré +5 more
- 17 Jul 2019
- Vol. 33, Iss: 01, pp 4763-4771
TL;DR: This work shows that by solving a matrix completion-style problem, it can recover the accuracies of these multi-task sources given their dependency structure, but without any labeled data, leading to higher-quality supervision for training an end model.
read more
Abstract: As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlated labels, and may label different tasks or apply at different levels of granularity. We propose a framework for integrating and modeling such weak supervision sources by viewing them as labeling different related sub-tasks of a problem, which we refer to as the multi-task weak supervision setting. We show that by solving a matrix completion-style problem, we can recover the accuracies of these multi-task sources given their dependency structure, but without any labeled data, leading to higher-quality supervision for training an end model. Theoretically, we show that the generalization error of models trained with this approach improves with the number of unlabeled data points, and characterize the scaling with respect to the task and dependency structures. On three fine-grained classification problems, we show that our approach leads to average gains of 20.2 points in accuracy over a traditional supervised approach, 6.8 points over a majority vote baseline, and 4.1 points over a previously proposed weak supervision method that models tasks separately.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Learning--based Text Classification: A Comprehensive Review
TL;DR: This paper provided a comprehensive review of more than 150 deep learning-based models for text classification developed in recent years, and discussed their technical contributions, similarities, and strengths, and provided a quantitative analysis of the performance of different deep learning models on popular benchmarks.
1K
Snorkel: Rapid Training Data Creation with Weak Supervision
TL;DR: Snorkel is a first-of-its-kind system that enables users to train state- of- the-art models without hand labeling any training data and proposes an optimizer for automating tradeoff decisions that gives up to 1.8× speedup per pipeline execution.
•Posted Content
Deep Learning Based Text Classification: A Comprehensive Review
TL;DR: A comprehensive review of more than 150 deep learning--based models for text classification developed in recent years is provided, and their technical contributions, similarities, and strengths are discussed.
600
Snorkel: rapid training data creation with weak supervision
Alexander Ratner,Stephen H. Bach,Henry R. Ehrenberg,Jason Fries,Sen Wu,Christopher Ré +5 more
TL;DR: Snorkel enables rapid training data creation with weak supervision by automating the process of labeling training data and incorporating data programming techniques.
DataComp: In search of the next generation of multimodal datasets
Samir Yitzhak Gadre,Gabriel Ilharco,Alex Fang,Jonathan Hayase,Georgios Smyrnis,T. Nguyen,Ryan Marten,Mitchell Wortsman,Dhruba Ghosh,Jieyu Zhang,Rahim Entezari,Giannis Daras,Sarah I. Pratt,Vivek Ramanujan,Yonatan Bitton,Kalyani Marathe,Stephen Mussmann,Richard Vencu,Mehdi Cherti,Ranjay Krishna,Pang Wei Koh,Olga Saukh,Alexander Ratner,Shuran Song,Hannaneh Hajishirzi,Ali Farhadi,Romain Beaumont,Sewoong Oh,Alexandros G. Dimakis,Jenia Jitsev,Yair Carmon,Vaishaal Shankar,Ludwig Schmidt +32 more
TL;DR: DataComp as mentioned in this paper is a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl, which can be used to design new filtering techniques or curate new data sources and then evaluate their new dataset by running our standardized CLIP training code and testing the resulting model on 38 downstream test sets.
193
References
•Book
Probabilistic graphical models : principles and techniques
Daniel L. Koller,Nir Friedman +1 more
- 31 Jul 2009
TL;DR: The framework of probabilistic graphical models, presented in this book, provides a general approach for causal reasoning and decision making under uncertainty, allowing interpretable models to be constructed and then manipulated by reasoning algorithms.
Combining labeled and unlabeled data with co-training
Avrim Blum,Tom M. Mitchell +1 more
- 24 Jul 1998
TL;DR: A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.
6.4K
Distant supervision for relation extraction without labeled data
Mike D. Mintz,Steven Bills,Rion Snow,Dan Jurafsky +3 more
- 02 Aug 2009
TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
•Posted Content
An Overview of Multi-Task Learning in Deep Neural Networks
TL;DR: This article seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks, particularly in deep neural networks.
3.3K
Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm
A. P. Dawid,A. M. Skene +1 more
TL;DR: The EM algorithm is shown to provide a slow but sure way of obtaining maximum likelihood estimates of the parameters of interest in compiling a patient record.
1.9K
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Avrim Blum,Tom M. Mitchell +1 more
- 24 Jul 1998