Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection
Soumyajit Gupta,Sooyong Lee,Maria De-Arteaga,M. Lease +3 more
- 14 Feb 2023
TL;DR: This paper propose a multi-task learning approach for toxicity detection, where only training examples relevant to the given demographic group are considered by the loss function. But, their method requires labels for all tasks to be present for every data point, leading to disparate performance.
read more
Abstract: Algorithmic bias often arises as a result of differential subgroup validity, in which predictive relationships vary across groups. For example, in toxic language detection, comments targeting different demographic groups can vary markedly across groups. In such settings, trained models can be dominated by the relationships that best fit the majority group, leading to disparate performance. We propose framing toxicity detection as multi-task learning (MTL), allowing a model to specialize on the relationships that are relevant to each demographic group while also leveraging shared properties across groups. With toxicity detection, each task corresponds to identifying toxicity against a particular demographic group. However, traditional MTL requires labels for all tasks to be present for every data point. To address this, we propose Conditional MTL (CondMTL), wherein only training examples relevant to the given demographic group are considered by the loss function. This lets us learn group specific representations in each branch which are not cross contaminated by irrelevant labels. Results on synthetic and real data show that using CondMTL improves predictive recall over various baselines in general and for the minority demographic group in particular, while having similar overall accuracy.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
How (Not) to Use Sociodemographic Information for Subjective NLP Tasks
T. Beck,Hendrik Schuff,Anne Lauscher,Iryna Gurevych +3 more
TL;DR: It is shown that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks, and that sociodemographic prompting should be used with care for sensitive applications, such as toxicity annotation or when studying LLM alignment.
21
A Critical Survey on Fairness Benefits of XAI
Luca Deck,Jakob Schoeffer,Maria De-Arteaga,Niklas Kühl +3 more
TL;DR: This critical survey analyzes typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts and encourages to conceive XAI not as an ethical panacea but as one of many tools to approach the multidimensional, sociotechnical challenge of algorithmic fairness.
7
Hate Speech Detection with Generalizable Target-aware Fairness
Tong Chen,Danny Wang,Xue Li,Marten Risius,Gianluca Demartini,Hongzhi Yin +5 more
- 28 May 2024
TL;DR: Hate speech detection with generalizable target-aware fairness aims to address the bias problem in hate speech detection classifiers by generalizing to unseen targets.
Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AI
Houjiang Liu,Anubrata Das,Alexander Boltz,Didi Zhou,Daisy Pinaroc,m.a.Cynthia A. Lease,Min Kyung Lee +6 more
TL;DR: A co-design method is investigated, Matchmaking for AI, to enable fact-checkers, designers, and NLP researchers to collaboratively identify what fact-checker needs should be addressed by technology, and to brainstorm ideas for potential solutions.
Algorithmic Fairness: A Tolerance Perspective
Rongkui Luo,Tao Tang,Feng Guo,Jiaying Liu,C. Shan Xu,Leo Yu Zhang,Xiang Wang,Chengqi Zhang +7 more
- 26 Apr 2024
TL;DR: Algorithmic fairness survey exploring the social consequences of biased decisions, introducing a novel taxonomy based on 'tolerance', and outlining challenges and future directions.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Focal Loss for Dense Object Detection
Tsung-Yi Lin,Priya Goyal,Ross Girshick,Kaiming He,Piotr Dollár +4 more
- 07 Aug 2017
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
•Posted Content
Focal Loss for Dense Object Detection
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
16.7K
•Posted Content
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
TL;DR: This work proposes a method to pre-train a smaller general-purpose language representation model, called DistilBERT, which can be fine-tuned with good performances on a wide range of tasks like its larger counterparts, and introduces a triple loss combining language modeling, distillation and cosine-distance losses.
7.3K