A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle

doi:10.1145/3465416.3483305

A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle

- 05 Oct 2021

416

TL;DR: In this paper, the authors identify seven potential sources of downstream harm in machine learning, spanning data collection, development, and deployment, and propose a framework to facilitate more productive and precise communication around these issues.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1007/978-3-031-23618-1_1

Gender Stereotyping Impact in Facial Expression Recognition

Iris Dominguez-Catena

- 01 Jan 2023

- Communications in computer and informati...

TL;DR: In this article , the authors used a popular FER dataset, FER+, to generate derivative datasets with different amounts of stereotypical bias by altering the gender proportions of certain labels. And they then measured the discrepancy between the performance of the models trained on these datasets for the apparent gender groups.

...read moreread less

7

•Journal Article•10.15441/ceem.23.041

Current challenges in adopting machine learning to critical care and emergency medicine

Cyra Y. Kang, +1 more

- 15 May 2023

- Clinical and experimental emergency medi...

TL;DR: In this article , a series of current challenges of adopting ML models to clinical research is discussed, including data, feature generation, model design, performance assessment, and limited implementation of the research.

...read moreread less

7

•Journal Article•10.1088/2515-7620/acde35

Social media and volunteer rescue requests prediction with random forest and algorithm bias detection: a case of Hurricane Harvey

Volodymyr Mihunov, +4 more

- 01 Jun 2023

- Environmental research communications

TL;DR: In this article , the authors evaluate a Random Forest regression model trained to predict Twitter rescue request rates from social-environmental data using three fairness criteria (independence, separation, and sufficiency).

...read moreread less

7

Proceedings Article•10.1145/3593013.3594010

Rethinking Transparency as a Communicative Constellation

Florian Eyert, +1 more

- 12 Jun 2023

TL;DR: In this paper , the authors make the case for an expanded understanding of transparency and propose to view transparency as a communicative constellation that is a precondition for meaningful democratic deliberation.

...read moreread less

7

Journal Article•10.48550/arXiv.2303.15889

Metrics for Dataset Demographic Bias: A Case Study on Facial Expression Recognition

Iris Dominguez-Catena, +2 more

- 28 Mar 2023

- arXiv.org

TL;DR: In this article , a taxonomy for the classification of these metrics, providing a practical guide for the selection of appropriate metrics, was developed, and a case study of 20 datasets used in Facial Emotion Recognition (FER), analyzing the biases present in them.

...read moreread less

7

...

Expand

References

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K

•Journal Article•10.1613/JAIR.953

SMOTE: synthetic minority over-sampling technique

Nitesh V. Chawla, +3 more

- 01 Jan 2002

- Journal of Artificial Intelligence Resea...

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

27.7K

•Journal Article•10.1613/JAIR.953

SMOTE: Synthetic Minority Over-sampling Technique

Nitesh V. Chawla, +3 more

- 09 Jun 2011

- arXiv: Artificial Intelligence

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

11.5K

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Joy Buolamwini, +1 more

- 21 Jan 2018

TL;DR: It is shown that the highest error involves images of dark-skinned women, while the most accurate result is for light-skinned men, in commercial API-based classifiers of gender from facial images, including IBM Watson Visual Recognition.

...read moreread less

4.3K

Journal Article•10.2139/SSRN.2477899

Big Data's Disparate Impact

Solon Barocas, +1 more

- 01 Jan 2016

- Social Science Research Network

TL;DR: In the absence of a demonstrable intent to discriminate, the best doctrinal hope for data mining's victims would seem to lie in disparate impact doctrine as discussed by the authors, which holds that a practice can be justified as a business necessity when its outcomes are predictive of future employment outcomes, and data mining is specifically designed to find such statistical correlations.

...read moreread less

2.8K

...

Expand

A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle

Chat with Paper

AI Agents for this Paper

Citations

Gender Stereotyping Impact in Facial Expression Recognition

Current challenges in adopting machine learning to critical care and emergency medicine

Social media and volunteer rescue requests prediction with random forest and algorithm bias detection: a case of Hurricane Harvey

Rethinking Transparency as a Communicative Constellation

Metrics for Dataset Demographic Bias: A Case Study on Facial Expression Recognition

References

ImageNet: A large-scale hierarchical image database

SMOTE: synthetic minority over-sampling technique

SMOTE: Synthetic Minority Over-sampling Technique

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Big Data's Disparate Impact

Related Papers (5)

A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle

Toward a Taxonomy of Harm

A Framework of Severity for Harmful Content Online

Toward a Taxonomy of Harm in Knowledge Organization Systems

How Measuring Learning May Limit New Knowledge Creation