Fair mapping

doi:10.48550/arXiv.2209.00617

Journal Article10.48550/arXiv.2209.00617

Fair mapping

Sébastien Gambs, +1 more

- 01 Sep 2022

- arXiv.org

- Vol. abs/2209.00617

TL;DR: A novel pre-processing method based on the transformation of the distribution of protected groups onto a chosen target one, with additional privacy constraints whose objective is to prevent the inference of sensitive attributes is proposed, which preserves the interpretability of data and can be used without defining exactly the sensitive groups.

Abstract: To mitigate the effects of undesired biases in models, several approaches propose to pre-process the input dataset to reduce the risks of discrimination by preventing the inference of sensitive attributes. Unfortunately, most of these pre-processing methods lead to the generation a new distribution that is very different from the original one, thus often leading to unrealistic data. As a side effect, this new data distribution implies that existing models need to be re-trained to be able to make accurate predictions. To address this issue, we propose a novel pre-processing method, that we coin as fair mapping, based on the transformation of the distribution of protected groups onto a chosen target one, with additional privacy constraints whose objective is to prevent the inference of sensitive attributes. More precisely, we leverage on the recent works of the Wasserstein GAN and AttGAN frameworks to achieve the optimal transport of data points coupled with a discriminator enforcing the protection against attribute inference. Our proposed approach, preserves the interpretability of data and can be used without defining exactly the sensitive groups. In addition, our approach can be specialized to model existing state-of-the-art approaches, thus proposing a unifying view on these methods. Finally, several experiments on real and synthetic datasets demonstrate that our approach is able to hide the sensitive attributes, while limiting the distortion of the data and improving the fairness on subsequent data analysis tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

References

•Journal Article•10.3156/JSOFT.29.5_177_2

Generative Adversarial Nets

Ian Goodfellow, +7 more

- 08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

48.6K

•Proceedings Article

Wasserstein Generative Adversarial Networks

Martin Arjovsky, +2 more

- 17 Jul 2017

TL;DR: This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.

...read moreread less

8.2K

•Proceedings Article

Sinkhorn Distances: Lightspeed Computation of Optimal Transport

Marco Cuturi

- 05 Dec 2013

TL;DR: This work smooths the classic optimal transport problem with an entropic regularization term, and shows that the resulting optimum is also a distance which can be computed through Sinkhorn's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transport solvers.

...read moreread less

4K

•Proceedings Article

Equality of opportunity in supervised learning

Moritz Hardt, +2 more

- 05 Dec 2016

TL;DR: This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition.

...read moreread less

4K

•Proceedings Article•10.1109/ICDE.2007.367856

t-Closeness: Privacy Beyond k-Anonymity and l-Diversity

Ninghui Li, +2 more

- 15 Apr 2007

TL;DR: T-closeness as mentioned in this paper requires that the distribution of a sensitive attribute in any equivalence class is close to the distributions of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t).

...read moreread less

3.9K

...

Expand