Open AccessProceedings Article
Scalable Deletion-Robust Submodular Maximization: Data Summarization with Privacy and Fairness Constraints
Ehsan Kazemi,Morteza Zadimoghaddam,Amin Karbasi +2 more
- 03 Jul 2018
- pp 2544-2553
TL;DR: This work proposes the first memory-efficient centralized, streaming, and distributed methods with constant-factor approximation guarantees against any number of adversarial deletions, and shows that the solution is robust against even 80% of data deletion.
read more
Abstract: Can we efficiently extract useful information from a large user-generated dataset while protecting the privacy of the users and/or ensuring fairness in representation? We cast this problem as an instance of a deletion-robust submodular maximization where part of the data may be deleted or masked due to privacy concerns or fairness criteria. We propose the first memory-efficient centralized, streaming, and distributed methods with constant-factor approximation guarantees against any number of adversarial deletions. We extensively evaluate the performance of our algorithms on real-world applications, including (i) Uber-pick up locations with location privacy constraints; (ii) feature selection with fairness constraints for income prediction and crime rate prediction; and (iii) robust to deletion summarization of census data, consisting of 2,458,285 feature vectors. Our experiments show that our solution is robust against even 80% of data deletion.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Review on Fairness in Machine Learning
Dana Pessach,Erez Shmueli +1 more
TL;DR: An overview of the main concepts of identifying, measuring, and improving algorithmic fairness when using ML algorithms, focusing primarily on classification tasks is presented.
430
•Posted Content
Algorithmic Fairness
Dana Pessach,Erez Shmueli +1 more
TL;DR: An overview of the main concepts of identifying, measuring and improving algorithmic fairness when using AI algorithms is presented and the most commonly used fairness-related datasets in this field are described.
•Proceedings Article
Do Less, Get More: Streaming Submodular Maximization with Subsampling
Moran Feldman,Amin Karbasi,Ehsan Kazemi +2 more
- 01 Jan 2018
TL;DR: In this article, a one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once is proposed, which achieves a tight approximation guarantee in various settings while having the smallest memory footprint and requiring the lowest number of function evaluations.
The one-way communication complexity of submodular maximization with applications to streaming and robustness
Moran Feldman,Ashkan Norouzi-Fard,Ola Svensson,Rico Zenklusen +3 more
- 22 Jun 2020
TL;DR: In this paper, the authors consider the problem of maximizing a monotone submodular function subject to a cardinality constraint, and show that the possibility of querying infeasible sets can actually be exploited to beat this bound, by presenting a tight 2/3-approximation taking exponential time.
•Posted Content
Submodular Maximization with Nearly Optimal Approximation, Adaptivity and Query Complexity
TL;DR: A distributed algorithm for maximizing a monotone submodular function with cardinality constraint $k$ that achieves a $(1-1/e-\varepsilon)$-approximation in expectation that the approximation guarantee and query complexity are optimal, and the adaptivity is nearly optimal.
49
References
A tutorial on support vector regression
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Ridge regression: biased estimation for nonorthogonal problems
TL;DR: In this paper, an estimation procedure based on adding small positive quantities to the diagonal of X′X was proposed, which is a method for showing in two dimensions the effects of nonorthogonality.
10.3K
Maximizing the Spread of Influence through a Social Network
TL;DR: The problem of finding the most influential nodes in a social network is NP-hard as mentioned in this paper, and the first provable approximation guarantees for efficient algorithms were provided by Domingos et al. using an analysis framework based on submodular functions.
An analysis of approximations for maximizing submodular set functions--I
TL;DR: It is shown that a “greedy” heuristic always produces a solution whose value is at least 1 −[(K − 1/K]K times the optimal value, which can be achieved for eachK and has a limiting value of (e − 1)/e, where e is the base of the natural logarithm.
5.2K