Crowdsourced Selection on Multi-Attribute Data

doi:10.1145/3132847.3132891

Proceedings Article10.1145/3132847.3132891

Crowdsourced Selection on Multi-Attribute Data

Xueping Weng, +3 more

- 06 Nov 2017

- pp 307-316

9

TL;DR: This paper studies the crowdsourced selection problem on multi-attribute data, e.g., selecting the female photos with dark eyes and wearing sunglasses, and proposes a predicate order based framework to reduce monetary cost.

Abstract: Crowdsourced selection asks the crowd to select entities that satisfy a query condition, e.g., selecting the photos of people wearing sunglasses from a given set of photos. Existing studies focus on a single query predicate and in this paper we study the crowdsourced selection problem on multi-attribute data, e.g., selecting the female photos with dark eyes and wearing sunglasses. A straightforward method asks the crowd to answer every entity by checking every predicate in the query. Obviously, this method involves huge monetary cost. Instead, we can select an optimized predicate order and ask the crowd to answer the entities following the order. Since if an entity does not satisfy a predicate, we can prune this entity without needing to ask other predicates and thus this method can reduce the cost. There are two challenges in finding the optimized predicate order. The first is how to detect the predicate order and the second is to capture correlation among different predicates. To address this problem, we propose predicate order based framework to reduce monetary cost. Firstly, we define an expectation tree to store selectivities on predicates and estimate the best predicate order. In each iteration, we estimate the best predicate order from the expectation tree, and then choose a predicate as a question to ask the crowd. After getting the result of the current predicate, we choose next predicate to ask until we get the result. We will update the expectation tree using the answer obtained from the crowd and continue to the next iteration. We also study the problem of answering multiple queries simultaneously, and reduce its cost using the correlation between queries. Finally, we propose a confidence based method to improve the quality. The experiment result shows that our predicate order based algorithm is effective and can reduce cost significantly compared with baseline approaches.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/JIOT.2021.3065716

A Cost-Efficient Framework for Crowdsourced Data Collection in Vehicular Networks

Bo Yin, +1 more

- 01 Sep 2021

- IEEE Internet of Things Journal

TL;DR: This work proposes a cost-efficient framework for crowdsourced data collection in vehicular networks while ensuring the accuracy of the response and designs an answer gathering scheme that considers both the length of the aggregation tree and data delivery and minimizes the communication cost for collecting answers from participants.

...read moreread less

12

•Posted Content

Crowd-Powered Data Mining

Chengliang Chai, +4 more

- 13 Jun 2018

- arXiv: Databases

TL;DR: This tutorial gives an overview of crowdsourcing, and then summarizes the fundamental techniques, including quality control, cost control, and latency control, which must be considered in crowdsourced data mining.

...read moreread less

11

Proceedings Article•10.1145/3308558.3313749

Improving Multiclass Classification in Crowdsourcing by Using Hierarchical Schemes

Xiaoni Duan, +1 more

- 13 May 2019

TL;DR: A method of improving accuracy of multiclass classification tasks in crowdsourcing by reorganizing a given flat classification task into a hierarchical classification task consisting of several subtasks, and assigning each worker to an appropriate subtask.

...read moreread less

7

•Book

Crowdsourced Data Management : Hybrid Machine-Human Computing

Guoliang Li, +4 more

- 12 Oct 2018

5

Journal Article•10.1016/j.future.2021.09.008

Cost-effective crowdsourced join queries for entity resolution without prior knowledge

Jaco Lavinsky

- 01 Feb 2022

- Future Generation Computer Systems

TL;DR: In this article , a two-level confidence-based labeling model was proposed to minimize the monetary cost of labeling a single pair with confidence guarantee, and the number of comparison pairs on the basis of transitive relations.

...read moreread less

3

References

Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments

Gary B. Huang, +3 more

- 01 Oct 2008

TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.

...read moreread less

6.5K

Book Chapter•10.1007/978-3-642-33712-3_44

Describing clothing by semantic attributes

Huizhong Chen, +2 more

- 07 Oct 2012

TL;DR: A fully automated system that is capable of generating a list of nameable attributes for clothes on human body in unconstrained images is proposed, and a novel application of dressing style analysis is introduced that utilizes the semantic attributes produced by the system.

...read moreread less

514

•Journal Article•10.14778/3055540.3055547

Truth inference in crowdsourcing: is the problem solved?

Yudian Zheng, +4 more

- 01 Jan 2017

TL;DR: It is believed that the truth inference problem is not fully solved, and the limitations of existing algorithms are identified and point out promising research directions.

...read moreread less

474

Journal Article•10.14778/2336664.2336676

CDAS: a crowdsourcing data analytics system

Xuan Liu, +5 more

- 01 Jun 2012

TL;DR: A quality-sensitive answering model is introduced, which guides the crowdsourcing query engine for the design and processing of the corresponding crowdsourcing jobs, and effectively reduces the processing cost while maintaining the required query answer quality.

...read moreread less

319

Journal Article•10.1109/TKDE.2016.2535242

Crowdsourced Data Management: A Survey

Guoliang Li, +3 more

- 01 Sep 2016

- IEEE Transactions on Knowledge and Data ...

TL;DR: This paper surveys and synthesizes a wide spectrum of existing studies on crowdsourced data management and outlines key factors that need to be considered to improve crowdsourcing data management.

...read moreread less

281

...

Expand

Crowdsourced Selection on Multi-Attribute Data

Chat with Paper

AI Agents for this Paper

Citations

A Cost-Efficient Framework for Crowdsourced Data Collection in Vehicular Networks

Crowd-Powered Data Mining

Improving Multiclass Classification in Crowdsourcing by Using Hierarchical Schemes

Crowdsourced Data Management : Hybrid Machine-Human Computing

Cost-effective crowdsourced join queries for entity resolution without prior knowledge

References

Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments

Describing clothing by semantic attributes

Truth inference in crowdsourcing: is the problem solved?

CDAS: a crowdsourcing data analytics system

Crowdsourced Data Management: A Survey

Related Papers (5)

A probabilistic optimization framework for the empty-answer problem

Constraint acquisition with recommendation queries

Differentially private top-k query over MapReduce

Make Up Your Mind: The Price of Online Queries in Differential Privacy

Advanced query optimization techniques for relational database systems