Proceedings Article10.1145/3132847.3132891
Crowdsourced Selection on Multi-Attribute Data
Xueping Weng,Guoliang Li,Huiqi Hu,Jianhua Feng +3 more
- 06 Nov 2017
- pp 307-316
9
TL;DR: This paper studies the crowdsourced selection problem on multi-attribute data, e.g., selecting the female photos with dark eyes and wearing sunglasses, and proposes a predicate order based framework to reduce monetary cost.
read more
Abstract: Crowdsourced selection asks the crowd to select entities that satisfy a query condition, e.g., selecting the photos of people wearing sunglasses from a given set of photos. Existing studies focus on a single query predicate and in this paper we study the crowdsourced selection problem on multi-attribute data, e.g., selecting the female photos with dark eyes and wearing sunglasses. A straightforward method asks the crowd to answer every entity by checking every predicate in the query. Obviously, this method involves huge monetary cost. Instead, we can select an optimized predicate order and ask the crowd to answer the entities following the order. Since if an entity does not satisfy a predicate, we can prune this entity without needing to ask other predicates and thus this method can reduce the cost. There are two challenges in finding the optimized predicate order. The first is how to detect the predicate order and the second is to capture correlation among different predicates. To address this problem, we propose predicate order based framework to reduce monetary cost. Firstly, we define an expectation tree to store selectivities on predicates and estimate the best predicate order. In each iteration, we estimate the best predicate order from the expectation tree, and then choose a predicate as a question to ask the crowd. After getting the result of the current predicate, we choose next predicate to ask until we get the result. We will update the expectation tree using the answer obtained from the crowd and continue to the next iteration. We also study the problem of answering multiple queries simultaneously, and reduce its cost using the correlation between queries. Finally, we propose a confidence based method to improve the quality. The experiment result shows that our predicate order based algorithm is effective and can reduce cost significantly compared with baseline approaches.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Cost-Efficient Framework for Crowdsourced Data Collection in Vehicular Networks
Bo Yin,Jiazhuang Lu +1 more
TL;DR: This work proposes a cost-efficient framework for crowdsourced data collection in vehicular networks while ensuring the accuracy of the response and designs an answer gathering scheme that considers both the length of the aggregation tree and data delivery and minimizes the communication cost for collecting answers from participants.
12
•Posted Content
Crowd-Powered Data Mining
TL;DR: This tutorial gives an overview of crowdsourcing, and then summarizes the fundamental techniques, including quality control, cost control, and latency control, which must be considered in crowdsourced data mining.
Improving Multiclass Classification in Crowdsourcing by Using Hierarchical Schemes
Xiaoni Duan,Keishi Tajima +1 more
- 13 May 2019
TL;DR: A method of improving accuracy of multiclass classification tasks in crowdsourcing by reorganizing a given flat classification task into a hierarchical classification task consisting of several subtasks, and assigning each worker to an appropriate subtask.
7
Cost-effective crowdsourced join queries for entity resolution without prior knowledge
TL;DR: In this article , a two-level confidence-based labeling model was proposed to minimize the monetary cost of labeling a single pair with confidence guarantee, and the number of comparison pairs on the basis of transitive relations.
3
References
Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments
Gary B. Huang,Marwan Mattar,Tamara L. Berg,Eric Learned-Miller +3 more
- 01 Oct 2008
TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.
Describing clothing by semantic attributes
Huizhong Chen,Andrew C. Gallagher,Bernd Girod +2 more
- 07 Oct 2012
TL;DR: A fully automated system that is capable of generating a list of nameable attributes for clothes on human body in unconstrained images is proposed, and a novel application of dressing style analysis is introduced that utilizes the semantic attributes produced by the system.
Truth inference in crowdsourcing: is the problem solved?
Yudian Zheng,Guoliang Li,Yuanbing Li,Caihua Shan,Reynold Cheng +4 more
- 01 Jan 2017
TL;DR: It is believed that the truth inference problem is not fully solved, and the limitations of existing algorithms are identified and point out promising research directions.
CDAS: a crowdsourcing data analytics system
Xuan Liu,Meiyu Lu,Beng Chin Ooi,Yanyan Shen,Sai Wu,Meihui Zhang +5 more
- 01 Jun 2012
TL;DR: A quality-sensitive answering model is introduced, which guides the crowdsourcing query engine for the design and processing of the corresponding crowdsourcing jobs, and effectively reduces the processing cost while maintaining the required query answer quality.
Crowdsourced Data Management: A Survey
TL;DR: This paper surveys and synthesizes a wide spectrum of existing studies on crowdsourced data management and outlines key factors that need to be considered to improve crowdsourcing data management.