Open AccessPosted Content
MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition
TL;DR: A benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data, which could lead to one of the largest classification problems in computer vision.
read more
Abstract: In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis. Associated with this task, we design and provide concrete measurement set, evaluation protocol, as well as training data. We also present in details our experiment setup and report promising baseline results. Our benchmark task could lead to one of the largest classification problems in computer vision. To the best of our knowledge, our training dataset, which contains 10M images in version 1, is the largest publicly available one in the world.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Fig. 1. An example of our face recognition task. Our task is to recognize the face in the image and then link this face with the corresponding entity key in the knowledge base. By recognizing the left image to be “Anne Hathaway” and linking to the entity key, we know she is an American actress born in 1982, who has played Mia Thermopolis in The Princess Diaries, not the other Anne Hathaway who was the wife of William Shakespeare. Input image is from the web. 2 ![Fig. 2. Distribution of the properties of the celebrities in our one-million list in different aspects. The large scale of our dataset naturally introduces great diversity. As shown in (a) and (b), we include persons with more than 2000 different professions, and come from more than 200 distinct countries/regions. The figure (c) demonstrates that we don’t include celebrities who were born before 1846 (long time before the first rollfilm specialized camera “Kodak” was invented [19]) and covers celebrities of a large variance of age. In (d), we notice that we have more females than males in our onemillion celebrity list. This might be correlated with the profession distribution in our list.](/figures/figure2-1-4pi5ioqhkqmp.png)
Fig. 2. Distribution of the properties of the celebrities in our one-million list in different aspects. The large scale of our dataset naturally introduces great diversity. As shown in (a) and (b), we include persons with more than 2000 different professions, and come from more than 200 distinct countries/regions. The figure (c) demonstrates that we don’t include celebrities who were born before 1846 (long time before the first rollfilm specialized camera “Kodak” was invented [19]) and covers celebrities of a large variance of age. In (d), we notice that we have more females than males in our onemillion celebrity list. This might be correlated with the profession distribution in our list. 
Fig. 4. Examples (subset) of the training images for the celebrity with entity key m.06y3r (Steve Jobs). The image marked with a green rectangle is claimed to be Steve Jobs when he was in high school. The image marked with a red rectangle is considered as a noise sample in our dataset, since it is synthesized by combining one image of Steve Jobs and one image of Ashton Kutcher, who is the actor in the movie “Jobs”. 
Fig. 3. Labeling GUI for “Chuck Palhniuk”. (partial view) As shown in the figure, in the upper right corner, a representative image and a short description is provided. For a given image candidate, judge can label as “not for this celebrity” (red), “yes for this celebrity” (green), or “broken image” (dark gray). 
Table 1. Face recognition datasets
Citations
Face Recognition with Low False Positive Error Rate
TL;DR: This work proposes an approach to maximize face recognition accuracy for a fixed false positive error rate using limited amount of hardware.
JPEG Compression-Resistant Low-Mid Adversarial Perturbation against Unauthorized Face Recognition System
TL;DR: This work proposes a more natural solution called low frequency adversarial perturbation (LFAP), which turns to regularize the source model to employing more low-frequency features by adversarial training, and proposes the mid frequency components as the productive complement.
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg,Timo Lüddecke,Jonathan Henrich,Sharmita Dey,Matthias Nuske,Valentin Hassler,Derek Murphy,Julia Fischer,Julia Ostner,Oliver Schülke,Peter M. Kappeler,Claudia Fichtel,Alexander Gail,Stefan Treue,Hansjörg Scherberger,Florentin Wörgötter,Alexander Ecker +16 more
TL;DR: A survey of the state-of-the-art methods for computer vision problems that are directly relevant to the video-based study of animal behavior, including object detection, multi-individual tracking, (inter)action recognition and individual identification.
ReliableSwap: Boosting General Face Swapping Via Reliable Supervision
TL;DR: ReliableSwap as discussed by the authors uses face reenactment and blending techniques to synthesize the swapped face from real images in advance, where the synthetic face preserves source identity and target attributes.
2
3D Human Face Reconstruction and 2D Appearance Synthesis
Yajie Zhao
- 01 Jan 2018
TL;DR: This dissertation proposes three image-based face reconstruction approaches according to different assumption of inputs and proposes a deep neutral network to solve the HMD removal problem considering it as a face inpainting problem and explores the applicability of these reconstructions on four interesting applications.
2
References
ImageNet classification with deep convolutional neural networks
TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
•Journal Article
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Michael S. Bernstein,Li Fei-Fei,Alexander C. Berg,Aditya Khosla +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been running annually for five years (since 2010) and has become the standard benchmark for large-scale object recognition.
23.9K
FaceNet: A Unified Embedding for Face Recognition and Clustering
TL;DR: FaceNet as discussed by the authors uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches, and achieves state-of-the-art face recognition performance using only 128 bytes per face.
14.2K
•Proceedings Article
On Spectral Clustering: Analysis and an algorithm
Andrew Y. Ng,Michael I. Jordan,Yair Weiss +2 more
- 03 Jan 2001
TL;DR: A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well.
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Omkar M. Parkhi,Andrea Vedaldi,Andrew Zisserman +2 more
- 01 Jan 2015