FaceNet: A Unified Embedding for Face Recognition and Clustering
TL;DR: FaceNet as discussed by the authors uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches, and achieves state-of-the-art face recognition performance using only 128 bytes per face.
read more
Abstract: Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.
Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face.
On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets.
We also introduce the concept of harmonic embeddings, and a harmonic triplet loss, which describe different versions of face embeddings (produced by different networks) that are compatible to each other and allow for direct comparison between each other.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Learning for Biometrics: A Survey
TL;DR: This article surveys 100 different approaches that explore deep learning for recognizing individuals using various biometric modalities and discusses how deep learning methods can benefit the field of biometrics and the potential gaps that deep learning approaches need to address for real-world biometric applications.
282
•Proceedings Article
Sub-center ArcFace: Boosting Face Recognition by Large-Scale Noisy Web Faces.
Jiankang Deng,Jia Guo,Tongliang Liu,Mingming Gong,Stefanos Zafeiriou +4 more
- 01 Jan 2020
TL;DR: This paper relaxes the intra-class constraint of ArcFace to improve the robustness to label noise and designs K sub-centers for each class and the training sample only needs to be close to any of the K positive subcenters instead of the only one positive center.
280
•Posted Content
Targeting Ultimate Accuracy: Face Recognition via Deep Embedding
TL;DR: A two-stage approach that combines a multi-patch deep CNN and deep metric learning, which extracts low dimensional but very discriminative features for face verification and recognition is proposed, showing a clear path to practical high-performance face recognition systems in real world.
Point in, Box Out: Beyond Counting Persons in Crowds
Yuting Liu,Miaojing Shi,Qijun Zhao,Xiaofang Wang +3 more
- 15 Jun 2019
TL;DR: Zhang et al. as discussed by the authors proposed a curriculum learning strategy to train the network from images of relatively accurate and easy pseudo ground truth first, which can simultaneously detect the size and location of human heads and count them in crowds.
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
Zheng Zhu,Guan Huang,Jiankang Deng,Yun Ye,Junjie Huang,Xinze Chen,Jiagang Zhu,Tian Yang,Jiwen Lu,Dalong Du,Jie Zhou +10 more
- 06 Mar 2021
TL;DR: Wang et al. as discussed by the authors proposed a new million-scale face benchmark containing noisy 4M identities/260M faces (WebFace260M) and cleaned 2m identities/42M faces(WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol.
References
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
- 07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Learning representations by back-propagating errors
TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.
30.1K
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler,Rob Fergus +1 more
- 06 Sep 2014
TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.
16.6K
Backpropagation applied to handwritten zip code recognition
Yann LeCun,Bernhard E. Boser,John S. Denker,D. Henderson,Richard Howard,W. Hubbard,Lawrence D. Jackel +6 more
TL;DR: This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
12.5K
•Posted Content
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler,Rob Fergus +1 more
TL;DR: In this article, the authors introduce a novel visualization technique that gives insight into the function of intermediate feature layers and the operation of the classifier, and perform an ablation study to discover the performance contribution from different model layers.
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014