FaceNet: A Unified Embedding for Face Recognition and Clustering

doi:10.1109/CVPR.2015.7298682

Open AccessProceedings Article10.1109/CVPR.2015.7298682

FaceNet: A Unified Embedding for Face Recognition and Clustering

Florian Schroff, +2 more

- 12 Mar 2015

- arXiv: Computer Vision and Pattern Recog...

14K

TL;DR: FaceNet as discussed by the authors uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches, and achieves state-of-the-art face recognition performance using only 128 bytes per face.

Abstract: Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors. Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets. We also introduce the concept of harmonic embeddings, and a harmonic triplet loss, which describe different versions of face embeddings (produced by different networks) that are compatible to each other and allow for direct comparison between each other.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

Ensemble Distribution Distillation

Andrey Malinin, +2 more

- 30 Apr 2019

- arXiv: Machine Learning

TL;DR: In this article, a prior network is proposed to distill the distribution of the predictions from an ensemble, rather than just the average prediction, into a single model, which is useful for uncertainty estimation.

...read moreread less

211

Journal Article•10.1109/TIM.2021.3088489

A Hybrid Generalization Network for Intelligent Fault Diagnosis of Rotating Machinery Under Unseen Working Conditions

Te Han, +2 more

- 11 Jun 2021

- IEEE Transactions on Instrumentation and...

TL;DR: Wang et al. as mentioned in this paper proposed a domain generalization-based hybrid diagnosis network, which regularizes the discriminant structure of the deep network with both intrinsic and extrinsic generalization objectives such that the diagnostic model can learn robust features and generalize to unseen domains.

...read moreread less

211

•Proceedings Article•10.1109/CVPR.2019.01108

AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations

Xiao Zhang, +4 more

- 01 Jun 2019

TL;DR: In this paper, the authors investigate the effects of two important hyperparameters of cosine-based softmax losses, the scale parameter and angular margin parameter, by analyzing how they modulate the predicted classification probability.

...read moreread less

211

•Journal Article•10.1109/TMM.2020.3042080

Parameter Sharing Exploration and Hetero-center Triplet Loss for Visible-Thermal Person Re-Identification

Haijun Liu, +2 more

- 02 Dec 2020

- IEEE Transactions on Multimedia

TL;DR: Wang et al. as mentioned in this paper explored how many parameters a two-stream network should share, which is still not well investigated in the existing literature, by splitting the ResNet50 model to construct the modality-specific feature extraction network and modality sharing feature embedding network.

...read moreread less

211

Book Chapter•10.1007/978-3-030-58568-6_2

SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing

Haonan Qiu, +5 more

- 19 Jun 2019

TL;DR: An algorithm is proposed which leverages disentangled semantic factors to generate adversarial perturbation by altering controlled semantic attributes to fool the learner towards various "adversarial" targets.

...read moreread less

211

...

Expand

References

•Proceedings Article•10.1109/CVPR.2015.7298594

Going deeper with convolutions

Christian Szegedy, +8 more

- 07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

56.6K

Journal Article•10.1038/323533A0

Learning representations by back-propagating errors

David E. Rumelhart, +2 more

- 01 Jan 1988

- Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

30.1K

•Book Chapter•10.1007/978-3-319-10590-1_53

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

- 06 Sep 2014

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

16.6K

Journal Article•10.1162/NECO.1989.1.4.541

Backpropagation applied to handwritten zip code recognition

Yann LeCun, +6 more

- 01 Dec 1989

- Neural Computation

TL;DR: This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.

...read moreread less

12.5K

•Posted Content

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

- 12 Nov 2013

- arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, the authors introduce a novel visualization technique that gives insight into the function of intermediate feature layers and the operation of the classifier, and perform an ablation study to discover the performance contribution from different model layers.

...read moreread less

9.7K

...

Expand

FaceNet: A Unified Embedding for Face Recognition and Clustering

Chat with Paper

AI Agents for this Paper

Citations

Ensemble Distribution Distillation

A Hybrid Generalization Network for Intelligent Fault Diagnosis of Rotating Machinery Under Unseen Working Conditions

AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations

Parameter Sharing Exploration and Hetero-center Triplet Loss for Visible-Thermal Person Re-Identification

SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing

References

Going deeper with convolutions

Learning representations by back-propagating errors

Visualizing and Understanding Convolutional Networks

Backpropagation applied to handwritten zip code recognition

Visualizing and Understanding Convolutional Networks

Related Papers (5)

Deep Residual Learning for Image Recognition

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

ImageNet Classification with Deep Convolutional Neural Networks

Going deeper with convolutions

Very Deep Convolutional Networks for Large-Scale Image Recognition