Top 1499 papers published in the topic of Unsupervised learning in 2018

Showing papers on "Unsupervised learning published in 2018"

Posted Content•

Representation Learning with Contrastive Predictive Coding

[...]

Aaron van den Oord¹, Yazhe Li¹, Oriol Vinyals¹•Institutions (1)

10 Jul 2018-arXiv: Learning

TL;DR: This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

...read moreread less

Abstract: While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose a universal unsupervised learning approach to extract useful representations from high-dimensional data, which we call Contrastive Predictive Coding. The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models. We use a probabilistic contrastive loss which induces the latent space to capture information that is maximally useful to predict future samples. It also makes the model tractable by using negative sampling. While most prior work has focused on evaluating representations for a particular modality, we demonstrate that our approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

...read moreread less

8,203 citations

Proceedings Article•10.1109/CVPR.2018.00393•

Unsupervised Feature Learning via Non-parametric Instance Discrimination

[...]

Zhirong Wu¹, Yuanjun Xiong², Stella X. Yu¹, Dahua Lin²•Institutions (2)

University of California, Berkeley¹, The Chinese University of Hong Kong²

18 Jun 2018

TL;DR: This work forms this intuition as a non-parametric classification problem at the instance-level, and uses noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes.

...read moreread less

Abstract: Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similarity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances? We formulate this intuition as a non-parametric classification problem at the instance-level, and use noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Our experimental results demonstrate that, under unsupervised learning settings, our method surpasses the state-of-the-art on ImageNet classification by a large margin. Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.

...read moreread less

4,693 citations

Book Chapter•10.1007/978-3-030-01264-9_9•

Deep Clustering for Unsupervised Learning of Visual Features

[...]

Mathilde Caron¹, Piotr Bojanowski¹, Armand Joulin¹, Matthijs Douze¹•Institutions (1)

Facebook¹

8 Sep 2018

TL;DR: DeepCluster as discussed by the authors is a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features, and uses the subsequent assignments as supervision to update the weights of the network.

...read moreread less

Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large-scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.

...read moreread less

2,861 citations

Proceedings Article•10.1109/CVPR.2018.00392•

Maximum Classifier Discrepancy for Unsupervised Domain Adaptation

[...]

Kuniaki Saito¹, Kohei Watanabe¹, Yoshitaka Ushiku¹, Tatsuya Harada¹•Institutions (1)

University of Tokyo¹

18 Jun 2018

TL;DR: MCD-DA as discussed by the authors aligns distributions of source and target by utilizing the task-specific decision boundaries between classes to detect target samples that are far from the support of the source.

...read moreread less

Abstract: In this work, we present a method for unsupervised domain adaptation. Many adversarial learning methods train domain classifier networks to distinguish the features as either a source or target and train a feature generator network to mimic the discriminator. Two problems exist with these methods. First, the domain classifier only tries to distinguish the features as a source or target and thus does not consider task-specific decision boundaries between classes. Therefore, a trained generator can generate ambiguous features near class boundaries. Second, these methods aim to completely match the feature distributions between different domains, which is difficult because of each domain's characteristics. To solve these problems, we introduce a new approach that attempts to align distributions of source and target by utilizing the task-specific decision boundaries. We propose to maximize the discrepancy between two classifiers' outputs to detect target samples that are far from the support of the source. A feature generator learns to generate target features near the support to minimize the discrepancy. Our method outperforms other methods on several datasets of image classification and semantic segmentation. The codes are available at https://github.com/mil-tokyo/MCD_DA

...read moreread less

2,532 citations

Proceedings Article•

Learning deep representations by mutual information estimation and maximization

[...]

R Devon Hjelm¹, Alex Fedorov², Samuel Lavoie-Marchildon³, Karan Grewal, Philip Bachman¹, Adam Trischler¹, Yoshua Bengio³ - Show less +3 more•Institutions (3)

Microsoft¹, University of New Mexico², Université de Montréal³

20 Aug 2018

TL;DR: Deep InfoMax (DIM) as discussed by the authors maximizes mutual information between an input and the output of a deep neural network encoder by matching to a prior distribution adversarially.

...read moreread less

Abstract: This work investigates unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation’s suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and compares favorably with fully-supervised learning on several classification tasks in with some standard architectures. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation learning objectives for specific end-goals.

...read moreread less

2,512 citations

Posted Content•

Deep Clustering for Unsupervised Learning of Visual Features

[...]

Mathilde Caron¹, Piotr Bojanowski¹, Armand Joulin¹, Matthijs Douze¹•Institutions (1)

Facebook¹

15 Jul 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features and outperforms the current state of the art by a significant margin on all the standard benchmarks.

...read moreread less

Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.

...read moreread less

1,858 citations

Posted Content•

Deep Graph Infomax.

[...]

Petar Veličković¹, William Fedus², William L. Hamilton³, Pietro Liò¹, Yoshua Bengio⁴, R Devon Hjelm⁵ - Show less +2 more•Institutions (5)

University of Cambridge¹, Google², Stanford University³, Université de Montréal⁴, Microsoft⁵

27 Sep 2018-arXiv: Machine Learning

TL;DR: Deep Graph Infomax (DGI) is presented, a general approach for learning node representations within graph-structured data in an unsupervised manner that is readily applicable to both transductive and inductive learning setups.

...read moreread less

Abstract: We present Deep Graph Infomax (DGI), a general approach for learning node representations within graph-structured data in an unsupervised manner. DGI relies on maximizing mutual information between patch representations and corresponding high-level summaries of graphs---both derived using established graph convolutional network architectures. The learnt patch representations summarize subgraphs centered around nodes of interest, and can thus be reused for downstream node-wise learning tasks. In contrast to most prior approaches to unsupervised learning with GCNs, DGI does not rely on random walk objectives, and is readily applicable to both transductive and inductive learning setups. We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning.

...read moreread less

1,628 citations

Journal Article•10.1093/NSR/NWX105•

An Overview of Multi-task Learning

[...]

Yu Zhang¹, Qiang Yang¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Jan 2018-National Science Review

TL;DR: Many areas, including computer vision, bioinformatics, health informatics, speech, natural language processing, web applications and ubiquitous computing, use MTL to improve the performance of the applications involved and some representative works are reviewed.

...read moreread less

Abstract: As a promising area in machine learning, multi-task learning (MTL) aims to improve the performance of multiple related learning tasks by leveraging useful information among them. In this paper, we give an overview of MTL by first giving a definition of MTL. Then several different settings of MTL are introduced, including multi-task supervised learning, multi-task unsupervised learning, multi-task semi-supervised learning, multi-task active learning, multi-task reinforcement learning, multi-task online learning and multi-task multi-view learning. For each setting, representative MTL models are presented. In order to speed up the learning process, parallel and distributed MTL models are introduced. Many areas, including computer vision, bioinformatics, health informatics, speech, natural language processing, web applications and ubiquitous computing, use MTL to improve the performance of the applications involved and some representative works are reviewed. Finally, recent theoretical analyses for MTL are presented.

...read moreread less

1,602 citations

Journal Article•10.1145/3234150•

A Survey on Deep Learning: Algorithms, Techniques, and Applications

[...]

Samira Pouyanfar¹, Saad Sadiq², Yilin Yan², Haiman Tian¹, Yudong Tao², Maria Presa Reyes¹, Mei-Ling Shyu², Shu-Ching Chen¹, S. Sitharama Iyengar¹ - Show less +5 more•Institutions (2)

Florida International University¹, University of Miami²

18 Sep 2018-ACM Computing Surveys

TL;DR: A comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing is presented, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications.

...read moreread less

Abstract: The field of machine learning is witnessing its golden era as deep learning slowly becomes the leader in this domain. Deep learning uses multiple layers to represent the abstractions of data to build computational models. Some key enabler deep learning algorithms such as generative adversarial networks, convolutional neural networks, and model transfers have completely changed our perception of information processing. However, there exists an aperture of understanding behind this tremendously fast-paced domain, because it was never previously represented from a multiscope perspective. The lack of core understanding renders these powerful methods as black-box machines that inhibit development at a fundamental level. Moreover, deep learning has repeatedly been perceived as a silver bullet to all stumbling blocks in machine learning, which is far from the truth. This article presents a comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications. It was also undertaken to review the issues faced in deep learning such as unsupervised learning, black-box models, and online learning and to illustrate how these challenges can be transformed into prolific future research avenues.

...read moreread less

1,299 citations

Proceedings Article•10.1109/CVPR.2018.00029•

FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation

[...]

Yaoqing Yang¹, Chen Feng, Yiru Shen², Dong Tian³•Institutions (3)

Carnegie Mellon University¹, Mitsubishi Electric², Clemson University³

14 Dec 2018

TL;DR: FoldingNet as discussed by the authors proposes an end-to-end deep auto-encoder to address unsupervised learning challenges on point clouds, where a folding-based decoder deforms a canonical 2D grid onto the underlying 3D object surface of a point cloud.

...read moreread less

Abstract: Recent deep networks that directly handle points in a point set, e.g., PointNet, have been state-of-the-art for supervised learning tasks on point clouds such as classification and segmentation. In this work, a novel end-to-end deep auto-encoder is proposed to address unsupervised learning challenges on point clouds. On the encoder side, a graph-based enhancement is enforced to promote local structures on top of PointNet. Then, a novel folding-based decoder deforms a canonical 2D grid onto the underlying 3D object surface of a point cloud, achieving low reconstruction errors even for objects with delicate structures. The proposed decoder only uses about 7% parameters of a decoder with fully-connected neural networks, yet leads to a more discriminative representation that achieves higher linear SVM classification accuracy than the benchmark. In addition, the proposed decoder structure is shown, in theory, to be a generic architecture that is able to reconstruct an arbitrary point cloud from a 2D grid. Our code is available at http://www.merl.com/research/license#FoldingNet

...read moreread less

1,296 citations

Proceedings Article•10.1109/CVPR.2018.00212•

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

[...]

Zhichao Yin, Jianping Shi

6 Mar 2018

TL;DR: GeoNet as mentioned in this paper proposes an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively.

...read moreread less

Abstract: We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and egomotion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner. Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately. Furthermore, we propose an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively. Experimentation on the KITTI driving dataset reveals that our scheme achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.

...read moreread less

Posted Content•

Disentangling by Factorising.

[...]

Hyunjik Kim¹, Andriy Mnih¹•Institutions (1)

Google¹

16 Feb 2018-arXiv: Machine Learning

TL;DR: FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions, is proposed and it improves upon $\beta$-VAE by providing a better trade-off between disentanglement and reconstruction quality.

...read moreread less

Abstract: We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation. We propose FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions. We show that it improves upon $\beta$-VAE by providing a better trade-off between disentanglement and reconstruction quality. Moreover, we highlight the problems of a commonly used disentanglement metric and introduce a new metric that does not suffer from them.

...read moreread less

Proceedings Article•10.1109/CVPR.2018.00594•

Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints

[...]

Reza Mahjourian¹, Martin Wicke¹, Anelia Angelova¹•Institutions (1)

University of Texas at Austin¹

15 Feb 2018

TL;DR: The main contribution is to explicitly consider the inferred 3D geometry of the whole scene, and enforce consistency of the estimated 3D point clouds and ego-motion across consecutive frames, and outperforms the state-of-the-art for both breadth and depth.

...read moreread less

Abstract: We present a novel approach for unsupervised learning of depth and ego-motion from monocular video. Unsupervised learning removes the need for separate supervisory signals (depth or ego-motion ground truth, or multi-view video). Prior work in unsupervised depth learning uses pixel-wise or gradient-based losses, which only consider pixels in small local neighborhoods. Our main contribution is to explicitly consider the inferred 3D geometry of the whole scene, and enforce consistency of the estimated 3D point clouds and ego-motion across consecutive frames. This is a challenging task and is solved by a novel (approximate) backpropagation algorithm for aligning 3D structures. We combine this novel 3D-based loss with 2D losses based on photometric quality of frame reconstructions using estimated depth and ego-motion from adjacent frames. We also incorporate validity masks to avoid penalizing areas in which no useful information exists. We test our algorithm on the KITTI dataset and on a video dataset captured on an uncalibrated mobile phone camera. Our proposed approach consistently improves depth estimates on both datasets, and outperforms the state-of-the-art for both depth and ego-motion. Because we only require a simple video, learning depth and ego-motion on large and varied datasets becomes possible. We demonstrate this by training on the low quality uncalibrated video dataset and evaluating on KITTI, ranking among top performing prior methods which are trained on KITTI itself.1

...read moreread less

Journal Article•10.1038/S41928-018-0023-2•

Fully memristive neural networks for pattern classification with unsupervised learning

[...]

Zhongrui Wang¹, Saumil Joshi¹, Sergey Savel'ev², Wenhao Song¹, Rivu Midya¹, Yunning Li¹, Mingyi Rao¹, Peng Yan¹, Shiva Asapu¹, Ye Zhuo¹, Hao Jiang¹, Peng Lin¹, Can Li¹, Jung Ho Yoon¹, Navnidhi K. Upadhyay¹, Jiaming Zhang³, Miao Hu³, John Paul Strachan³, Mark Barnell⁴, Qing Wu⁴, Huaqiang Wu⁵, R. Stanley Williams³, Qiangfei Xia¹, Jianhua Yang¹ - Show less +20 more•Institutions (5)

University of Massachusetts Amherst¹, Loughborough University², Hewlett-Packard³, Air Force Research Laboratory⁴, Tsinghua University⁵

8 Feb 2018

TL;DR: It is shown that a diffusive memristor based on silver nanoparticles in a dielectric film can be used to create an artificial neuron with stochastic leaky integrate-and-fire dynamics and tunable integration time, which is determined by silver migration alone or its interaction with circuit capacitance.

...read moreread less

Abstract: Neuromorphic computers comprised of artificial neurons and synapses could provide a more efficient approach to implementing neural network algorithms than traditional hardware. Recently, artificial neurons based on memristors have been developed, but with limited bio-realistic dynamics and no direct interaction with the artificial synapses in an integrated network. Here we show that a diffusive memristor based on silver nanoparticles in a dielectric film can be used to create an artificial neuron with stochastic leaky integrate-and-fire dynamics and tunable integration time, which is determined by silver migration alone or its interaction with circuit capacitance. We integrate these neurons with nonvolatile memristive synapses to build fully memristive artificial neural networks. With these integrated networks, we experimentally demonstrate unsupervised synaptic weight updating and pattern classification.

...read moreread less

Posted Content•

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

[...]

Zhichao Yin, Jianping Shi

06 Mar 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: An adaptive geometric consistency loss is proposed to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively and achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.

...read moreread less

Abstract: We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and ego-motion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner. Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately. Furthermore, we propose an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively. Experimentation on the KITTI driving dataset reveals that our scheme achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.

...read moreread less

Proceedings Article•10.18653/V1/N18-1049•

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

[...]

Matteo Pagliardini¹, Prakhar Gupta¹, Martin Jaggi¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

1 May 2018

TL;DR: This work presents a simple but efficient unsupervised objective to train distributed representations of sentences, which outperforms the state-of-the-art un supervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

...read moreread less

Abstract: The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

...read moreread less

Proceedings Article•10.1109/CVPR.2018.00964•

An Unsupervised Learning Model for Deformable Medical Image Registration

[...]

Guha Balakrishnan¹, Amy Zhao¹, Mert R. Sabuncu², Adrian V. Dalca¹, John V. Guttag¹ - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Cornell University²

7 Feb 2018

TL;DR: The proposed method uses a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field, and demonstrates registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice.

...read moreread less

Abstract: We present a fast learning-based algorithm for deformable, pairwise 3D medical image registration. Current registration methods optimize an objective function independently for each pair of images, which can be time-consuming for large data. We define registration as a parametric function, and optimize its parameters given a set of images from a collection of interest. Given a new pair of scans, we can quickly compute a registration field by directly evaluating the function using the learned parameters. We model this function using a CNN, and use a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field. The proposed method does not require supervised information such as ground truth registration fields or anatomical landmarks. We demonstrate registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice. Our method promises to significantly speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is available at https://github.com/balakg/voxelmorph.

...read moreread less

Proceedings Article•10.1109/CVPR.2018.00242•

Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-identification

[...]

Jingya Wang¹, Xiatian Zhu, Shaogang Gong¹, Wei Li¹•Institutions (1)

Queen Mary University of London¹

18 Jun 2018

TL;DR: In this article, a Transferable Joint Attribute-Identity Deep Learning (TJ-AIDL) model is proposed to simultaneously learn an attribute-semantic and identity discriminative feature representation space transferrable to any new (unseen) target domain for re-id tasks without the need for collecting new labelled training data from the target domain.

...read moreread less

Abstract: Most existing person re-identification (re-id) methods require supervised model learning from a separate large set of pairwise labelled training data for every single camera pair. This significantly limits their scalability and usability in real-world large scale deployments with the need for performing re-id across many camera views. To address this scalability problem, we develop a novel deep learning method for transferring the labelled information of an existing dataset to a new unseen (unlabelled) target domain for person re-id without any supervised learning in the target domain. Specifically, we introduce an Transferable Joint Attribute-Identity Deep Learning (TJ-AIDL) for simultaneously learning an attribute-semantic and identity-discriminative feature representation space transferrable to any new (unseen) target domain for re-id tasks without the need for collecting new labelled training data from the target domain (i.e. unsupervised learning in the target domain). Extensive comparative evaluations validate the superiority of this new TJ-AIDL model for unsupervised person re-id over a wide range of state-of-the-art methods on four challenging benchmarks including VIPeR, PRID, Market-1501, and DukeMTMC-ReID.

...read moreread less

Journal Article•10.1145/3243316•

Unsupervised Person Re-identification: Clustering and Fine-tuning

[...]

Hehe Fan¹, Liang Zheng¹, Chenggang Yan², Yi Yang¹•Institutions (2)

University of Technology, Sydney¹, Hangzhou Dianzi University²

10 Oct 2018-ACM Transactions on Multimedia Computing, Communications, and Applications

TL;DR: A progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains and demonstrates that PUL outputs discriminative features that improve the re-ID accuracy.

...read moreread less

Abstract: The superiority of deeply learned pedestrian representations has been reported in very recent literature of person re-identification (re-ID). In this article, we consider the more pragmatic issue of learning a deep feature with no or only a few labels. We propose a progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains. Our method is easy to implement and can be viewed as an effective baseline for unsupervised re-ID feature learning. Specifically, PUL iterates between (1) pedestrian clustering and (2) fine-tuning of the convolutional neural network (CNN) to improve the initialization model trained on the irrelevant labeled dataset. Since the clustering results can be very noisy, we add a selection operation between the clustering and fine-tuning. At the beginning, when the model is weak, CNN is fine-tuned on a small amount of reliable examples that locate near to cluster centroids in the feature space. As the model becomes stronger, in subsequent iterations, more images are being adaptively selected as CNN training samples. Progressively, pedestrian clustering and the CNN model are improved simultaneously until algorithm convergence. This process is naturally formulated as self-paced learning. We then point out promising directions that may lead to further improvement. Extensive experiments on three large-scale re-ID datasets demonstrate that PUL outputs discriminative features that improve the re-ID accuracy. Our code has been released at https://github.com/hehefan/Unsupervised-Person-Re-identification-Clustering-and-Fine-tuning.

...read moreread less

Journal Article•10.1109/MSP.2018.2825478•

IoT Security Techniques Based on Machine Learning: How Do IoT Devices Use AI to Enhance Security?

[...]

Liang Xiao, Xiaoyue Wan, Xiaozhen Lu, Yanyong Zhang¹, Di Wu² - Show less +1 more•Institutions (2)

Rutgers University¹, Sun Yat-sen University²

03 Sep 2018-IEEE Signal Processing Magazine

TL;DR: The attack model for IoT systems is investigated, and the IoT security solutions based on machine-learning (ML) techniques including supervised learning, unsupervised learning, and reinforcement learning (RL) are reviewed.

...read moreread less

Abstract: The Internet of things (IoT), which integrates a variety of devices into networks to provide advanced and intelligent services, has to protect user privacy and address attacks such as spoofing attacks, denial of service (DoS) attacks, jamming, and eavesdropping. We investigate the attack model for IoT systems and review the IoT security solutions based on machine-learning (ML) techniques including supervised learning, unsupervised learning, and reinforcement learning (RL). ML-based IoT authentication, access control, secure offloading, and malware detection schemes to protect data privacy are the focus of this article. We also discuss the challenges that need to be addressed to implement these ML-based security schemes in practical IoT systems.

...read moreread less

Proceedings Article•10.1109/CVPR.2018.00043•

Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction

[...]

Huangying Zhan¹, Ravi Garg¹, Chamara Saroj Weerasekera¹, Kejie Li¹, Harsh Agarwal², Ian Reid¹ - Show less +2 more•Institutions (2)

University of Adelaide¹, Indian Institute of Technology (BHU) Varanasi²

18 Jun 2018

TL;DR: The use of stereo sequences for learning depth and visual odometry enables the use of both spatial and temporal photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale.

...read moreread less

Abstract: Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner. Recent approaches to single view depth estimation explore the possibility of learning without full supervision via minimizing photometric error. In this paper, we explore the use of stereo sequences for learning depth and visual odometry. The use of stereo sequences enables the use of both spatial (between left-right pairs) and temporal (forward backward) photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale. At test time our framework is able to estimate single view depth and two-view odometry from a monocular sequence. We also show how we can improve on a standard photometric warp loss by considering a warp of deep features. We show through extensive experiments that: (i) jointly training for single view depth and visual odometry improves depth prediction because of the additional constraint imposed on depths and achieves competitive results for visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss for both single view depth estimation and visual odometry. Our method outperforms existing learning based methods on the KITTI driving dataset in both tasks. The source code is available at https://github.com/Huangying-Zhan/Depth-VO-Feat.

...read moreread less

Book Chapter•10.1007/978-3-030-01228-1_3•

DF-Net: Unsupervised Joint Learning of Depth and Flow Using Cross-Task Consistency

[...]

Yuliang Zou¹, Zelun Luo², Jia-Bin Huang¹•Institutions (2)

Virginia Tech¹, Stanford University²

8 Sep 2018

TL;DR: The core idea is that for rigid regions the authors can use the predicted scene depth and camera motion to synthesize 2D optical flow by backprojecting the induced 3D scene flow to impose a cross-task consistency loss.

...read moreread less

Abstract: We present an unsupervised learning framework for simultaneously training single-view depth prediction and optical flow estimation models using unlabeled video sequences. Existing unsupervised methods often exploit brightness constancy and spatial smoothness priors to train depth or flow models. In this paper, we propose to leverage geometric consistency as additional supervisory signals. Our core idea is that for rigid regions we can use the predicted scene depth and camera motion to synthesize 2D optical flow by backprojecting the induced 3D scene flow. The discrepancy between the rigid flow (from depth prediction and camera motion) and the estimated flow (from optical flow model) allows us to impose a cross-task consistency loss. While all the networks are jointly optimized during training, they can be applied independently at test time. Extensive experiments demonstrate that our depth and flow models compare favorably with state-of-the-art unsupervised methods.

...read moreread less

Proceedings Article•10.1109/CVPR.2018.00400•

Collaborative and Adversarial Network for Unsupervised Domain Adaptation

[...]

Weichen Zhang¹, Wanli Ouyang¹, Wen Li², Dong Xu¹•Institutions (2)

University of Sydney¹, ETH Zurich²

1 Jun 2018

TL;DR: A new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) is proposed through domain-collaborative and domain-adversarial training of neural networks and extended as Incremental CAN (iCAN), in which a set of pseudo-labelled target samples are selected based on the image classifier and the last domain classifier from the previous training epoch.

...read moreread less

Abstract: In this paper, we propose a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) through domain-collaborative and domain-adversarial training of neural networks. We add several domain classifiers on multiple CNN feature extraction blocks1, in which each domain classifier is connected to the hidden representations from one block and one loss function is defined based on the hidden presentation and the domain labels (e.g., source and target). We design a new loss function by integrating the losses from all blocks in order to learn domain informative representations from lower blocks through collaborative learning and learn domain uninformative representations from higher blocks through adversarial learning. We further extend our CAN method as Incremental CAN (iCAN), in which we iteratively select a set of pseudo-labelled target samples based on the image classifier and the last domain classifier from the previous training epoch and re-train our CAN model by using the enlarged training set. Comprehensive experiments on two benchmark datasets Office and ImageCLEF-DA clearly demonstrate the effectiveness of our newly proposed approaches CAN and iCAN for unsupervised domain adaptation.

...read moreread less

Journal Article•10.1016/J.MEDIA.2018.06.001•

Disease prediction using graph convolutional networks: Application to Autism Spectrum Disorder and Alzheimer's disease.

[...]

Sarah Parisot, Sofia Ira Ktena¹, Enzo Ferrante², Matthew C. H. Lee¹, Ricardo Guerrero, Ben Glocker¹, Daniel Rueckert¹ - Show less +3 more•Institutions (2)

Imperial College London¹, National Scientific and Technical Research Council²

02 Jun 2018-Medical Image Analysis

TL;DR: A thorough evaluation of a generic framework that leverages both imaging and non‐imaging information and can be used for brain analysis in large populations, which shows that the novel framework can improve over state‐of‐the‐art results on both databases.

...read moreread less

Journal Article•10.1021/ACS.JCIM.7B00616•

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition.

[...]

Sabrina Jaeger, Simone Fulle, Samo Turk

10 Jan 2018-Journal of Chemical Information and Modeling

TL;DR: Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment-independent and thus can also be easily used for proteins with low sequence similarities.

...read moreread less

Abstract: Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pretrained once, yields dense vector representations, and overcomes drawbacks of common compound feature representations such as sparseness and bit collision...

...read moreread less

Journal Article•10.1016/J.PHYSREP.2019.03.001•

A high-bias, low-variance introduction to Machine Learning for physicists

[...]

Pankaj Mehta¹, Marin Bukov², Ching-Hao Wang¹, Alexandre G. R. Day¹, Charles C. Richardson¹, Charles K. Fisher, David J. Schwab³ - Show less +3 more•Institutions (3)

Boston University¹, University of California, Berkeley², City University of New York³

23 Mar 2018-arXiv: Computational Physics

TL;DR: In this paper, the authors provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists and emphasize the many natural connections between ML and statistical physics.

...read moreread less

Abstract: Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute. (Notebooks are available at this https URL )

...read moreread less

Book Chapter•10.1007/978-3-030-01240-3_40•

Pose-Normalized Image Generation for Person Re-identification

[...]

Xuelin Qian¹, Yanwei Fu¹, Tao Xiang², Wenxuan Wang¹, Jie Qiu³, Yang Wu³, Yu-Gang Jiang¹, Xiangyang Xue¹ - Show less +4 more•Institutions (3)

Fudan University¹, Queen Mary University of London², Nara Institute of Science and Technology³

8 Sep 2018

TL;DR: PN-GAN as mentioned in this paper is based on a generative adversarial network (GAN) designed specifically for pose normalization in re-id, thus termed pose-normalization GAN.

...read moreread less

Abstract: Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations. In this work, we address both problems by proposing a novel deep person image generation model for synthesizing realistic person images conditional on the pose. The model is based on a generative adversarial network (GAN) designed specifically for pose normalization in re-id, thus termed pose-normalization GAN (PN-GAN). With the synthesized images, we can learn a new type of deep re-id features free of the influence of pose variations. We show that these features are complementary to features learned with the original images. Importantly, a more realistic unsupervised learning setting is considered in this work, and our model is shown to have the potential to be generalizable to a new re-id dataset without any fine-tuning. The codes will be released at https://github.com/naiq/PN_GAN.

...read moreread less

Journal Article•10.1109/TCCN.2018.2881442•

A Very Brief Introduction to Machine Learning With Applications to Communication Systems

[...]

Osvaldo Simeone¹•Institutions (1)

King's College London¹

21 Nov 2018-IEEE Transactions on Cognitive Communications and Networking

TL;DR: In this paper, the authors provide a high-level introduction to the basics of supervised and unsupervised learning, exemplifying applications to communication networks by distinguishing tasks carried out at the edge and at the cloud segments of the network at different layers of the protocol stack, with an emphasis on the physical layer.

...read moreread less

Abstract: Given the unprecedented availability of data and computing resources, there is widespread renewed interest in applying data-driven machine learning methods to problems for which the development of conventional engineering solutions is challenged by modeling or algorithmic deficiencies. This tutorial-style paper starts by addressing the questions of why and when such techniques can be useful. It then provides a high-level introduction to the basics of supervised and unsupervised learning. For both supervised and unsupervised learning, exemplifying applications to communication networks are discussed by distinguishing tasks carried out at the edge and at the cloud segments of the network at different layers of the protocol stack, with an emphasis on the physical layer.

...read moreread less

Proceedings Article•10.1109/CVPRW.2018.00113•

Unsupervised Image Super-Resolution Using Cycle-in-Cycle Generative Adversarial Networks

[...]

Yuan Yuan¹, Siyuan Liu², Jiawei Zhang¹, Yongbing Zhang², Chao Dong¹, Liang Lin¹ - Show less +2 more•Institutions (2)

SenseTime¹, Tsinghua University²

18 Jun 2018

TL;DR: This work proposes a Cycle-in-Cycle network structure with generative adversarial networks (GAN) as the basic component to tackle the single image super-resolution problem in a more general case that the low-/high-resolution pairs and the down-sampling process are unavailable.

...read moreread less

Abstract: We consider the single image super-resolution problem in a more general case that the low-/high-resolution pairs and the down-sampling process are unavailable. Different from traditional super-resolution formulation, the low-resolution input is further degraded by noises and blurring. This complicated setting makes supervised learning and accurate kernel estimation impossible. To solve this problem, we resort to unsupervised learning without paired data, inspired by the recent successful image-to-image translation applications. With generative adversarial networks (GAN) as the basic component, we propose a Cycle-in-Cycle network structure to tackle the problem within three steps. First, the noisy and blurry input is mapped to a noise-free low-resolution space. Then the intermediate image is up-sampled with a pre-trained deep model. Finally, we fine-tune the two modules in an end-to-end manner to get the high-resolution output. Experiments on NTIRE2018 datasets demonstrate that the proposed unsupervised method achieves comparable results as the state-of-the-art supervised models.

...read moreread less

Book Chapter•10.1007/978-3-030-01228-1_37•

PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors

[...]

Haowen Deng¹, Tolga Birdal¹, Slobodan Ilic¹•Institutions (1)

Technische Universität München¹

8 Sep 2018

TL;DR: It is demonstrated that despite having six degree-of-freedom invariance and lack of training labels, PPF-FoldNet achieves state of the art results in standard benchmark datasets and outperforms its competitors when rotations and varying point densities are present.

...read moreread less

Abstract: We present PPF-FoldNet for unsupervised learning of 3D local descriptors on pure point cloud geometry Based on the folding-based auto-encoding of well known point pair features, PPF-FoldNet offers many desirable properties: it necessitates neither supervision, nor a sensitive local reference frame, benefits from point-set sparsity, is end-to-end, fast, and can extract powerful rotation invariant descriptors Thanks to a novel feature visualization, its evolution can be monitored to provide interpretable insights Our extensive experiments demonstrate that despite having six degree-of-freedom invariance and lack of training labels, our network achieves state of the art results in standard benchmark datasets and outperforms its competitors when rotations and varying point densities are present PPF-FoldNet achieves 9% higher recall on standard benchmarks, 23% higher recall when rotations are introduced into the same datasets and finally, a margin of >35% is attained when point density is significantly decreased

...read moreread less

...

Expand