Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Unsupervised learning
  4. 2018
  1. Home
  2. Topics
  3. Unsupervised learning
  4. 2018
Showing papers on "Unsupervised learning published in 2018"
Posted Content•
Representation Learning with Contrastive Predictive Coding

[...]

Aaron van den Oord1, Yazhe Li1, Oriol Vinyals1•
Google1
10 Jul 2018-arXiv: Learning
TL;DR: This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.
Abstract: While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose a universal unsupervised learning approach to extract useful representations from high-dimensional data, which we call Contrastive Predictive Coding. The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models. We use a probabilistic contrastive loss which induces the latent space to capture information that is maximally useful to predict future samples. It also makes the model tractable by using negative sampling. While most prior work has focused on evaluating representations for a particular modality, we demonstrate that our approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

8,203 citations

Proceedings Article•10.1109/CVPR.2018.00393•
Unsupervised Feature Learning via Non-parametric Instance Discrimination

[...]

Zhirong Wu1, Yuanjun Xiong2, Stella X. Yu1, Dahua Lin2•
University of California, Berkeley1, The Chinese University of Hong Kong2
18 Jun 2018
TL;DR: This work forms this intuition as a non-parametric classification problem at the instance-level, and uses noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes.
Abstract: Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similarity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances? We formulate this intuition as a non-parametric classification problem at the instance-level, and use noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Our experimental results demonstrate that, under unsupervised learning settings, our method surpasses the state-of-the-art on ImageNet classification by a large margin. Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.

4,693 citations

Book Chapter•10.1007/978-3-030-01264-9_9•
Deep Clustering for Unsupervised Learning of Visual Features

[...]

Mathilde Caron1, Piotr Bojanowski1, Armand Joulin1, Matthijs Douze1•
Facebook1
8 Sep 2018
TL;DR: DeepCluster as discussed by the authors is a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features, and uses the subsequent assignments as supervision to update the weights of the network.
Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large-scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.

2,861 citations

Proceedings Article•10.1109/CVPR.2018.00392•
Maximum Classifier Discrepancy for Unsupervised Domain Adaptation

[...]

Kuniaki Saito1, Kohei Watanabe1, Yoshitaka Ushiku1, Tatsuya Harada1•
University of Tokyo1
18 Jun 2018
TL;DR: MCD-DA as discussed by the authors aligns distributions of source and target by utilizing the task-specific decision boundaries between classes to detect target samples that are far from the support of the source.
Abstract: In this work, we present a method for unsupervised domain adaptation. Many adversarial learning methods train domain classifier networks to distinguish the features as either a source or target and train a feature generator network to mimic the discriminator. Two problems exist with these methods. First, the domain classifier only tries to distinguish the features as a source or target and thus does not consider task-specific decision boundaries between classes. Therefore, a trained generator can generate ambiguous features near class boundaries. Second, these methods aim to completely match the feature distributions between different domains, which is difficult because of each domain's characteristics. To solve these problems, we introduce a new approach that attempts to align distributions of source and target by utilizing the task-specific decision boundaries. We propose to maximize the discrepancy between two classifiers' outputs to detect target samples that are far from the support of the source. A feature generator learns to generate target features near the support to minimize the discrepancy. Our method outperforms other methods on several datasets of image classification and semantic segmentation. The codes are available at https://github.com/mil-tokyo/MCD_DA

2,532 citations

Proceedings Article•
Learning deep representations by mutual information estimation and maximization

[...]

R Devon Hjelm1, Alex Fedorov2, Samuel Lavoie-Marchildon3, Karan Grewal, Philip Bachman1, Adam Trischler1, Yoshua Bengio3 •
Microsoft1, University of New Mexico2, Université de Montréal3
20 Aug 2018
TL;DR: Deep InfoMax (DIM) as discussed by the authors maximizes mutual information between an input and the output of a deep neural network encoder by matching to a prior distribution adversarially.
Abstract: This work investigates unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation’s suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and compares favorably with fully-supervised learning on several classification tasks in with some standard architectures. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation learning objectives for specific end-goals.

2,512 citations

Posted Content•
Deep Clustering for Unsupervised Learning of Visual Features

[...]

Mathilde Caron1, Piotr Bojanowski1, Armand Joulin1, Matthijs Douze1•
Facebook1
15 Jul 2018-arXiv: Computer Vision and Pattern Recognition
TL;DR: This work presents DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features and outperforms the current state of the art by a significant margin on all the standard benchmarks.
Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.

1,858 citations

Posted Content•
Deep Graph Infomax.

[...]

Petar Veličković1, William Fedus2, William L. Hamilton3, Pietro Liò1, Yoshua Bengio4, R Devon Hjelm5 •
University of Cambridge1, Google2, Stanford University3, Université de Montréal4, Microsoft5
27 Sep 2018-arXiv: Machine Learning
TL;DR: Deep Graph Infomax (DGI) is presented, a general approach for learning node representations within graph-structured data in an unsupervised manner that is readily applicable to both transductive and inductive learning setups.
Abstract: We present Deep Graph Infomax (DGI), a general approach for learning node representations within graph-structured data in an unsupervised manner. DGI relies on maximizing mutual information between patch representations and corresponding high-level summaries of graphs---both derived using established graph convolutional network architectures. The learnt patch representations summarize subgraphs centered around nodes of interest, and can thus be reused for downstream node-wise learning tasks. In contrast to most prior approaches to unsupervised learning with GCNs, DGI does not rely on random walk objectives, and is readily applicable to both transductive and inductive learning setups. We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning.

1,628 citations

Journal Article•10.1093/NSR/NWX105•
An Overview of Multi-task Learning

[...]

Yu Zhang1, Qiang Yang1•
Hong Kong University of Science and Technology1
01 Jan 2018-National Science Review
TL;DR: Many areas, including computer vision, bioinformatics, health informatics, speech, natural language processing, web applications and ubiquitous computing, use MTL to improve the performance of the applications involved and some representative works are reviewed.
Abstract: As a promising area in machine learning, multi-task learning (MTL) aims to improve the performance of multiple related learning tasks by leveraging useful information among them. In this paper, we give an overview of MTL by first giving a definition of MTL. Then several different settings of MTL are introduced, including multi-task supervised learning, multi-task unsupervised learning, multi-task semi-supervised learning, multi-task active learning, multi-task reinforcement learning, multi-task online learning and multi-task multi-view learning. For each setting, representative MTL models are presented. In order to speed up the learning process, parallel and distributed MTL models are introduced. Many areas, including computer vision, bioinformatics, health informatics, speech, natural language processing, web applications and ubiquitous computing, use MTL to improve the performance of the applications involved and some representative works are reviewed. Finally, recent theoretical analyses for MTL are presented.

1,602 citations

Journal Article•10.1145/3234150•
A Survey on Deep Learning: Algorithms, Techniques, and Applications

[...]

Samira Pouyanfar1, Saad Sadiq2, Yilin Yan2, Haiman Tian1, Yudong Tao2, Maria Presa Reyes1, Mei-Ling Shyu2, Shu-Ching Chen1, S. Sitharama Iyengar1 •
Florida International University1, University of Miami2
18 Sep 2018-ACM Computing Surveys
TL;DR: A comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing is presented, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications.
Abstract: The field of machine learning is witnessing its golden era as deep learning slowly becomes the leader in this domain. Deep learning uses multiple layers to represent the abstractions of data to build computational models. Some key enabler deep learning algorithms such as generative adversarial networks, convolutional neural networks, and model transfers have completely changed our perception of information processing. However, there exists an aperture of understanding behind this tremendously fast-paced domain, because it was never previously represented from a multiscope perspective. The lack of core understanding renders these powerful methods as black-box machines that inhibit development at a fundamental level. Moreover, deep learning has repeatedly been perceived as a silver bullet to all stumbling blocks in machine learning, which is far from the truth. This article presents a comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications. It was also undertaken to review the issues faced in deep learning such as unsupervised learning, black-box models, and online learning and to illustrate how these challenges can be transformed into prolific future research avenues.

1,299 citations

Proceedings Article•10.1109/CVPR.2018.00029•
FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation

[...]

Yaoqing Yang1, Chen Feng, Yiru Shen2, Dong Tian3•
Carnegie Mellon University1, Mitsubishi Electric2, Clemson University3
14 Dec 2018
TL;DR: FoldingNet as discussed by the authors proposes an end-to-end deep auto-encoder to address unsupervised learning challenges on point clouds, where a folding-based decoder deforms a canonical 2D grid onto the underlying 3D object surface of a point cloud.
Abstract: Recent deep networks that directly handle points in a point set, e.g., PointNet, have been state-of-the-art for supervised learning tasks on point clouds such as classification and segmentation. In this work, a novel end-to-end deep auto-encoder is proposed to address unsupervised learning challenges on point clouds. On the encoder side, a graph-based enhancement is enforced to promote local structures on top of PointNet. Then, a novel folding-based decoder deforms a canonical 2D grid onto the underlying 3D object surface of a point cloud, achieving low reconstruction errors even for objects with delicate structures. The proposed decoder only uses about 7% parameters of a decoder with fully-connected neural networks, yet leads to a more discriminative representation that achieves higher linear SVM classification accuracy than the benchmark. In addition, the proposed decoder structure is shown, in theory, to be a generic architecture that is able to reconstruct an arbitrary point cloud from a 2D grid. Our code is available at http://www.merl.com/research/license#FoldingNet

1,296 citations

Proceedings Article•10.1109/CVPR.2018.00212•
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

[...]

Zhichao Yin, Jianping Shi
6 Mar 2018
TL;DR: GeoNet as mentioned in this paper proposes an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively.
Abstract: We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and egomotion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner. Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately. Furthermore, we propose an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively. Experimentation on the KITTI driving dataset reveals that our scheme achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.
Posted Content•
Disentangling by Factorising.

[...]

Hyunjik Kim1, Andriy Mnih1•
Google1
16 Feb 2018-arXiv: Machine Learning
TL;DR: FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions, is proposed and it improves upon $\beta$-VAE by providing a better trade-off between disentanglement and reconstruction quality.
Abstract: We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation. We propose FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions. We show that it improves upon $\beta$-VAE by providing a better trade-off between disentanglement and reconstruction quality. Moreover, we highlight the problems of a commonly used disentanglement metric and introduce a new metric that does not suffer from them.
Proceedings Article•10.1109/CVPR.2018.00594•
Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints

[...]

Reza Mahjourian1, Martin Wicke1, Anelia Angelova1•
University of Texas at Austin1
15 Feb 2018
TL;DR: The main contribution is to explicitly consider the inferred 3D geometry of the whole scene, and enforce consistency of the estimated 3D point clouds and ego-motion across consecutive frames, and outperforms the state-of-the-art for both breadth and depth.
Abstract: We present a novel approach for unsupervised learning of depth and ego-motion from monocular video. Unsupervised learning removes the need for separate supervisory signals (depth or ego-motion ground truth, or multi-view video). Prior work in unsupervised depth learning uses pixel-wise or gradient-based losses, which only consider pixels in small local neighborhoods. Our main contribution is to explicitly consider the inferred 3D geometry of the whole scene, and enforce consistency of the estimated 3D point clouds and ego-motion across consecutive frames. This is a challenging task and is solved by a novel (approximate) backpropagation algorithm for aligning 3D structures. We combine this novel 3D-based loss with 2D losses based on photometric quality of frame reconstructions using estimated depth and ego-motion from adjacent frames. We also incorporate validity masks to avoid penalizing areas in which no useful information exists. We test our algorithm on the KITTI dataset and on a video dataset captured on an uncalibrated mobile phone camera. Our proposed approach consistently improves depth estimates on both datasets, and outperforms the state-of-the-art for both depth and ego-motion. Because we only require a simple video, learning depth and ego-motion on large and varied datasets becomes possible. We demonstrate this by training on the low quality uncalibrated video dataset and evaluating on KITTI, ranking among top performing prior methods which are trained on KITTI itself.1
Journal Article•10.1038/S41928-018-0023-2•
Fully memristive neural networks for pattern classification with unsupervised learning

[...]

Zhongrui Wang1, Saumil Joshi1, Sergey Savel'ev2, Wenhao Song1, Rivu Midya1, Yunning Li1, Mingyi Rao1, Peng Yan1, Shiva Asapu1, Ye Zhuo1, Hao Jiang1, Peng Lin1, Can Li1, Jung Ho Yoon1, Navnidhi K. Upadhyay1, Jiaming Zhang3, Miao Hu3, John Paul Strachan3, Mark Barnell4, Qing Wu4, Huaqiang Wu5, R. Stanley Williams3, Qiangfei Xia1, Jianhua Yang1 •
University of Massachusetts Amherst1, Loughborough University2, Hewlett-Packard3, Air Force Research Laboratory4, Tsinghua University5
8 Feb 2018
TL;DR: It is shown that a diffusive memristor based on silver nanoparticles in a dielectric film can be used to create an artificial neuron with stochastic leaky integrate-and-fire dynamics and tunable integration time, which is determined by silver migration alone or its interaction with circuit capacitance.
Abstract: Neuromorphic computers comprised of artificial neurons and synapses could provide a more efficient approach to implementing neural network algorithms than traditional hardware. Recently, artificial neurons based on memristors have been developed, but with limited bio-realistic dynamics and no direct interaction with the artificial synapses in an integrated network. Here we show that a diffusive memristor based on silver nanoparticles in a dielectric film can be used to create an artificial neuron with stochastic leaky integrate-and-fire dynamics and tunable integration time, which is determined by silver migration alone or its interaction with circuit capacitance. We integrate these neurons with nonvolatile memristive synapses to build fully memristive artificial neural networks. With these integrated networks, we experimentally demonstrate unsupervised synaptic weight updating and pattern classification.
Posted Content•
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

[...]

Zhichao Yin, Jianping Shi
06 Mar 2018-arXiv: Computer Vision and Pattern Recognition
TL;DR: An adaptive geometric consistency loss is proposed to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively and achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.
Abstract: We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and ego-motion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner. Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately. Furthermore, we propose an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively. Experimentation on the KITTI driving dataset reveals that our scheme achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.
Proceedings Article•10.18653/V1/N18-1049•
Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

[...]

Matteo Pagliardini1, Prakhar Gupta1, Martin Jaggi1•
École Polytechnique Fédérale de Lausanne1
1 May 2018
TL;DR: This work presents a simple but efficient unsupervised objective to train distributed representations of sentences, which outperforms the state-of-the-art un supervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.
Abstract: The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.
Proceedings Article•10.1109/CVPR.2018.00964•
An Unsupervised Learning Model for Deformable Medical Image Registration

[...]

Guha Balakrishnan1, Amy Zhao1, Mert R. Sabuncu2, Adrian V. Dalca1, John V. Guttag1 •
Massachusetts Institute of Technology1, Cornell University2
7 Feb 2018
TL;DR: The proposed method uses a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field, and demonstrates registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice.
Abstract: We present a fast learning-based algorithm for deformable, pairwise 3D medical image registration. Current registration methods optimize an objective function independently for each pair of images, which can be time-consuming for large data. We define registration as a parametric function, and optimize its parameters given a set of images from a collection of interest. Given a new pair of scans, we can quickly compute a registration field by directly evaluating the function using the learned parameters. We model this function using a CNN, and use a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field. The proposed method does not require supervised information such as ground truth registration fields or anatomical landmarks. We demonstrate registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice. Our method promises to significantly speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is available at https://github.com/balakg/voxelmorph.
Proceedings Article•10.1109/CVPR.2018.00242•
Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-identification

[...]

Jingya Wang1, Xiatian Zhu, Shaogang Gong1, Wei Li1•
Queen Mary University of London1
18 Jun 2018
TL;DR: In this article, a Transferable Joint Attribute-Identity Deep Learning (TJ-AIDL) model is proposed to simultaneously learn an attribute-semantic and identity discriminative feature representation space transferrable to any new (unseen) target domain for re-id tasks without the need for collecting new labelled training data from the target domain.
Abstract: Most existing person re-identification (re-id) methods require supervised model learning from a separate large set of pairwise labelled training data for every single camera pair. This significantly limits their scalability and usability in real-world large scale deployments with the need for performing re-id across many camera views. To address this scalability problem, we develop a novel deep learning method for transferring the labelled information of an existing dataset to a new unseen (unlabelled) target domain for person re-id without any supervised learning in the target domain. Specifically, we introduce an Transferable Joint Attribute-Identity Deep Learning (TJ-AIDL) for simultaneously learning an attribute-semantic and identity-discriminative feature representation space transferrable to any new (unseen) target domain for re-id tasks without the need for collecting new labelled training data from the target domain (i.e. unsupervised learning in the target domain). Extensive comparative evaluations validate the superiority of this new TJ-AIDL model for unsupervised person re-id over a wide range of state-of-the-art methods on four challenging benchmarks including VIPeR, PRID, Market-1501, and DukeMTMC-ReID.
Journal Article•10.1145/3243316•
Unsupervised Person Re-identification: Clustering and Fine-tuning

[...]

Hehe Fan1, Liang Zheng1, Chenggang Yan2, Yi Yang1•
University of Technology, Sydney1, Hangzhou Dianzi University2
10 Oct 2018-ACM Transactions on Multimedia Computing, Communications, and Applications
TL;DR: A progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains and demonstrates that PUL outputs discriminative features that improve the re-ID accuracy.
Abstract: The superiority of deeply learned pedestrian representations has been reported in very recent literature of person re-identification (re-ID). In this article, we consider the more pragmatic issue of learning a deep feature with no or only a few labels. We propose a progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains. Our method is easy to implement and can be viewed as an effective baseline for unsupervised re-ID feature learning. Specifically, PUL iterates between (1) pedestrian clustering and (2) fine-tuning of the convolutional neural network (CNN) to improve the initialization model trained on the irrelevant labeled dataset. Since the clustering results can be very noisy, we add a selection operation between the clustering and fine-tuning. At the beginning, when the model is weak, CNN is fine-tuned on a small amount of reliable examples that locate near to cluster centroids in the feature space. As the model becomes stronger, in subsequent iterations, more images are being adaptively selected as CNN training samples. Progressively, pedestrian clustering and the CNN model are improved simultaneously until algorithm convergence. This process is naturally formulated as self-paced learning. We then point out promising directions that may lead to further improvement. Extensive experiments on three large-scale re-ID datasets demonstrate that PUL outputs discriminative features that improve the re-ID accuracy. Our code has been released at https://github.com/hehefan/Unsupervised-Person-Re-identification-Clustering-and-Fine-tuning.
Journal Article•10.1109/MSP.2018.2825478•
IoT Security Techniques Based on Machine Learning: How Do IoT Devices Use AI to Enhance Security?

[...]

Liang Xiao, Xiaoyue Wan, Xiaozhen Lu, Yanyong Zhang1, Di Wu2 •
Rutgers University1, Sun Yat-sen University2
03 Sep 2018-IEEE Signal Processing Magazine
TL;DR: The attack model for IoT systems is investigated, and the IoT security solutions based on machine-learning (ML) techniques including supervised learning, unsupervised learning, and reinforcement learning (RL) are reviewed.
Abstract: The Internet of things (IoT), which integrates a variety of devices into networks to provide advanced and intelligent services, has to protect user privacy and address attacks such as spoofing attacks, denial of service (DoS) attacks, jamming, and eavesdropping. We investigate the attack model for IoT systems and review the IoT security solutions based on machine-learning (ML) techniques including supervised learning, unsupervised learning, and reinforcement learning (RL). ML-based IoT authentication, access control, secure offloading, and malware detection schemes to protect data privacy are the focus of this article. We also discuss the challenges that need to be addressed to implement these ML-based security schemes in practical IoT systems.
Proceedings Article•10.1109/CVPR.2018.00043•
Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction

[...]

Huangying Zhan1, Ravi Garg1, Chamara Saroj Weerasekera1, Kejie Li1, Harsh Agarwal2, Ian Reid1 •
University of Adelaide1, Indian Institute of Technology (BHU) Varanasi2
18 Jun 2018
TL;DR: The use of stereo sequences for learning depth and visual odometry enables the use of both spatial and temporal photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale.
Abstract: Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner. Recent approaches to single view depth estimation explore the possibility of learning without full supervision via minimizing photometric error. In this paper, we explore the use of stereo sequences for learning depth and visual odometry. The use of stereo sequences enables the use of both spatial (between left-right pairs) and temporal (forward backward) photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale. At test time our framework is able to estimate single view depth and two-view odometry from a monocular sequence. We also show how we can improve on a standard photometric warp loss by considering a warp of deep features. We show through extensive experiments that: (i) jointly training for single view depth and visual odometry improves depth prediction because of the additional constraint imposed on depths and achieves competitive results for visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss for both single view depth estimation and visual odometry. Our method outperforms existing learning based methods on the KITTI driving dataset in both tasks. The source code is available at https://github.com/Huangying-Zhan/Depth-VO-Feat.
Book Chapter•10.1007/978-3-030-01228-1_3•
DF-Net: Unsupervised Joint Learning of Depth and Flow Using Cross-Task Consistency

[...]

Yuliang Zou1, Zelun Luo2, Jia-Bin Huang1•
Virginia Tech1, Stanford University2
8 Sep 2018
TL;DR: The core idea is that for rigid regions the authors can use the predicted scene depth and camera motion to synthesize 2D optical flow by backprojecting the induced 3D scene flow to impose a cross-task consistency loss.
Abstract: We present an unsupervised learning framework for simultaneously training single-view depth prediction and optical flow estimation models using unlabeled video sequences. Existing unsupervised methods often exploit brightness constancy and spatial smoothness priors to train depth or flow models. In this paper, we propose to leverage geometric consistency as additional supervisory signals. Our core idea is that for rigid regions we can use the predicted scene depth and camera motion to synthesize 2D optical flow by backprojecting the induced 3D scene flow. The discrepancy between the rigid flow (from depth prediction and camera motion) and the estimated flow (from optical flow model) allows us to impose a cross-task consistency loss. While all the networks are jointly optimized during training, they can be applied independently at test time. Extensive experiments demonstrate that our depth and flow models compare favorably with state-of-the-art unsupervised methods.
Proceedings Article•10.1109/CVPR.2018.00400•
Collaborative and Adversarial Network for Unsupervised Domain Adaptation

[...]

Weichen Zhang1, Wanli Ouyang1, Wen Li2, Dong Xu1•
University of Sydney1, ETH Zurich2
1 Jun 2018
TL;DR: A new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) is proposed through domain-collaborative and domain-adversarial training of neural networks and extended as Incremental CAN (iCAN), in which a set of pseudo-labelled target samples are selected based on the image classifier and the last domain classifier from the previous training epoch.
Abstract: In this paper, we propose a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) through domain-collaborative and domain-adversarial training of neural networks. We add several domain classifiers on multiple CNN feature extraction blocks1, in which each domain classifier is connected to the hidden representations from one block and one loss function is defined based on the hidden presentation and the domain labels (e.g., source and target). We design a new loss function by integrating the losses from all blocks in order to learn domain informative representations from lower blocks through collaborative learning and learn domain uninformative representations from higher blocks through adversarial learning. We further extend our CAN method as Incremental CAN (iCAN), in which we iteratively select a set of pseudo-labelled target samples based on the image classifier and the last domain classifier from the previous training epoch and re-train our CAN model by using the enlarged training set. Comprehensive experiments on two benchmark datasets Office and ImageCLEF-DA clearly demonstrate the effectiveness of our newly proposed approaches CAN and iCAN for unsupervised domain adaptation.
Journal Article•10.1016/J.MEDIA.2018.06.001•
Disease prediction using graph convolutional networks: Application to Autism Spectrum Disorder and Alzheimer's disease.

[...]

Sarah Parisot, Sofia Ira Ktena1, Enzo Ferrante2, Matthew C. H. Lee1, Ricardo Guerrero, Ben Glocker1, Daniel Rueckert1 •
Imperial College London1, National Scientific and Technical Research Council2
02 Jun 2018-Medical Image Analysis
TL;DR: A thorough evaluation of a generic framework that leverages both imaging and non‐imaging information and can be used for brain analysis in large populations, which shows that the novel framework can improve over state‐of‐the‐art results on both databases.
Journal Article•10.1021/ACS.JCIM.7B00616•
Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition.

[...]

Sabrina Jaeger, Simone Fulle, Samo Turk
10 Jan 2018-Journal of Chemical Information and Modeling
TL;DR: Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment-independent and thus can also be easily used for proteins with low sequence similarities.
Abstract: Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pretrained once, yields dense vector representations, and overcomes drawbacks of common compound feature representations such as sparseness and bit collision...
Journal Article•10.1016/J.PHYSREP.2019.03.001•
A high-bias, low-variance introduction to Machine Learning for physicists

[...]

Pankaj Mehta1, Marin Bukov2, Ching-Hao Wang1, Alexandre G. R. Day1, Charles C. Richardson1, Charles K. Fisher, David J. Schwab3 •
Boston University1, University of California, Berkeley2, City University of New York3
23 Mar 2018-arXiv: Computational Physics
TL;DR: In this paper, the authors provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists and emphasize the many natural connections between ML and statistical physics.
Abstract: Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute. (Notebooks are available at this https URL )
Book Chapter•10.1007/978-3-030-01240-3_40•
Pose-Normalized Image Generation for Person Re-identification

[...]

Xuelin Qian1, Yanwei Fu1, Tao Xiang2, Wenxuan Wang1, Jie Qiu3, Yang Wu3, Yu-Gang Jiang1, Xiangyang Xue1 •
Fudan University1, Queen Mary University of London2, Nara Institute of Science and Technology3
8 Sep 2018
TL;DR: PN-GAN as mentioned in this paper is based on a generative adversarial network (GAN) designed specifically for pose normalization in re-id, thus termed pose-normalization GAN.
Abstract: Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations. In this work, we address both problems by proposing a novel deep person image generation model for synthesizing realistic person images conditional on the pose. The model is based on a generative adversarial network (GAN) designed specifically for pose normalization in re-id, thus termed pose-normalization GAN (PN-GAN). With the synthesized images, we can learn a new type of deep re-id features free of the influence of pose variations. We show that these features are complementary to features learned with the original images. Importantly, a more realistic unsupervised learning setting is considered in this work, and our model is shown to have the potential to be generalizable to a new re-id dataset without any fine-tuning. The codes will be released at https://github.com/naiq/PN_GAN.
Journal Article•10.1109/TCCN.2018.2881442•
A Very Brief Introduction to Machine Learning With Applications to Communication Systems

[...]

Osvaldo Simeone1•
King's College London1
21 Nov 2018-IEEE Transactions on Cognitive Communications and Networking
TL;DR: In this paper, the authors provide a high-level introduction to the basics of supervised and unsupervised learning, exemplifying applications to communication networks by distinguishing tasks carried out at the edge and at the cloud segments of the network at different layers of the protocol stack, with an emphasis on the physical layer.
Abstract: Given the unprecedented availability of data and computing resources, there is widespread renewed interest in applying data-driven machine learning methods to problems for which the development of conventional engineering solutions is challenged by modeling or algorithmic deficiencies. This tutorial-style paper starts by addressing the questions of why and when such techniques can be useful. It then provides a high-level introduction to the basics of supervised and unsupervised learning. For both supervised and unsupervised learning, exemplifying applications to communication networks are discussed by distinguishing tasks carried out at the edge and at the cloud segments of the network at different layers of the protocol stack, with an emphasis on the physical layer.
Proceedings Article•10.1109/CVPRW.2018.00113•
Unsupervised Image Super-Resolution Using Cycle-in-Cycle Generative Adversarial Networks

[...]

Yuan Yuan1, Siyuan Liu2, Jiawei Zhang1, Yongbing Zhang2, Chao Dong1, Liang Lin1 •
SenseTime1, Tsinghua University2
18 Jun 2018
TL;DR: This work proposes a Cycle-in-Cycle network structure with generative adversarial networks (GAN) as the basic component to tackle the single image super-resolution problem in a more general case that the low-/high-resolution pairs and the down-sampling process are unavailable.
Abstract: We consider the single image super-resolution problem in a more general case that the low-/high-resolution pairs and the down-sampling process are unavailable. Different from traditional super-resolution formulation, the low-resolution input is further degraded by noises and blurring. This complicated setting makes supervised learning and accurate kernel estimation impossible. To solve this problem, we resort to unsupervised learning without paired data, inspired by the recent successful image-to-image translation applications. With generative adversarial networks (GAN) as the basic component, we propose a Cycle-in-Cycle network structure to tackle the problem within three steps. First, the noisy and blurry input is mapped to a noise-free low-resolution space. Then the intermediate image is up-sampled with a pre-trained deep model. Finally, we fine-tune the two modules in an end-to-end manner to get the high-resolution output. Experiments on NTIRE2018 datasets demonstrate that the proposed unsupervised method achieves comparable results as the state-of-the-art supervised models.
Book Chapter•10.1007/978-3-030-01228-1_37•
PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors

[...]

Haowen Deng1, Tolga Birdal1, Slobodan Ilic1•
Technische Universität München1
8 Sep 2018
TL;DR: It is demonstrated that despite having six degree-of-freedom invariance and lack of training labels, PPF-FoldNet achieves state of the art results in standard benchmark datasets and outperforms its competitors when rotations and varying point densities are present.
Abstract: We present PPF-FoldNet for unsupervised learning of 3D local descriptors on pure point cloud geometry Based on the folding-based auto-encoding of well known point pair features, PPF-FoldNet offers many desirable properties: it necessitates neither supervision, nor a sensitive local reference frame, benefits from point-set sparsity, is end-to-end, fast, and can extract powerful rotation invariant descriptors Thanks to a novel feature visualization, its evolution can be monitored to provide interpretable insights Our extensive experiments demonstrate that despite having six degree-of-freedom invariance and lack of training labels, our network achieves state of the art results in standard benchmark datasets and outperforms its competitors when rotations and varying point densities are present PPF-FoldNet achieves 9% higher recall on standard benchmarks, 23% higher recall when rotations are introduced into the same datasets and finally, a margin of >35% is attained when point density is significantly decreased
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve