Embedding

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Posted Content•

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

[...]

Leland McInnes, John Healy

09 Feb 2018-arXiv: Machine Learning

TL;DR: The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance.

...read moreread less

Abstract: UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology The result is a practical scalable algorithm that applies to real world data The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning

...read moreread less

9,939 citations

Proceedings Article•10.1145/2736277.2741093•

LINE: Large-scale Information Network Embedding

[...]

Jian Tang¹, Meng Qu², Mingzhe Wang², Ming Zhang², Jun Yan¹, Qiaozhu Mei³ - Show less +2 more•Institutions (3)

Microsoft¹, Peking University², University of Michigan³

18 May 2015

TL;DR: A novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.

...read moreread less

Abstract: This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online\footnote{\url{https://github.com/tangjianpku/LINE}}.

...read moreread less

4,980 citations

Posted Content•

Learning Transferable Features with Deep Adaptation Networks

[...]

Mingsheng Long¹, Mingsheng Long², Yue Cao¹, Jianmin Wang¹, Michael I. Jordan² - Show less +1 more•Institutions (2)

Tsinghua University¹, University of California, Berkeley²

10 Feb 2015-arXiv: Learning

TL;DR: A new Deep Adaptation Network (DAN) architecture is proposed, which generalizes deep convolutional neural network to the domain adaptation scenario and can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding.

...read moreread less

Abstract: Recent studies reveal that a deep neural network can learn transferable features which generalize well to novel tasks for domain adaptation. However, as deep features eventually transition from general to specific along the network, the feature transferability drops significantly in higher layers with increasing domain discrepancy. Hence, it is important to formally reduce the dataset bias and enhance the transferability in task-specific layers. In this paper, we propose a new Deep Adaptation Network (DAN) architecture, which generalizes deep convolutional neural network to the domain adaptation scenario. In DAN, hidden representations of all task-specific layers are embedded in a reproducing kernel Hilbert space where the mean embeddings of different domain distributions can be explicitly matched. The domain discrepancy is further reduced using an optimal multi-kernel selection method for mean embedding matching. DAN can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding. Extensive empirical evidence shows that the proposed architecture yields state-of-the-art image classification error rates on standard domain adaptation benchmarks.

...read moreread less

4,433 citations

Proceedings Article•10.1145/2736277.2741093•

LINE: Large-scale Information Network Embedding

[...]

Jian Tang¹, Meng Qu², Mingzhe Wang², Ming Zhang², Jun Yan¹, Qiaozhu Mei³ - Show less +2 more•Institutions (3)

Microsoft¹, Peking University², University of Michigan³

12 Mar 2015-arXiv: Learning

TL;DR: LINE as discussed by the authors proposes a network embedding method called LINE, which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.

...read moreread less

Abstract: This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the "LINE," which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online.

...read moreread less

4,225 citations

Proceedings Article•

Knowledge graph embedding by translating on hyperplanes

[...]

Zhen Wang¹, Jianwen Zhang², Jianlin Feng¹, Zheng Chen²•Institutions (2)

Sun Yat-sen University¹, Microsoft²

27 Jul 2014

TL;DR: This paper proposes TransH which models a relation as a hyperplane together with a translation operation on it and can well preserve the above mapping properties of relations with almost the same model complexity of TransE.

...read moreread less

Abstract: We deal with embedding a large scale knowledge graph composed of entities and relations into a continuous vector space. TransE is a promising method proposed recently, which is very efficient while achieving state-of-the-art predictive performance. We discuss some mapping properties of relations which should be considered in embedding, such as reflexive, one-to-many, many-to-one, and many-to-many. We note that TransE does not do well in dealing with these properties. Some complex models are capable of preserving these mapping properties but sacrifice efficiency in the process. To make a good trade-off between model capacity and efficiency, in this paper we propose TransH which models a relation as a hyperplane together with a translation operation on it. In this way, we can well preserve the above mapping properties of relations with almost the same model complexity of TransE. Additionally, as a practical knowledge graph is often far from completed, how to construct negative examples to reduce false negative labels in training is very important. Utilizing the one-to-many/many-to-one mapping property of a relation, we propose a simple trick to reduce the possibility of false negative labeling. We conduct extensive experiments on link prediction, triplet classification and fact extraction on benchmark datasets like WordNet and Freebase. Experiments show TransH delivers significant improvements over TransE on predictive accuracy with comparable capability to scale up.

...read moreread less

4,131 citations

...

Expand

Year	Papers
2026	36
2025	2,395
2024	3,331
2023	5,752
2022	9,811
2021	1,851

Topic Tools

Papers published on a yearly basis

Papers

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

LINE: Large-scale Information Network Embedding

Learning Transferable Features with Deep Adaptation Networks

LINE: Large-scale Information Network Embedding

Knowledge graph embedding by translating on hyperplanes

Related Topics (5)

Performance Metrics