Hierarchical Metadata-Aware Document Categorization under Weak Supervision

Open AccessPosted Content

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

- 26 Oct 2020

13

TL;DR: This paper proposes a novel joint representation learning module that allows simultaneous modeling of category dependencies, metadata information and textual semantics, and introduces a data augmentation module that hierarchically synthesizes training documents to complement the original, small-scale training set.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1145/3485447.3512174

Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

Yu Yvette Zhang, +7 more

- 11 Feb 2022

TL;DR: Experimental results show that MICoL significantly outperforms strong zero-shot text classification and contrastive learning baselines and is on par with the state-of-the-art supervised metadata-aware LMTC method trained on 10K–200K labeled documents, and tends to predict more infrequent labels than supervised methods, thus alleviates the deteriorated performance on long-tailed labels.

...read moreread less

30

Journal Article•10.1145/3470888

dhCM: Dynamic and Hierarchical Event Categorization and Discovery for Social Media Stream

GuoJinjin, +2 more

- 23 Sep 2021

- ACM Transactions on Intelligent Systems ...

TL;DR: The online event discovery in social media based documents is useful, such as for disaster recognition and intervention, but the diverse events incrementally identified from social media streaks need to be addressed.

...read moreread less

6

Journal Article•10.48550/arXiv.2203.10922

Who Should Review Your Proposal? Interdisciplinary Topic Path Detection for Research Proposals

Meng Xiao, +6 more

- 07 Mar 2022

- arXiv.org

TL;DR: A deep Hierarchical Interdisciplinary Research Proposal Classification Network (HIRPCN) is developed, which proposes a hierarchical transformer to extract the textual semantic information of proposals and designs a level-wise prediction component to fuse the two types of knowledge representations and detect interdisciplinary topic paths for each proposal.

...read moreread less

3

Journal Article•10.1371/journal.pcbi.1012006

Partial label learning for automated classification of single-cell transcriptomic profiles.

Malek Senoussi, +2 more

- 05 Apr 2024

- PLOS Computational Biology

TL;DR: Overall the findings show how hierarchical and non-hierarchical partial label learning strategies can help solve the problem of automated classification of single-cell transcriptomic profiles, interestingly these methods rely on a much less stringent type of annotated datasets compared to fully supervised learning methods.

...read moreread less

2

Proceedings Article•10.1145/3534678.3542607

Adapting Pretrained Representations for Text Mining

Yu Meng, +3 more

- 14 Aug 2022

TL;DR: This tutorial introduces recent advances in pretrained text representations, as well as their applications to a wide range of text mining tasks, and focuses on minimally-supervised approaches that do not require massive human annotations.

...read moreread less

1

References

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

Proceedings Article•10.18653/V1/N19-1423

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

24.6K

•Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

- 05 Dec 2013

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

24.1K

•Posted Content

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

- 16 Oct 2013

- arXiv: Computation and Language

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.

...read moreread less

22.9K