Open AccessPosted Content
Multi-Granular Text Encoding for Self-Explaining Categorization.
TL;DR: In this article, a tree-structured LSTM is used to learn a context-independent representation for each unit via parameter sharing, which can extract intuitive multi-granular evidence to support its predictions.
read more
Abstract: Self-explaining text categorization requires a classifier to make a prediction along with supporting evidence. A popular type of evidence is sub-sequences extracted from the input text which are sufficient for the classifier to make the prediction. In this work, we define multi-granular ngrams as basic units for explanation, and organize all ngrams into a hierarchical structure, so that shorter ngrams can be reused while computing longer ngrams. We leverage a tree-structured LSTM to learn a context-independent representation for each unit via parameter sharing. Experiments on medical disease classification show that our model is more accurate, efficient and compact than BiLSTM and CNN baselines. More importantly, our model can extract intuitive multi-granular evidence to support its predictions.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures
Citations
•Posted Content
Evaluating Explanation Methods for Neural Machine Translation
TL;DR: An initial attempt to evaluate explanation methods from an alternative viewpoint and proposes a principled metric based on fidelity in regard to the predictive behavior of the NMT model.
16
Intelligent Classification Method of Archive Data Based on Multigranular Semantics
TL;DR: An intelligent classification method for archive data based on multigranular semantics is proposed, which uses the multilabel data set to train the constructed semantic-label multigramular attention model, and outputs the classification result.
2
•Posted Content
Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction
Yifan Gao,Henghui Zhu,Patrick Ng,Cicero Nogueira dos Santos,Zhiguo Wang,Feng Nan,Dejiao Zhang,Ramesh Nallapati,Andrew Arnold,Bing Xiang +9 more
TL;DR: The proposed round-trip prediction is a model-agnostic general approach for answering ambiguous open-domain questions, which improves the state-of-the-art Refuel as well as several baseline models.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Posted Content
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
82.5K
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
Glove: Global Vectors for Word Representation
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Related Papers (5)
Zechun Tan,Zhiyun Chen +1 more
- 28 Jan 2021
S. Jaillet,Anne Laurent,Maguelonne Teisseire +2 more
- 01 May 2006





