Learning Sentence-Level Representations with Predictive Coding

doi:10.3390/make5010005

Open AccessJournal Article10.3390/make5010005

Learning Sentence-Level Representations with Predictive Coding

Vladimir Araujo, +2 more

- 09 Jan 2023

- Machine learning and knowledge extractio...

- Vol. 5, Iss: 1, pp 59-77

5

TL;DR: The authors extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks and conduct extensive experimentation with various benchmarks for the English and Spanish languages.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1002/rrq.70069

The Linguistic Pathways Model: Capturing the Multiple Dimensions of Reading Development

Xiuhong Tong, +3 more

- 01 Oct 2025

- Reading Research Quarterly

Abstract: ABSTRACT The importance of oral language skills in reading comprehension is widely recognized in contemporary models. Building on this foundation, we propose the Linguistic Pathways Model. In this model, we illuminate mechanistic and developmental detail by which individual components of oral language support reading comprehension and embrace the multiple dimensions across which reading development plays out. This is the level of theoretical detail needed to inform instruction in the classroom that is most likely to propel children on strong trajectories of reading development. We illustrate the value of this model by focusing on syntactic skills—the ability to understand and manipulate sentence structure. We hypothesize two core pathways by which syntactic skills impact reading comprehension. In the syntax‐to‐lexicon pathway, syntactic skills influence how readers construct lexical representations, ultimately impacting reading comprehension. In the syntax‐to‐sentence pathway, syntactic skills affect reading comprehension by shaping how readers parse sentences and generate predictions about upcoming information. In each, we elaborate on mechanisms of these influences. We also detail the nature of developmental effects, including changes in relative reliance on skills over time and the temporal order of effects, and the interactions between the two. This work provides a new theoretical model for understanding the precise pathways through which individual oral language skills contribute to reading comprehension development, making predictions that are testable in classrooms.

...read moreread less

Journal Article•10.1007/978-981-97-2550-2_33

Next-Gen Language Mastery: Exploring Advances in Natural Language Processing Post-transformers

Mily Lal, +6 more

- 01 Jan 2024

Journal Article•10.48550/arxiv.2409.00070

Learning to Plan Long-Term for Language Modeling

Florian Mai, +2 more

- 23 Aug 2024

TL;DR: Researchers propose a planner for language models to predict long-term text continuations by sampling multiple plans, conditioning the model on a distribution of text continuations, and trading computation time for improved next token prediction accuracy.

...read moreread less

Journal Article•10.48550/arxiv.2407.01948

Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation

Pablo Messina, +4 more

- 02 Jul 2024

TL;DR: This study presents a two-stage framework leveraging large language models and medical knowledge to enhance radiological text representation, outperforming state-of-the-art methods in tasks like sentence ranking and label extraction from radiology reports.

...read moreread less

Preprint•10.48550/arxiv.2405.19954

GenKubeSec: LLM-Based Kubernetes Misconfiguration Detection, Localization, Reasoning, and Remediation

Ehud Malul, +4 more

- 30 May 2024

TL;DR: GenKubeSec is an LLM-based method for detecting, localizing, reasoning about, and remediating KCF misconfigurations. It achieves high precision and recall, and provides detailed explanations for misconfigurations.

...read moreread less

References

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

•Proceedings Article•10.3115/V1/D14-1179

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

- 01 Jan 2014

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

28.6K

•Proceedings Article•10.18653/V1/D19-1410

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Nils Reimers, +1 more

- 14 Aug 2019

TL;DR: Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.

...read moreread less

12K

•Proceedings Article•10.18653/V1/N18-1202

Deep contextualized word representations

Matthew E. Peters, +6 more

- 15 Feb 2018

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).

...read moreread less

11.7K

•Proceedings Article

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Zhilin Yang, +5 more

- 19 Jun 2019

TL;DR: The authors proposes XLNet, a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT The authors.

...read moreread less

6.1K