Improving Systematic Generalization Through Modularity and Augmentation

Journal Article

Improving Systematic Generalization Through Modularity and Augmentation

Laura Ruis, +1 more

- 22 Feb 2022

- arXiv.org

- Vol. abs/2202.10745

10

TL;DR: This work investigates how two well-known modeling principles— modularity and data augmentation—affect systematic generalization of neural networks in grounded language learning and analyzes how large the vocabulary needs to be to achieve system- atic generalization and how similar the augmented data needs toBe to the problem at hand.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Figure 1: This figure depicts the two tests of adverb compositionality in gSCAN. Figure (a) denotes a few-shot learning test; a model has access to few (k) examples of how the adverb “cautiously” translates to an output sequence and needs to generalize to all other examples. Figure (b) denotes the “pull while spinning”-test; reminiscent of the “cycle cautiously”-example, a model learns all examples of pushing while spinning or walking while spinning, and is tested on its ability to interpret “pull while spinning”.

Figure 2: The input command (“Push a circle cautiously.”) and world state are processed by different modules, each dealing with a different question about the input task. The final output is produced by the transformation module. *: cautious is in reality not a primitive action but a sequence of “turn left turn right turn right turn left”

Citations

Journal Article•10.48550/arxiv.2310.10899

Instilling Inductive Biases with Subnetworks

Enyan Zhang, +2 more

- 17 Oct 2023

- arXiv.org

TL;DR: This work discovers a functional subnetwork that implements a particular subtask within a trained model and uses it to instill inductive biases towards solutions utilizing that subtask, and demonstrates its effectiveness with two experiments.

...read moreread less

2

Journal Article•10.48550/arXiv.2305.12169

Learn to Compose Syntactic and Semantic Representations Appropriately for Compositional Generalization

Shuangtao Li, +5 more

- 20 May 2023

- arXiv.org

TL;DR: The authors propose Composition (Compose Syntactic and Semantic Representations), an extension to Seq2Seq models to learn to compose representations of different encoder layers appropriately for generating different keys and values passing into different decoder layers through introducing a composed layer between the encoder and decoder.

...read moreread less

1

Meta-learning from relevant demonstrations can improve compositional generalization

TL;DR: The proposed architecture can significantly improve the generalization capabilities of the agent on one of the most difficult gSCAN splits: the “adverb-to-verb” Split H.

...read moreread less

Meta-learning from relevant demonstrations improves compositional generalization

Sam Spilsbury, +1 more

TL;DR: In this article , a meta-sequence-to-sequence learning approach is proposed to improve the generalization of language-instructed agents in gSCAN, where the agent receives as a context a few examples of pairs of instructions and action trajectories in a given instance of the environment (a support set) and it is tasked to predict an action sequence for a query instruction for the same environment instance.

...read moreread less

Journal Article•10.48550/arXiv.2305.13092

Improved Compositional Generalization by Generating Demonstrations for Meta-Learning

Sam Spilsbury, +1 more

- 22 May 2023

- arXiv.org

TL;DR: The authors consider a grounded language learning problem (gSCAN) where good support examples for certain test splits might not even exist in the training data, or would be infeasible to search for.

...read moreread less

References

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Journal Article•10.1186/S40537-019-0197-0

A survey on Image Data Augmentation for Deep Learning

Connor Shorten, +1 more

- 06 Jul 2019

- Journal of Big Data

TL;DR: This survey will present existing methods for Data Augmentation, promising developments, and meta-level decisions for implementing DataAugmentation, a data-space solution to the problem of limited data.

...read moreread less

10.6K

Journal Article•10.2307/2184717

The modularity of mind

Robert Cummins, +1 more

- 06 Apr 1983

- The Philosophical Review

Abstract: This monograph synthesizes current information from the various fields of cognitive science in support of a new theory of mind. Most psychologists study horizontal processes like memory. Fodor postulates a vertical and modular psychological organization underlying biologically coherent behaviours. This view of mental architecture is consistent with the historical tradition of faculty psychology while integrating a computational approach to mental processes. One of the most notable aspects of Fodor’s work is that it articulates features not only of speculative cognitive architecture but also of current research in artificial intelligence. – Part I. Four accounts of mental structure; – Part II. A functional taxonomy of cognitive mechanisms; – Part III. Input systems as modules; – Part IV. Central systems; – Part V. Caveats and conclusions. M.-M. V.

...read moreread less

7.6K

Journal Article•10.1016/0010-0277(88)90031-5

Connectionism and cognitive architecture: a critical analysis

Jerry A. Fodor, +1 more

- 09 Sep 1988

- Cognition

TL;DR: Differences between Connectionist proposals for cognitive architecture and the sorts of models that have traditionally been assumed in cognitive science are explored and the possibility that Connectionism may provide an account of the neural structures in which Classical cognitive architecture is implemented is considered.

...read moreread less

3.9K

•Posted Content

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Justin Johnson, +5 more

- 20 Dec 2016

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a diagnostic dataset that tests a range of visual reasoning abilities and uses this dataset to analyze a variety of modern visual reasoning systems, providing novel insights into their abilities and limitations.

...read moreread less

1.7K

...

Expand