Topic

Visual language

About: Visual language is a research topic. Over the lifetime, 1407 publications have been published within this topic receiving 18444 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

1 / 2

Papers

Book•

Introduction to Multimodal Analysis

[...]

David Machin

30 Mar 2007

TL;DR: Introduction to Multimodal Analysis is a unique and accessible textbook that clearly and critically explains this groundbreaking approach to visual analysis and outlines the tools for analysis and takes the reader through examples of analysis, providing a model that can be followed.

...read moreread less

Abstract: Introduction to Multimodal Analysis is a unique and accessible textbook that clearly and critically explains this groundbreaking approach to visual analysis. Each chapter outlines the tools for analysis and takes the reader through examples of analysis, providing a model that can then be followed. All visual media compositions, such as photographs, advertisements, newspapers and websites, are carefully designed. A photograph of a soldier, an advertisement for a car, a magazine cover or the opening titles to a news programme are thought out to create the appropriate effect. Designers use semiotic tools such as colour, framing, focus, positioning of elements and font style to communicate with the viewer. These choices make up a visual language that we can analyse. Multimodal analysis looks at the separate components of this language to build up a toolkit for analysing the grammar of visual design. The book includes an assessment of the claim that there is a visual grammar and important differences between images and language and the way they create meaning are identified. Including images throughout and a colour plate section, Introduction to Multimodal Analysis is an essential resource for students studying multimodality within visual communication in media and cultural studies, critical discourse analysis, journalism studies or linguistics.

...read moreread less

561 citations

Journal Article•10.1177/007327537601400301•

The Emergence of a Visual Language for Geological Science 1760—1840:

[...]

Martin Rudwick¹•Institutions (1)

VU University Amsterdam¹

01 Sep 1976-History of Science

483 citations

Posted Content•

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

[...]

Mohit Shridhar¹, Jesse Thomason¹, Daniel Gordon¹, Yonatan Bisk², Winson Han², Roozbeh Mottaghi², Luke Zettlemoyer¹, Dieter Fox³ - Show less +4 more•Institutions (3)

University of Washington¹, Allen Institute for Artificial Intelligence², Nvidia³

03 Dec 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that a baseline model based on recent embodied vision-and-language tasks performs poorly on ALFRED, suggesting that there is significant room for developing innovative grounded visual language understanding models with this benchmark.

...read moreread less

Abstract: We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. ALFRED includes long, compositional tasks with non-reversible state changes to shrink the gap between research benchmarks and real-world applications. ALFRED consists of expert demonstrations in interactive visual environments for 25k natural language directives. These directives contain both high-level goals like "Rinse off a mug and place it in the coffee maker." and low-level language instructions like "Walk to the coffee maker on the right." ALFRED tasks are more complex in terms of sequence length, action space, and language than existing vision-and-language task datasets. We show that a baseline model based on recent embodied vision-and-language tasks performs poorly on ALFRED, suggesting that there is significant room for developing innovative grounded visual language understanding models with this benchmark.

...read moreread less

465 citations

Journal Article•

The visual language of comics: introduction to the structure and cognition of sequential images

[...]

Neil Cohn¹•Institutions (1)

University of California, Berkeley¹

26 Jan 2015-Facta Universitatis Series: Linguistics and Literature

TL;DR: This chapter introduces Visual Language Section 1: Structure of Visual Language, which discusses the structure of visual language across the world and the role of language grammar in this structure.

...read moreread less

Abstract: Chapter 1. Introducing Visual Language SECTION 1: STRUCTURE OF VISUAL LANGUAGE Chapter 2. The Visual Lexicon, Part 1: Visual morphology Chapter 3. The Visual Lexicon, Part 2: Panels and Constructions Chapter 4. Visual Language Grammar: Narrative Structure Chapter 5. Navigation of External Compositional Structure Chapter 6. Cognition of Visual Language SECTION 2: VISUAL LANGUAGE ACROSS THE WORLD Chapter 7. American Visual Language Chapter 8. Japanese Visual Language Chapter 9. Central Australian Visual Language Chapter 10. The Principle of Equivalence

...read moreread less

456 citations

Proceedings Article•10.1109/CVPR42600.2020.01075•

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

[...]

Mohit Shridhar¹, Jesse Thomason¹, Daniel Gordon¹, Yonatan Bisk², Winson Han², Roozbeh Mottaghi², Luke Zettlemoyer¹, Dieter Fox³ - Show less +4 more•Institutions (3)

University of Washington¹, Allen Institute for Artificial Intelligence², Nvidia³

14 Jun 2020

TL;DR: Action Learning From Realistic Environments and Directives (ALFRED) as mentioned in this paper is a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks.

...read moreread less

Abstract: We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. ALFRED includes long, compositional tasks with non-reversible state changes to shrink the gap between research benchmarks and real-world applications. ALFRED consists of expert demonstrations in interactive visual environments for 25k natural language directives. These directives contain both high-level goals like “Rinse off a mug and place it in the coffee maker.” and low-level language instructions like “Walk to the coffee maker on the right.” ALFRED tasks are more complex in terms of sequence length, action space, and language than existing vision- and-language task datasets. We show that a baseline model based on recent embodied vision-and-language tasks performs poorly on ALFRED, suggesting that there is significant room for developing innovative grounded visual language understanding models with this benchmark.

...read moreread less

408 citations

...

Expand

Performance Metrics

1,564

Papers

8,268

Citations

No. of papers in the topic in previous years
Year	Papers
2026	3
2025	31
2024	41
2023	30
2022	38
2021	41

Visual language

Topic Tools

Papers published on a yearly basis

Papers

Introduction to Multimodal Analysis

The Emergence of a Visual Language for Geological Science 1760—1840:

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

The visual language of comics: introduction to the structure and cognition of sequential images

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Related Topics (5)

Performance Metrics