Sentiment Knowledge Enhanced Self-supervised Learning for Multimodal Sentiment Analysis

Proceedings Article

Sentiment Knowledge Enhanced Self-supervised Learning for Multimodal Sentiment Analysis

pp 12966-12978

6

TL;DR: Sentiment Knowledge Enhanced Self-supervised Learning (SKESL) is proposed to capture common sentimental patterns in unlabeled videos, which facilitates further learning on limited labeled data and achieves new State-Of-The-Art (SOTA) results.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1007/s13278-025-01461-8

Generalizing sentiment analysis: a review of progress, challenges, and emerging directions

Khaled Alahmadi, +3 more

- 28 Apr 2025

- Social Network Analysis and Mining

4

Journal Article•10.1109/lsp.2024.3359570

Capturing High-Level Semantic Correlations via Graph for Multimodal Sentiment Analysis

Fan Qian, +4 more

- IEEE Signal Processing Letters

TL;DR: C capsule networks are introduced to construct high-level semantic nodes in a graph, uncovering deep sentimental structures in multimodal sentiment analysis and the learnable adjacency matrices are employed to construct edges of graph, thus adaptively learning the relations between nodes.

...read moreread less

4

Journal Article•10.48550/arxiv.2409.07388

Recent Trends of Multimodal Affective Computing: A Survey from NLP Perspective

Guimin Hu, +7 more

- 11 Sep 2024

TL;DR: This survey presents recent trends in multimodal affective computing from an NLP perspective, covering four tasks, formalizing tasks, and discussing technical approaches, challenges, and future directions in analyzing human behaviors and intentions through text and multimodal data.

...read moreread less

Journal Article•10.1109/icme59968.2025.11210018

A Multi-Grained Perception Model for Sentiment Analysis with Perceived Contrastive Focal Loss

Jin Wei, +5 more

- 30 Jun 2025

TL;DR: This study proposes MGSA1, a multi-grained perception model for multimodal sentiment analysis, addressing imbalanced modality discrimination and sample distributions with a novel MCP module and Perceived Contrastive Focal loss, outperforming baselines on MOSI and MOSEI datasets.

...read moreread less

Journal Article•10.1145/3689062.3689375

AMTN: Attention-Enhanced Multimodal Temporal Network for Humor Detection

Yangyang Xu, +8 more

- 28 Oct 2024

References

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

- 03 Dec 2019

- arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

25.9K

•Posted Content

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Victor Sanh, +3 more

- 02 Oct 2019

- arXiv: Computation and Language

TL;DR: This work proposes a method to pre-train a smaller general-purpose language representation model, called DistilBERT, which can be fine-tuned with good performances on a wide range of tasks like its larger counterparts, and introduces a triple loss combining language modeling, distillation and cosine-distance losses.

...read moreread less

7.3K

•Proceedings Article•10.25080/MAJORA-7B98E3ED-003

librosa: Audio and Music Signal Analysis in Python

Brian McFee, +6 more

- 01 Jan 2015

TL;DR: A brief overview of the librosa library's functionality is provided, along with explanations of the design goals, software development practices, and notational conventions.

...read moreread less

2.9K

•Proceedings Article•10.18653/V1/P19-1139

ERNIE: Enhanced Language Representation with Informative Entities

Zhengyan Zhang, +5 more

- 17 May 2019

TL;DR: This paper utilizes both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE) which can take full advantage of lexical, syntactic, and knowledge information simultaneously, and is comparable with the state-of-the-art model BERT on other common NLP tasks.

...read moreread less

1.8K

...

Expand