Fuzzy Multi-task Learning for Hate Speech Type Identification

doi:10.1145/3308558.3313546

Open AccessProceedings Article10.1145/3308558.3313546

Fuzzy Multi-task Learning for Hate Speech Type Identification

Han Liu, +3 more

- 13 May 2019

- pp 3006-3012

39

TL;DR: A novel formulation of the hate speech type identification problem in the setting of multi-task learning through the proposed fuzzy ensemble approach and an experimental study on identification of four types of hate speech, namely: religion, race, disability and sexual orientation are reported.

Abstract: In traditional machine learning, classifiers training is typically undertaken in the setting of single-task learning, so the trained classifier can discriminate between different classes. However, this must be based on the assumption that different classes are mutually exclusive. In real applications, the above assumption does not always hold. For example, the same book may belong to multiple subjects. From this point of view, researchers were motivated to formulate multi-label learning problems. In this context, each instance can be assigned multiple labels but the classifiers training is still typically undertaken in the setting of single-task learning. When probabilistic approaches are adopted for classifiers training, multi-task learning can be enabled through transformation of a multi-labelled data set into several binary data sets. The above data transformation could usually result in the class imbalance issue. Without the above data transformation, multi-labelling of data results in an exponential increase of the number of classes, leading to fewer instances for each class and a higher difficulty for identifying each class. In addition, multi-labelling of data is very time consuming and expensive in some application areas, such as hate speech detection. In this paper, we introduce a novel formulation of the hate speech type identification problem in the setting of multi-task learning through our proposed fuzzy ensemble approach. In this setting, single-labelled data can be used for semi-supervised multi-label learning and two new metrics (detection rate and irrelevance rate) are thus proposed to measure more effectively the performance for this kind of learning tasks. We report an experimental study on identification of four types of hate speech, namely: religion, race, disability and sexual orientation. The experimental results show that our proposed fuzzy ensemble approach outperforms other popular probabilistic approaches, with an overall detection rate of 0.93.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Table 4: Topics detected from hate speech instances in four data sets using LDA

Table 3: Correlation analysis between different types of hate speech

Table 1: Detection rate on four types of hate speech

Table 2: Irrelevance rate on four types of hate speech

Figure 1: Trapezoidal Membership Function [19]

Citations

•Journal Article•10.3390/info13060273

A Literature Review of Textual Hate Speech Detection Methods and Datasets

Fatimah Alkomah, +1 more

- 26 May 2022

- Information

TL;DR: This study shows several approaches that do not provide consistent results in various hate speech categories and shows that many datasets are small in size and are not reliable for various tasks of hate speech detection.

...read moreread less

111

Journal Article•10.1016/J.COSREV.2020.100311

Machine learning techniques for hate speech classification of twitter data: State-of-the-art, future challenges and research directions

Femi Emmanuel Ayo, +3 more

- 01 Nov 2020

- Computer Science Review

TL;DR: The results showed that the developed system is very good for automatic topic detection and categorization, and indicates a more perfect test having an AUC of 0.97, when compared to similar methods.

...read moreread less

92

•Posted Content

Transfer Learning for Hate Speech Detection in Social Media

Marian-Andrei Rizoiu, +3 more

- 10 Jun 2019

- arXiv: Social and Information Networks

TL;DR: Developing automated text analytics methods, capable of jointly learning a single representation of hate from several smaller, unrelated data sets, that enables generating an interpretable two-dimensional text visualization called the Map of Hate that is capable of separating different types of hate speech and explaining what makes text harmful.

...read moreread less

59

•Proceedings Article•10.1145/3442442.3452313

A Comparative Study of Using Pre-trained Language Models for Toxic Comment Classification

Zhixue Zhao, +2 more

- 19 Apr 2021

TL;DR: This paper used pre-trained language model-based methods for toxic comment classification and found that using a basic linear downstream structure outperforms complex ones such as CNN and BiLSTM.

...read moreread less

56

Journal Article•10.1007/s00530-023-01051-8

A literature survey on multimodal and multilingual automatic hate speech identification

Anusha Chhabra, +1 more

- 20 Jan 2023

- Multimedia Systems

TL;DR: This survey presents a comprehensive analysis of hate speech definitions along with the motivation for detection and standard textual analysis methods that play a crucial role in identifying hate speech.

...read moreread less

35

...

Expand

References

•Book

C4.5: Programs for Machine Learning

J. Ross Quinlan

- 15 Oct 1992

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.

...read moreread less

27.2K

•Book

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

Nello Cristianini, +1 more

- 01 Jan 2000

TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.

...read moreread less

15K

Programs for Machine Learning

Steven L. Salzberg, +1 more

- 01 Jan 1994

TL;DR: In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.

...read moreread less

9.4K

•Book

An Introduction to Support Vector Machines

Nello Cristianini, +1 more

- 01 Mar 2000

TL;DR: This book is the first comprehensive introduction to Support Vector Machines, a new generation learning system based on recent advances in statistical learning theory, and introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods.

...read moreread less

8.2K

Journal Article•10.1109/TKDE.2013.39

A Review On Multi-Label Learning Algorithms

Min-Ling Zhang, +1 more

- 01 Aug 2014

- IEEE Transactions on Knowledge and Data ...

TL;DR: This paper aims to provide a timely review on this area with emphasis on state-of-the-art multi-label learning algorithms with relevant analyses and discussions.

...read moreread less

3.4K

...

Expand

Fuzzy Multi-task Learning for Hate Speech Type Identification

Chat with Paper

AI Agents for this Paper

Figures

Citations

A Literature Review of Textual Hate Speech Detection Methods and Datasets

Machine learning techniques for hate speech classification of twitter data: State-of-the-art, future challenges and research directions

Transfer Learning for Hate Speech Detection in Social Media

A Comparative Study of Using Pre-trained Language Models for Toxic Comment Classification

A literature survey on multimodal and multilingual automatic hate speech identification

References

C4.5: Programs for Machine Learning

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

Programs for Machine Learning

An Introduction to Support Vector Machines

A Review On Multi-Label Learning Algorithms

Related Papers (5)

Multi-task learning for intelligent data processing in granular computing context

Inductive multi-task learning with multiple view data

Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter

A Fuzzy Approach to Text Classification With Two-Stage Training for Ambiguous Instances

Heterogeneous transfer learning with RBMs