Conference
Knowledge and Systems Engineering
About: Knowledge and Systems Engineering is an academic conference. The conference publishes majorly in the area(s): Computer science & Vietnamese. Over the lifetime, 764 publications have been published by the conference receiving 3877 citations.
Papers
1 Oct 2017
TL;DR: A Vietnamese corpus is introduced, which collected comments/reviews from Vietnamese commercial web pages and was annotated by three human annotators and the proposed model outperforms SVM, LSTM, and CNN on the two datasets.
Abstract: Convolutional neural network (CNN) and Long Short Term Memory (LSTM) have shown the state of the art results for sentiment analysis in English corpus. However, there are not many studies of this approach for Vietnamese corpus. In our work, CNN and LSTM are employed to generate information channels for Vietnamese sentiment analysis. Because each deep learning model (e.g. CNN, LSTM) has a particular advantage, this scenario provides a novel and efficient way for integrating the advantages of CNN and LSTM. In addition, we introduced a Vietnamese corpus, which collected comments/reviews from Vietnamese commercial web pages and was annotated by three human annotators. We evaluated our approach on our corpus and VLSP corpus. According to the experimental results, the proposed model outperforms SVM, LSTM, and CNN on the two datasets.
92 citations
1 Oct 2017
TL;DR: Recent advances in deep learning are applied to propose effective deep Convolutional Neural Networks that can accurately interpret semantic information available in faces in an automated manner without hand-designing of features descriptors, which makes the proposed networks well suitable for real-time systems.
Abstract: Facial expressions convey non-verbal information between humans in face-to-face interactions. Automatic facial expression recognition, which plays a vital role in human-machine interfaces, has attracted increasing attention from researchers since the early nineties. Classical machine learning approaches often require a complex feature extraction process and produce poor results. In this paper, we apply recent advances in deep learning to propose effective deep Convolutional Neural Networks (CNNs) that can accurately interpret semantic information available in faces in an automated manner without hand-designing of features descriptors. We also apply different loss functions and training tricks in order to learn CNNs with a strong classification power. The experimental results show that our proposed networks outperform state-of-the-art methods on the well-known FERC-2013 dataset provided on the Kaggle facial expression recognition competition. In comparison to the winning model of this competition, the number of parameters in our proposed networks intensively decreases, that accelerates the overall performance speed and makes the proposed networks well suitable for real-time systems.
79 citations
1 Nov 2018
TL;DR: The methods of building annotation guidelines and ensure the annotation accuracy and consistency of this corpus, a free and high-quality corpus for research on two different tasks: sentiment-based and topic-based classifications.
Abstract: Students’ feedback is a vital resource for the interdisciplinary research combining of two fields: sentiment analysis and education. To strengthen the sentiment analysis of the Vietnamese language which is a low-resource language, we build a Vietnamese Students’ Feedback Corpus (UIT-VSFC), a free and high-quality corpus for research on two different tasks: sentiment-based and topic-based classifications. In this paper, we present the methods of building annotation guidelines and ensure the annotation accuracy and consistency of this corpus. The resource consists of over 16,000 sentences which are human-annotated on the two tasks. To assess the quality of our corpus, we measure the inter-annotator agreements and classification accuracies on our UIT-VSFC. As a result, we achieved 91.20% of the inter-annotator agreement for the sentiment-based task and 71.07% of that for the topic-based task. In addition, the best results are of baseline model as the Maximum Entropy classifier with 87.94% and 84.03% of the overall F1-score of the sentiment-based and topic-based tasks respectively. These results illustrate that the corpus is reliable and helpful resource for research.
78 citations
24 Oct 2019
TL;DR: An algorithm that weights the sentiment score in terms of weight of hashtag and cleaned text to obtain the sentiment and an algorithm to train the Support Vector Machine, Deep Learning, and Naïve Bayes classifiers to process Twitter data.
Abstract: In the big data era, data is made in real-time or closer to real-time. Thus, businesses can utilize this evergrowing volume of data for the data-driven or information-driven decision-making process to improve their businesses. Social media, like Twitter, generates an enormous amount of such data. However, social media data are often unstructured and difficult to manage. Hence, this study proposes an effective text data preprocessing technique and develop an algorithm to train the Support Vector Machine (SVM), Deep Learning (DL) and Naive Bayes (NB) classifiers to process Twitter data. We develop an algorithm that weights the sentiment score in terms of weight of hashtag and cleaned text. In this study, we (i) compare different preprocessing techniques on the data collected from Twitter using various techniques such as (stemming, lemmatization and spelling correction) to obtain the efficient method (ii) develop an algorithm to weight the scores of the hashtag and cleaned text to obtain the sentiment. We retrieved N=1,314,000 Twitter data, and we compared the popularity of two products, Google Now and Amazon Alexa. Using our data preprocessing algorithm and sentiment weight score algorithm, we train SVM, DL, NB models. The results show that stemming technique performed best in terms of computational speed. Additionally, the accuracy of the algorithm was tested against manually sorted sentiments and sentiments produced before text data preprocessing. The result demonstrated that the impact produced by the algorithm was close to the manually annotated sentiments. In terms of model performance, the SVM performed better with the accuracy of 90.3%, perhaps, due to the unstructured nature of Twitter data. Previous studies used conventional techniques; hence, no precise methods were utilized on cleaning the text. Therefore, our approach confirms that proper text data preprocessing technique plays a significant role in the prediction accuracy and computational time of the classifier when using the unstructured Twitter data.
57 citations
1 Oct 2015
TL;DR: A system for automated classification of rice variety for rice seed production using computer vision and image processing techniques and it is demonstrated that the average accuracy of the classification system can reach 90.54% using Random Forest method with a simple feature extraction technique.
Abstract: This paper presents a system for automated classification of rice variety for rice seed production using computer vision and image processing techniques. Rice seeds of different varieties are visually very similar in color, shape and texture that make the classification of rice seed varieties at high accuracy challenging. We investigated various feature extraction techniques for efficient rice seed image representation. We analyzed the performance of powerful classifiers on the extracted features for finding the robust one. Images of six different rice seed varieties in northern Vietnam were acquired and analyzed. Our experiments have demonstrated that the average accuracy of our classification system can reach 90.54% using Random Forest method with a simple feature extraction technique. This result can be used for developing a computer-aided machine vision system for automated assessment of rice seeds purity.
54 citations
Performance Metrics
| Year | Papers |
|---|---|
| 2020 | 46 |
| 2019 | 103 |
| 2018 | 83 |
| 2017 | 51 |
| 2016 | 69 |
| 2015 | 140 |