Open AccessPosted Content
Arabic Spelling Correction using Supervised Learning
TL;DR: This work addresses the problem of spelling correction in the Arabic language utilizing the new corpus provided by QALB (Qatar Arabic Language Bank) project which is an annotated corpus of sentences with errors and their corrections.
read more
Abstract: In this work, we address the problem of spelling correction in the Arabic language utilizing the new corpus provided by QALB (Qatar Arabic Language Bank) project which is an annotated corpus of sentences with errors and their corrections The corpus contains edit, add before, split, merge, add after, move and other error types We are concerned with the first four error types as they contribute more than 90% of the spelling errors in the corpus The proposed system has many models to address each error type on its own and then integrating all the models to provide an efficient and robust system that achieves an overall recall of 059, precision of 058 and F1 score of 058 including all the error types on the development set Our system participated in the QALB 2014 shared task "Automatic Arabic Error Correction" and achieved an F1 score of 06, earning the sixth place out of nine participants
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Table 2: The feature set used by the add before errors classifier. 
Table 4: The results of some combinations of the models and applying them on the development set. The models are abbreviated as Edit E, Merge M, Split S, and Add before A. 
Table 3: The incremental results after adding each error type model and applying them on the development set. 
Table 1: The feature set used by the edit errors classifier.
Citations
The First QALB Shared Task on Automatic Text Correction for Arabic
Behrang Mohit,Alla Rozovskaya,Nizar Habash,Wajdi Zaghouani,Ossama Obeid +4 more
- 01 Oct 2014
TL;DR: An overview of the QALB corpus which was the source of the datasets used for training and evaluation, an overview of participating systems, results of the competition and an analysis of the results and systems are presented.
Arib$@$QALB-2015 Shared Task: A Hybrid Cascade Model for Arabic Spelling Error Detection and Correction
Nouf AlShenaifi,Rehab AlNefie,Maha Al-Yahya,Hend S. Al-Khalifa +3 more
- 01 Jul 2015
TL;DR: The Arib system for Arabic spelling error detection and correction is presented as part of the second Shared Task on Automatic Arabic Error Correction and results indicate that using the correction components in cascaded way yields the best results.
Automatic Correction of Arabic Dyslexic Text
Maha Alamri,William J. Teahan +1 more
TL;DR: An automatic correction system that detects and corrects dyslexic errors in Arabic text that uses a language model based on the Prediction by Partial Matching text compression scheme that generates possible alternatives for each misspelled word.
Improving Arabic morphological analyzers benchmark
TL;DR: Two new major improvements are presented: the establishment of the first version of the corpus that is dedicated to the evaluation of morphological analyzers, as well as the introduction of a new metric, which combines all metrics related to results aswell as the execution time of the analyzers.
10
References
•Posted Content
NLTK: The Natural Language Toolkit
Edward Loper,Steven Bird +1 more
TL;DR: NLTK, the Natural Language Toolkit, is a suite of open source program modules, tutorials and problem sets, providing ready-to-use computational linguistics courseware that covers symbolic and statistical natural language processing.
4.6K
•Proceedings Article
MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic
Arfath Pasha,Mohamed Al-Badrashiny,Mona Diab,Ahmed El Kholy,Ramy Eskander,Nizar Habash,Manoj Pooleery,Owen Rambow,Ryan M. Roth +8 more
- 01 May 2014
TL;DR: MADAMIRA is a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing with a more streamlined Java implementation that is more robust, portable, extensible, and is faster than its ancestors by more than an order of magnitude.
•Proceedings Article
Better Evaluation for Grammatical Error Correction
Daniel Dahlmeier,Hwee Tou Ng +1 more
- 03 Jun 2012
TL;DR: This work presents a novel method for evaluating grammatical error correction that is an algorithm for efficiently computing the sequence of phrase-level edits between a source sentence and a system hypothesis that achieves the highest overlap with the gold-standard annotation.
433
The First QALB Shared Task on Automatic Text Correction for Arabic
Behrang Mohit,Alla Rozovskaya,Nizar Habash,Wajdi Zaghouani,Ossama Obeid +4 more
- 01 Oct 2014
TL;DR: An overview of the QALB corpus which was the source of the datasets used for training and evaluation, an overview of participating systems, results of the competition and an analysis of the results and systems are presented.
Related Papers (5)
Maha Alamri,William J. Teahan +1 more
Majed Al-Jefri,Sabri Abdullah Mahmoud Mohammed +1 more
- 18 Feb 2014