TL;DR: This paper reports about the development of a Named Entity Recognition system for South and South East Asian languages, particularly for Bengali, Hindi, Telugu, Oriya and Urdu as part of the IJCNLP-08 NER Shared Task 1.
Abstract: This paper reports about the development of a Named Entity Recognition (NER) system for South and South East Asian languages, particularly for Bengali, Hindi, Telugu, Oriya and Urdu as part of the IJCNLP-08 NER Shared Task 1 . We have
TL;DR: In this article, the authors investigated the relationship between phonological awareness and reading in Oriya and English and found that phonological knowledge contributed significantly to reading Oriya words and pseudo-words for the children in the Oriya medium schools.
Abstract: This study investigated the relationships between phonological awareness and reading in Oriya and English. Oriya is the official language of Orissa, an eastern state of India. The writing system is an alphasyllabary. Ninety-nine fifth grade children (mean age 9 years 7 months) were assessed on measures of phonological awareness, word reading and pseudo-word reading in both languages. Forty-eight of the children attended Oriya-medium schools where they received literacy instruction in Oriya from grade 1 and learned English from grade 2. Fifty-one children attended English-medium schools where they received literacy instruction in English from grade 1 and in Oriya from grade 2. The results showed that phonological awareness in Oriya contributed significantly to reading Oriya and English words and pseudo-words for the children in the Oriya-medium schools. However, it only contributed to Oriya pseudo-word reading and English word reading for children in the English-medium schools. Phonological awareness in English contributed to English word and pseudo-word reading for both groups. Further analyses investigated the contribution of awareness of large phonological units (syllable, onsets and rimes) and small phonological units (phonemes) to reading in each language. The data suggest that cross-language transfer and facilitation of phonological awareness to word reading is not symmetrical across languages and may depend both on the characteristics of the different orthographies of the languages being learned and whether the first literacy language is also the first spoken language.
TL;DR: The Oriya Language Movement as mentioned in this paper was a resistance movement against Bengali education in the Indian state of Orissa between 1868 and 1870 in the context of the colonial controversy over language policy between Orientalists who claimed that vernacular languages were best for this purpose, and Anglicists, who favoured English.
Abstract: This article discusses the Oriya Language Movement, which was active between 1868 and 1870 in the Indian state of Orissa in the context of the colonial controversy over language policy between Orientalists, who claimed that vernacular languages were best for this purpose,and Anglicists, who favoured English. In the Orissa division, there were only seven Oriya schoolteachers; Bengalis formed the majority of teachers,even in remote areas. Consequently, Bengali books were prescribed textbooks for Oriya children. Emulating the Anglicists, the Bengalis made an effort to institutionalise Bengali medium education. After the Na'anka Famine in 1866, a resistance movement arose. It demanded that jobs be reserved for natives and that Oriya children read books in Oriya and not Bengali. It succeeded in dislodging Bengali from controlling schools in 1870. This victory of the native Oriya over the neo-colonising Bengali can be interpreted as a victory for Orientalism, with its tenet of vernacular education.
TL;DR: During this process, the PPR Language Modelling concept for four major Indian languages like Hindi, Bengali, Oriya, and Telugu is analysed and the results are quite appreciable.
Abstract: Indian Languages are Indo-Aryan being influenced by Sanskrit or Dravidian being influenced by Tamil. Dravidian Languages have the influence of Sanskrit also. All Indian Languages have the influence of Pali language for which the graphemes are being influenced Brahmi. All the Indian languages are phonetic in nature. Every Indian language has its distinctive phone sets. North Indian languages are IndoAryan and South Indian Languages are Dravidian. Considering their respective Phonetic properties during speaking we have tried to consider the special CV behaviour of the language in their syllables and are able to identify the Language analysing it with the limited training data set available using the SVM Classifier. During this process we have analysed the PPR Language Modelling concept for four major Indian languages like Hindi, Bengali, Oriya, and Telugu and the results are quite appreciable. General Terms Spoken Language Identification, Speech Processing, Support Vector Machine