Scispace (Formerly Typeset)
  1. Home
  2. Conferences
  3. Artificial Intelligence and Natural Language
  4. 2016
  1. Home
  2. Conferences
  3. Artificial Intelligence and Natural Language
  4. 2016
Showing papers presented at "Artificial Intelligence and Natural Language in 2016"
Proceedings Article•
Lexical, morphological and semantic correlates of the dark triad personality traits in russian facebook texts

[...]

Polina Panicheva, Yanina Ledovaya, Olga Bogolyubova1•
Clarkson University1
1 Nov 2016
TL;DR: Morphological and semantic analysis are applied to investigate the relationship between the Dark traits and their linguistic manifestation in social network texts and identify correlated features, a step towards automatic Dark trait prediction and early detection of the potentially harmful mental states.
Abstract: The presented project is intended to make use of growing amounts or textual data in social networks in the Russian language, In order to Hnd Ungulstlc correlates of the Dark Triad personality traits, comprising non-clinical Nareissism, Machiavellianism and Psychopathy. The baekgronnd for the ilwestigation includes, on the one haotl, psychological research on these phenomena and their measurement instruments, and on the other haod, recent advaoces In computational stylometry and text-based author profiling. The measures for these psychological phenomena are provided by recognized self-report psychological surveys adapted to Russian. Morphological and semantic analysis are applied to investigate the relationship between the Dark traits and their linguistic manifestation in social network texts. Slgnlflcant morphological and semantic correlates of Narcissism, MachlavelUanlsm and Psychopathy are ldentllled and compared to respective advaoces In Engltsh author proftUng. In order to deepen our underslanding of the relation between these psychological characteristics aod natural language use, the identified linguistic features are Interpreted In terms of the line-grained factor structure of the Dark traits. Identifying correlated features is a step towards automatic Dark trait prediction aod early detection of the potentially harmful mental states.

20 citations

Proceedings Article•
Measuring influencers in twitter ad-hoc discussions: active users vs. internal networks in the discourse on biryuliovo bashings in 2013

[...]

Svetlana S. Bodrunova1, Ivan S. Blekanov1, Alexey Maksimov1•
Saint Petersburg State University1
1 Nov 2016
TL;DR: It is shown that users who post or even get commented most do not make it to the positions of most 'central' users by network metrics, and it is demonstrated that usen that rank high by betweenness and pagerank centn form circles of reciprocal commenting that show the social cleavage wider than the discussion itselt.
Abstract: Despite disputable possibility of extension of analysis of social relations on Twitter to real life, Twitter discussions are stiU being under attention of scholars studying structures and meanings of news- and issue-based ad-hoc public discourse. One of the socially relevant aspects of Twitter studies is that of influencers - accounts that produce impact, either inside or outside Twitter. But there is still no agreement in the research community on how to defme and measure who is an inDuencer: either by 'absolute figUres' or by network analysis metrics; this issue is even rarely discussed. Politically, today's mediatized pub6c sphere where traditional media play the role of information hubs is highly uneven in terms of auess to opinion expression; it privileges institutional players, including political elites, corporations, and media themselves. Hopes that Twitter would provide a more equal space for public deliberation are still not proven weD enough. Using web crawling and manual assessment of Twitter ad-hoc discussion on the Biryulyovo bashings of 2013, we show that users who post or even get commented most do not make it to the positions of most 'central' users by network metrics. We also demonstrate that usen that rank high by betweenness and pagerank centn.lity form circles of reciprocal commenting that show the social cleavage wider than the discussion itselt

16 citations

Proceedings Article•
Predicting the age of social network users from user-generated texts with word embeddings

[...]

Anton Alekseev, Sergey I. Nikolenko1•
Steklov Mathematical Institute1
1 Nov 2016
TL;DR: The efficiency of age prediction algorithms based on word2vec word embeddings are evaluated and a comprehensive experimental evaluation is conducted, comparing these algorithms with each other and with classical baseline approaches.
Abstract: Many web-based applications such as advertising or recommender systems often critically depend on the demographic information, which may be unavailable for new or anonymous users. We study the problem of predicting demographic information based on user-generated texts on a Russian-language dataset from a large social network. We evaluate the efficiency of age prediction algorithms based on word2vec word embeddings and conduct a comprehensive experimental evaluation, comparing these algorithms with each other and with classical baseline approaches.

12 citations

Proceedings Article•
Multiword expressions in russian thesauri RuThes and RuWordnet

[...]

Natalia V. Loukachevitch1, German Lashevich2•
Bauman Moscow State Technical University1, Kazan Federal University2
1 Nov 2016
TL;DR: All the described expressions may look like compositiomd expressions but have specific relations that can be useful in appllcatlons and it is proposed to automatically introduce additional relations for their better representation.
Abstract: We present the types or multiword expressions included into the thesaurus or Russian language RuThes. Maoy of these expressions may look like compositiomd expressions but have specific relations that can be useful in appllcatlons. The rela· tion system or the RuThes thesaurus allows natural description of relations between an expression and its components if necessary. Transforming the RnThes knowledge into the Princeton WordNet structure for creating Russian wordnet (RuWordNet), we tronsfer also all the described expressions into the new resource and propose to automatically introduce additional relations for their better representation.

11 citations

Proceedings Article•
Improving neural network models for natural language processing in russian with synonyms

[...]

Ruslan Galinsky1, Anton Alekseev1, Sergey I. Nikolenko1•
Steklov Mathematical Institute1
1 Nov 2016
TL;DR: This work suggests a dala augmentation method based on extending a given dataset with synonyms for the words appearing there and applies this approach to the morphologically rich Russian language and shows improvements for modem neural network NLP models on standard tasks such as sentiment analysis.
Abstract: Recent advances in deep leaming for natural language processing achieve and improve over state of the art results in many natural language processing tasks. One problem with neural network models, however, is that they require large datasets, including large labeled datasets for the corresponding problems. In this work, we suggest a dala augmentation method based on extending a given dataset with synonyms for the words appearing there. We apply this approach to the morphologically rich Russian language and show improvements for modem neural network NLP models on standard tasks such as sentiment analysis.

10 citations

Proceedings Article•
Towards cluster validity index evaluation and selection

[...]

Andrey Filchenkov1, Sergey Muravyov1, Vladimir Parfenov1•
Saint Petersburg State University of Information Technologies, Mechanics and Optics1
1 Nov 2016
TL;DR: This work introduces four quality measures for CVI evaluation and suggests an approach for the best CVI predietion for a given dataset based on meta-lesrning.
Abstract: In this work, we address the hard clustering problem. We study how well clustering algorithm efficacy measures (clustering validity indices) cao rellect the clustering quality. We use assessors' estimations for cluster partition adequacy as the ground truth and explain, why tbis is the only measure that cao be used in tbis quality. We compare different clustering validity indices and show that none of them can be the universal, relleeting quality for each cluster partition. To do so, we introduce four quality measures for CVI evaluation. Also, we suggest an approach for the best CVI predietion for a given dataset based on meta-lesrning.

10 citations

Proceedings Article•
A general method applicable to the search for anglicisms in russian social network texts

[...]

Alena Fenogenova1, Ilia Karpov1, Viktor Kazorin•
National Research University – Higher School of Economics1
1 Nov 2016
TL;DR: A corpora-based approach to the automatic detection of anglicisms in Russian social network texts based on the idea of simultaneous scripting, phonetics, and semantics similarity of the original Latin word and its Cyrillic analogue is presented.
Abstract: In the process of globalization, the number of English words in other languages has rapidly increased. In automatic speech recognition systems, spell-checking, tagging, and other software in the field of natural language processing, loan words are not easily recognized and should be evaluated separately. In this paper we present a corpora-based approach to the automatic detection of anglicisms in Russian social network texts. Proposed method is based on the idea of simultaneous scripting, phonetics, and semantics similarity of the original Latin word and its Cyrillic analogue. We used a set of transliteration, phonetic transcribing, and morphological analysis methods to find possible hypotheses and distributional semantic models to filter them. Resulting list of borrowings, gathered from approximately 20 million LiveJoumal texts, shows good intenection with manually eoUe

4 citations

Proceedings Article•
Speech analysis and synthesis systems for the tatar language

[...]

Aidar Khusainov1, Alfira Khusainova2•
Russian Academy of Sciences1, Kazan Federal University2
1 Nov 2016
TL;DR: The work consists of three main elements: speech recognition system, speech synthesizer and language identification system that will be used in mobile and desktop applications, for instance, machine translation system, smart assistant.
Abstract: In this paper we describe our recent work of creation speech human-machine interface for the Ta1ar language. Our work consists of three main elements: speech recognition system, speech synthesizer and language identification system. These systems will be used in mobile and desktop applications, for instance, machine translation system, smart assistant.

2 citations

Proceedings Article•
Relational machine learning author disambiguation

[...]

Ekaterina Bastrakova1, Rodney Ledesma1, Jose Millan1, Fabien Rico2, Djamel A. Zighed •
Lumière University Lyon 21, Claude Bernard University Lyon 12
1 Nov 2016
TL;DR: A workflow that correctly disambiguates different instances of an author's name present in academic publications retrieved from the Internet is implemented using the best of a relational database engine and data mining techniques implemented in R.
Abstract: Author disambiguation is an open issue in the world of academic digital libraries. As many problems arise when trying to identify if two different signatures are from the same author and than group them, this issue has become more relevant inside the scientific community. This paper illustrates a workflow that aims to solve this issue. By using the best of a relational database engine and data mining techniques implemented in R, we have implemented a workflow that correctly disambiguates different instances of an author's name present in academic publications retrieved from the Internet. To evaluate he results we perform a two=step-validation process inside the workflow, validating if two articles were written by the same author, and if so, validating the authors grouped together as unique disambiguated author. With the validations performed, the workflow implemented allows the process of identifying and disambiguating any new author.

1 citations

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve