Scispace (Formerly Typeset)
  1. Home
  2. Conferences
  3. Artificial Intelligence and Natural Language
  4. 2015
  1. Home
  2. Conferences
  3. Artificial Intelligence and Natural Language
  4. 2015
Showing papers presented at "Artificial Intelligence and Natural Language in 2015"
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382963•
Evaluation of the modern visual SLAM methods

[...]

Arthur Huletski1, Dmitriy Kartashov1, Kirill Krinkin•
Saint Petersburg Academic University1
1 Nov 2015
TL;DR: This paper compares the algorithms theoretically (based on given description) and evaluates them with TUM RGB-D benchmark and gives brief intuitive description of ORB-SLAM, LSD- SLAM, L-SlAM and OpenRatSLAM algorithms.
Abstract: Simultaneous Localization and Mapping (SLAM) is a challenging task in robotics. Researchers work hard on it, so several novel SLAM algorithms as well as enhancements for the known ones are published every year. We have selected recent (2013–mid. 2015) approaches that in theory can be run on mobile robot and evaluated it. This paper gives brief intuitive description of ORB-SLAM, LSD-SLAM, L-SLAM and OpenRatSLAM algorithms, then compares the algorithms theoretically (based on given description) and evaluates them with TUM RGB-D benchmark.

43 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382967•
Design and implementation Raspberry Pi-based omni-wheel mobile robot

[...]

Kirill Krinkin1, Elena Stotskaya, Yury Stotskiy2•
Saint Petersburg State Electrotechnical University1, EMC Corporation2
1 Nov 2015
TL;DR: Hardware design and control software for small size omni-directional wheels robot implemented for indoor testing SLAM algorithms is described.
Abstract: Nowadays simultaneous localization and mapping (SLAM) algorithms are being tested at least in two phases: software simulation and real hardware platform testing. This paper describes hardware design and control software for small size omni-directional wheels robot implemented for indoor testing SLAM algorithms.

14 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382966•
Recurrent neural network-based language modeling for an automatic Russian speech recognition system

[...]

Irina S. Kipyatkova1, Alexey Karpov2•
Saint Petersburg State University1, Russian Academy of Sciences2
1 Nov 2015
TL;DR: A research of recurrent neural network language models for N-best list rescoring for automatic continuous Russian speech recognition with relative word error rate reduction of 14% with respect to the baseline 3-gram model.
Abstract: In the paper, we describe a research of recurrent neural network language models for N-best list rescoring for automatic continuous Russian speech recognition. We tried recurrent neural networks with different number of units in the hidden layer. We achieved the relative word error rate reduction of 14% with respect to the baseline 3-gram model.

12 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382973•
Comparison of sentence similarity measures for Russian paraphrase identification

[...]

Ekaterina V. Pronoza1, Elena Yagunova1•
Saint Petersburg State University1
1 Nov 2015
TL;DR: The research disproves the supposition that it is more difficult to distinguish between precise and loose paraphrases than between loose paraphRases and non-paraphrases.
Abstract: In this paper we analyze and compare different types of sentence similarity measures applied to the problem of sentential paraphrase identification. We work with Russian, and all the experiments are conducted on the Russian paraphrase corpus we have collected from the news headlines (and are collecting at the moment). Apart from the similarity measures, we also analyze the corpus itself. As a result of the research we disprove the supposition that it is more difficult to distinguish between precise and loose paraphrases than between loose paraphrases and non-paraphrases. We also come up with the recommendations for the application of different similarity measures to identifying paraphrases derived from the news texts.

12 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382965•
Discovering text reuse in large collections of documents: A study of theses in history sciences

[...]

Anton Khritankov, Pavel V. Botov, Nikolay S. Surovenko, Sergey V. Tsarkov, Dmitriy V. Viuchnov, Yuri V. Chekhovich 
1 Nov 2015
TL;DR: Using algorithmic and statistical methods groups of highly connected theses with large amount of text reuse between them are discovered and works compiled from several other theses are located and point out sources of reuse.
Abstract: In this paper we investigate graphs of text reuse cases in scientific degree theses in history sciences (07.xx.xx of Russian Higher Attestation Committee topic codes). Using algorithmic and statistical methods we discovered groups of highly connected theses with large amount of text reuse between them. In addition we located works compiled from several other theses and point out sources of reuse.

12 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382975•
Morpho-syntactic parsing based on neural networks and corpus data

[...]

Roman Rybka1, Alexander Sboev1, Ivan Moloshnikov1, Dmitry Gudovskikh1•
Kurchatov Institute1
1 Nov 2015
TL;DR: Methods to construct procedure of morpho-syntactic parsing based on corpus dataset analyzes are presented, which includes a method of parsing sentences on the basis of neural network algorithms and a selected set of parameters in the format of used corpus.
Abstract: This article presents methods to construct procedure of morpho-syntactic parsing based on corpus dataset analyzes. It contains 1) the method to eliminate morphological ambiguities using existing morphological parsers and then converting the results of parsing into the format of the language corpus used; 2) a method of selecting parameters for syntactic parsing and assessment of the achievable accuracy of parsing, which can be provided by the data of the used corpus; 3) a method of parsing sentences on the basis of neural network algorithms and a selected set of parameters in the format of used corpus. The basis for this study are sentences with unambiguous morpho-syntactic marking from the Russian National Corpus.

8 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382969•
An information retrieval system for technology analysis and forecasting

[...]

Nikita Nikitinsky, Dmitry Ustalov1, Sergey Shashev•
Ural Federal University1
1 Nov 2015
TL;DR: A scientific information retrieval system designed for the Russian language that uses patents, research papers and government contracts for facilitating the expertise process by providing the experts with relevant documents is presented.
Abstract: Expert evaluation of grant proposals and research projects is often facilitated by specialized decision support systems, which analyze research and industry trends in a large domain-dependent text corpus. Despite that there exist production-grade technological forecasting systems for English, Russian patent databases and citation indexes had been developed isolated from the global ones. This complicates technology analysis and forecasting in research conducted in Russia. In this paper, we present a scientific information retrieval system designed for the Russian language. The system uses patents, research papers and government contracts for facilitating the expertise process by providing the experts with relevant documents. Comparison of our system with a popular baseline shows promising results.

6 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382972•
Communication between emergency medical system equipped with panic buttons and hospital information systems: Use case and interfaces

[...]

Ilya Paramonov1, Andrey Vasilyev1, Ivan Timofeev1•
Petrozavodsk State University1
1 Nov 2015
TL;DR: Identification of typical use case of communication between emergency medical services equipped with the “panic button” and healthcare information systems, and analysis of possible ways of organization of such a communication are devoted.
Abstract: For patients with a risk of out-of-hospital emergency situation quickness of the first aid provision is essential. Emergency medical services equipped with the “panic button” are aimed at reduction of the time of first aid provision. The further improvement of such services can be achieved by their communication with healthcare information systems deployed in hospitals. Such communication can be used to retrieve past medical history of the patient directly during the first aid provision, find an appropriate hospital for the patient's conveyance, automatically transmit the clinical handover information etc. This paper is devoted to identification of typical use case of communication between emergency medical services equipped with the “panic button” and healthcare information systems, and analysis of possible ways of organization of such a communication.

6 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382968•
Twitter as a transport layer platform

[...]

Dmitry Namiot1•
Moscow State University1
1 Nov 2015
TL;DR: This work introduces a programmable service called 411 for Twitter, which supports user-defined and application-specific commands through tweets, and describes the way information systems can use Twitter as a transport layer for own services.
Abstract: Internet messengers and social networks have become an integral part of modern digital life. We have in mind not only the interaction between individual users but also a variety of applications that exist in these applications. Typically, applications for social networks use the universal login system and rely on data from social networks. Also, such applications are likely to get more traction when they are inside of the big social network like Facebook. At the same time, less attention is paid to communication capabilities of social networks. In this paper, we target Twitter as a messaging system at the first hand. We describe the way information systems can use Twitter as a transport layer for own services. Our work introduces a programmable service called 411 for Twitter, which supports user-defined and application-specific commands through tweets.

5 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382982•
Revealing potential changes of significant terms in streams of textual data written in natural languages using windowing and text mining

[...]

Jan Zizka1, Frantisek Darena1•
Mendel University1
1 Nov 2015
TL;DR: The presented research deals with analyzing continuous streams of textual data written in natural languages and demonstrates that the suggested method provides reliable results.
Abstract: The presented research deals with analyzing continuous streams of textual data written in natural languages. One of problems is revealing possible significant concept changes in Internet blogs, discussions, etc., together with discovering what represents such data, if it is more-or-less topically invariable or changing, and what kind of change occurred. A real-world textual dataset is analyzed using text-mining with automatically generated decision trees to find significant words that affect correct assignment of document labels (classes) and can be used for detecting noticeable changes. The changes and their detection are here modeled by assorted gradual mixture of two languages and the change degree is measured by cosine, Eucledian, and Jaccard distance (similarity), which provide qualitatively the same result. The monitoring procedure is based on analyzing successively adjacent couples of data-windows in the stream using the comparison of the current and its previous window, both represented by their lists of relevant features expressed in words. The presented results demonstrate that the suggested method provides reliable results.

4 citations

Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382960•
A monolingual approach to detection of text reuse in Russian-English collection

[...]

Oleg Bakhteev, Rita Kuznetsova, Alexey Romanov, Anton Khritankov
1 Nov 2015
TL;DR: A method for cross-lingual (Russian and English) text reuse detection based on the monolingual approach - translation of texts into one language and reduction to the text similarity problem is developed.
Abstract: In this paper we develop a method for cross-lingual (Russian and English) text reuse detection. The method is based on the monolingual approach — translation of texts into one language and reduction to the text similarity problem. We split texts into non-overlapping fragments and compare fragments to each other by means of different metrics — BLEU(1–2), ME-TEOR, cosine similarity between bag-of-words representations of each snippet, and cosine similarity between vectors obtained from doc2vec-trained model. We explore the impact of choice of metric on the quality of text reuse detection. We assess quality of the method on a sample of a hundred scientific documents, originally in Russian, machine translated into English. Preliminary findings demonstrate feasibility of the approach.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382979•
Multi-representation approach to text regression of financial risks

[...]

Roman Trusov1, Alexey Natekin2, Pavel Kalaidin, Sergey Ovcharenko, Alois Knoll3, Aida Fazylova •
Saint Petersburg State University of Information Technologies, Mechanics and Optics1, Deloitte2, Technische Universität München3
1 Nov 2015
TL;DR: This article explores opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings and investigates performance of this multi-representation approach on the financial risk prediction problem.
Abstract: Different approaches for textual feature extraction have been proposed starting with simple word count features and continuing with deeper representations capturing distributional semantics. In recent publications word embedding methods have been successfully used as a representation basis for a large number of NLP tasks like text classification, part of speech tagging and many others. In this article we explore opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings. We investigate performance of this multi-representation approach on the financial risk prediction problem. Publicly available 10-K reports filled by US trading companies are used as the basis for predicting next year change in stock price volatility. Our study shows that models based on single representations achieve performance that is comparable to the previously published results on risk prediction and models with multiple representations benefit from complementary information and outperform both baseline and single representation models.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382981•
Implementation of the new REST API for open source LBS-platform Geo2Tag

[...]

Mark Zaslavskiy1, Dmitry Mouromtsev1•
Saint Petersburg State University of Information Technologies, Mechanics and Optics1
1 Nov 2015
TL;DR: The platform was improved by following challenges: data visualization, extended datetime processing, social network integration and background calculations support, and recommendations were fully implemented in API.
Abstract: The article describes current state of Geo2Tag LBS platform project and new API version implementation. The platform was improved by following challenges: data visualization, extended datetime processing, social network integration and background calculations support. These challenges were justified by review of most important tendencies for geocontext applications and LBS platforms. Recommendations were fully implemented in API. Also the article contains description of new version implementation. As an example Open Data import API and specific plugin for Open Karelia system was implemented. This extension allowed performing geocontext markup of complex spatiotemporal data inside the platform.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382980•
Crowdsourcing synset relations with Genus-Species-Match

[...]

Dmitry Ustalov
1 Nov 2015
TL;DR: Genus-Species-Match is presented, a crowdsourcing workflow for matching noisy pairs of synsets representing hyponymic/hypernymic relations and demonstrates F1 score of 80% on an experiment conducted on an online labor marketplace using the EMERCOM glossary and the Yet Another RussNet sense inventory.
Abstract: Enabling a domain-specific lexical resource is useful for improving the performance of a natural language processing system. However, such resources may be represented in the form of glossaries—terms provided with their sense definitions. Despite the problem of integrating such domain-specific glossaries into more sophisticated general purpose resources like thesuari being highly topical, it is complicated by ambiguity of the individual terms. This paper presents Genus-Species-Match, a crowdsourcing workflow for matching noisy pairs of synsets representing hyponymic/hypernymic relations. The system demonstrates F1 score of 80% on an experiment conducted on an online labor marketplace using the EMERCOM glossary and the Yet Another RussNet sense inventory.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382977•
Applying the P-medians in the design of modern systems-on-chip

[...]

Elena Suvorova1, Nadezhda Matveeva1, Lev Kurbanov1•
Saint Petersburg State University of Aerospace Instrumentation1
1 Nov 2015
TL;DR: In this paper detailed describe the solving of the p-median problem for homogeneous systems-on-chip and describes different methods of calculating the P-medians.
Abstract: In this paper we consider using p-medians searching algorithms in the design of modern systems-on-chip. This mathematical apparatus can be used for decision of some tasks that faced before developer. We consider the types of systems-on-chip, for which the p-median problem is useful. We describe different methods of calculating the P-medians. Also we examine which criteria can be used for searching P-medians. In this paper detailed describe the solving of the p-median problem for homogeneous systems-on-chip.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382976•
Weighted finite-state transducer approach to German compound words reconstruction for Speech Recognition

[...]

Nickolay Shamraev, Alexander Batalshchikov, Mikhail Zulkarneev, Sergey Repalov, Anna Shirokova 
1 Nov 2015
TL;DR: An approach is proposed for German Large Vocabulary Speech Recognition, dealing with the problem of compound words, based on unsupervised word decomposition for German words and a probabilistic method for combining the words using finite state transducers.
Abstract: An approach is proposed for German Large Vocabulary Speech Recognition, dealing with the problem of compound words, based on unsupervised word decomposition for German words and a probabilistic method for combining the words using finite state transducers. The basic idea of the method is to train n-gram language model on the texts where compound words are substituted by their parts plus concatenation symbol. Thus, the context information is taken into account for the compound words and is used in the process of recombination to find most probable variant for recognition result. The advantage of this approach is the improvement of the word recognition accuracy and a more precise recombination of compound words.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382974•
Arabic manuscripts identification based on Feature Relation Graph

[...]

Oleg Redkin1, Olga Bernikova1, Dmitry S. Shalymov1, Vladislav A. Pavlov1•
Saint Petersburg State University1
1 Nov 2015
TL;DR: A new metric based on the Feature Relation Graph (FRG) has proved to be effective for the text independent Persian writer identification and may be also applied to the Arabic manuscripts since Persian script is based on Arabic writing.
Abstract: We investigate a new metric based on the Feature Relation Graph (FRG). This metric has proved to be effective for the text independent Persian writer identification. Since Persian script is based on Arabic writing similar principles of analysis may be also applied to the Arabic manuscripts. We have investigated the FRG for Arabic handwritten texts. Pattern based features are extracted from handwritten texts using Gabor and XGabor filters. The extracted features are represented for each author based on the FRG that plays a role of a feature vector in the classification problems. We have also investigated different parameters of the FRG.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382970•
Software-to-hardware tester for the STP-ISS transport protocol verification

[...]

Valentin Olenev1, Irina Lavrovskaya1, Nadezhda Chumakova1•
Saint Petersburg State University of Aerospace Instrumentation1
1 Nov 2015
TL;DR: A description of such kind of tester, which is developed to test the on-board devices that work in conformance to the STP-ISS transport protocol standard and SpaceWire networking standard, is given.
Abstract: Implementation of conformance testers for the communication protocols is an important task, which is being solved in the majority of industrial companies that develop the communication equipment. Current article gives a description of such kind of tester, which is developed to test the on-board devices that work in conformance to the STP-ISS transport protocol standard and SpaceWire networking standard. We give a brief description of the possible solutions for hardware testing; provide the description of STP-ISS protocol. Then we report on implementation of the Software-to-Hardware STP-ISS tester and fields of its application.
Proceedings Article•10.1109/AINL-ISMW-FRUCT.2015.7382962•
Datasets meta-feature description for recommending feature selection algorithm

[...]

Andrey Filchenkov1, Arseniy Pendryak1•
Saint Petersburg State University of Information Technologies, Mechanics and Optics1
1 Nov 2015
TL;DR: A meta-feature set is found which showed the best result in predicting proper feature selection algorithms and a novel approach to engineer meta-features for data preprocessing algorithms is suggested, which is based on estimating the best parametrization of processing algorithms on small subsamples.
Abstract: Meta-learning is an approach for solving the algorithm selection problem, which is how to choose the best algorithm for a certain task. This task corresponds to a dataset in machine learning and data mining. The main challenge in meta-learning is to engineer a meta-feature description for datasets. In the paper we apply meta-learning for feature selection. We found a meta-feature set which showed the best result in predicting proper feature selection algorithms. We also suggested a novel approach to engineer meta-features for data preprocessing algorithms, which is based on estimating the best parametrization of processing algorithms on small subsamples.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve