Top 43 papers published in the topic of Clef in 2015

Showing papers on "Clef published in 2015"

Book Chapter•10.1007/978-3-319-24027-5_45•

General Overview of ImageCLEF at the CLEF 2015 Labs

[...]

Mauricio Villegas¹, Henning Müller², Andrew Gilbert³, Luca Piras⁴, Josiah Wang⁵, Krystian Mikolajczyk³, Alba García Seco de Herrera², Stefano Bromuri², M. Ashraful Amin⁶, Mahmood Kazi Mohammed⁷, Burak Acar⁸, Suzan Uskudarli⁸, Neda Barzegar Marvasti⁸, José F. Aldana⁹, María del Mar Roldán García⁹ - Show less +11 more•Institutions (9)

Polytechnic University of Valencia¹, University of Applied Sciences Western Switzerland², University of Surrey³, University of Cagliari⁴, University of Sheffield⁵, Independence University⁶, Sir Salimullah Medical College⁷, Boğaziçi University⁸, University of Málaga⁹

8 Sep 2015

TL;DR: The x-ray task was the only fully novel task this year, although the other three tasks introduced modifications to keep up relevancy of the proposed challenges.

...read moreread less

Abstract: This paper presents an overview of the ImageCLEF 2015 evaluation campaign, an event that was organized as part of the CLEF labs 2015 ImageCLEF is an ongoing initiative that promotes the evaluation of technologies for annotation, indexing and retrieval for providing information access to databases of images in various usage scenarios and domains In 2015, the 13th edition of ImageCLEF, four main tasks were proposed: 1 automatic concept annotation, localization and sentence description generation for general images; 2 identification, multi-label classification and separation of compound figures from biomedical literature; 3 clustering of x-rays from all over the body; and 4 prediction of missing radiological annotations in reports of liver CT images The x-ray task was the only fully novel task this year, although the other three tasks introduced modifications to keep up relevancy of the proposed challenges The participation was considerably positive in this edition of the lab, receiving almost twice the number of submitted working notes papers as compared to previous years

...read moreread less

71 citations

Patent•

Intelligent keyboard interface for virtual musical instrument

[...]

Alexander H. Little¹, Eli T. Manjarrez¹•Institutions (1)

Apple Inc.¹

2 Jul 2015

TL;DR: In this article, a user interface for a virtual musical instrument presents a number of chord touch regions, each corresponding to a chord of a diatonic key, including treble clef and bass clef touch zones.

...read moreread less

Abstract: A user interface for a virtual musical instrument presents a number of chord touch regions, each corresponding to a chord of a diatonic key. Within each chord region a number of touch zones are provided, including treble clef zones and bass clef zones. Each treble clef touch zone within a region will sound a different chord voicing. Each bass clef touch zone will sound a bass note of the chord. Other user interactions can modify or mute the chords, and vary the bass notes being played together with the chords. A set of related chords and/or a set of rhythmic patterns can be generated based on a selected instrument and a selected style of music.

...read moreread less

38 citations

CLEF eHealth Evaluation Lab 2015 Task 1b: clinical named entity recognition

[...]

Aurélie Névéol¹, Cyril Grouin, Xavier Tannier, Thierry Hamon², Liadh Kelly³, Lorraine Goeuriot, Pierre Zweigenbaum - Show less +3 more•Institutions (3)

Université Paris-Saclay¹, University of Paris², Dublin City University³

1 Jan 2015

TL;DR: This paper reports on Task 1b of the 2015 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF e health evaluation labs by considering ten types of en- tities including disorders, that were to be extracted from biomedical text in French.

...read moreread less

Abstract: This paper reports on Task 1b of the 2015 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs by considering ten types of en- tities including disorders, that were to be extracted from biomedical text in French. The task consisted of two phases: entity recognition (phase 1), in which participants could supply plain or normalized entities, and entity normalization (phase 2). The entities to be extracted were dened according to Semantic Groups in the Unied

...read moreread less

37 citations

Book Chapter•10.1007/978-3-319-24027-5_47•

Overview of the Living Labs for Information Retrieval Evaluation LL4IR CLEF Lab 2015

[...]

Anne Schuth¹, Krisztian Balog², Liadh Kelly³•Institutions (3)

University of Amsterdam¹, University of Stavanger², Trinity College, Dublin³

8 Sep 2015

TL;DR: The Living Labs for Information Retrieval Evaluation LL4IR CLEF Lab as mentioned in this paper provides a benchmarking platform for researchers to evaluate their ranking systems in a live setting with real users in their natural task environments.

...read moreread less

Abstract: In this paper we report on the first Living Labs for Information Retrieval Evaluation LL4IR CLEF Lab. Our main goal with the lab is to provide a benchmarking platform for researchers to evaluate their ranking systems in a live setting with real users in their natural task environments. For this first edition of the challenge we focused on two specific use-cases: product search and web search. Ranking systems submitted by participants were experimentally compared using interleaved comparisons to the production system from the corresponding use-case. In this paper we describe how these experiments were performed, what the resulting outcomes are, and conclude with some lessons learned.

...read moreread less

37 citations

SemGraphQA@ QALD5: LIMSI participation at QALD5@ CLEF.

[...]

Romain Beaumont, Brigitte Grau¹, Anne-Laure Ligozat¹•Institutions (1)

École Normale Supérieure¹

1 Sep 2015

TL;DR: An unsupervised method for the semantic analysis of questions, that generates queries, based on graph transformations, in two steps, that makes use of very general constraints on the query structure that allows to maintain semantic ambiguities in different graphs.

...read moreread less

Abstract: For our participation to QALD-5, we developed a system for answering questions on a knowledge base. We proposed an unsupervised method for the semantic analysis of questions, that generates queries, based on graph transformations, in two steps. First step is independent of the knowledge base schema and makes use of very general constraints on the query structure that allows us to maintain semantic ambiguities in different graphs. Ambiguities are then solved globally at the final step when querying the knowledge base.

...read moreread less

30 citations

Automatic Profiling of Twitter Users Based on Their Tweets: Notebook for PAN at CLEF 2015.

[...]

Octavia-Maria Sulea, Daniel Dichiu

1 Jan 2015

TL;DR: A novel way of computing the type/token ratio of an author is introduced and it is shown that, although strong correlations have been observed between high extroversion and low type/ token ratios in the past, this ratio is not necessarily a strong indicator of extrovertedness.

...read moreread less

Abstract: In this paper we go through our approach at solving the PAN Author Profiling task. We introduce a novel way of computing the type/token ratio of an author and show that, although strong correlations have been observed between high extroversion and low type/token ratios in the past, this ratio is not necessarily a strong indicator of extroversion. Since the text of a person is influenced by all 7 features (gender, age, and big five personality traits) that are required to be automatically identified in this task, we used this ratio, along with Term frequency-Inverse document frequency (tf-idf ) matrices, in all 7 subtasks and all 4 corpora and obtained good results.

...read moreread less

24 citations

Overview of CLEF QA Entrance Exams Task 2015.

[...]

Álvaro Rodrigo¹, Anselmo Peñas¹, Yusuke Miyao², Eduard H. Hovy², Noriko Kando³ - Show less +1 more•Institutions (3)

National University of Distance Education¹, National Institute of Informatics², Carnegie Mellon University³

1 Jan 2015

TL;DR: The Entrance Exams task at the CLEF QA Track 2014 is described, requiring not only a high degree of textual inference, but also the development of strate- gies for selecting the correct answer.

...read moreread less

Abstract: This paper describes the Entrance Exams task at the CLEF QA Track 2015. Following the last two editions, the data set has been extracted from actu- al university entrance examinations including a variety of topics and question types. Systems receive a set of Multiple-Choice Reading Comprehension tests where the task is to select the correct answer among a finite set of candidates, according to the given text. Questions are designed originally for testing human examinees, rather than evaluating computer systems. Therefore, the data set challenges human ability to show their understanding of texts. Thus, questions and answers are lexically distant from their supporting excerpts in text, requir- ing not only a high degree of textual inference, but also the development of strategies for selecting the correct answer.

...read moreread less

22 citations

Developing Monolingual Persian Corpus for Extrinsic Plagiarism Detection Using Artificial Obfuscation: Notebook for PAN at CLEF 2015.

[...]

Khadijeh Khoshnavataher, Vahid Zarrabi, Salar Mohtaj, Habibollah Asghari

1 Jan 2015

TL;DR: The approach for construction of a monolingual Persian plagia- rism corpus that can be used to evaluate the performance of Persian plagiarism detection systems is described.

...read moreread less

Abstract: The task of text alignment corpus construction at PAN 2015 competi- tion consists of preparing a plagiarism corpus so that it can provide various ob- fuscation types and versatile obfuscation degrees. Meanwhile, its format and metadata structure should follow previous PAN plagiarism corpora. In this pa- per, we describe our approach for construction of a monolingual Persian plagia- rism corpus that can be used to evaluate the performance of Persian plagiarism detection systems.

...read moreread less

20 citations

UniNE at CLEF 2015 Author Identification: Notebook for PAN at CLEF 2015.

[...]

Mirco Kocher, Jacques Savoy

1 Jan 2015

TL;DR: This paper describes and evaluates an unsupervised authorship verification model called SPATIUM-L1, which can be adapted without any problem to different languages (such as Dutch, English, Greek, and Spanish).

...read moreread less

Abstract: This paper describes and evaluates an unsupervised authorship verification model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, Greek, and Spanish) with their genre and topic differ significantly. As features, we suggest using the k most frequent terms of the disputed text (isolated words and punctuation symbols with k may vary from 200 to 300). Applying a simple distance measure and a set of impostors, we determine whether or not the disputed text was written by the proposed author. Moreover, based on a simple rule, we can define when there is enough evidence to propose an answer with a high degree of confidence or when the attribution scheme is given without certainty. The evaluations are based on four test collections (PAN AUTHOR IDENTIFICATION task at CLEF 2015).

...read moreread less

19 citations

Book Chapter•10.1007/978-3-319-16486-1_2•

Question Answering Track Evaluation in TREC, CLEF and NTCIR

[...]

María-Dolores Olvera-Lobo¹, Juncal Gutiérrez-Artacho¹•Institutions (1)

University of Granada¹

1 Jan 2015

TL;DR: This study presents a historical overview of 15 years of QA evaluation tracks using the method of systematic review, and examined the different tasks or specific labs created in each QA track, the types of evaluation question used, as well as the evaluation measures used in the different competitions analyzed.

...read moreread less

Abstract: Question Answering (QA) Systems are put forward as a real alternative to Information Retrieval systems as they provide the user with a fast and comprehensible answer to his or her information need. It has been 15 years since TREC introduced the first QA track. The principal campaigns in the evaluation of Information Retrieval have been specific tracks focusing on the development and evaluation of this type of system. This study is a brief review of the TREC, CLEF and NTCIR Conferences from the QA perspective. We present a historical overview of 15 years of QA evaluation tracks using the method of systematic review. We have examined identified the different tasks or specific labs created in each QA track, the types of evaluation question used, as well as the evaluation measures used in the different competitions analyzed. Of the conferences, it is CLEF that has applied the greater variety of types of test question (factoid, definition, list, causal, yes/no, amongst others). NTCIR, held on 13 occasions, is the conference which has made use of a greater number of different evaluation measures. Accuracy, precision and recall have been the three most used evaluation measures in the three campaigns.

...read moreread less

19 citations

WI-ENRE in CLEF eHealth Evaluation Lab 2015: Clinical Named Entity Recognition Based on CRF

[...]

Jingchi Jiang, Yi Guan¹, Chao Zhao•Institutions (1)

Harbin Institute of Technology¹

1 Jan 2015

TL;DR: A novel method to recognize clinical entities based on conditional random fields (CRF) based on WI-ENRE system, which is effective in the named entity recognition of biomedical texts.

...read moreread less

Abstract: Named entity recognition of biomedical text is the shared task 1b of the 2015 CLEF eHealth evaluation lab, which focuses on making biomedical text easier to understand for patients and clinical workers. In this paper, we propose a novel method to recognize clinical entities based on conditional random fields (CRF). The biomedical texts are split into sections and paragraphs. Then the NLP tools are used for POS tagging and parsing, and four groups of features are ex- tracted to train the entity recognition model. In the subsequent phase for entity normalization, the MetaMap of Unified Medical Language System (UMLS) tool is used to search for concept unique identifiers (CUIs) category. In addition, CRF++ package is adopted to recognize clinical entities in another phase for en- tity recognition. The experiments show that our system named as WI-ENRE, is effective in the named entity recognition of biomedical texts. The Fmeasure of EMEA and MEDLINE reach to 0.56 and 0.45 respectively in exact match.

...read moreread less

Authorship Verification: An Approach based on Random Forest: Notebook for PAN at CLEF 2015.

[...]

Promita Maitra, Souvick Ghosh, Dipankar Das

1 Jan 2015

TL;DR: This article used word-based and style-based features to identify the differences between the known and unknown problems of one given set and label the unknown ones accordingly using a Random Forest based classifier.

...read moreread less

Abstract: Authorship attribution, being an important problem in many areas including information retrieval, computational linguistics, law and journalism etc., has been identified as a subject of increasingly research interest in the recent years. In case of Author Identification task in PAN at CLEF 2015, the main focus was given on cross-genre and cross-topic author verification tasks. We have used several word-based and style-based features to identify the differences between the known and unknown problems of one given set and label the unknown ones accordingly using a Random Forest based classifier.

...read moreread less

Book Chapter•10.1007/978-3-319-24027-5_38•

A Method for Short Message Contextualization: Experiments at CLEF/INEX

[...]

Liana Ermakova

8 Sep 2015

TL;DR: The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring, and an algorithm from smoothing from the local context for sentence reordering for tweet contextualization.

...read moreread less

Abstract: This paper presents the approach we developed for automatic multi-document summarization applied to short message contextualization, in particular to tweet contextualization. The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring. In contrast to previous research, we introduced an algorithm from smoothing from the local context. Our approach exploits topic-comment structure of a text. Moreover, we developed a graph-based algorithm for sentence reordering. The method has been evaluated at INEX/CLEF tweet contextualization track. We provide the evaluation results over the 4 years of the track. The method was also adapted to snippet retrieval and query expansion. The evaluation results indicate good performance of the approach.

...read moreread less

ECNU at 2015 eHealth Task 2: User-centred Health Information Retrieval

[...]

Yang Song¹, Yun He¹, Qinmin Hu², Liang He¹, E. Mark Haacke - Show less +1 more•Institutions (2)

East China Normal University¹, Wayne State University²

1 Jan 2015

TL;DR: A Web-based query expansion model and a learning-to-rank algorithm to better understand and satisfy the CLEF eHealth Task 2.0 is proposed.

...read moreread less

Abstract: This paper presents our work on the 2015 CLEF eHealth Task 2. In particular, we propose a Web-based query expansion model and a learning-to-rank algorithm to better understand and satisfy the task.

...read moreread less

A Graph Based Authorship Identification Approach: Notebook for PAN at CLEF 2015.

[...]

Helena Gómez-Adorno, Grigori Sidorov¹, David Pinto, Ilia Markov²•Institutions (2)

Instituto Politécnico Nacional¹, Benemérita Universidad Autónoma de Puebla²

1 Jan 2015

TL;DR: The paper describes the approach for the Authorship Identification task at the PAN CLEF 2015, which uses a predefined threshold in order to decide if the unknown document is written by the known author or not.

...read moreread less

Abstract: The paper describes our approach for the Authorship Identification task at the PAN CLEF 2015. We extract textual patterns based on features ob- tained from shortest path walks over Integrated Syntactic Graphs (ISG). Then we calculate a similarity between the unknown document and the known document with these patterns. The approach uses a predefined threshold in order to decide if the unknown document is written by the known author or not.

...read moreread less

Proceedings Article•

SIBM at CLEF e-Health Evaluation Lab 2015

[...]

Lina Fatima Soualmia, Chloé Cabot, Badisse Dahamna, Stéfan Jacques Darmoni

8 Sep 2015

TL;DR: This paper reports on the participation in the clinical named entity recognition task of the CLEF eHealth 2015 evaluation initiative i.e. to fully automatically identify clinically relevant entities in medical text in French using several biomedical knowledge organization systems containing terms and their variations already in French.

...read moreread less

Abstract: In this paper, we report on our participation in the clinical named entity recognition task of the CLEF eHealth 2015 evaluation initiative i.e. to fully automatically identify clinically relevant entities in medical text in French. We address the task by using several biomedical knowledge organization systems (KOS) containing terms and their variations already in French or that we have partially translated in the context of existing projects. The extraction method is available online in the form a web-based service that requests the KOS to extract clinical concepts from Electronic Health Records. It is also available via a user-friendly interface developed for clinicians. Our system has not obtained good results in inexact matching against the gold standard. However, this first participation allowed us to analyze our system and method and will allow us to

...read moreread less

Book Chapter•10.1007/978-3-319-24027-5_50•

Overview of the CLEF Question Answering Track 2015

[...]

Anselmo Peñas¹, Christina Unger², Georgios Paliouras, Ioannis A. Kakadiaris³•Institutions (3)

National University of Distance Education¹, Bielefeld University², University of Houston³

8 Sep 2015

TL;DR: This paper describes the CLEF QA Track 2015, which was divided into four tasks: i QALD: focused on translating natural language questions into SPARQL; ii Entrance Exams:focused on answering questions to assess machine reading capabilities; iii BioASQ1 focused on large-scale semantic indexing and iv BioASZ2 for Question Answering in the biomedical domain.

...read moreread less

Abstract: This paper describes the CLEF QA Track 2015. Following the scenario stated last year for the CLEF QA Track, the starting point for accessing information is always a Natural Language question. However, answering some questions may need to query Linked Data especially if aggregations or logical inferences are required, some questions may need textual inferences and querying free-text, and finally, answering some queries may require both sources of information. In this edition, the Track was divided into four tasks: i QALD: focused on translating natural language questions into SPARQL; ii Entrance Exams: focused on answering questions to assess machine reading capabilities; iii BioASQ1 focused on large-scale semantic indexing and iv BioASQ2 for Question Answering in the biomedical domain.

...read moreread less

Homotopy Based Classification for Author Verification Task: Notebook for PAN at CLEF 2015.

[...]

Josué Gerardo Gutiérrez Hernández, Jose Casillas, Paola Ledesma, Gibran Fuentes Pineda, Iván Vladimir Meza Ruíz - Show less +1 more

1 Jan 2015

TL;DR: This paper presents the experience implementing a homotopy-based classification (HBC) system for the ‘PAN 2015 Author Identification’ and embedded into the General Impostor Method resulting in an ensemble of the SBC model.

...read moreread less

Abstract: This paper presents our experience implementing a homotopy-based classification (HBC) system for the ‘PAN 2015 Author Identification’ [20]. Known documents from a specific author and randomly selected impostor documents are used as a dictionary to generate a contested document. Given the contribution of the known documents to the contested document we can verify the authorship of the document. This classification is embedded into the General Impostor Method resulting in an ensemble of the SBC model.

...read moreread less

Syntactic N-grams as Features for the Author Profiling Task: Notebook for PAN at CLEF 2015.

[...]

Juan Pablo Posadas-Durán, Helena Gómez-Adorno, Ilia Markov, Grigori Sidorov, Ildar Z. Batyrshin, Alexander Gelbukh, Obdulia Pichardo-Lagunas - Show less +3 more

1 Jan 2015

TL;DR: This article used syntactic features such as syntactic based n-grams of various types in order to predict the age, gender and personality traits that has the author of a given text.

...read moreread less

Abstract: This paper describes our approach to tackle the Author Profiling task at PAN 2015. Our method relies on syntactic features, such as syntactic based n-grams of various types in order to predict the age, gender and personality traits that has the author of a given text. In this paper, we describe the used features, the employed classification algorithm, and other general ideas concerning the experiments we conducted.

...read moreread less

GLAD: Groningen Lightweight Authorship Detection Notebook for PAN at CLEF 2015

[...]

Manuela Hürlimann, Benno Weck, Esther van den Berg, Malvina Nissim

1 Jan 2015

TL;DR: A binary linear classifier is trained both on the features describing known and unknown documents individually and on the joint features comparing these two types of documents, resulting in competitive results that outperform the baseline and position the system among the top PAN shared task participants.

...read moreread less

Abstract: We present a simple and effective approach to authorship verifica- tion for Dutch, English, Spanish and Greek, which can be easily ported to yet other languages We train a binary linear classifier both on the features describing known and unknown documents individually, and on the joint features comparing these two types of documents The list of feature types includes, among others, character n-grams, the lexical overlap, visual text properties and a compression measure We obtain competitive results that outperform the baseline and position our system among the top PAN shared task participants

...read moreread less

Historical Clicks for Product Search: GESIS at CLEF LL4IR 2015.

[...]

Philipp Schaer, Narges Tavakolpoursaleh

1 Jan 2015

TL;DR: The Living Labs for Information Retrieval (LL4IR) lab was held for the first time at CLEF and GESIS participated in this pilot evaluation and the system that is based on the Solr search engine and includes a re-reranking based on historical click data was described.

...read moreread less

Abstract: The Living Labs for Information Retrieval (LL4IR) lab was held for the first time at CLEF and GESIS participated in this pilot evaluation. We took part in the product search task and describe our system that is based on the Solr search engine and includes a re-reranking based on historical click data. This brief workshop note also includes some preliminary results, discussion and some lessons learned.

...read moreread less

SNUMedinfo at CLEF QA track BioASQ 2015

[...]

Sungbin Choi

1 Jan 2015

TL;DR: This paper describes the participants' participation at the BioASQ Task 3b of CLEF 2015 Question Answering track and the ideal answer generation subtask in Phase B, where relevant passages are selected and combined to produce answer text.

...read moreread less

Abstract: This paper describes our participation at the BioASQ Task 3b of CLEF 2015 Question Answering track. We participated at the document retrieval sub- task in Phase A and the ideal answer generation subtask in Phase B. As of previ- ous year, in the document retrieval task, we mostly experimented with semantic concept-enriched dependence model and sequential dependence model. In the ideal answer generation task, relevant passages are selected and combined to au- tomatically produce answer text.

...read moreread less

Proceedings Article•

LIG at CLEF 2015 SBS Lab

[...]

Philippe Mulhem, Nawal Ould Amer, Mathias Géry

8 Sep 2015

TL;DR: The proposal rely on a biased fusion of content-only retrieval, using BM25F and LGD retrieval models, user non-social profile based on the catalog of the requester, and social profiles using user/user links generated from their catalogs and ratings on books.

...read moreread less

Abstract: This paper describes the work achieved by the MRIM research group of Grenoble, using some data from the LaHC of Saint-´ Etienne, in a way to test personalized retrieval of books for the Social Book Search Lab of CLEF 2015. Our proposal rely on a biased fusion of content-only retrieval, using BM25F and LGD retrieval models, user non-social profile based on the catalog of the requester, and social profiles using user/user links generated from their catalogs and ratings on books. The official results obtained show a clear positive impact of user profile, and a small positive impact of the social elements we used. Post official results that present non biased fusion scores are also presented.

...read moreread less

LIMSI-CNRS@CLEF 2015: Tree Edit Beam Search for Multiple Choice Question Answering

[...]

Martin Gleize, Brigitte Grau¹•Institutions (1)

École Normale Supérieure¹

1 Jan 2015

TL;DR: The authors' system retrieves passages relevant to the question, through lexical expansion involving WordNet and word vectors, then a tree edit model is used on graph representations of the passages and answer choices to extract edit se- quences.

...read moreread less

Abstract: This paper describes our participation to the Entrance Ex- ams Task of CLEF 2015's Question Answering Track. The goal is to an- swer multiple-choice questions on short texts. Our system rst retrieves passages relevant to the question, through lexical expansion involving WordNet and word vectors. Then a tree edit model is used on graph representations of the passages and answer choices to extract edit se- quences. Finally, features are computed from those edit sequences and used in various machine-learned models to take the nal decision. We submitted several runs in the task, one of which yielding a c@1 of 0.36, which makes our team the second best on the task.

...read moreread less

Age, Gender and Personality Recognition using Tweets in a Multilingual setting: Notebook for PAN at CLEF 2015.

[...]

Mounica Arroju¹, Aftab Hassan¹, Golnoosh Farnadi²•Institutions (2)

University of Washington¹, Ghent University²

1 Jan 2015

TL;DR: The properties of the multilingual software submitted for PAN2015 which recognizes the age, gender and personality traits of Twitter users in four languages, namely, English, Spanish, Dutch and Italian are described.

...read moreread less

Abstract: User generated text on social media sites is a rich source of information that can be used to identify different aspects of their authors. Proper mining of this content provides an automatic way of identifying users which is very valuable for applications that rely on personalisation. In this work, we describe the properties of our multilingual software submitted for PAN2015 which recognizes the age, gender and personality traits of Twitter users in four languages, namely, English, Spanish, Dutch and Italian.

...read moreread less

Experimental IR meets multilinguality, multimodality, and interaction : 6th International Conference of the CLEF Association, CLEF '15, Toulouse, France, September 8-11, 2015 : proceedings

[...]

Cross-Language Evaluation Forum, Josiane Mothe

1 Jan 2015

TL;DR: This book constitutes the refereed proceedings of the 6th International Conference of the CLEF Initiative, CLEF 2015, held in Toulouse, France, in September 2015, and covers a broad range of issues in the fields of multilingual and multimodal information access evaluation.

...read moreread less

Abstract: This book constitutes the refereed proceedings of the 6th International Conference of the CLEF Initiative, CLEF 2015, held in Toulouse, France, in September 2015. The 31 full papers and 20 short papers presented were carefully reviewed and selected from 68 submissions. They cover a broad range of issues in the fields of multilingual and multimodal information access evaluation, also included are a set of labs and workshops designed to test different aspects of mono and cross-language information retrieval systems.

...read moreread less

XRCE Personal Language Analytics Engine for Multilingual Author Profiling: Notebook for PAN at CLEF 2015.

[...]

Scott Nowson, Julien Perez, Caroline Brun¹, Shachar Mirkin, Claude C. Roux - Show less +1 more•Institutions (1)

Xerox¹

1 Jan 2015

TL;DR: This article used personality traits alongside age and gender in a corpus of tweets in four languages (English, Spanish, Italian and Dutch) for the PAN 2015 Author Profiling Challenge by the team from Xerox Research Centre Europe (XRCE).

...read moreread less

Abstract: This technical notebook describes the methodology used – and results achieved – for the PAN 2015 Author Profiling Challenge by the team from Xerox Research Centre Europe (XRCE). This year, personality traits are introduced alongside age and gender in a corpus of tweets in four languages – English, Spanish, Italian and Dutch. We describe a largely language agnostic methodology for classification which uses language specific linguistic processing to generate features. We also report on experiments in which we use machine translation to accommodate for languages in which there is less training data. Native language results are successful, but socio-demographic signals in language seem to be lost under MT conditions.

...read moreread less

UniNE at CLEF 2015: Author Profiling Notebook for PAN at CLEF 2015

[...]

Mirco Kocher

1 Jan 2015

TL;DR: This paper describes and evaluates an effective author profiling model called SPATIUM-L1, which can be adapted without any problem to different languages (such as Dutch, English, Italian, and Spanish) in Twitter tweets.

...read moreread less

Abstract: This paper describes and evaluates an effective author profiling model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, Italian, and Spanish) in Twitter tweets. As features, we suggest using the 200 most frequent terms of the query text (isolated words and punctuation symbols). Applying a simple distance measure and looking at the three nearest neighbors, we can determine the gender (with the nominal values male and female), the age group (with the ordinal measurement 18-24|25-34|35-49|>50), and the Big Five personality traits (extraversion, neuroticism, agreeableness, conscientiousness, and openness on an interval scale containing eleven items). Evaluations are based on four test collections (PAN AUTHOR PROFILING task at CLEF 2015).

...read moreread less

Integrating Social Features and Query Type Recognition in the Suggestion Track of CLEF 2015 Social Book Search Lab.

[...]

Shih-Hung Wu, Yi-Hsiang Hsieh, Liang-Pu Chen, Tsun Ku

1 Jan 2015

TL;DR: A social feature re-ranking system is built based on a full-text search engine and a set of rules to filtering out unnecessary books from the recommendation list are defined.

...read moreread less

Abstract: The Social Book Search (SBS) Lab is part of CLEF 2015 lab series. This is the third time that the CYUT CSIE team attends the SBS track. Based on a full-text search engine, we build a social feature re-ranking system and introduce more knowledge on understanding the queries. We defined a set of rules to filtering out unnecessary books from the recommendation list. The official run results show that the system performance is improved from our previous system.

...read moreread less

Proceedings Article•

IRIT at CLEF 2015: A product search model for head queries

[...]

Lamjed Ben Jabeur, Laure Soulier, Lynda Tamine

8 Sep 2015

TL;DR: This paper proposes a probabilistic model for product search based on the intuition that descriptive fields and the category might fit with the query and addresses head queries that are frequently submitted on e-commerce Web sites.

...read moreread less

Abstract: We describe in this paper our participation in the product search task of LL4IR CLEF 2015 Lab. This task aims to evaluate, with living labs protective point of view, the retrieval effectiveness over e-commerce search engines. During the online shopping process, users would search for interesting products and quickly access those that fit with their needs among a long tail of similar or closely related products. Our contribution addresses head queries that are frequently submitted on e-commerce Web sites. Head queries usually target featured products with several variations, accessories, and complementary products. We propose a probabilistic model for product search based on the intuition that descriptive fields and the category might fit with the query. Fi-naly, we present results obtained during the second round of the product search task.

...read moreread less