Top 45 papers published in the topic of Clef in 2019

Showing papers on "Clef published in 2019"

Book Chapter•10.1007/978-3-030-15719-7_41•

CheckThat! at CLEF 2019: Automatic Identification and Verification of Claims.

[...]

Tamer Elsayed¹, Preslav Nakov², Alberto Barrón-Cedeño², Maram Hasanain¹, Reem Suwaileh¹, Giovanni Da San Martino², Pepa Atanasova³ - Show less +3 more•Institutions (3)

Qatar University¹, Qatar Computing Research Institute², Sofia University³

14 Apr 2019

TL;DR: The second edition of the CheckThat!

...read moreread less

Abstract: We introduce the second edition of the CheckThat! Lab, part of the 2019 Cross-Language Evaluation Forum (CLEF). CheckThat! proposes two complementary tasks. Task 1: predict which claims in a political debate should be prioritized for fact-checking. Task 2: rank Web-retrieved pages against a check-worthy claim based on their usefulness for fact-checking, extract useful passages from those pages, and then use them all to decide whether the claim is factually true or false. Checkthat! provides a full evaluation framework, consisting of data in English (derived from fact-checking sources) and Arabic (gathered and annotated from scratch) and evaluation based on mean average precision (MAP) for ranking and F\(_1\) for classification tasks.

...read moreread less

64 citations

Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 1: Check-Worthiness.

[...]

Pepa Atanasova, Preslav Nakov, Georgi Karadzhov, Mitra Mohtarami, Giovanni Da San Martino - Show less +1 more

1 Jan 2019

TL;DR: An overview of the 2nd edition of the CheckThat!

...read moreread less

Abstract: We present an overview of the 2nd edition of the CheckThat! Lab, part of CLEF 2019, with focus on Task 1: Check-Worthiness in political debates. The task asks to predict which claims in a political debate should be prioritized for fact-checking. In particular, given a debate or a political speech, the goal is to produce a ranked list of its sentences based on their worthiness for fact-checking. This year, we extended the 2018 dataset with 16 more debates and speeches. A total of 47 teams registered to participate in the lab, and eleven of them actually submitted runs for Task 1 (compared to seven last year). The evaluation results show that the most successful approaches to Task 1 used various neural networks and logistic regression. The best system achieved mean average precision of 0.166 (0.250 on the speeches, and 0.054 on the debates). This leaves large room for improvement, and thus we release all datasets and scoring scripts, which should enable further research in check-worthiness estimation.

...read moreread less

64 citations

Classifying German Animal Experiment Summaries with Multi-lingual BERT at CLEF eHealth 2019 Task 1.

[...]

Mario Sänger, Leon Weber, Madeleine Kittner, Ulf Leser

1 Jan 2019

TL;DR: This paper approaches the CLEF eHealth challenge 2019, Task 1 of automatic annotation of German non-technical summaries of animal experiments with ICD-10 codes as multi-label classification problem and leverage the multi-lingual version of the BERT text encoding model to represent the summaries.

...read moreread less

Abstract: In this paper we present our contribution to the CLEF eHealth challenge 2019, Task 1. The task involves the automatic annotation of German non-technical summaries of animal experiments with ICD-10 codes. We approach the task as multi-label classification problem and leverage the multi-lingual version of the BERT text encoding model [6] to represent the summaries. The model is extended by a single output layer to produce probabilities for individual ICD-10 codes. In addition, we make use of extra training data from the German Clinical Trials Register and ensemble several model instances to improve the overall performance of our approach. We compare our model with five baseline systems including a dictionary matching approach and single-label SVM and BERT classification models. Experiments on the development set highlight the advantage of our approach compared to the baselines with an improvement of 3.6%. Our model achieves the overall best performance in the challenge reaching an F1 score of 0.80 in the final evaluation.

...read moreread less

22 citations

Overview of the CLEF eHealth 2019 Multilingual Information Extraction.

[...]

Mariana Neves, Daniel Butzke, A. Dörendahl, Nora Leich, Benedikt Hummel, Gilbert Schönfelder, Barbara Grune - Show less +3 more

1 Jan 2019

TL;DR: The NTSs of planned animal experiments in Germany are publicly available and have been manually assigned to ICD-10 codes and used in the scope of organizing the Multilingual Information Extraction Task in the CLEF eHealth challenge.

...read moreread less

Abstract: Non-technical summaries (NTSs) of animal experimentation can be valuable resources to foster more transparency of research made with animals and to better inform the community about this topic. The NTSs of planned animal experiments in Germany are publicly available and have been manually assigned to ICD-10 codes. We used this data in the scope of organizing the Multilingual Information Extraction Task (Task 1) in the CLEF eHealth challenge. For the development phase, we released a training dataset containing more than 8,000 NTSs and their corresponding codes (if any assigned). For the test phase, we released 407 unseen NTSs for which the participants should submit the predictions made by their systems. The best performing system obtained a P, R, and FM of 0.83, 0.77, and 0.80, respectively.

...read moreread less

19 citations

Book•10.1007/978-3-030-28577-7•

Experimental IR meets multilinguality, multimodality, and interaction : 10th International Conference of the CLEF Association, CLEF 2019, Lugano, Switzerland, September 9–12, 2019, proceedings

[...]

Fabio Crestani, Henning Müller

1 Jan 2019

TL;DR: A summary of the motivations which led to the establishment of CLEF, and a description of how it has evolved over the years, the major achievements, and what the authors see as the next challenges are provided.

...read moreread less

Abstract: 2019 marks the 20 birthday for CLEF, an evaluation campaign activity which has applied the Cranfield evaluation paradigm to the testing of multilingual and multimodal information access systems in Europe. This paper provides a summary of the motivations which led to the establishment of CLEF, and a description of how it has evolved over the years, the major achievements, and what we see as the next challenges.

...read moreread less

19 citations

TheEarthIsFlat's Submission to CLEF'19CheckThat! Challenge.

[...]

Luca Favano, Mark J. Carman, Pier Luca Lanzi

1 Jan 2019

TL;DR: This report details the investigations in applying state-ofthe-art pre-trained Deep Learning models to the problems of Automated Claim Detection and Fact Checking, as part of the CLEF’19 Lab: CheckThat!

...read moreread less

Abstract: This report details our investigations in applying state-ofthe-art pre-trained Deep Learning models to the problems of Automated Claim Detection and Fact Checking, as part of the CLEF’19 Lab: CheckThat!: Automatic Identification and Verification of Claims. The report provides an overview of the experiments performed on these tasks, which continue to be extremely challenging for current technology. The research focuses mainly on the use of pre-trained deep neural text embeddings that through transfer learning can allow for improved classification performance on small and unbalanced text datasets. We also investigate the effectiveness of external data sources for improving prediction accuracy on the claim detection and fact checking tasks. Our team submitted runs for every task/subtask of the challenge. The results appeared satisfactory for task 1 and promising but less satisfactory for task 2. A detailed explanation of the steps performed to obtain the submitted results is provided, including comparison tables between our submissions and other techniques investigated.

...read moreread less

18 citations

UniNE at PAN-CLEF 2019: Bots and Gender Task.

[...]

Catherine Ikae, Sukunya Nath, Jacques Savoy

1 Jan 2019

TL;DR: A feature selection procedure is applied and a Zeta model is proposed to apply to reduce the number of decisions taken by the kNN classifier to solve the “bots and gender” subtask.

...read moreread less

Abstract: When participating in the “bots and gender” subtask (both in English and Spanish), our aim is to automatically detect different text sources (sequence of tweets sent by a bot or a human). When a text is identified as being sent by humans, the system must determine the author’s gender (author profiling). To solve these questions, we focus on a simple classifier (k-NN, k = 5) usually able to produce a correct answer but not in an efficient way. Thus, we apply a feature selection procedure to reduce the number of terms (around 200 to 500). We also propose to apply a Zeta model to reduce the number of decisions taken by the kNN classifier. In this case, we focus on terms used in one category and ignored or used rarely by the second. In addition, the Type-Token Ratio of the lexical density (LD) presents some merit to discriminate between tweets sent by a bot (TTR < 0.2, LD ≥ 0.8) or humans (TTR ≥ 0.2, LD < 0.8).

...read moreread less

16 citations

Temporal Mood Variation: at the CLEF eRisk-2018

[...]

Waleed Ragheb, Jérôme Azé, Sandra Bringay, Maximilien Servajean

2 Jun 2019

TL;DR: The authors used text information without any hand-crafted features or dictionaries to model the temporal mood variation detected from users posts and used two learning phases through exploration of state-of-the-art text vectorization.

...read moreread less

Abstract: Two tasks are proposed at CLEF eRisk-2018 on predicting mental disorder using Users posts on Reddit. Depression and anorexia disorders are considered to be detected as early as possible. In this paper we present the participation of LIRMM (Laboratoire d’Informatique, de Robotique et de Micro´electronique de Montpellier) in both tasks. The proposed architectures and models use only text information without any hand-crafted features or dictionaries to model the temporal mood variation detected from users posts. The proposed models use two learning phases through exploration of state-of-the-art text vectorization. The proposed models perform comparably to other contributions while experiments shows that document-level outperformed word-level vectorizations.

...read moreread less

13 citations

Book Chapter•10.1007/978-3-030-22948-1_1•

From Multilingual to Multimodal: The Evolution of CLEF over Two Decades

[...]

Nicola Ferro¹, Carol Peters²•Institutions (2)

University of Padua¹, Istituto di Scienza e Tecnologie dell'Informazione²

1 Jan 2019

TL;DR: This introductory chapter begins by explaining briefly what is intended by experimental evaluation in information retrieval in order to provide the necessary background for the rest of this volume.

...read moreread less

Abstract: This introductory chapter begins by explaining briefly what is intended by experimental evaluation in information retrieval in order to provide the necessary background for the rest of this volume. The major international evaluation initiatives that have adopted and implemented in various ways this common framework are then presented and their relationship to CLEF indicated. The second part of the chapter details how the experimental evaluation paradigm has been implemented in CLEF by providing a brief overview of the main activities and results obtained over the last two decades. The aim has been to build a strong multidisciplinary research community and to create a sustainable technical framework that would not simply support but would also empower both research and development and evaluation activities, while meeting and at times anticipating the demands of a rapidly evolving information society.

...read moreread less

10 citations

Entity Detection for Check-worthiness Prediction: Glasgow Terrier at CLEF CheckThat! 2019

[...]

Ting Su, Craig Macdonald, Iadh Ounis

14 Jun 2019

TL;DR: This paper proposes an effective approach for retrieving check-worthy sentences within American political debates, which relates to the first task of the CLEF CheckThat!

...read moreread less

Abstract: Since information can be created and shared online by anyone, a lot of time and effort are required to manually fact-check all the information encountered by users everyday. Hence, an automatic factchecking process is needed to effectively fact-check the vast information available online. However, gathering information related to every single claim can also be redundant, as not all sentences or articles are checkworthy. In this paper, we propose an effective approach for retrieving check-worthy sentences within American political debates, which relates to the first task of the CLEF CheckThat! 2019 Lab. To rank sentences based on their check-worthiness, we propose to represent each sentence using their mentioned entities using a TF-IDF representation. We use a SVM classifier to predict the check-worthiness of each sentence. Our approach ranked 4th out of 12 submissions. Our experiments show that the pronouns and coreference resolution pre-processing procedure we use as part of our approach does improve the effectiveness of sentence checkworthiness prediction. Furthermore, our results show that entity analysis features provide valuable evidence for this task.

...read moreread less

10 citations

TOBB-ETU at CLEF 2019: Prioritizing Claims Based on Check-Worthiness.

[...]

Bahadir Altun, Mucahid Kutlu

1 Jan 2019

TL;DR: This paper presents a hybrid approach which combines rule-based and supervised methods for CLEF-2019 Check That!

...read moreread less

Abstract: In recent years, we witnessed an incredible amount of misinformation spread over the Internet. However, it is extremely time consuming to analyze the veracity of every claim made on the Internet. Thus, we urgently need automated systems that can prioritize claims based on their check-worthiness, helping fact-checkers to focus on important claims. In this paper, we present our hybrid approach which combines rule-based and supervised methods for CLEF-2019 Check That! Lab’s Check-Worthiness task. Our primary model ranked 9 based on MAP, and 6 based on R-P, P@5, and P@20 metrics in the official evaluation of primary submissions.

...read moreread less

Ranking studies for systematic reviews using query adaptation : University of Sheffield's approach to CLEF eHealth 2019 task 2 working notes for CLEF 2019

[...]

Amal Alharbi, Mark Stevenson

23 Jul 2019

TL;DR: This paper describes the University of Sheffield’s approach to the CLEF 2019 eHealth Task 2: Technologically Assisted Reviews in Empirical Medicine, which focuses on identifying relevant studies for systematic reviews.

...read moreread less

Abstract: This paper describes the University of Sheffield’s approach to the CLEF 2019 eHealth Task 2: Technologically Assisted Reviews in Empirical Medicine. This task focuses on identifying relevant studies for systematic reviews. The University of Sheffield participated in subtask 2 (Abstract and Title Screening). Our approach used lexical statistics (LogLikelihood, Chi-Squared and Odds-Ratio) to identify terms that retrieve specific types of evidence. A total of 12 official runs were submitted.

...read moreread less

Book Chapter•10.1007/978-3-030-22948-1_7•

Lessons Learnt from Experiments on the Ad Hoc Multilingual Test Collections at CLEF

[...]

Jacques Savoy, Martin Braschler¹•Institutions (1)

Zurich University of Applied Sciences/ZHAW¹

1 Jan 2019

TL;DR: This chapter describes the lessons learnt from the ad hoc track at CLEF in the years 2000 to 2009, and describes the most important challenges when designing a IR system for a new language.

...read moreread less

Abstract: This chapter describes the lessons learnt from the ad hoc track at CLEF in the years 2000 to 2009. This contribution focuses on Information Retrieval (IR) for languages other than English (monolingual IR), as well as bilingual IR (also termed “cross-lingual”; the request is written in one language and the searched collection in another), and multilingual IR (the information items are written in many different languages). During these years the ad hoc track has used mainly newspaper test collections, covering more than 15 languages. The authors themselves have designed, implemented and evaluated IR tools for all these languages during those CLEF campaigns. Based on our own experience and the lessons reported by other participants in these years, we are able to describe the most important challenges when designing a IR system for a new language. When dealing with bilingual IR, our experiments indicate that the critical point is the translation process. However, currently online translating systems tend to offer rather effective translation from one language to another, especially when one of these languages is English. In order to solve the multilingual IR question, different IR architectures are possible. For the simplest approach based on query translation of individual language pairs, the crucial component is the merging of the intermediate bilingual results. When considering both document and query translation, the complexity of the whole system represents clearly a main issue.

...read moreread less

Book Chapter•10.1007/978-3-030-22948-1_17•

From XML Retrieval to Semantic Search and Beyond: The INEX, SBS, and MC2 Labs of CLEF 2012–2018

[...]

Jaap Kamps¹, Marijn Koolen², Shlomo Geva³, Ralf Schenkel⁴, Eric SanJuan⁵, Toine Bogers⁶ - Show less +2 more•Institutions (6)

University of Amsterdam¹, Royal Netherlands Academy of Arts and Sciences², Queensland University of Technology³, University of Trier⁴, University of Avignon⁵, Aalborg University – Copenhagen⁶

14 Aug 2019

TL;DR: This chapter details the efforts of the INEX lab in CLEF (2012–2014), as well as the ongoing activities as separate labs, under the labels Social Book Search (2015–2016), and Microblog Contextualization (2016–2018).

...read moreread less

Abstract: INEX ran as an independent evaluation forum for 10 years before it teamed up with CLEF in 2012. Even before 2012 there was considerable collaboration between INEX and CLEF, and these collaborations increased in intensity when CLEF moved beyond its traditional cross-lingual focus in 2009/2010 shifting to include all experimental IR. This led to the merger of CLEF and INEX, and effectively to the inclusion of INEX as a large track or lab into CLEF in 2012. This chapter details the efforts of the INEX lab in CLEF (2012–2014), as well as the ongoing activities as separate labs, under the labels Social Book Search (2015–2016), and Microblog Contextualization (2016–2018).

...read moreread less

Book Chapter•10.1007/978-3-030-15719-7_36•

CLEF eHealth 2019 Evaluation Lab

[...]

Liadh Kelly¹, Lorraine Goeuriot², Hanna Suominen, Mariana Neves³, Evangelos Kanoulas⁴, René Spijker⁴, René Spijker⁵, Leif Azzopardi⁶, Dan Li⁴, Jimmy⁷, Joao Palotti⁸, Joao Palotti⁹, Guido Zuccon⁷ - Show less +9 more•Institutions (9)

Maynooth University¹, University of Grenoble², Federal Institute for Risk Assessment³, University of Amsterdam⁴, Utrecht University⁵, University of Strathclyde⁶, University of Queensland⁷, Vienna University of Technology⁸, Qatar Computing Research Institute⁹

14 Apr 2019

TL;DR: The CLEF eHealth evaluation series to-date is described and then the 2019 tasks, evaluation methodology, and resources are presented.

...read moreread less

Abstract: Since 2012 CLEF eHealth has focused on evaluation resource building efforts around the easing and support of patients, their next-of-kins, clinical staff, and health scientists in understanding, accessing, and authoring eHealth information in a multilingual setting. This year’s lab offers three tasks: Task 1 on multilingual information extraction; Task 2 on technology assisted reviews in empirical medicine; and Task 3 on consumer health search in mono- and multilingual settings. Herein, we describe the CLEF eHealth evaluation series to-date and then present the 2019 tasks, evaluation methodology, and resources.

...read moreread less

Book Chapter•10.1007/978-3-030-22948-1_22•

The Scholarly Impact of CLEF 2010-2017 - A Google Scholar Analysis of CLEF Proceedings and Working Notes.

[...]

Birger Larsen¹•Institutions (1)

Aalborg University¹

1 Jan 2019

TL;DR: The analysis of the productivity and citation impact of CLEF in the period 2010–2017 shows that CLEF is a very strong and vibrant initiative that has managed a major change of format between 2009/2010 and continues to produce relevant research, datasets and tools.

...read moreread less

Abstract: This chapter assesses the scholarly impact of the CLEF evaluation campaign by performing a bibliometric analysis of the citations of the CLEF 2010–2017 papers collected through Google Scholar. The analysis extends an earlier 2013 study by Tsikrika et al. of the CLEF Proceedings for the period 2000–2009 and compares the impact of the first half of CLEF to the second. It also extends the analysis by including the CLEF Working notes, a less formal but important part of the CLEF oeuvre. Results show that, despite the different nature of the peer-reviewed CLEF Proceedings papers and the less formal and much more numerous Working note papers, both types of publications have high citation impact. In particular, overview papers from the various labs and tasks in CLEF attract large amounts of citations in both Proceedings and Working Notes. A significant proportion of the total number of citations appear to be from outside CLEF—there are simply not enough CLEF papers every year to explain that many citations. In conclusion, the analysis of the productivity and citation impact of CLEF in the period 2010–2017 shows that CLEF is a very strong and vibrant initiative that has managed a major change of format between 2009/2010 and continues to produce relevant research, datasets and tools.

...read moreread less

Book Chapter•10.1007/978-3-030-22948-1_13•

About Sound and Vision: CLEF Beyond Text Retrieval Tasks

[...]

Gareth J. F. Jones¹•Institutions (1)

Dublin City University¹

1 Jan 2019

TL;DR: This chapter reviews tasks examining speech and video retrieval carried out within CLEF during its first 10 years, and overviews related work reported at other information retrieval benchmarks.

...read moreread less

Abstract: CLEF was initiated with intention of providing a catalyst to research in Cross-Language Information Retrieval (CLIR) and Multilingual Information Retrieval (MIR). Focusing principally on European languages, it initially provided CLIR benchmark tasks to the research community within an annual cycle of task design, conduct and reporting. While the early focus was on textual data, the emergence of technologies to enable collection, archiving and content processing of multimedia content led to several initiatives which sought to address search for spoken and visual content. Similar to the interest in multilingual search for text, interest arose in working multilingually with multimedia content. To support research in these areas CLEF introduced a number of tasks in multilingual search for multimedia content. While investigation of image retrieval has formed the focus of the ImageCLEF task over many years, this chapter reviews tasks examining speech and video retrieval carried out within CLEF during its first 10 years, and overviews related work reported at other information retrieval benchmarks.

...read moreread less

Book Chapter•10.1007/978-3-030-28577-7_31•

Overview of the CLEF 2019 Personalised Information Retrieval Lab (PIR-CLEF 2019)

[...]

Gabriella Pasi¹, Gareth J. F. Jones², Lorraine Goeuriot, Liadh Kelly³, Stefania Marrara, Camilla Sanvitto¹ - Show less +2 more•Institutions (3)

University of Milano-Bicocca¹, Dublin City University², Maynooth University³

9 Sep 2019

TL;DR: PIR-CLEF 2019 provided registered participants with two tracks: the Web Search Task and the Medical Search Task, which focuses on personalisation within an ad hoc search task introduced in previous editions of the CLEF eHealth Lab.

...read moreread less

Abstract: The Personalised Information Retrieval Lab (PIR-CLEF 2019) lab is an initiative aimed at both providing and critically analysing the evaluation of Personalization in Information Retrieval (PIR) applications. PIR-CLEF 2019 is the second edition of the Lab after the successful Pilot lab organised at CLEF 2017 and the first edition of the Lab at CLEF 2018. PIR-CLEF 2019 provided registered participants with two tracks: the Web Search Task and the Medical Search Task. The Web Search Task continues the activities introduced in the previous editions of the PIR-CLEF Lab, while the Medical Search Track focuses on personalisation within an ad hoc search task introduced in previous editions of the CLEF eHealth Lab.

...read moreread less

Bots and gender profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019

[...]

Edwin Puertas, Luis Gabriel Moreno-Sandoval, Flor Miriam Plaza-del-Arco, Jorge Andrés Alvarado-Valencia, Alexandra Pomares-Quimbaya, L. Alfonso Ureña-López - Show less +2 more

1 Jan 2019

TL;DR: In this paper, the authors describe the properties of their multilingual classification model submitted for PAN2019 that is able to recognize bots from humans, and females from males, and extracted 18 features from the user's posts and applying a machine learning algorithm obtained good performance results.

...read moreread less

Abstract: Unfortunately, in social networks, software bots or just bots are becoming more and more common because malicious people have seen their usefulness to spread false messages, spread rumors and even manipulate public opinion. Even though the text generated by users in social networks is a rich source of information that can be used to identify different aspects of its authors, not being able to recognize which users are truly humans and which are not, is a big drawback. In this work, we describe the properties of our multilingual classification model submitted for PAN2019 that is able to recognize bots from humans, and females from males. This solution extracted 18 features from the user’s posts and applying a machine learning algorithm obtained good performance results.

...read moreread less

Proceedings Article•

Author Profiling Using Semantic and Syntactic Features : Notebook for PAN at CLEF 2019

[...]

György Kovács¹, Vanda Balogh, Purvnashi Mehta, Kumar Shridhar², Pedro Alonso³, Marcus Liwicki² - Show less +2 more•Institutions (3)

University of Pécs¹, Kaiserslautern University of Technology², Luleå University of Technology³

1 Jan 2019

TL;DR: An approach for the PAN 2019 Author Profiling challenge is presented to detect Twitter bots and also to classify the gender of human Twitter users as male or female.

...read moreread less

Abstract: In this paper we present an approach for the PAN 2019 Author Profiling challenge. The task here is to detect Twitter bots and also to classify the gender of human Twitter users as male or female, b ...

...read moreread less

Experiences with the 2013-2016 CLEF interactive information retrieval tracks

[...]

Vivien Petras¹, Marijn Koolen², Maria Gäde¹, Toine Bogers•Institutions (2)

Humboldt University of Berlin¹, Royal Netherlands Academy of Arts and Sciences²

1 Jan 2019

TL;DR: This paper describes the experiences with the interactive IR tracks organized at CLEF from 2013-2016 and aggregates the lessons learned with each consecutive instance of the lab, to provide practical insights and lessons for future collaborative interactive IR evaluation exercises and for potential re-use scenarios.

...read moreread less

Abstract: This paper describes our experiences with the interactive IR tracks organized at CLEF from 2013-2016 and aggregates the lessons learned with each consecutive instance of the lab. We end with a summary of practical insights and lessons for future collaborative interactive IR evaluation exercises and for potential re-use scenarios.

...read moreread less

Proceedings Article•

Uppsala University and Gavagai at CLEF eRISK : Comparing Word Embedding Models

[...]

Elena Fano, Jussi Karlgren¹, Joakim Nivre²•Institutions (2)

Royal Institute of Technology¹, IBM²

1 Jan 2019

TL;DR: This paper describes an experiment to evaluate the performance of three different types of semantic vectors or word embeddings-random indexing, GloVe, and ELMo-and two different classification arch models.

...read moreread less

Abstract: This paper describes an experiment to evaluate the performance of three different types of semantic vectors or word embeddings-random indexing, GloVe, and ELMo-and two different classification arch ...

...read moreread less

A Distributed Effort Approach for Systematic Reviews. IMS Unipd At CLEF 2019 eHealth Task 2.

[...]

Giorgio Maria Di Nunzio

1 Jan 2019

TL;DR: This work presents a variation of the system presented last year, in particular, not only is the maximum amount of documents that the physician is willing to read set, but the effort is distributed proportionally to the number of documents in the pool.

...read moreread less

Abstract: This is the third participation of the Information Management Systems (IMS) group at CLEF eHealth Task of Technologically Assisted Reviews in Empirical Medicine. This task focuses on the problem of medical systematic reviews, a problem which requires a recall close (if not equal) to 100%. Semi-Automated approaches are essential to support these type of searches when the amount of data exceed the limits of users, i.e. in terms of attention or patience. We present a variation of the system we presented last year; in particular, not only we set the maximum amount of documents that the physician is willing to read, but we distribute the effort across the topics proportionally to the number of documents in the pool. We compare the results of this approach with the “frozen” system we used in 2018 and a BM25 baseline.

...read moreread less

CLEF ProtestNews Lab 2019: Contextualized Word Embeddings for Event Sentence Detection and Event Extraction.

[...]

Gabriella Skitalinskaya, Jonas Klaff, Maximilian Spliethöver

1 Jan 2019

TL;DR: The models were trained on a data corpus collected from Indian news sources, but evaluated on data obtained from news sources from other countries as well, such as China, to use contextualized string embeddings.

...read moreread less

Abstract: In this work we describe our results achieved in the ProtestNews Lab at CLEF 2019. To tackle the problems of event sentence detection and event extraction we decided to use contextualized string embeddings. The models were trained on a data corpus collected from Indian news sources, but evaluated on data obtained from news sources from other countries as well, such as China. Our models have obtained competitive results and have scored 3rd in the event sentence detection task and 1st in the event extraction task based on average F1-scores for different test datasets.

...read moreread less

Non-local DenseNet for Plant CLEF 2019 Contest.

[...]

Dat Nguyen Thanh, Georges Quénot, Lorraine Goeuriot

1 Jan 2019

TL;DR: The DenseNet architecture with competitive performance and relatively low number of parameters is augmented with a non-local block in an attempt to tackle the data deficient challenge in PlantCLEF 2019.

...read moreread less

Abstract: Image-based plant identification is a promising tool constituting the automation of agriculture and environmental conservation as stated in. As an attempt to tackle the data deficient challenge in PlantCLEF 2019, the DenseNet architecture with competitive performance and relatively low number of parameters is augmented with a non-local block. A variety of data sampling schemes are also evaluated as a part of the work. The evaluation of the model and the methods is detailed in the content of the paper.

...read moreread less

Classification of Animal Experiments: A Reproducible Study. IMS Unipd at CLEF eHealth Task 1.

[...]

Giorgio Maria Di Nunzio

1 Jan 2019

TL;DR: The third participation of the Information Management Systems (IMS) group at CLEF eHealth 2019 Task 1.1 is described, in which participants are required to label with ICD-10 codes health-related documents with the focus on the German language and on non-technical summaries of animal experiments.

...read moreread less

Abstract: In this paper, we describe the third participation of the Information Management Systems (IMS) group at CLEF eHealth 2019 Task 1. In this task, participants are required to label with ICD-10 codes health-related documents with the focus on the German language and on non-technical summaries (NTPs) of animal experiments. We tackled this task by focusing on reproducibility aspects, as we did the previous years. This time, we tried three different probabilistic Näıve Bayes classifiers that use different hypothesis on the distribution of terms in the documents and the collection. The experimental evaluation showed a significantly different behavior of the classifiers during the training phase and the test phase. We are currently investigating possible sources of biases introduced in the training phase as well as out-of-vocabulary issues and change in the terminology from the training set to the test set.

...read moreread less

Book Chapter•10.1007/978-3-030-22948-1_18•

Results and Lessons of the Question Answering Track at CLEF

[...]

Anselmo Peñas¹, Álvaro Rodrigo¹, Bernardo Magnini, Pamela Forner, Eduard Hovy², Richard F. E. Sutcliffe³, Danilo Giampiccolo - Show less +3 more•Institutions (3)

National University of Distance Education¹, Carnegie Mellon University², University of Limerick³

1 Jan 2019

TL;DR: The Question Answering track at CLEF ran for 13 years, from 2003 until 2015, and is divided into four eras, with the description and the main results for each of these eras, together with the pilot exercises and other Question AnSWering tasks that ran in CLEF.

...read moreread less

Abstract: The Question Answering track at CLEF ran for 13 years, from 2003 until 2015. Along these years, many different tasks, resources and evaluation methodologies were developed. We divide the CLEF Question Answering campaigns into four eras: (1) Ungrouped mainly factoid questions asked against monolingual newspapers (2003–2006), (2) Grouped questions asked against newspapers and Wikipedias (2007–2008), (3) Ungrouped questions against multilingual parallel-aligned EU legislative documents (2009–2010), and (4) Questions about a single document using a related document collection as background information (2011–2015). We provide the description and the main results for each of these eras, together with the pilot exercises and other Question Answering tasks that ran in CLEF. Finally, we conclude with some of the lessons learnt along these years.

...read moreread less

Ranking Studies for Systematic Reviews Using Query Adaptation: University of Sheffield's Approach to CLEF eHealth 2019 Task 2.

[...]

Amal Alharbi, Mark Stevenson

1 Jan 2019

...read moreread less

CLEF, a Java library to Extract Logical Relationships from Multivalued Contexts.

[...]

Jessie Carbonnel

1 Jan 2019

DEMIR at CLEF eHealth 2019: Information Retrieval based Classification of Animal Experiments Summaries.

[...]

Nizar A. Ahmed, Aliriza Aribas, Adil Alpkocak

1 Jan 2019

TL;DR: This study proposes a k-nearest neighbor (k-NN) and Threshold (t-nn) approaches to classify animal experiment summaries into its correct ICD-10 codes and another two methods are proposed to control and adjust the retrieved labels of the documents results to assign ICD -10 codes for the issued query document.

...read moreread less

Abstract: Information retrieval searching systems recently become powerful for retrieving full text results according to a particular query (or else a document query). Elastic search is an open source information retrieval searching system that is built on Apache Lucene, and works as a distributed search and analytics engine at the same time. Therefore, this engine can also be used as one of machine learnings’ approaches to solve some challenges such as document classification problem. This study is published as working-notes paper for CLEF eHealth 2019 Task 1 on Multilingual Information Extraction and it proposes a k-nearest neighbor (k-NN) and Threshold (t-NN) approaches to classify animal experiment summaries into its correct ICD-10 codes. After that, another two methods are proposed to control and adjust the retrieved labels of the documents results to assign ICD-10 codes for the issued query document. These approaches register high precision, recall and f-measure after we experiment it with the development dataset.

...read moreread less