TL;DR: This paper is a condensed report on Touche: the first shared task on argument retrieval that was held at CLEF 2020 and runs two tasks: supporting individuals in finding arguments on socially important topics and supporting individuals with arguments on everyday personal decisions.
Abstract: This paper is a condensed report on Touche: the first shared task on argument retrieval that was held at CLEF 2020. With the goal to create a collaborative platform for research in argument retrieval, we run two tasks: (1) supporting individuals in finding arguments on socially important topics and (2) supporting individuals with arguments on everyday personal decisions.
TL;DR: The CheckThat! Lab at CLEF 2020 as mentioned in this paper was the third edition of the CLEF 2019 challenge, which consisted of five tasks in two different languages: English and Arabic: check-worthiness estimation, retrieving previously fact-checked claims, evidence retrieval, and claim verification.
Abstract: We present an overview of the third edition of the CheckThat! Lab at CLEF 2020. The lab featured five tasks in two different languages: English and Arabic. The first four tasks compose the full pipeline of claim verification in social media: Task 1 on check-worthiness estimation, Task 2 on retrieving previously fact-checked claims, Task 3 on evidence retrieval, and Task 4 on claim verification. The lab is completed with Task 5 on check-worthiness estimation in political debates and speeches. A total of 67 teams registered to participate in the lab (up from 47 at CLEF 2019), and 23 of them actually submitted runs (compared to 14 at CLEF 2019). Most teams used deep neural networks based on BERT, LSTMs, or CNNs, and achieved sizable improvements over the baselines on all tasks. Here we describe the tasks setup, the evaluation results, and a summary of the approaches used by the participants, and we discuss some lessons learned. Last but not least, we release to the research community all datasets from the lab as well as the evaluation scripts, which should enable further research in the important tasks of check-worthiness estimation and automatic claim verification.
TL;DR: The paper gives a brief overview of the four shared tasks that are to be organized at the PAN 2020 lab on digital text forensics and stylometry, hosted at CLEF conference.
Abstract: The paper gives a brief overview of the four shared tasks that are to be organized at the PAN 2020 lab on digital text forensics and stylometry, hosted at CLEF conference. The tasks include author profiling, celebrity profiling, cross-domain author verification, and style change detection, seeking to advance the state of the art and to evaluate it on new benchmark datasets.
TL;DR: An overview of the ChEMU2020 lab, which focuses on extracting synthesis process of new chemical compounds from chemical patents, the resources created for the two tasks, the evaluation methodology adopted, and participants results are described.
Abstract: The discovery of new chemical compounds is perceived as a key driver of the chemistry industry and many other economic sectors. The information about the new discoveries are usually disclosed in scientific literature and in particular, in chemical patents, since patents are often the first venues where the new chemical compounds are publicized. Despite the significance of the information provided in chemical patents, extracting the information from patents is costly due to the large volume of existing patents and its drastic expansion rate. The Cheminformatics Elsevier Melbourne University (ChEMU) evaluation lab 2020, part of the Conference and Labs of the Evaluation Forum 2020 (CLEF2020), provides a platform to advance the state-of-the-arts in automatic information extraction systems over chemical patents. In particular, we focus on extracting synthesis process of new chemical compounds from chemical patents. Using the ChEMU corpus of 1500 “snippets” (text segments) sampled from 170 patent documents and annotated by chemical experts, we defined two key information extraction tasks. Task 1 targets at chemical named entity recognition, i.e., the identification of chemical compounds and their specific roles in chemical reactions. Task 2 targets at event extraction, i.e., the identification of reaction steps, relating the chemical compounds involved in a chemical reaction. In this paper, we provide an overview of our ChEMU2020 lab. Herein, we describe the resources created for the two tasks, the evaluation methodology adopted, and participants results. We also provide a brief summary of the methods employed by participants of this lab and the results obtained across 46 runs from 11 teams, finding that several submissions achieve substantially better results than the baseline methods prepared by the organizers.
TL;DR: In this article, a deep learning approach is proposed that aims to extract specific information concerning incidents embedded in the texts from biomedical text mining texts.
Abstract: Event extraction is one of the crucial tasks in biomedical text mining that aims to extract specific information concerning incidents embedded in the texts. In this article, we propose a deep learning framework that aims to identify the attributes (severity, course, temporal expression, and document creation time) associated with the medical concepts extracted from electronic medical records. The bi-directional long short-term memory network assisted by the attention mechanism is utilized to uncover the important aspects of the patient’s medical conditions. The attention mechanism specific to the medical disorder mention can focus on various parts of the sentence when different disorders are considered as input. The proposed methodology is evaluated on benchmark ShARe/CLEF eHealth Evaluation Lab 2014 shared task 2 datasets. In addition to the CLEF dataset, we also used the social media text, especially the medical blog posts. Experimental results of the proposed approach illustrate that our proposed approach achieves significant performance improvements over the state-of-the-art techniques and the highly competitive deep learning--based baseline methods.
TL;DR: The substantial community interest in the tasks and their resources has led to CLEF eHealth maturing as a primary venue for all interdisciplinary actors of the ecosystem for producing, processing, and consuming electronic health information.
Abstract: Laypeople’s increasing difficulties to retrieve and digest valid and relevant information in their preferred language to make health-centred decisions has motivated CLEF eHealth to organize yearly labs since 2012. These 20 evaluation tasks on Information Extraction (IE), management, and Information Retrieval (IR) in 2013–2019 have been popular—as demonstrated by the large number of team registrations, submissions, papers, their included authors, and citations (748, 177, 184, 741, and 1299, respectively, up to and including 2018)—and achieved statistically significant improvements in the processing quality. In 2020, CLEF eHealth is calling for participants to contribute to the following two tasks: The 2020 Task 1 on IE focuses on term coding for clinical textual data in Spanish. The terms considered are extracted from clinical case records and they are mapped onto the Spanish version of the International Classification of Diseases, the 10th Revision, including also textual evidence spans for the clinical codes. The 2020 Task 2 is a novel extension of the most popular and established task in CLEF eHealth on CHS. This IR task uses the representative web corpus used in the 2018 challenge, but now also spoken queries, as well as textual transcripts of these queries, are offered to the participants. The task is structured into a number of optional subtasks, covering ad-hoc search using the spoken queries, textual transcripts of the spoken queries, or provided automatic speech-to-text conversions of the spoken queries. In this paper we describe the evolution of CLEF eHealth and this year’s tasks. The substantial community interest in the tasks and their resources has led to CLEF eHealth maturing as a primary venue for all interdisciplinary actors of the ecosystem for producing, processing, and consuming electronic health information.
TL;DR: In the CLEF 2020 Task of Chemical Reaction Extraction from Patent as discussed by the authors, the task consisted of two subtasks: (1) Named entity recognition to identify compounds and different semantic roles in the chemical reaction; (2) Event extraction to identify event-triggers of chemical reaction and their relations with the semantic roles recognized in subtask 1.
Abstract: This work describes the participation of the Melaxtech team in the CLEF 2020 – ChEMU Task of Chemical Reaction Extraction from Patent. The task consisted of two subtasks: (1) Named entity recognition to identify compounds and different semantic roles in the chemical reaction. (2) Event extraction to identify event-triggers of chemical reaction and their relations with the semantic roles recognized in subtask 1. We developed hybrid approaches combining both deep learning models and pattern-based rules for this task. Our approaches achieved state-of-art results in both subtasks, with the best F1 of 0.957 for entity recognition and the best F1 of 0.9536 for event extraction, indicating the proposed approaches are promising.
TL;DR: The CLEF hypothesis provides a simple solution to all protein folding paradoxes, and proposes a “CLEF age” or “Stone Age” for the prebiotic evolution of proteins.
Abstract: The protein folding problem has been extensively studied for decades, and hundreds of thousands of protein structures have been solved. Yet, how proteins fold from a linear peptide chain to their unique 3D structures is not fully understood. With key clues having emerged unexpectedly from the field of nanoscience, a "Confined Lowest Energy Fragment" (CLEF) hypothesis was proposed. The CLEF hypothesis states that a protein chain can be divided into CLEFs, the semi-independent folding units, by a small number of key residues that form key long-range interactions. The native structure of a CLEF is the lowest energy state under the constraints of the key long-range interactions, but the native structure of the whole protein is not necessary the lowest energy state as Anfinsen's thermodynamic hypothesis suggested. The CLEF hypothesis proposes a unified CLEF mechanism for protein folding, basically a two-step process. In the first step, the favorable enthalpy of CLEFs for native structures quickly brings those residues for the key long-range interactions together, forming intermediates corresponding to the so-called hydrophobic collapse. In the second step, those collapsed key residues shuffle for the right combination to form the native key long-range interactions. The CLEF hypothesis provides a simple solution to all protein folding paradoxes, and proposes a "CLEF Age" or "Stone Age" for the prebiotic evolution of proteins.
TL;DR: A comparison study between a set of classifiers has been carried out and the best results were achieved using the model LSVC which yielded an f1-score of 76% and 58.50% for Spanish and English, respectively.
Abstract: In this paper, we present a description of our experiments on Profiling Fake News Spreaders on Twitter based on TFIDF Features and Morphological Processes as stemming, lemmatization and part of speech tagging. A comparison study between a set of classifiers has been carried out. The best results were achieved using the model LSVC which yielded an f1-score of 76% and 58.50% for Spanish and English, respectively.
TL;DR: This paper approaches this challenge through extracting linguistic and sentiment features from users’ tweet feed as well as retrieving the presence of emojis, hashtags and political bias in their tweets, and achieves 72% accuracy, being among the top-4 results obtained by systems for the task in the English language.
Abstract: Automatic detection of fake news in social media has become a prominent research topic due to its widespread, adverse effect on not only the society and public health but also on economy and democracy. The computational approaches towards automatic detection of fake news span from analyzing the source credibility, user credibility, as well as social network structure and the news content. However, the studies on user credibility in this context have largely focused on the frequency and times of engaging in a fake news propagation rather than profiling users based on the content of their tweets. In this paper, we approach this challenge through extracting linguistic and sentiment features from users’ tweet feed as well as retrieving the presence of emojis, hashtags and political bias in their tweets. These features are then used to classify users into spreaders or non-spreaders of fake news. Our proposed approach achieves 72% accuracy, being among the top-4 results obtained by systems for the task in the English language.
TL;DR: This paper tackled the task of automatically assigning ICD-10 diagnosis and procedure codes to Spanish electronic health records using a dictionary-based approach and achieved an F1-score of 0.52 on a test set of 250 clinical cases.
Abstract: In this paper, we describe the approach and the results of our participation in task 1 (multilingual information extraction) of the CLEF eHealth 2020 challenge. We tackled the task of automatically assigning ICD-10 diagnosis and procedure codes to Spanish electronic health records. We used a dictionary-based approach using only materials provided by the task organizers. The training set consisted of 750 clinical cases annotated by a medical expert. Our system achieved an F1-score of 0.69 for the detection of diagnoses and 0.52 for the detection of procedures on a test set of 250 clinical cases.
TL;DR: In this paper, the effects of processing fluency on metamemory for written music were examined for short sequences notated in either treble or bass clef by playing them on a sile...
Abstract: We examined the effects of processing fluency on metamemory for written music. In Experiment 1, piano players studied short sequences notated in either treble or bass clef by playing them on a sile...
TL;DR: BERT-SciELO, a BERT-Base model pre-trained from scratch on an unlabeled corpus of biomedical articles in Spanish, achieved the best results among three submitted systems, obtaining a final Mean Average Precision (MAP) metric score of 0.482 on the evaluation set.
Abstract: This working notes paper presents our contribution to the CLEF eHealth 2020 Task 1. Our team has participated in the CodiEsp-D subtask, the first shared task consisted in the automatic clinical coding of medical cases in Spanish, annotated with ICD-10-CM codes. We tackled the task as a multi-label classification problem using BERT model [4]. With the aim of leveraging all the language modeling capacities of the deep bidirectional encoder architecture of BERT, we developed a tailored approach to annotate short fragments of text extracted from the long clinical cases present in the CodiEsp corpus and use them as input to the model. Two publicly available Spanish versions of BERT, namely BETO [3] and BERT-SciELO [1], were fine-tuned on the CodiEsp-D corpus extended by a set of abstracts annotated with ICD-10 codes, following our fragment-based classification approach. BERT-SciELO, a BERT-Base model pre-trained from scratch on an unlabeled corpus of biomedical articles in Spanish, achieved the best results among our three submitted systems, obtaining a final Mean Average Precision (MAP) metric score of 0.482 on the evaluation set.
TL;DR: The participation of the Document and Pattern Recognition Lab from the Rochester Institute of Technology in the CLEF 2020 ARQMath lab yielded strong results, the Task 1 results were less competitive.
Abstract: This paper describes the participation of the Document and Pattern Recognition Lab from the Rochester Institute of Technology in the CLEF 2020 ARQMath lab. There are two tasks defined for ARQMath: (1) Question Answering, and (2) Formula Retrieval. Four runs were submitted for Task 1 using systems that take advantage of text and formula embeddings. For Task 2, three runs were submitted: one uses only formula embedding, another uses formula and text embeddings, and the final one uses formula embedding followed by re-ranking results by tree-edit distance. The Task 2 runs yielded strong results, the Task 1 results were less competitive.
TL;DR: This paper describes the submission to the CLEF HIPE 2020 shared task on identifying named entities in multi-lingual historical newspapers in French, German and English, and uses an ensemble of fine-tuned BERT models for named entity recognition and entity linking.
Abstract: This paper describes our submission to the CLEF HIPE 2020 shared task on identifying named entities in multi-lingual historical newspapers in French, German and English. The subtasks we addressed in our submission include coarse-grained named entity recognition, entity mention detection and entity linking. For the task of named entity recognition we used an ensemble of fine-tuned BERT models; entity linking was approached by three different methods: (1) a simple method relying on ElasticSearch retrieval scores, (2) an approach based on contextualised text embeddings, and (3) REL, a modular entity linking system based on several state-of-the-art components.
TL;DR: This paper elaborates on the submission to the ARQMath track at CLEF 2020, using a two-stage retrieval technique in which the first stage is a fusion of traditional BM25 scoring and tf-idf with cosine similarity-based retrieval while the second stage is a re-ranking technique using contextualized embeddings.
TL;DR: The task was a novel extension of the most popular and established task in CLEF eHealth on Consumer Health Search, which makes responses to spoken ad-hoc queries, and described the resources created for the task and evaluation methodology adopted.
TL;DR: These working notes present the participation of the IXAAAA team on the CodiEsp Track, as part of the CLEF 2020, and developed several systems to cope with the three sub-tasks, including tree-based multi-label classifiers, similarity match strategies, and ensemble models.
Abstract: These working notes present the participation of the IXAAAA team on the CodiEsp Track, as part of the CLEF 2020. The track is about automatic coding of clinical records according to the International Classification of Diseases 10th revision (ICD-10). There are three sub-tasks: CodiEsp-D, CodiEsp-P and CodiEsp-X. The two main tasks, CodiEsp-D and CodiEsp-P, aim to develop systems able to automatically classify clinical texts according to the ICD-10, respectively for diagnostics and procedures. CodiEsp-X, by contrast, is an exploratory sub-task within the framework of Explainable AI in which the goal is to detect the text fragment that motivates the presence of the ICD code. For the IXA-AAA team participation, we have developed several systems to cope with the three sub-tasks, including tree-based multi-label classifiers, similarity match strategies, and ensemble models. For the similarity match, we have explored several approaches and algorithms from string edit distances as Levenshtein to dense representation with Transformers grounded BERT models. Our best results overall are achieved by the combination of models, with a MAP of 69.8% for CodiEsp-D and 48.1% for CodiEsp-P. Regarding the exploratory task, CodiEsp-X, our best coder achieve a micro F1-Score of 30.6%.
TL;DR: This paper describes the team’s participation in the Tracks 1 & 2 from Conference and Labs of the Evaluation Forum (CLEF 2020) Challenge organized by Cheminformatics Elsevier Melbourne University for extracting information over chemical reactions from patents and discusses their systems: MedaCy, a python-based supervised multi-class entity recognition system, and RelEx, a Python-based relation extraction system which includes rule-based and supervised learning pipelines.
Abstract: This paper describes our team’s participation in the Tracks 1 & 2 from Conference and Labs of the Evaluation Forum (CLEF 2020) Challenge organized by Cheminformatics Elsevier Melbourne University for extracting information over chemical reactions from patents. We discuss our systems: MedaCy, a python-based supervised multi-class entity recognition system, and RelEx, a python-based relation extraction system which includes rule-based and supervised learning pipelines. Our best model for Task 1 obtained an overall relaxed precision of 0.95 and exact precision of 0.87; relaxed recall of 0.99 and exact recall of 0.86; and relaxed F1 score of 0.97 and exact F1 score of 0.87. Our best model for Task 2 obtained an overall precision of 0.80; recall of 0.54; and F1 score of 0.65.
TL;DR: This paper describes the system developed for ChEMU @ CLEF Cheminformatics Elsevier Melbourne University lab, Named Entity Recognition task for identifying chemical compounds as well as their types in context, i.e., to assign the label of a chemical compound according to the role which the compound plays within a chemical reaction from patent documents.
Abstract: This paper describes our system developed for ChEMU @ CLEF Cheminformatics Elsevier Melbourne University lab, Named Entity Recognition (NER) task for identifying chemical compounds as well as their types in context, i.e., to assign the label of a chemical compound according to the role which the compound plays within a chemical reaction from patent documents. We have presented two systems which use Conditional random fields (CRFs) algorithms and Artificial Neural Networks (ANN). In this work we used feature set that includes linguistic, orthographical and lexical clue features. In the development of systems, we have used only the training data provided by the track organizers and no other external resources or embedding models were used. We obtained an F-score of 0.6640 using CRFs and F-Score of 0.3764 using ANN on the test data.
TL;DR: The main finding was that combining word embeddings could be a useful strategy to apply for deep learning-based approaches, even though the combinedembeddings do not belong to the medical domain.
Abstract: This paper describes the system presented by the SINAI team for the Multilingual Information Extraction task of the CLEF eHealth Lab 2020. This task focuses on the automatic assignment of the International Classification of Diseases (ICD) codes to health-related texts in Spanish. Our proposal follows a deep learning-based approach where we have used the bidirectional variant of a Long Short Term Memory (LSTM) network along with a stacked Conditional Random Fields (CRF) decoding layer (BiLSTM+CRF). The aim of the experiments carried out was to test the performance of different pre-trained word embeddings for recognizing diagnoses and procedures in clinical text. The main finding was that combining word embeddings could be a useful strategy to apply for deep learning-based approaches, even though the combined embeddings do not belong to the medical domain. The best MAP scores achieved were 0.314 and 0.293 for the CodiEsp-D and CodiEsp-P subtasks, respectively.
TL;DR: The ARQMath Lab at CLEF considers finding answers to new mathematical questions among posted answers on a community question answering site (Math Stack Exchange), which includes a formula retrieval sub-task.
Abstract: The ARQMath Lab at CLEF considers finding answers to new mathematical questions among posted answers on a community question answering site (Math Stack Exchange). Queries are question posts held out from the searched collection, each containing both text and at least one formula. This is a challenging task, as both math and text may be needed to find relevant answer posts. ARQMath also includes a formula retrieval sub-task: individual formulas from question posts are used to locate formulae in earlier question and answer posts, with relevance determined considering the context of the post from which a query formula is taken, and the posts in which retrieved formulae appear.
TL;DR: In this paper, the authors explore the distinct linguistic characteristics related to Twitter compared to the traditional oral or written form, and discover the linguistic features strongly related to bots, and those associated with men or women.
Abstract: In this second chapter presenting stylometric applications, the social networks, and more precisely Twitter, are the source of our dataset. To explore new forms of communication, this chapter explores the distinct linguistic characteristics related to Twitter compared to the traditional oral or written form. For example, the frequency of mentions (e.g., @POTUS44), hyperlinks (e.g., www.nytimes.com), retweets or emojis (e.g., Open image in new window, Open image in new window) can be exploited to profile the author of a set of tweets. The dataset, freely available, is provided by the CLEF PAN evaluation campaign in 2019. With this corpus, the first classification task is to discriminate between tweets generated by bots or by humans. In a second application, the computer must identify tweets written by men or women. As a useful additional result, one can discover the linguistic features strongly related to bots, and those associated with men or women.
TL;DR: The unsupervised component is used to provide code evidences in EHRs exploiting a greater interpretability and the mixed approach improves the strict supervised proposals by more than 38% and 13% respectively.
Abstract: This paper describes our contribution to the CLEF eHealth 2020 Task 1, consisting of the CIE-10-ES annotation of Spanish Electronic Health Records (EHRs). CIE-10-ES coding is the extended version of the ICD-10 in Spain. One of the sub-tasks is aimed at the interpretability of proposals, which is in line with the latest demands in Natural Language Processing (NLP). Moreover, ICD-10 entries generated by hospitals usually follow an extreme distribution, involving complex annotation challenges. For that reason, an unsupervised semantic similarity-based method has been explored using a representation based on SNOMED-CT clinical terminology. Since example-based learning is able to capture complex patterns, the proposal has been combined with Gradient Boosting methods to model the codes with more instances. mAP scores of 0.517 are achieved for CIE-10-ES codes associated with diagnoses and 0.398 for CIE-10-ES procedure codes. The mixed approach improves the strict supervised proposals by more than 38% and 13% respectively. Finally, the unsupervised component is used to provide code evidences in EHRs exploiting a greater interpretability.
TL;DR: The University of Amsterdam’s participation in CLEF 2020 Touché Track consists of two tasks: Conversational Argument Retrieval and Comparative Argument Retrieval, and a pipeline to re-rank documents retrieved from Clueweb using three features: PageRank scores, web domains, and argu-mentativeness.
TL;DR: This suggested approach is based on a two-stage method ignoring infrequent terms and ranking the others according to their occurrence differences between the two categories, and a classifier is implemented combining decision tree, random forest, and boosting.
Abstract: In our participation of the “Profiling Fake News Spreaders on Twitter” task (both in English and Spanish), our main objective is to be able to detect Twitter user accounts used to spread disinformation, fake news, as well as conspiracy theories. To automatically solve these questions based only on the tweets' contents, we suggest to reduce the number of features (isolated words) to a few hundred. This suggested approach is based on a two-stage method ignoring infrequent terms and ranking the others according to their occurrence differences between the two categories. Finally, a classifier is implemented combining decision tree, random forest, and boosting. Our first evaluation experiments indicate an overall accuracy around 70%.
TL;DR: The presented models use the neural principles of convolution and attention to obtain their results and a hierarchical component is introduced as well as hierarchical post-processing heuristics that leverage the information that is inherently present in the ICD taxonomy.
Abstract: In this paper, we compare state-of-the-art neural network approaches to the 2020 CLEF eHealth task 1. The presented models use the neural principles of convolution and attention to obtain their results. Furthermore, a hierarchical component is introduced as well as hierarchical post-processing heuristics. These additions successfully leverage the information that is inherently present in the ICD taxonomy.