Top 32 papers published in the topic of Clef in 2016

Showing papers on "Clef published in 2016"

Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016.

[...]

Aurélie Névéol¹, Kevin Bretonnel Cohen¹, Kevin Bretonnel Cohen², Cyril Grouin¹, Thierry Hamon³, Thomas Lavergne¹, Liadh Kelly⁴, Lorraine Goeuriot⁵, Grégoire Rey⁶, Aude Robert⁶, Xavier Tannier¹, Pierre Zweigenbaum¹ - Show less +8 more•Institutions (6)

Université Paris-Saclay¹, University of Colorado Boulder², University of Paris³, Trinity College, Dublin⁴, University of Grenoble⁵, French Institute of Health and Medical Research⁶

1 Sep 2016

TL;DR: The 2016 CLEF eHealth Task 2 as mentioned in this paper extended the previous information extraction tasks of ShARe/CLEF ehealth evaluation labs by introducing a large-scale classification task in French death certificates, which consisted of extracting causes of death as coded in the International Classification of Diseases, tenth revision (ICD10).

...read moreread less

Abstract: This paper reports on Task 2 of the 2016 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with named entity recognition and normalization in French narratives, as offered in CLEF eHealth 2015. Named entity recognition involved ten types of entities including disorders that were defined according to Semantic Groups in the Unified Medical Language System® (UMLS®), which was also used for normalizing the entities. In addition, we introduced a large-scale classification task in French death certificates, which consisted of extracting causes of death as coded in the International Classification of Diseases, tenth revision (ICD10). Participant systems were evaluated against a blind reference standard of 832 titles of scientific articles indexed in MEDLINE, 4 drug monographs published by the European Medicines Agency (EMEA) and 27,850 death certificates using Precision, Recall and F-measure. In total, seven teams participated, including five in the entity recognition and normalization task, and five in the death certificate coding task. Three teams submitted their systems to our newly offered reproducibility track. For entity recognition, the highest performance was achieved on the EMEA corpus, with an overall F-measure of 0.702 for plain entities recognition and 0.529 for normalized entity recognition. For entity normalization, the highest performance was achieved on the MEDLINE corpus, with an overall F-measure of 0.552. For death certificate coding, the highest performance was 0.848 F-measure.

...read moreread less

68 citations

Erasmus MC at CLEF eHealth 2016: Concept recognition and coding in French texts

[...]

Erik M. van Mulligen, Zubair Afzal, Saber A. Akhondi, Dang Vo, Jan A. Kors - Show less +1 more

1 Jan 2016

TL;DR: This work used Peregrine, the authors' open-source indexing engine, with a dictionary based on French terms in the Unified Medical Language System supplemented with English UMLS terms that were translated into French with automatic translators to address entity recognition and normalization in a corpus of French drug labels and Medline titles.

...read moreread less

Abstract: We participated in task 2 of the CLEF eHealth 2016 chal-lenge. Two subtasks were addressed: entity recognition and normalization in a corpus of French drug labels and Medline titles, and ICD-10 coding of French death certificates. For both subtasks we used a dictionary-based approach. For entity recognition and normalization, we used Peregrine, our open-source indexing engine, with a dictionary based on French terms in the Unified Medical Language System (UMLS) supplemented with English UMLS terms that were translated into French with automatic translators. For ICD-10 coding, we used the Solr text tagger, together with one of two ICD-10 terminologies derived from the task training ma-terial. To reduce the number of false-positive detections, we implemented several post-processing steps. On the challenge test set, our best system obtained F-scores of 0.702 and 0.651 for entity recognition in the drug labels and in the Medline titles, respectively. For entity normalization, F-scores were 0.529 and 0.474. On the test set for ICD-10 coding, our system achieved an F-score of 0.848 (precision 0.886, recall 0.813). These scores were substantially higher than the average score of the systems that participated in the challenge.

...read moreread less

42 citations

Proceedings Article•

GronUP: Groningen User Profiling: Notebook for PAN at CLEF 2016

[...]

Mart Busger op Vollenbroek, Talvany Carlotto, Tim Kreutz¹, Masha Medvedeva², Chris Pool, Johannes Bjerva², Hessel Haagsma², Malvina Nissim² - Show less +4 more•Institutions (2)

University of Antwerp¹, University of Groningen²

1 Sep 2016

TL;DR: An SVM model is trained to perform user profiling, in terms of gender and age, on non-Twitter social media data, on English, Dutch, and Spanish data without any language-specific tuning of features or parameters.

...read moreread less

Abstract: We trained an SVM model on tweets to perform user profiling, in terms of gender and age, on non-Twitter social media data. The system exploits features that we deemed appropriate to profile authors on social media, and that do not characterise too closely the specific usage of Twitter. Our system works on English, Dutch, and Spanish data without any language-specific tuning of features or parameters. Results on the cross-validated training set seem to indicate that features contribute rather equally to the model’s performance.

...read moreread less

26 citations

ECSTRA-INSERM @ CLEF eHealth2016-task 2: ICD10 Code Extraction from Death Certificates.

[...]

Mohamed Dermouche, Vincent Looten, Rémi Flicoteaux, Sylvie Chevret, Julien Velcin, Namik Taright - Show less +2 more

5 Sep 2016

TL;DR: This paper casts the task as a machine learning problem involving the prediction of the ICD10 codes (categorical variable) from the raw text transformed into a bag-of-words matrix and relies on probabilistic topic models that are evaluated against classical classifiers such as SVM and Naive Bayes.

...read moreread less

Abstract: This paper describes the participation of ECSTRA-INSERM team at CLEF eHealth 2016, task 2.C. The task involves extracting ICD10 codes from death certificates, mainly described with short plain texts. We cast the task as a machine learning problem involving the prediction of the ICD10 codes (categorical variable) from the raw text transformed into a bag-of-words matrix. We rely on probabilistic topic models that we evaluate against classical classifiers such as SVM and Naive Bayes. We demonstrate the effectiveness of topic models for this task in terms of prediction accuracy and result interpretation.

...read moreread less

26 citations

Book Chapter•10.1007/978-3-319-44564-9_30•

Overview of the CLEF 2016 Cultural Micro-blog Contextualization Workshop

[...]

Lorraine Goeuriot¹, Josiane Mothe², Philippe Mulhem¹, Fionn Murtagh³, Fionn Murtagh⁴, Eric SanJuan⁵ - Show less +2 more•Institutions (5)

University of Grenoble¹, University of Toulouse², Goldsmiths, University of London³, University of Derby⁴, University of Avignon⁵

5 Sep 2016

TL;DR: Cultural micro-blog Contextualization Workshop is aiming at providing the research community with data sets to gather, organize and deliver relevant social data related to events generating a large number of micro- blog posts and web documents.

...read moreread less

Abstract: CLEF Cultural micro-blog Contextualization Workshop is aiming at providing the research community with data sets to gather, organize and deliver relevant social data related to events generating a large number of micro-blog posts and web documents. It is also devoted to discussing tasks to be run from this data set and that could serve applications.

...read moreread less

22 citations

Proceedings Article•

SIBM at CLEF eHealth Evaluation Lab 2016: Extracting Concepts in French Medical Texts with ECMT and CIMIND.

[...]

Chloé Cabot, Lina Fatima Soualmia, Badisse Dahamna, Stéfan Jacques Darmoni

5 Sep 2016

TL;DR: This paper presents SIBM’s participation in the Multilingual Information Extraction task 2 of the CLEF eHealth 2016 evaluation initiative which focuses on named entity recognition in French written text and reports on the indexing of the provided QUAERO dataset with multiple knowledge organization systems (KOS) partially or totally translated in French.

...read moreread less

Abstract: This paper presents SIBM’s participation in the Multilingual Information Extraction task 2 of the CLEF eHealth 2016 evaluation initiative which focuses on named entity recognition in French written text. We report on the indexing of the provided QUAERO dataset with multiple knowledge organization systems (KOS) partially or totally translated in French. The extraction method is available online in the form a webbased service that requests the KOS to extract clinical concepts from Electronic Health Records. It is also available via a user-friendly interface developed for clinicians. We addressed the identification of relevant clinical entities within the International Classification of Diseases version 10 in the CépiDC dataset with a system based on natural language processing and approximate string matching methods. The results obtained this year were rather satisfactory and attested significant progress, particularly in exact match recognition, since our last year’s participation.

...read moreread less

22 citations

IPL at CLEF 2016 Medical Task.

[...]

Leonidas Valavanis, Spyridon Stathopoulos, Theodore Kalamboukis

1 Jan 2016

TL;DR: This paper presents the image classification techniques performed by the IPL Group for the subfigure classification subtask of ImageCLEF 2016 Medical Task and presents the results of the runs and the extensive experiments applying early or late fusion on the results obtained from a multi-class linear kernel support vector machine.

...read moreread less

Abstract: In this paper we present the image classification techniques performed by the IPL Group for the subfigure classification subtask of ImageCLEF 2016 Medical Task. For the visual representation of images, various state-of-the-art visual features, such as, Bag of Visual Words computed with pyramid-histogram of-visual-words descriptors and quadtree bag-of-colors were adopted. We present the results of our runs and our extensive experiments applying early or late fusion on the results obtained from a multi-class linear kernel support vector machine. Our top run was ranked 3rd among 34 runs.

...read moreread less

16 citations

BiTeM at CLEF eHealth Evaluation Lab 2016 Task 2: Multilingual Information Extraction.

[...]

Luc Mottin, Julien Gobeill, Anaïs Mottaz, Emilie Pasche, Arnaud Gaudinat, Patrick Ruch - Show less +2 more

1 Jan 2016

TL;DR: The participation of the BiTeM/SIB Text Mining team at the CLEF eHealth 2016 evaluation lab and an ad hoc solution based on simple pattern matching to comply with the constraints of the CépiDC challenge.

...read moreread less

Abstract: BiTeM/SIB Text Mining (http://bitem.hesge.ch/) is a University research group carrying over activities in semantic and text analytics applied to health and life sciences. This paper reports on the participation of our team at the CLEF eHealth 2016 evaluation lab. The processing applied to each evaluation corpus (QUAREO and CépiDC) was originally very similar. Our method is based on an Automatic Text Categorization (ATC) system. First, the system is set with a specific input ontology (French UMLS), and ATC assigns a rank list of related concepts to each document received in input. Then, a second module relocates all of the positive matches in the text, and normalizes the extracted entities. For the CépiDC corpus, the system was loaded with the Swiss ICD-10 GM thesaurus. However a late minute data transformation issue forced us to implement an ad hoc solution based on simple pattern matching to comply with the constraints of the CépiDC challenge. We obtained an average precision of 62% on the QUAREO entity extraction (over MEDLINE/EMEA texts, and exact/inexact), 48% on normalizing this entities, and 59% on the CépiDC subtask. Enhancing the recall by expanding the coverage of the terminologies could be an interesting approach to improve this system at moderate labour costs.

...read moreread less

15 citations

Task 1 of the CLEF eHealth Evaluation Lab 2016: Handover Information Extraction

[...]

Hanna Suominen¹, Hanna Suominen², Liyuan Zhou, Lorraine Goeuriot³, Liadh Kelly⁴ - Show less +1 more•Institutions (4)

University of Turku¹, Australian National University², University of Grenoble³, Dublin City University⁴

1 Jan 2016

TL;DR: This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum (CLEF) in 2016, where all participant methods outperformed all 4 baselines.

...read moreread less

Abstract: Cascaded speech recognition (SR) and information extraction (IE) could support the best practice for clinical handover and release clinicians’ time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges and hence, processing correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum (CLEF) in 2016. This IE task built on the 2015 CLEF eHealth Task on SR by using its 201 synthetic handover documents for training and validation (appr. 8, 500 + 7, 700 words) and releasing another 100 documents with over 6, 500 expert-annotated words for testing. It attracted 25 team registrations and 3 team submissions with 2 methods each. When using the macro-averaged F1 over the 35 form headings present in the training documents for evaluation on the test documents, all participant methods outperformed all 4 baselines, including the organizers’ method (F1 = 0.25), published in 2015 in a top-tier medical informatics journal and provided to the participants as an option to build on, a random classifier (F1 = 0.02), and majority classifiers for the two most common classes (i.e., NA to filter out text irrelevant to the form and the most common form heading, both with F1 < 0.00). The top-2 methods (F1 = 0.38 and 0.37) had statistically significantly (p < 0.05, Wilcoxon signed-rank test) better performance than the third-best method (F1 = 0.35). In comparison, the top-3 methods and the organizers’ method (7th) had F1 of 0.81, 0.80, 0.81, and 0.75 in the NA class, respectively.

...read moreread less

14 citations

UniNE at CLEF 2016: Author Clustering.

[...]

Mirco Kocher

1 Jan 2016

TL;DR: An effective unsupervised author clustering authorship linking model called SPATIUM-L1 is described and evaluated, which can be adapted without any problem to different languages (such as Dutch, English, and Greek) in different genres.

...read moreread less

Abstract: This paper describes and evaluates an effective unsupervised author clustering authorship linking model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, and Greek) in different genres (e.g., newspaper articles and reviews). As features, we suggest using the m most frequent terms of each text (isolated words and punctuation symbols with m at most 200). Applying a simple distance measure, we determine whether there is enough indication that two texts were written by the same author. The evaluations are based on six test collections (PAN AUTHOR CLUSTERING task at CLEF 2016).

...read moreread less

14 citations

Book Chapter•10.1007/978-3-319-44564-9_2•

The CLEF Monolingual Grid of Points

[...]

Nicola Ferro¹, Gianmaria Silvello¹•Institutions (1)

University of Padua¹

5 Sep 2016

TL;DR: A systematic series of experiments is run for creating a grid of points where many combinations of retrieval methods and components adopted by MultiLingual Information Access (MLIA) systems are represented to provide insights about the effectiveness of the different components and their interaction.

...read moreread less

Abstract: In this paper we run a systematic series of experiments for creating a grid of points where many combinations of retrieval methods and components adopted by MultiLingual Information Access (MLIA) systems are represented. This grid of points has the goal to provide insights about the effectiveness of the different components and their interaction and to identify suitable baselines with respect to which all the comparisons can be made.

...read moreread less

Proceedings Article•

Cultural micro-blog Contextualization 2016 Workshop Overview: data and pilot tasks

[...]

Liana Ermakova, Lorraine Goeuriot¹, Josiane Mothe², Philippe Mulhem¹, Jian-Yun Nie, Eric SanJuan - Show less +2 more•Institutions (2)

University of Grenoble¹, University of Toulouse²

5 Sep 2016

...read moreread less

LITL at CLEF eHealth2016: recognizing entities in French biomedical documents

[...]

Lydia-Mai Ho-Dac, Ludovic Tanguy, Céline Grauby¹, Nkauj Hnub Aurore Heu Mby¹, Justine Malosse¹, Laura Rivière¹, Amélie Veltz-Mauclair¹, Marine Wauquier¹ - Show less +4 more•Institutions (1)

University of Toulouse¹

6 Sep 2016

TL;DR: This paper describes the participation of master's students (LITL programme, university of Toulouse) and their teachers to the CLEF eHealth 2016 campaign and the system used, a CRF classier based on a number of dierent features (POS tagging, generic word lists and syntactic parsing).

...read moreread less

Abstract: This paper describes the participation of master's students (LITL programme, university of Toulouse) and their teachers to the CLEF eHealth 2016 campaign. Two runs were submitted for task 2 (multilingual information extraction) which consisted in the recognition and categorization of medical entities in French biomedical documents. The system used consists of a CRF classier based on a number of dierent features (POS tagging, generic word lists and syntactic parsing). In addition , several patterns were used on the CRF's output in order to extract more complex entities. The best run achieved high precision (0.640.78) but lower recall (0.320.40), with an overall F1-measure of 0.430.53.

...read moreread less

Proceedings Article•

CLEF NewsREEL 2016: Image based Recommendation

[...]

Francesco Corsini, Martha Larson¹•Institutions (1)

Radboud University Nijmegen¹

1 Jan 2016

TL;DR: TheCLEF 2016 conference and Labs of the Evaluation forum Evora, Portugal, 5-8 September, 2016 and the working notes of CLEF 2016 are published.

...read moreread less

Abstract: CLEF 2016: Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum Evora, Portugal, 5-8 September, 2016

...read moreread less

Query Expansion by Word Embedding in the Suggestion Track of CLEF 2016 Social Book Search Lab.

[...]

Shih-Hung Wu, Yi-Hsiang Hsieh, Liang-Pu Chen, Ping-Che Yang

1 Jan 2016

TL;DR: A query expansion module which is based on word2vec, a word embedding toolkit is designed which helps the CYUT CSIE system to get better performance in suggestion track.

...read moreread less

Abstract: The Social Book Search (SBS) Lab is part of CLEF 2016 lab series. This is the fourth time that the CYUT CSIE team attends the SBS track. The content of topics has changed a little bit by the organizer; therefore, we make necessary modification on our system, which is based on keyword searching and ranking by social features. This year, we design a query expansion module which is based on word2vec, a word embedding toolkit. The new module helps our system to get better performance in suggestion track.

...read moreread less

Book Chapter•10.1007/978-3-319-44564-9_16•

SS4MCT: A Statistical Stemmer for Morphologically Complex Texts

[...]

Javid Dadashkarimi¹, Hossein Nasr Esfahani¹, Heshaam Faili¹, Azadeh Shakery¹•Institutions (1)

University of Tehran¹

5 Sep 2016

TL;DR: A method is proposed for finding statistical inflectional rules based on minimum edit distance table of word pairs and the likelihoods of the rules in a language to statistically stem words and can be used in different text mining tasks.

...read moreread less

Abstract: There have been multiple attempts to resolve various inflection matching problems in information retrieval. Stemming is a common approach to this end. Among many techniques for stemming, statistical stemming has been shown to be effective in a number of languages, particularly highly inflected languages. In this paper we propose a method for finding affixes in different positions of a word. Common statistical techniques heavily rely on string similarity in terms of prefix and suffix matching. Since infixes are common in irregular/informal inflections in morphologically complex texts, it is required to find infixes for stemming. In this paper we propose a method whose aim is to find statistical inflectional rules based on minimum edit distance table of word pairs and the likelihoods of the rules in a language. These rules are used to statistically stem words and can be used in different text mining tasks. Experimental results on CLEF 2008 and CLEF 2009 English-Persian CLIR tasks indicate that the proposed method significantly outperforms all the baselines in terms of MAP.

...read moreread less

The Effectiveness of Query Expansion when searching for Health related Content: InfoLab at CLEF eHealth 2016.

[...]

Ricardo Henrique Alves da Silva, Carla Teixeira Lopes

1 Jan 2016

TL;DR: The participation of InfoLab in the patient-centred information retrieval task of the CLEF eHealth 2016 lab is described and the performance of several query expansion strategies using different sources of terms and different methods to select the terms to be added to the original query is analysed.

...read moreread less

Abstract: In this paper we describe the participation of InfoLab in the patient-centred information retrieval task of the CLEF eHealth 2016 lab. We analyse the performance of several query expansion strategies using different sources of terms and different methods to select the terms to be added to the original query. One of the strategies uses pseudo relevance feedback for term selection. The other strategies use external sources such as Wikipedia articles and definitions from the UMLS Metathesaurus for term selection. In the end, readability metrics such as SMOG, FOG and Flesch-Kincaid were used to re-rank the documents retrieved using the expanded queries. As the relevance and readability assessments are not available we can’t make any conclusion regarding the results of our approaches.

...read moreread less

Team GU-IRLAB at CLEF eHealth 2016: Task 3.

[...]

Luca Soldaini, Will Edman, Nazli Goharian

1 Jan 2016

TL;DR: This work uses synonyms and hypernyms from a large medical ontology to generate alternative formulations for a query and results obtained by the reformulated queries are fused using the Borda rank aggregation algorithm.

...read moreread less

Abstract: Recent surveys have shown that a growing number internet users seek medical help online. Yet, recent research [12] has shown that many commercial search engine still struggle in completely satisfying the information need of users. In this work, we present a study on the use of medical terms for query reformulation. We use synonyms and hypernyms from a large medical ontology to generate alternative formulations for a query; Results obtained by the reformulated queries are fused using the Borda rank aggregation algorithm.

...read moreread less

CLEF NewsREEL 2016: Comparing Multi-Dimensional Offline and Online Evaluation of News Recommender Systems

[...]

Benjamin Kille, Andreas Lommatzsch, Frank Hopfgartner, Martha Larson, Jonas Seiler, Davide Malagoli, András Serény, Torben Brodt - Show less +4 more

1 Jan 2016

TL;DR: Results illustrate potentials for multi-dimensional evaluation of recommendation algorithms in a living lab and simulation based evaluation setting and overviews ideas on living lab evaluation that have been presented as part of a “New Ideas” track at the conference in Portugal.

...read moreread less

Abstract: Running in its third year at CLEF, NewsREEL challenged participants to develop news recommendation algorithms and have them benchmarked in an online (Task 1) and offline setting (Task 2), respectively. This paper provides an overview of the NewsREEL scenario, outlines this year’s campaign, presents results of both tasks, and discusses the approaches of participating teams. Moreover, it overviews ideas on living lab evaluation that have been presented as part of a “New Ideas” track at the conference in Portugal. Presented results illustrate potentials for multi-dimensional evaluation of recommendation algorithms in a living lab and simulation based evaluation setting.

...read moreread less

MayoNLPTeam at the 2016 CLEF eHealth information retrieval task 1

[...]

Yanshan Wang, Stephen Wu¹, Hongfang Liu•Institutions (1)

Oregon Health & Science University¹

1 Jan 2016

Book•

Experimental IR Meets Multilinguality, Multimodality, and Interaction 7th International Conference of the CLEF Association, CLEF 2016, Évora, Portugal, September 5-8, 2016, Proceedings

[...]

Norbert Fuhr

1 Jan 2016

Journal Article•10.1145/2888422.2888428•

Report on CLEF 2015: Experimental IR Meets Multilinguality, Multimodality, and Interaction

[...]

Linda Cappellato¹, Nicola Ferro¹, Gareth J. F. Jones², Jaap Kamps³, Josiane Mothe⁴, Karen Pinel-Sauvagnat⁴, Eric San Juan⁵, Jacques Savoy - Show less +4 more•Institutions (5)

University of Padua¹, Dublin City University², University of Amsterdam³, University of Toulouse⁴, University of Avignon⁵

29 Jan 2016

TL;DR: The focus of the Conference is "Experimental IR" as carried out in the CLEF Labs and other evaluation forums, it featured keynotes by Greg Greffenstette, Mounia Lalmas, and Doug Oard, and 43 papers have been presented, covering a wide range of topics.

...read moreread less

Abstract: This is a report on the sixth edition of the Conference and Labs of the Evaluation Forum (CLEF 2015), held in early September 2015, in Toulouse, France. CLEF was a four day event combining a Conference and an Evaluation Forum. The focus of the Conference is "Experimental IR" as carried out in the CLEF Labs and other evaluation forums, it featured keynotes by Greg Greffenstette, Mounia Lalmas, and Doug Oard, and 43 papers have been presented, covering a wide range of topics. There were a total of eight Labs: eHealth, ImageCLEF, LifeCLEF, Living Labs for IR, NEWSREEL, PAN, QA, and Social Book Search, addressing a wide range of tasks, media, languages, and ways to go beyond standard test collections.

...read moreread less

Journal Article•10.12775/TSB.2015.022•

Ewaluacja skuteczności systemów wyszukiwania informacji. Od eksperymentu Cranfield do laboratoriów TREC i CLEF. Geneza i metody

[...]

Piotr Malak, Adam Pawłowski¹•Institutions (1)

University of Wrocław¹

06 Apr 2016-Toruńskie Studia Bibliologiczne

TL;DR: The article presents also design of the CLEF (Conference and Labs of the Evaluation Forum) evaluation labs with special attention paid to CHiC (Cultural Heritage in CLEF).

...read moreread less

Abstract: We present the genesis and evolution of methods and measures of IR systems evaluation. The design of the Cranfield experiment, a long-term model for evaluation methodology, is described. Evolution of current methodology of IR systems evaluation, developed at the annual TREC (Text REtrieval Conference) is provided, and the most popular and current measures described. The article presents also design of the CLEF (Conference and Labs of the Evaluation Forum) evaluation labs with special attention paid to CHiC (Cultural Heritage in CLEF). We describe the design of Polish Task in CHiClab and discuss conclusions from lab realisation.

...read moreread less

MayoNLPTeam at the 2016 CLEF eHealth Information Retrieval Task 1.

[...]

Yanshan Wang, Stephen Wu, Hongfang Liu

1 Jan 2016

TL;DR: A Part-of-Speech (POS) based query term weighting approach which assigns different weights to the query terms according to their POS categories is explored, which applied with the optimal weights obtained from TREC 2011 and 2012 Medical Records Track.

...read moreread less

Abstract: This paper presents the participation of MayoNLPTeam in the 2016 CLEF eHealth Information Retrieval Task (IR Task 1: ad-hoc search). We explored a Part-of-Speech (POS) based query term weighting approach which assigns different weights to the query terms according to their POS categories. The weights are learned by defining an objective function based on the mean average precision. We applied the proposed approach with the optimal weights obtained from TREC 2011 and 2012 Medical Records Track into the Query Likelihood model (Run 2) and Markov Random Field (MRF) models (Run 3). The conventional Query Likelihood model was implemented as the baseline (Run 1).

...read moreread less

KISTI at CLEF eHealth 2016 Task 3: Ranking Medical Documents using Word Vectors.

[...]

Heung-Seon Oh, Yuchul Jung

1 Jan 2016

TL;DR: Two different approaches using word vectors learnt by Word2Vec with Wikipedia are attempted to deliver useful medical information through pseudo-relevance feedback with two different usage of the word vectors.

...read moreread less

Abstract: User’s searching activity to obtain relevant medical information becomes very common as the general public uses the Web as source of health information. As a response to this phenomenon, there have been a number of approaches to find useful information for diagnosing or understanding their health conditions from the Web or medical literatures. As an ongoing effort to deliver useful medical information, we attempted two different approaches using word vectors learnt by Word2Vec with Wikipedia. At first, initial documents are obtained using a search engine. Based the retrieved documents, pseudo-relevance feedback is applied with two different usage of the word vectors. In the first approach, a feedback model is constructed using new relevance scores using the word vectors while it is constructed with a new query expanded.

...read moreread less

KDEIR at CLEF eHealth 2016: Health Documents Re-ranking Based on Query Variations.

[...]

Zia Ullah, Masaki Aono

1 Jan 2016

TL;DR: This paper describes its participation in the CLEF eHealth 2016 task 3: Patient-Centred Information Retrieval focusing on the clinical web documents based on user queries in the health forum and employs multiple features based unsupervised reranking method to the documents retrieved by a baseline system.

...read moreread less

Abstract: In this paper, we describe our participation in the CLEF eHealth 2016 task 3: Patient-Centred Information Retrieval focusing on the clinical web documents based on user queries in the health forum. In our participation, we submitted three runs in ad-hoc search and two runs in query variation search subtasks. In ad-hoc search, the main challenge is to retrieve high quality clinical documents based on user query. For ad-hoc search, we employ multiple features based unsupervised reranking method to the documents retrieved by a baseline system. During the query variation search, the main challenge is to generate a ranked list of documents covering the different variations of the query. To tackle the query variation problem, first we formulate a query and a set of information needs from the query variation. Then, we re-rank the documents retrieved for the formulated query by focusing on the set of information needs.

...read moreread less

Proceedings Article•

The IR task at the CLEF eHealth evaluation lab 2016

[...]

Henning Müller

1 Jan 2016

Proceedings Article•

LIG at CLEF 2016 Cultural Microblog Contextualization: TimeLine Illustration based on Microblogs

[...]

Nayanika Dogra¹, Philippe Mulhem¹, Nawal Ould Amer, Lorraine Goeuriot¹•Institutions (1)

University of Grenoble¹

5 Sep 2016

TL;DR: The approach used by the LIG-MRIM research group to the participation of the task 3 (TimeLine illustration based on Microblogs) for the CLEF of Cultural Microblog Contextualization track deals with the retrieval of tweets related to cultural events (music festivals).

...read moreread less

Abstract: This paper presents the approach used by the LIG-MRIM research group to the participation of the task 3 (TimeLine illustration based on Microblogs) for the CLEF of Cultural Microblog Contextualization track. This task deals with the retrieval of tweets related to cultural events (music festivals). For the content-based elements, we use the classical BM25 model [4]. Then, we diversify the results based on duplicate removal, using tf-based representations of tweets. In a third step, we apply optional re-ranking related to time-line, activity and popularity of authors of tweets.

...read moreread less

WHUIRGroup at the CLEF 2016 eHealth Lab Task 3.

[...]

Ruixue Wang, Wei Lu, Ke Ren

1 Jan 2016

TL;DR: This paper presents the work on the 2016 CLEF eHealth Task 3.0 using CHV to expand query and proposed a learning-to-rank algorithm to re-rank the result.

...read moreread less

Abstract: This paper presents our work on the 2016 CLEF eHealth Task 3.We used Indri to conduct our experiments. We used CHV to expand query and proposed a learning-to-rank algorithm to re-rank the result.

...read moreread less

Book Chapter•10.1007/978-3-319-44564-9_24•

Overview of the CLEF eHealth Evaluation Lab 2016

[...]

Liadh Kelly¹, Lorraine Goeuriot², Hanna Suominen³, Hanna Suominen⁴, Aurélie Névéol⁵, Joao Palotti⁶, Guido Zuccon⁷ - Show less +3 more•Institutions (7)

Trinity College, Dublin¹, University of Grenoble², Australian National University³, University of Turku⁴, Centre national de la recherche scientifique⁵, Vienna University of Technology⁶, Queensland University of Technology⁷

5 Sep 2016

TL;DR: The resources created for these tasks, evaluation methodology adopted and a brief summary of participants to this year’s challenges are described and some results obtained are provided.

...read moreread less

Abstract: In this paper we provide an overview of the fourth edition of the CLEF eHealth evaluation lab. CLEF eHealth 2016 continues our evaluation resource building efforts around the easing and support of patients, their next-of-kins and clinical staff in understanding, accessing and authoring eHealth information in a multilingual setting. This year’s lab offered three tasks: Task 1 on handover information extraction related to Australian nursing shift changes, Task 2 on information extraction in French corpora, and Task 3 on multilingual patient-centred information retrieval considering query variations. In total 20 teams took part in these tasks (3 in Task 1, 7 in Task 2 and 10 in Task 3). Herein, we describe the resources created for these tasks, evaluation methodology adopted and provide a brief summary of participants to this year’s challenges and some results obtained. As in previous years, the organizers have made data and tools associated with the lab tasks available for future research and development.

...read moreread less