Open AccessProceedings Article
MS\^2: Multi-Document Summarization of Medical Studies.
Jay DeYoung,Iz Beltagy,Madeleine van Zuylen,Bailey Kuehl,Lucy Lu Wang +4 more
- 01 Nov 2021
- pp 7494-7513
TL;DR: MS2 as mentioned in this paper is a dataset of over 470k documents and 20k summaries derived from the scientific literature, which facilitates the development of systems that can assess and aggregate contradictory evidence across multiple studies and is the first large-scale, publicly available multi-document summarization dataset in the biomedical domain.
read more
Abstract: To assess the effectiveness of any medical intervention, researchers must conduct a time-intensive and manual literature review. NLP systems can help to automate or assist in parts of this expensive process. In support of this goal, we release MSˆ2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20K summaries derived from the scientific literature. This dataset facilitates the development of systems that can assess and aggregate contradictory evidence across multiple studies, and is the first large-scale, publicly available multi-document summarization dataset in the biomedical domain. We experiment with a summarization system based on BART, with promising early results, though significant work remains to achieve higher summarization quality. We formulate our summarization inputs and targets in both free text and structured forms and modify a recently proposed metric to assess the quality of our system’s generated summaries. Data and models are available at https://github.com/allenai/ms2.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
References
Pandemic publishing: Medical journals strongly speed up their publication process for covid-19
S.P.J.M. Horbach,S.P.J.M. Horbach +1 more
- 04 Sep 2020
TL;DR: It is concluded that medical journals have indeed strongly accelerated their publication process for coronavirus-related articles since the outbreak of the pandemic, and the time between submission and publication has decreased on average by 49%.
A BRIEF HISTORY OF THE RANDOMIZED CONTROLLED TRIAL: From Oranges and Lemons to the Gold Standard
TL;DR: The authors discusses the history and development of randomized clinical trial methodology, the reasons for its status and authority as a method of therapeutic evaluation, and the continuing role of clinical judgement in designing, interpreting, and applying the findings of trials.
Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization
Logan Lebanoff,Kaiqiang Song,Fei Liu +2 more
- 01 Aug 2018
TL;DR: The authors exploited the maximal marginal relevance method to select representative sentences from multi-document input, and leveraged an abstractive encoder-decoder model to fuse disparate sentences to generate abstractive summary.
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering
Asma Ben Abacha,Chaitanya Shivade,Dina Demner-Fushman +2 more
- 01 Aug 2019
TL;DR: The shared task is motivated by a need to develop relevant methods, techniques and gold standards for inference and entailment in the medical domain, and their application to improve domain specific information retrieval and question answering systems.
Making progress with the automation of systematic reviews: principles of the International Collaboration for the Automation of Systematic Reviews (ICASR).
Elaine Beller,Justin Clark,Guy Tsafnat,Clive E Adams,Heinz Diehl,Hans Lund,Mourad Ouzzani,Kristina A. Thayer,James Thomas,Tari Turner,Jun Xia,Karen A. Robinson,Paul Glasziou +12 more
TL;DR: The ‘Vienna Principles’ set out in this paper aim to guide a more coordinated effort which will allow the integration of work by separate teams and build on the experience, code and evaluations done by the many teams working across the globe.