Using binary classification to prioritize and curate articles for the Comparative Toxicogenomics Database.

doi:10.1093/DATABASE/BAS050

Open AccessJournal Article10.1093/DATABASE/BAS050

Using binary classification to prioritize and curate articles for the Comparative Toxicogenomics Database.

Dina Vishnyakova, +2 more

- 01 Jan 2012

- Database

- Vol. 2012, Iss: 2012, pp 1-9

10

TL;DR: The original integration of an automatic text categorization pipeline, so-called ToxiCat (Toxicogenomic Categorizer), that was developed to perform biomedical documents classification and prioritization in order to speed up the curation of the Comparative Toxicogenomics Database (CTD).

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1093/NAR/GKT441

PubTator: a web-based text mining tool for assisting biocuration

Chih-Hsuan Wei, +2 more

- 01 Jul 2013

- Nucleic Acids Research

TL;DR: PubTator is described, a web-based system for assisting biocuration that featuring a PubMed-like interface, and being equipped with multiple challenge-winning text mining algorithms to ensure the quality of its automatic results.

...read moreread less

640

•Journal Article•10.1021/ACS.CHEMREV.6B00851

Information retrieval and text mining technologies for chemistry

Martin Krallinger, +5 more

- 05 May 2017

- Chemical Reviews

TL;DR: This Review provides a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting information demands of chemical information contained in scientific literature, patents, technical reports, or the web.

...read moreread less

279

•Book Chapter•10.1007/978-1-4939-3743-1_6

Text mining to support gene ontology curation and vice versa

Patrick Ruch

- 01 Jan 2017

- Methods of Molecular Biology

TL;DR: It is argued that automatic text categorization functions can ultimately be embedded into a Question-Answering (QA) system to answer questions related to protein functions, and a new type of QA system is emerging, so-called Deep QA which uses machine learning methods trained with curated contents.

...read moreread less

22

•Journal Article•10.1186/S13321-014-0040-8

A document classifier for medicinal chemistry publications trained on the ChEMBL corpus

George Papadatos, +5 more

- 12 Aug 2014

- Journal of Cheminformatics

TL;DR: Large-scale machine learning document classification was shown to be very robust and flexible for this particular application, as illustrated in four distinct text-mining-based use cases.

...read moreread less

13

•Journal Article•10.1093/DATABASE/BAY097

Document triage for identifying protein-protein interactions affected by mutations: a neural network ensemble approach.

Ling Luo, +3 more

- 01 Jan 2018

- Database

TL;DR: In this approach, several neural network models are used for document triage, and the ensemble performs better than any individual model and is incorporated into the approach to address the problem of the limited size of training set.

...read moreread less

5

References

Journal Article•10.1145/1961189.1961199

LIBSVM: A library for support vector machines

Chih-Chung Chang, +1 more

- 06 May 2011

- ACM Transactions on Intelligent Systems ...

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

46.3K

•Book Chapter•10.1007/978-3-540-35488-8_13

Combining SVMs with Various Feature Selection Strategies

Yi-Wei Chen, +1 more

- 01 Jan 2006

TL;DR: This article investigates the performance of combining support vector machines (SVM) and various feature selection strategies, some are filter-type approaches: general feature selection methods independent of SVM, and some are wrapper-type methods: modifications of S VM which can be used to select features.

...read moreread less

1.1K

•Journal Article•10.1186/1471-2105-6-S1-S1

Overview of BioCreAtIvE: critical assessment of information extraction for biology

Lynette Hirschman, +3 more

- 24 May 2005

- BMC Bioinformatics

TL;DR: The first BioCreAtIvE assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology.

...read moreread less

608

•Journal Article•10.1093/BIOINFORMATICS/BTI783

Automatic assignment of biomedical categories: toward a generic approach

Patrick Ruch

- 15 Mar 2006

- Bioinformatics

TL;DR: A lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units, which shows the effectiveness of phrase indexing for both GO and MeSH categorization.

...read moreread less

153

•Journal Article•10.1093/BIOINFORMATICS/BTP249

MeSH Up

Dolf Trieschnigg, +5 more

- 01 Jun 2009

- Bioinformatics

TL;DR: The annotation of biomedical texts using controlled vocabularies such as MeSH can be automated to improve text-only IR and the automatic MeSH annotation system proposed is highly scalable and generates improvements in IR comparable with those observed for manual annotations.

...read moreread less

142

Using binary classification to prioritize and curate articles for the Comparative Toxicogenomics Database.

Chat with Paper

AI Agents for this Paper

Citations

PubTator: a web-based text mining tool for assisting biocuration

Information retrieval and text mining technologies for chemistry

Text mining to support gene ontology curation and vice versa

A document classifier for medicinal chemistry publications trained on the ChEMBL corpus

Document triage for identifying protein-protein interactions affected by mutations: a neural network ensemble approach.

References

LIBSVM: A library for support vector machines

Combining SVMs with Various Feature Selection Strategies

Overview of BioCreAtIvE: critical assessment of information extraction for biology

Automatic assignment of biomedical categories: toward a generic approach

MeSH Up

Related Papers (5)

Tagger: BeCalm API for rapid named entity recognition

The Challenge of automatically annotating solution documents

Classification-aware hidden-web text database selection

Handwritten annotation management apparatus and interface

Dynamic summarization of bibliographic-based data