Top 3 papers published in the topic of Autocoding in 2004

Doublet method for very fast autocoding

[...]

15 Sep 2004-BMC Medical Informatics and Decision Making

TL;DR: The doublet method of autocoded is a novel algorithm for rapid text autocoding that will work with any nomenclature and will parse any ascii plain-text.

...read moreread less

Abstract: Background Autocoding (or automatic concept indexing) occurs when a software program extracts terms contained within text and maps them to a standard list of concepts contained in a nomenclature. The purpose of autocoding is to provide a way of organizing large documents by the concepts represented in the text. Because textual data accumulates rapidly in biomedical institutions, the computational methods used to autocode text must be very fast. The purpose of this paper is to describe the doublet method, a new algorithm for very fast autocoding.

...read moreread less

19 citations

Book Chapter•10.1007/978-3-540-39615-4_17•

Automatic Translation to Controlled Medical Vocabularies

[...]

András Kornai, Lisa Stone

1 Jan 2004

TL;DR: In this chapter, this chapter surveys the automatic translation or autocoding systems currently in use in the medical domain.

...read moreread less

Abstract: In the medical domain, over the centuries several controlled vocabularies have emerged with the goal of mapping semantically equivalent terms such as fever, pyrexia, hyperthermia, and febrile on the same (numerical) value. Translating unstructured natural language texts or verbatims produced by healthcare professionals to categories defined by a controlled vocabulary is a hard problem, mostly solved by employing human coders trained both in medicine and in the details of the classification system. In this chapter we survey the automatic translation or autocoding systems currently in use.

...read moreread less

7 citations

BMC Medical Informatics and Decision Making

[...]

Jules J Berman

1 Jan 2004

TL;DR: The doublet method of autocoded is a novel algorithm for rapid text autocoding that will work with any nomenclature and will parse any ascii plain-text.

...read moreread less

Abstract: Background: Autocoding (or automatic concept indexing) occurs when a software program extracts terms contained within text and maps them to a standard list of concepts contained in a nomenclature. The purpose of autocoding is to provide a way of organizing large documents by the concepts represented in the text. Because textual data accumulates rapidly in biomedical institutions, the computational methods used to autocode text must be very fast. The purpose of this paper is to describe the doublet method, a new algorithm for very fast autocoding. Methods: An autocoder was written that transforms plain-text into intercalated word doublets (e.g. "The ciliary body produces aqueous humor" becomes "The ciliary, ciliary body, body produces, produces aqueous, aqueous humor"). Each doublet is checked against an index of doublets extracted from a standard nomenclature. Matching doublets are assigned a numeric code specific for each doublet found in the nomenclature. Text doublets that do not match the index of doublets extracted from the nomenclature are not part of valid nomenclature terms. Runs of matching doublets from text are concatenated and matched against nomenclature terms (also represented as runs of doublets). Results: The doublet autocoder was compared for speed and performance against a previously published phrase autocoder. Both autocoders are Perl scripts, and both autocoders used an identical text (a 170+ Megabyte collection of abstracts collected through a PubMed search) and the same nomenclature (neocl.xml, containing over 102,271 unique names of neoplasms). In side-byside comparison on the same computer, the doublet method autocoder was 8.4 times faster than the phrase autocoder (211 seconds versus 1,776 seconds). The doublet method codes 0.8 Megabytes of text per second on a desktop computer with a 1.6 GHz processor. In addition, the doublet autocoder successfully matched terms that were missed by the phrase autocoder, while the phrase autocoder found no terms that were missed by the doublet autocoder. Conclusions: The doublet method of autocoding is a novel algorithm for rapid text autocoding. The method will work with any nomenclature and will parse any ascii plain-text. An implementation of the algorithm in Perl is provided with this article. The algorithm, the Perl implementation, the neoplasm nomenclature, and Perl itself, are all open source materials.

...read moreread less

3 citations

Showing papers on "Autocoding published in 2004"

Doublet method for very fast autocoding

Automatic Translation to Controlled Medical Vocabularies

BMC Medical Informatics and Decision Making