About: Library classification is a research topic. Over the lifetime, 2991 publications have been published within this topic receiving 26781 citations.
TL;DR: Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data.
Abstract: Comprehensive Coverage of the Entire Area of Classification Research on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data. This comprehensive book focuses on three primary aspects of data classification: Methods-The book first describes common techniques used for classification, including probabilistic methods, decision trees, rule-based methods, instance-based methods, support vector machine methods, and neural networks. Domains-The book then examines specific methods used for data domains such as multimedia, text, time-series, network, discrete sequence, and uncertain data. It also covers large data sets and data streams due to the recent importance of the big data paradigm. Variations-The book concludes with insight on variations of the classification process. It discusses ensembles, rare-class learning, distance function learning, active learning, visual learning, transfer learning, and semi-supervised learning as well as evaluation aspects of classifiers.
TL;DR: This work introduces a new methodology for constructing classification systems at the level of individual publications, and presents an application in which a classification system is produced that includes almost 10 million publications.
TL;DR: In this article, a hierarchical classification method that can classify documents to both leaf and internal categories has been proposed, which considers the degree of misclassification in measuring the classification performance.
Abstract: Hierarchical classification refers to the assignment of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a top-down level-based classification method that can classify documents to both leaf and internal categories. As the standard performance measures assume independence between categories, they have not considered the documents incorrectly classified into categories that are similar to or not far from correct ones in the category tree. We therefore propose category-similarity measures and distance-based measures to consider the degree of misclassification in measuring the classification performance. An experiment has been carried out to measure the performance of our proposed hierarchical classification method. The results showed that our method performs well for a Reuters text collection when enough training documents are given and the new measures have indeed considered the contributions of misclassified documents.
TL;DR: It is suggested that external variables that affect perceived ease of use and usefulness need to be considered as important factors in the process of designing, implementing, and operating digital library systems to help decrease the mismatch between system design and local users' realities, and further facilitate the successful adoption ofdigital library systems in developing countries.
TL;DR: A system for searching and classifying U.S. patent documents, based on Inquery, which includes a unique “phrase help” facility, which helps users find and add phrases and terms related to those in their query.
Abstract: We present a system for searching and classifying U.S. patent documents, based on Inquery. Patents are distributed through hundreds of collections, divided up by general area. The system selects the best collections for the query. Users can search for patents or classify patent text. The user interface helps users search in fields without requiring the knowledge of Inquery query operators. The system includes a unique “phrase help” facility, which helps users find and add phrases and terms related to those in their query.