Towards the self-annotating web
Philipp Cimiano,Siegfried Handschuh,Steffen Staab +2 more
- 17 May 2004
- pp 462-471
TL;DR: PANKOW (Pattern-based Annotation through Knowledge on theWeb), a method which employs an unsupervised, pattern-based approach to categorize instances with regard to an ontology, is proposed.
read more
Abstract: The success of the Semantic Web depends on the availability of ontologies as well as on the proliferation of web pages annotated with metadata conforming to these ontologies. Thus, a crucial question is where to acquire these metadata from. In this paper wepropose PANKOW (Pattern-based Annotation through Knowledge on theWeb), a method which employs an unsupervised, pattern-based approach to categorize instances with regard to an ontology. The approach is evaluated against the manual annotations of two human subjects. The approach is implemented in OntoMat, an annotation tool for the Semantic Web and shows very promising results.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures
Citations
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network
TL;DR: An automatic approach to the construction of BabelNet, a very large, wide-coverage multilingual semantic network, key to this approach is the integration of lexicographic and encyclopedic knowledge from WordNet and Wikipedia.
1.8K
•Proceedings Article
Open information extraction from the web
Michele Banko,Michael Cafarella,Stephen Soderland,Matt Broadhead,Oren Etzioni +4 more
- 06 Jan 2007
TL;DR: Open Information Extraction (OIE) as mentioned in this paper is a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input.
Open information extraction for the web
Oren Etzioni,Michele Banko +1 more
- 01 Jan 2009
TL;DR: Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced.
Data cleansing for Web information retrieval using query independent features
TL;DR: It is found that there exists a large proportion of low-quality Web pages in both the English and the Chinese Web page corpus, and retrieval target pages can be identified using query-independent features and cleansing algorithms.
Semantic annotation for knowledge management: Requirements and a survey of the state of the art
Victoria Uren,Philipp Cimiano,José Iria,Siegfried Handschuh,Maria Vargas-Vera,Enrico Motta,Fabio Ciravegna +6 more
TL;DR: This analysis shows that, while there is still some way to go before semantic annotation tools will be able to address fully all the knowledge management needs, research in the area is active and making good progress.
645
References
Automatic acquisition of hyponyms from large text corpora
Marti A. Hearst
- 23 Aug 1992
TL;DR: A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified.
•Journal Article
Assessing agreement on classification tasks: the kappa statistic
TL;DR: The authors discuss what is wrong with reliability measures as they are currently used for discourse and dialogue work in computational linguistics and cognitive science, and argue that we would be better off as a field adopting techniques from content analysis.
2.5K
•Book
Ontology Learning for the Semantic Web
Alexander Maedche,Steffen Staab +1 more
- 28 Feb 2002
TL;DR: The authors present an ontology learning framework that extends typical ontology engineering environments by using semiautomatic ontology construction tools and encompasses ontology import, extraction, pruning, refinement and evaluation.
Self-organization and identification of Web communities
TL;DR: This work shows that the Web self-organizes and its link structure allows efficient identification of communities and is significant because no central authority or process governs the formation and structure of hyperlinks.




