About: Web mapping is a research topic. Over the lifetime, 5392 publications have been published within this topic receiving 126716 citations. The topic is also known as: webmap & map service.
TL;DR: Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced.
Abstract: The World Wide Web contains a significant amount of information expressed using natural language. While unstructured text is often difficult for machines to understand, the field of Information Extraction (IE) offers a way to map textual content into a structured knowledge base. The ability to amass vast quantities of information from Web pages has the potential to increase the power with which a modern search engine can answer complex queries.
IE has traditionally focused on acquiring knowledge about particular relationships within a small collection of domain-specific text. Typically, a target relation is provided to the system as input along with extraction patterns or examples that have been specified by hand. Shifting to a new relation requires a person to create new patterns or examples. This manual labor scales linearly with the number of relations of interest.
The task of extracting information from the Web presents several challenges for existing IE systems. The Web is large and heterogeneous; the number of potentially interesting relations is massive and their identity often unknown. To enable large-scale knowledge acquisition from the Web, this thesis presents Open Information Extraction, a novel extraction paradigm that automatically discovers thousands of relations from unstructured text and readily scales to the size and diversity of the Web.
TL;DR: The practice of crowdsourcing is transforming the Web and giving rise to a new field of inquiry called "crowdsourcing", which aims to provide real-time information about events in a democratic manner.
Abstract: This document describes technical mechanisms employed in network censorship that regimes around the world use for blocking or impairing Internet traffic. It aims to make designers, implementers, and users of Internet protocols aware of the properties exploited and mechanisms used for censoring end-user access to information. This document makes no suggestions on individual protocol considerations, and is purely informational, intended as a reference.
TL;DR: This paper defines Web mining and presents an overview of the various research issues, techniques, and development efforts, and briefly describes WEBMINER, a system for Web usage mining, and concludes the paper by listing research issues.
Abstract: Application of data mining techniques to the World Wide Web, referred to as Web mining, has been the focus of several recent research projects and papers. However, there is no established vocabulary, leading to confusion when comparing research efforts. The term Web mining has been used in two distinct ways. The first, called Web content mining in this paper, is the process of information discovery from sources across the World Wide Web. The second, called Web usage mining, is the process of mining for user browsing and access patterns. We define Web mining and present an overview of the various research issues, techniques, and development efforts. We briefly describe WEBMINER, a system for Web usage mining, and conclude the paper by listing research issues.
TL;DR: An overview of recent research efforts of automatic Web service composition both from the workflow and AI planning research community is given.
Abstract: In today’s Web, Web services are created and updated on the fly. It’s already beyond the human ability to analysis them and generate the composition plan manually. A number of approaches have been proposed to tackle that problem. Most of them are inspired by the researches in cross-enterprise workflow and AI planning. This paper gives an overview of recent research efforts of automatic Web service composition both from the workflow and AI planning research community.
TL;DR: This work shows that the Web self-organizes and its link structure allows efficient identification of communities and is significant because no central authority or process governs the formation and structure of hyperlinks.
Abstract: The vast improvement in information access is not the only advantage resulting from the increasing percentage of hyperlinked human knowledge available on the Web. Additionally, much potential exists for analyzing interests and relationships within science and society. However, the Web's decentralized and unorganized nature hampers content analysis. Millions of individuals operating independently and having a variety of backgrounds, knowledge, goals and cultures author the information on the Web. Despite the Web's decentralized, unorganized, and heterogeneous nature, our work shows that the Web self-organizes and its link structure allows efficient identification of communities. This self-organization is significant because no central authority or process governs the formation and structure of hyperlinks.