About: Microformat is a research topic. Over the lifetime, 38 publications have been published within this topic receiving 882 citations. The topic is also known as: uF & μF.
TL;DR: A microformat is proposed called hRESTS (HTML for RESTful Services) for machine-readable descriptions of Web APIs, backed by a simple service model, which captures the facets of public APIs important for mashup developers and MicroWSMO, which provides support for semantic automation.
Abstract: The Web 2.0 wave brings, among other aspects, the programmable Web: increasing numbers of Web sites provide machine-oriented APIs and Web services. However, most APIs are only described with text in HTML documents. The lack of machine-readable API descriptions affects the feasibility of tool support for developers who use these services. We propose a microformat called hRESTS (HTML for RESTful Services) for machine-readable descriptions of Web APIs, backed by a simple service model. The hRESTS microformat describes main aspects of services, such as operations, inputs and outputs. We also present two extensions of hRESTS: SA-REST, which captures the facets of public APIs important for mashup developers, and MicroWSMO, which provides support for semantic automation.
TL;DR: The Web Data Commons project extracts all Microformat, Microdata and RDFa data from the Common Crawl web corpus, the largest and most up-todata web corpus that is currently available to the public, and provides the extracted data for download in the form of RDF-quads.
Abstract: More and more websites embed structured data describing for instance products, people, organizations, places, events, resumes, and cooking recipes into their HTML pages using encoding standards such as Microformats, Microdatas and RDFa. The Web Data Commons project extracts all Microformat, Microdata and RDFa data from the Common Crawl web corpus, the largest and most up-todata web corpus that is currently available to the public, and provides the extracted data for download in the form of RDF-quads. In this paper, we give an overview of the project and present statistics about the popularity of the different encoding standards as well as the kinds of data that are published using each format.
TL;DR: A new format for representing both intermediate and final OCR results is described, developed in response to the needs of a newly developed OCR system and ground truth data release, which embeds OCR information invisibly inside the HTML and CSS standards.
Abstract: Large scale scanning and document conversion efforts have led to a renewed interest in OCR systems and workflows. This paper describes a new format for representing both intermediate and final OCR results, developed in response to the needs of a newly developed OCR system and ground truth data release. The format embeds OCR information invisibly inside the HTML and CSS standards and therefore can represent a wide range of linguistic and typographic phenomena with already well-defined, widely understood markup and can be processed using widely available and known tools. The format is based on a new, multi-level abstraction of OCR results based on logical markup, common typesetting models, and OCR engine-specific markup, making it suitable both for the support of existing workflows and the development of future model-based OCR engines.
TL;DR: A language to specify semantically rich XML languages in terms of other XML languages, such as XHTML, that is versatile enough to represent templates that can capture the overall structure of large documents as well as the fine details of a microformat.
Abstract: Microformats and semantic XHTML add semantics to web pages while taking advantage of the existing (X)HTML infrastructure. This approach enables new applications that can be deployed smoothly on the web. But there is currently no way to describe rigorously this type of markup and authors of web pages have very little help for creating and encoding semantic markup. A language that addresses these issues is presented in this paper. Its role is to specify semantically rich XML languages in terms of other XML languages, such as XHTML. The language is versatile enough to represent templates that can capture the overall structure of large documents as well as the fine details of a microformat. It is supported by an editing tool for producing documents encoded in a semantically rich markup language, still fully compatible with XHTML.
TL;DR: New Microformat and Microdata schemas for describing 3D web components and 3D scenes with metadata and semantic properties are presented and may be combined with X3D, a well-established 3D content description standard.
Abstract: The paper presents new Microformat and Microdata schemas for creating descriptions of interactive 3D web content. Microformats and Microdata are increasingly popular solutions for creating lightweight attribute-based built-in semantic metadata of web content. However, although Microformats and Microdata enable basic description of media objects, they have not been intended for 3D content. Describing 3D components is more complex than describing standard web pages as the descriptions may relate to different aspects of the 3D content-spatial, temporal, structural, logical and behavioural. The main contribution of this paper are new Microformat and Microdata schemas for describing 3D web components and 3D scenes with metadata and semantic properties. The proposed schemas may be combined with X3D, a well-established 3D content description standard. Thanks to the use of the standardized solutions, the presented approach facilitates widespread dissemination of 3D content for use in a variety of multimedia applications on the web.