About: Document type definition is a research topic. Over the lifetime, 2472 publications have been published within this topic receiving 59767 citations. The topic is also known as: DTD.
TL;DR: It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases.
Abstract: XML is fast emerging as the dominant standard for representing data in the World Wide Web. Sophisticated query engines that allow users to effectively tap the data stored in XML documents will be crucial to exploiting the full power of XML. While there has been a great deal of activity recently proposing new semistructured data models and query languages for this purpose, this paper explores the more conservative approach of using traditional relational database engines for processing XML documents conforming to Document Type Descriptors (DTDs). To this end, we have developed algorithms and implemented a prototype system that converts XML documents to relational tuples, translates semi-structured queries over XML documents to SQL queries over tables, and converts the results to XML. We have qualitatively evaluated this approach using several real DTDs drawn from diverse domains. It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases. We identify the causes for these limitations and propose certain extensions to the relational model that would make it more appropriate for processing queries over XML documents.
TL;DR: In this paper, the authors propose a markup language according to the SGML standard in which document type definitions are created under which electronic documents are divided into blocks that are associated with logical fields specific to the type of block.
Abstract: The invention includes a markup language according to the SGML standard in which document type definitions are created under which electronic documents are divided into blocks that are associated with logical fields that are specific to the type of block. Each of many different types of electronic documents can have a record mapping to a particular environment, such as a legacy environment of a banking network, a hospital's computer environment for electronic record keeping, a lending institution's computer environment for processing loan applications, or a court or arbitrator's computer system. Semantic document type definitions for various electronic document types (including, for example, electronic checks, mortgage applications, medical records, prescriptions, contracts, and the like) can be formed using mapping techniques between the logical content of the document and the block that is defined to include such content. Also, the various document types are preferably defined to satisfy existing customs, protocols and legal rules.
TL;DR: The XRANK system is presented, designed to handle the novel features of XML keyword search, which naturally generalizes a hyperlink based HTML search engine such as Google and can be used to query a mix of HTML and XML documents.
Abstract: We consider the problem of efficiently producing ranked results for keyword search queries over hyperlinked XML documents. Evaluating keyword search queries over hierarchical XML documents, as opposed to (conceptually) flat HTML documents, introduces many new challenges. First, XML keyword search queries do not always return entire documents, but can return deeply nested XML elements that contain the desired keywords. Second, the nested structure of XML implies that the notion of ranking is no longer at the granularity of a document, but at the granularity of an XML element. Finally, the notion of keyword proximity is more complex in the hierarchical XML data model. In this paper, we present the XRANK system that is designed to handle these novel features of XML keyword search. Our experimental results show that XRANK offers both space and performance benefits when compared with existing approaches. An interesting feature of XRANK is that it naturally generalizes a hyperlink based HTML search engine such as Google. XRANK can thus be used to query a mix of HTML and XML documents.
TL;DR: A computer-based electronic document and/or paper-based document management application program as discussed by the authors provides an efficient way to automatically import, index, categorize, store, search, retrieve, manipulate and archive electronic documents.
Abstract: A computer-based electronic document and/or paper-based document management application program. The program provides an efficient way to automatically import, index, categorize, store, search, retrieve, manipulate and archive electronic documents. The program is also capable of managing documents regardless of document type or document format.
TL;DR: In this paper, a technique, system, and computer program for using servlets to isolate the retrieval of data from the rendering of the data into a presentation format is described. But this technique is limited to the case where the data retrieval logic is isolated to a data servlet, and presentation formatting is separated to a rendering servlet.
Abstract: A technique, system, and computer program for using servlets to isolate the retrieval of data from the rendering of the data into a presentation format. Data retrieval logic is isolated to a data servlet, and presentation formatting is isolated to a rendering servlet. Servlet chaining is used to send the output of the data servlet to the rendering servlet. The data servlet formats its output data stream for transfer to a downstream servlet. This data stream may be formatted using a language such as the Extensible Markup Language (XML), according to a specific Document Type Definition (DTD). The rendering servlet parses this XML data stream, using a style sheet that may be written using the Extensible Style Language (XSL), and creates a HyperText Markup Language (HTML) data stream as output.