Topic

Document type definition

About: Document type definition is a research topic. Over the lifetime, 2472 publications have been published within this topic receiving 59767 citations. The topic is also known as: DTD.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

Papers

Proceedings Article•

Relational Databases for Querying XML Documents: Limitations and Opportunities

[...]

Jayavel Shanmugasundaram, Kristin Tufte, Chun Zhang, Gang He, David J. DeWitt, Jeffrey F. Naughton - Show less +2 more

7 Sep 1999

TL;DR: It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases.

...read moreread less

Abstract: XML is fast emerging as the dominant standard for representing data in the World Wide Web. Sophisticated query engines that allow users to effectively tap the data stored in XML documents will be crucial to exploiting the full power of XML. While there has been a great deal of activity recently proposing new semistructured data models and query languages for this purpose, this paper explores the more conservative approach of using traditional relational database engines for processing XML documents conforming to Document Type Descriptors (DTDs). To this end, we have developed algorithms and implemented a prototype system that converts XML documents to relational tuples, translates semi-structured queries over XML documents to SQL queries over tables, and converts the results to XML. We have qualitatively evaluated this approach using several real DTDs drawn from diverse domains. It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases. We identify the causes for these limitations and propose certain extensions to the relational model that would make it more appropriate for processing queries over XML documents.

...read moreread less

1,137 citations

Patent•

Method and system for processing electronic documents

[...]

Milton M. Anderson, Frank A. Jaffe, Chris Hibbert, Jyri Virkki, Jeffrey Kravitz, Sheveling Chang, Elaine R. Palmer - Show less +3 more

19 Dec 1997

TL;DR: In this paper, the authors propose a markup language according to the SGML standard in which document type definitions are created under which electronic documents are divided into blocks that are associated with logical fields specific to the type of block.

...read moreread less

Abstract: The invention includes a markup language according to the SGML standard in which document type definitions are created under which electronic documents are divided into blocks that are associated with logical fields that are specific to the type of block. Each of many different types of electronic documents can have a record mapping to a particular environment, such as a legacy environment of a banking network, a hospital's computer environment for electronic record keeping, a lending institution's computer environment for processing loan applications, or a court or arbitrator's computer system. Semantic document type definitions for various electronic document types (including, for example, electronic checks, mortgage applications, medical records, prescriptions, contracts, and the like) can be formed using mapping techniques between the logical content of the document and the block that is defined to include such content. Also, the various document types are preferably defined to satisfy existing customs, protocols and legal rules.

...read moreread less

945 citations

Proceedings Article•10.1145/872757.872762•

XRANK: ranked keyword search over XML documents

[...]

Lin Guo¹, Feng Shao¹, Chavdar Botev¹, Jayavel Shanmugasundaram¹•Institutions (1)

Cornell University¹

9 Jun 2003

TL;DR: The XRANK system is presented, designed to handle the novel features of XML keyword search, which naturally generalizes a hyperlink based HTML search engine such as Google and can be used to query a mix of HTML and XML documents.

...read moreread less

Abstract: We consider the problem of efficiently producing ranked results for keyword search queries over hyperlinked XML documents. Evaluating keyword search queries over hierarchical XML documents, as opposed to (conceptually) flat HTML documents, introduces many new challenges. First, XML keyword search queries do not always return entire documents, but can return deeply nested XML elements that contain the desired keywords. Second, the nested structure of XML implies that the notion of ranking is no longer at the granularity of a document, but at the granularity of an XML element. Finally, the notion of keyword proximity is more complex in the hierarchical XML data model. In this paper, we present the XRANK system that is designed to handle these novel features of XML keyword search. Our experimental results show that XRANK offers both space and performance benefits when compared with existing approaches. An interesting feature of XRANK is that it naturally generalizes a hyperlink based HTML search engine such as Google. XRANK can thus be used to query a mix of HTML and XML documents.

...read moreread less

899 citations

Patent•

Computer-based document management system

[...]

David R. Ferguson, Dani Suleman, Gregory L. Whittemore

7 Oct 1998

TL;DR: A computer-based electronic document and/or paper-based document management application program as discussed by the authors provides an efficient way to automatically import, index, categorize, store, search, retrieve, manipulate and archive electronic documents.

...read moreread less

Abstract: A computer-based electronic document and/or paper-based document management application program. The program provides an efficient way to automatically import, index, categorize, store, search, retrieve, manipulate and archive electronic documents. The program is also capable of managing documents regardless of document type or document format.

...read moreread less

588 citations

Patent•

Servlet pairing for isolation of the retrieval and rendering of data

[...]

Elias N. Bayeh¹, Mark W. Lumsden¹•Institutions (1)

IBM¹

23 Feb 1998

TL;DR: In this paper, a technique, system, and computer program for using servlets to isolate the retrieval of data from the rendering of the data into a presentation format is described. But this technique is limited to the case where the data retrieval logic is isolated to a data servlet, and presentation formatting is separated to a rendering servlet.

...read moreread less

Abstract: A technique, system, and computer program for using servlets to isolate the retrieval of data from the rendering of the data into a presentation format. Data retrieval logic is isolated to a data servlet, and presentation formatting is isolated to a rendering servlet. Servlet chaining is used to send the output of the data servlet to the rendering servlet. The data servlet formats its output data stream for transfer to a downstream servlet. This data stream may be formatted using a language such as the Extensible Markup Language (XML), according to a specific Document Type Definition (DTD). The rendering servlet parses this XML data stream, using a style sheet that may be written using the Extensible Style Language (XSL), and creates a HyperText Markup Language (HTML) data stream as output.

...read moreread less

452 citations

...

Expand

Performance Metrics

2,497

Papers

35,966

Citations

No. of papers in the topic in previous years
Year	Papers
2025	2
2023	5
2022	13
2021	9
2020	22
2019	27

Document type definition

Topic Tools

Papers published on a yearly basis

Papers

Relational Databases for Querying XML Documents: Limitations and Opportunities

Method and system for processing electronic documents

XRANK: ranked keyword search over XML documents

Computer-based document management system

Servlet pairing for isolation of the retrieval and rendering of data

Related Topics (5)

Performance Metrics