TL;DR: Techniques for automatically computing the geographical scope of web resources, based on the textual content of the resources, as well as on the geographical distribution of hyperlinks to them are introduced.
Abstract: Many information resources on the web are relevant primarily to limited geographical communities. For instance, web sites containing information on restaurants, theaters, and apartment rentals are relevant primarily to web users in geographical proximity to these locations. In contrast, other information resources are relevant to a broader geographical community. For instance, an on-line newspaper may be relevant to users across the United States. Unfortunately, current web search engines largely ignore the geographical scope of web resources. In this paper, we introduce techniques for automatically computing the geographical scope of web resources, based on the textual content of the resources, as well as on the geographical distribution of hyperlinks to them. We report an extensive experimental evaluation of our strategies using real web data. Finally, we describe a geographicallyaware search engine that we have built to showcase our techniques.
TL;DR: In this paper, a standard format is provided for a text string called an enterprise identifier, which acts as a handle to access resources from disparate sources and technologies, using extensible markup language format to allow a resource identifier to be created manually without accessing the resource.
Abstract: A standard format is provided for a text string called an enterprise identifier, which acts as a handle to access resources from disparate sources and technologies. Enterprise identifiers use extensible markup language format to allow a resource identifier to be created manually without accessing the resource. The identifier may be passed between enterprises via business-to-business connection, e-mail, telephone, or facsimile. Data may be extracted from the identifier for display or programmatic use without accessing the resource, thus avoiding unnecessary data access and transfer
TL;DR: In this article, a plurality of links are identified using a web browser component, where the links are selectable to open a corresponding web resource of a specified data type on the web site.
Abstract: Embodiments of the invention include a method to access a web site. A plurality of links are identified using a web browser component, where the links are selectable to open a corresponding web resource of a specified data type on the web site. The plurality of links are made available to a plurality of Internet enabled devices that select one or more of the links.
TL;DR: A user's Web surfing action is abstracted as a Markov model and a new rank algorithm is proposed, which synthesizes the relevance, authority, integrativity and novelty of each Web resource and can be computed efficiently through solving a group of linear equations.
Abstract: How to rank Web resources is critical to Web Resource Discovery (Search Engine). This paper not only points out the weakness of current approaches, but also presents in-depth analysis of the multidimensionality and subjectivity of rank algorithms. From a dynamics viewpoint, this paper abstracts a user's Web surfing action as a Markov model. Based on this model, we propose a new rank algorithm. The result of our rank algorithm, which synthesizes the relevance, authority, integrativity and novelty of each Web resource, can be computed efficiently not by iteration but through solving a group of linear equations.
TL;DR: A semi- automatic accessibility evaluation tool is proposed, which will guide evaluators through the auditing process and produce a set of tailored recommendations for making the subject site accessible.
Abstract: A majority of Web based information, facilities and services is unnecessarily inaccessible to people with certain disabilities, largely due to a lack of awareness of accessibility issues on the part of developers. This paper argues that currently available accessibility evaluation methods are unsatisfactory in the scope and presentation of their results. Consequently, there is a need for a meta- method which utilises the strengths of current methods, but which also bridges their weaknesses. The paper discusses a comprehensive, yet usable methodology for evaluating web sites for accessibility. Using this methodology, a semi- automatic accessibility evaluation tool is proposed, which will guide evaluators through the auditing process and produce a set of tailored recommendations for making the subject site accessible.
TL;DR: PIPE (Personalization is Partial Evaluation) is able to personalize Web resources, without enumerating the interaction sequences beforehand, and supports information integration, and varying levels of input by Web visitors.
Abstract: Partial evaluation is a technique popular in the programming languages community. It is applied here as a methodology for personalizing Web content. PIPE (Personalization is Partial Evaluation) is able to personalize Web resources, without enumerating the interaction sequences beforehand. It supports information integration, and varying levels of input by Web visitors. PIPE models personalization as a form of partial evaluation, a technique that uses incomplete input information to specialize programs. This article describes the PIPE methodology and presents experimental results demonstrating its effectiveness in two different domains.
TL;DR: In this article, the authors present a system for managing resources, which can take the form of computer-compatible information, such as data files and programs, and non-computer compatible information (such as data contained on microfiche, and physical objects).
Abstract: The invention concerns a system for managing resources, which can take the form of (a) computer-compatible information, such as data files and programs, and (b) non-computer-compatible information, such as data contained on microfiche, and (c) physical objects. The resources are located at geographically diverse sites. The invention contains a descriptive profile for each resource, and allows any user to search all profiles, and to search the profiles according to “fields” (a database term), such as by location of the resources, or by category of the resources. The user can order delivery of a selected resource, and the system causes delivery of the resource to be executed, irrespective of the form (e.g., physical object) of the resource. The invention allows a provider of a new resource to limit access to the resource, by identifying users who are authorized to obtain access to the resource. Non-authorized users cannot obtain access to the profiles of these resources.
TL;DR: In this paper, the authors propose a method and system for enabling resources to be defined, tracked, verified, resolved and managed both statically and dynamically, regardless of resource type. But this method is not suitable for large-scale systems.
Abstract: The present invention relates to a method and system for enabling resources to be defined, tracked, verified, resolved and managed both statically and dynamically. Resource management may be performed explicitly and consistently throughout a system, regardless of resource type. When a resource is defined, the resource may be assigned a unique specifier which may include a resource ID, type ID, version ID and/or other identifier. This information may be stored in a centralized repository, preventing redundant definitions of similar resources. Software or other applications may request (or require) access to a resource from a resource manager, regardless of resource type, retrieval mechanism or location.
TL;DR: The paper describes the OHIF format and demonstrates how the Webvise system handles OHIF, and argues for better support for handling user controlled meta data, e.g. support for linking in non-XML data, integration of external linking in the Web infrastructure, and collaboration support for external structures and meta-data.
Abstract: This paper introduces an approach to utilise open hypermedia structures such as links, annotations, collections and guided tours as meta data for Web resources The paper introduces an XML based data format, called Open Hypermedia Interchange Format, OHIF, for such hypermedia structures OHIF resembles XLink with respect to its representation of out-of-line links, but it goes beyond XLink with a more rich set of structuring mechanisms, including eg composites Moreover OHIF includes an addressing mechanisms (LocSpecs) that goes beyond XPointer and URL in its ability to locate non-XML data segments By means of the Webvise system, OHIF structures can be authored, imposed on Web pages, and finally linked on the Web as any ordinary Web resource Following a link to an OHIF file automatically invokes a Webvise download of the meta data structures and the annotated Web content will be displayed in the browser Moreover, the Webvise system provides support for users to create, manipulate, and share the OHIF structures together with custom made Web pages and MS Office 2000 documents on WebDAV servers These Webvise facilities goes beyond earlier open hypermedia systems in that it now allows fully distributed open hypermedia linking between Web pages and WebDAV aware desktop applications The paper describes the OHIF format and demonstrates how the Webvise system handles OHIF Finally, it argues for better support for handling user controlled meta data, eg support for linking in non-XML data, integration of external linking in the Web infrastructure, and collaboration support for external structures and meta-data
TL;DR: The Library Visit study examines user accounts of what happened when users asked a reference question in a public or academic library of their choice between fall 1998 and spring 2000, finding that reference staff seem to regard the Internet as an external resource that users can search independently--at home or on the library's public access workstations--but not as a full-fledged reference tool for which reference librarians have a responsibility to help users search and evaluate.
Abstract: The Library Visit Study, Phase 2 Phase 2 of the Library Visit study examines 161 user accounts of what happened when users asked a reference question in a public or academic library of their choice between fall 1998 and spring 2000. Although the over-all success rate increased from 60 percent in phase 1 in the early 1990s to 69 percent in the current period, the problems identified in the unsuccessful cases were the same as those identified in phase 1. These problems include failure to conduct a reference interview, unmonitored referral, and failure to ask follow-up questions. A new focus of analysis in phase 2 addresses how the availability of electronic resources, including the Web, may have changed the reference transaction and affected the user's experience. A troubling finding is that reference staff seem to regard the Internet as an external resource that users can search independently--at home or on the library's public access workstations--but not as a full-fledged reference tool for which reference librarians have a responsibility to help users search and evaluate. Like so many research projects, this one started with a puzzle that arose in the course of teaching. The authors teach sections of a basic reference course with some other colleagues at the University of Western Ontario. Over the years, those of us who teach this course have put together a large file of reference questions from which we select ten reference questions for a class assignment for each course offering. Questions that turn out to be too easy or too hard are discarded, so that the question file contains several hundred tried-and-tested reference questions and best answers, classified by type of reference tool needed to answer the question. The faculty members responsible for the course recently noticed a problem: questions that previously had been adequately challenging now were too easy. For example, in the fall 1999 term, one of the authors used the question, "Who is Dart's child and what is Dart's child known for?" This question had formerly required students to analyze the question and take a series of logical steps to find the answer, but the 1999 students were able to find an answer in a few minutes. All they had to do was type "Dart's Child" into Google (www.google.com) and within the top ten hits could find several sources that either provided the answer or provided a shortcut to the answer. We wondered how many other of our reference questions could be answered by authoritative Web sources retrieved within the first ten Google hits. We decided to check systematically and discovered that we had to throw out a third of our tried-and-true questions. By virtue of answers available through just one search engine, a large number of questions--148 out of 442--had become too easy to provide useful learning experiences for students. For us, this was a dramatic illustration of the way in which the World Wide Web has transformed the landscape of reference tools. Despite all the frequently voiced criticisms of the Web--that it contains unfiltered information of dubious authority, that methods of subject control and retrieval are slap-dash, that Web pages are unstable and fugitive--one search engine, Google, proved to be an effective shortcut for finding an answer to one-third of our questions. Literature Review By 1998, evidence was beginning to accumulate in the library literature about the value of answering reference questions with resources in freely available Web sites. For example, Janes and McClure asked whether reference can now be done as well in the Web environment as in the traditional print environment and answered "yes." This conclusion was based on search outcomes when twenty-four volunteer searchers tried to find answers to twelve ready reference questions using either Web or non-Web resources.[1] In this study, Web resources were restricted to freely available Web pages and did not include licensed commercial products. …
TL;DR: In this article, the psychological benefit of finding a web site devoted to the contact and with the contact's own identity as part of the domain name conditions the contact favorably and increases the chances that the result sought by the promoter will be achieved.
Abstract: Techniques for inducing a contact to invoke a resource prepared by a promoter when the resource resides on a network, includes generating a resource location description for the resource. The resource location description includes a name of the contact. The promoter provides access to the resource at a location on the network according to the resource location description. The promoter also prepares a message to notify the contact about the resource location description for the resource. Thus a promoter (e.g., wholesaler, retailer, advocate, charity or politician) can provide a web site for each contact (e.g., customer, potential customer, viewer, supporter or voter) whom the promoter has identified. Each web site can have a domain name that prominently displays the contact's identity. The psychological benefit to the contact of finding a web site devoted to the contact and with the contact's own identity as part of the domain name conditions the contact favorably and increases the chances that the result sought by the promoter will be achieved.
TL;DR: Content analyses of selected sections of 63 web reviews published by eleventh grade students in a project-based science class show that student summaries were usually accurate, but had room for improvement especially in areas of comprehensiveness and level of detail.
Abstract: This research explores a new web-based curriculum idea, that of having students write and publish critical web ‘‘reviews’’ of scientific resources as a means of both practicing critical evaluation of web resources, and of making an authentic value-added contribution to the web. This paper presents content analyses of selected sections of 63 web reviews published by eleventh grade students in a project-based science class. Two aspects of critical evaluation are focused upon: summarization of content and evaluation of credibility. Content analyses show that student summaries were usually accurate, but had room for improvement especially in areas of comprehensiveness and level of detail. An ideal model of a content review is developed from analysis of a second set of reviews. When asked to evaluate credibility, students struggled to identify scientific evidence of claims in web resources, but analysis of web documents shows that this is often because such evidence is missing. Students could accurately determine the publishing source of web documents, but challenges arose in identifying potential biases. Recommendations for future iterations of this curriculum idea are presented throughout. A companion paper that will appear in this journal will examine how student reviews serve the function of social filtering on the web.
TL;DR: HONselect as discussed by the authors is a search integrator based on the hierarchical structure of medical terms of the Medical Subject Headings (MeSH) thesaurus, which has been developed by Health On the Net Foundation.
Abstract: The Internet carries a huge and growing amount of medical and health information. Users wishing to find specific information on a medical condition have two options. Either they consult topic lists that are often manually built or they use general or specialised full-text search engines. In both cases, they have to sort through and analyse the results they retrieve and only a few of these results correspond to the information sought. Further, the results in most cases are presented merely as a list of web sites with no annotation or pre-information such as a basic description of the disease and its hierarchical structure, an indication of the medical context to which it belongs, a global overview of the term or links to additional sources of knowledge. To pull together disparate types of information like scientific articles, bibliographic references, news, medical illustrations and multimedia, the user until recently has had to undertake repetitive searches. This is no longer necessary. Health On the Net Foundation has developed HONselect, a new search integrator based on the hierarchical structure of medical terms of the Medical Subject Headings (MeSH). It not only provides a solid foundation in the form of the MeSH thesaurus [1–4], it also precludes time-wasting repetition by combining five information types—MeSH classification, authoritative scientific articles, health and medical news, web sites and multimedia— into one tool to focus and accelerate a user’s search. The current version of HONselect is bilingual, in English and French, but HON’s intention is to develop a multilingual tool.
TL;DR: An overview of multilingual information access issues in relation to the Web is offered, showing the variety of scripts in which the written forms of the world’s languages appear create major problems in searching, inputting, displaying and printing text in non‐roman scripts.
Abstract: The World Wide Web offers access to information resources in many languages. Certain developments facilitate multilingual exploitation of these resources. Some search engines, for example, allow the user to restrict retrieved sites to those in particular languages; some also provide the searcher with an interface in a chosen language. Many web sites also offer their information in several languages, one of which typically is English. Systran, a machine translation system available from the AltaVista search engine, can even translate a search statement or a retrieved page from one language to another. Despite these features, however, language also creates obstacles to full exploitation of web resources. Not all languages are catered for by these multilingual tools. Machine translation output typically is but a rough and ready version of a human translation. The variety of scripts in which the written forms of the world’s languages appear also create major problems in searching, inputting, displaying and printing text in non‐roman scripts. The paper offers an overview of multilingual information access issues in relation to the Web.
TL;DR: In this article, the authors describe a system, method, and device for computer networking that includes receiving from a remote client a request for a web resource containing renderable and non-renderable data.
Abstract: A system, method, and device for computer networking. According to one embodiment of the invention, the method includes receiving from a remote client a request for a web resource containing renderable and non-renderable data. The method further includes filtering at least a portion of the non-renderable data from the requested web resource, thereby creating a modified web resource. The method also includes sending the modified web resource to the remote client. Non-renderable data may include whitespace, comments, hard returns, meta tags, keywords, or other data not used by a browser to present a web page.
TL;DR: Fifteen of the highest volume medical web sites are described in this paper, and attributes of the ideal site were categorized, and include a robust privacy and disclosure statement with an emphasis on education and an appropriate role for advertising.
Abstract: In 1998, 22 million individuals reported surfing the web for medical information, and this number will increase to over 30 million by 2000. Fifteen of the highest volume medical web sites are described in this paper. Sponsorship and/or ownership of the fifteen sites varied. The government sponsors one, and some are the products of well-known educational institutions. One site is supported by a consumer health organization, and the American Medical Association was in the top 15. However, the most common owners are commercial, for-profit businesses. Attributes of the ideal site were categorized, and include a robust privacy and disclosure statement with an emphasis on education and an appropriate role for advertising. The covering of Complementary and Alternative Medicine (CAM) should be in a balanced and unbiased manner. There has to be an emphasis on knowledge based evidence as opposed to testimonials, and sources should be timely and reviewed. Bibliographies of authors need to be available. Hyperlinking to other web resources is valuable, as even the largest of sites cannot come close to covering all of medicine.
TL;DR: In this paper, a Web resource comprising a plurality of user-selectable hyperlinks to plurality of Web resources is provided to a client node via a computer network via a network.
Abstract: A Web resource comprising a plurality of user-selectable hyperlinks to a plurality of Web resources is provided to a client node via a computer network. The Web resource comprises a plurality of advertiser-usable variables within at least one script. The advertiser-usable variables include a first advertiser-usable variable specific to a first Web resource and a second advertiser-usable variable specific to a second Web resource. An advertisement server node reads the advertiser-usable variables and stores same either at the client node or at the advertisement node. After a first hyperlink is user-selected from the Web resource, the advertisement server node retrieves the first advertiser-usable variable corresponding to the first Web resource. An advertisement is selected from a plurality of advertisements based on the first advertiser-usable variable. The advertisement is provided to the client node to display with the first Web resource.
TL;DR: An indispensable resource guide, this newly expanded book describes how to make use of conventional and assistive technologies, and explains how to determine needs.
Abstract: From the Publisher:
This newly expanded book describes how to make use of conventional and assistive technologies, and explains how to determine needs. An indispensable resource guide, it lists funding sources, government programs, publications, and technology vendors.
TL;DR: The role of events and processes in more expressive metadata and how simple resource-centric models, such as DCMES, are not equipped to express these semantics are discussed.
Abstract: The Dublin Core Metadata Element Set (DCMES) grew out of a recognized need for improved resource discovery of web resources. Initial work on the DCMES focused on the requirement of simplicity: "ordinary" users should be able to formulate descriptive records based on a relatively simple schema (fifteen free-text elements). Over the years there has been a movement within the Dublin Core community to use the DCMES for more complex and specialized resource description tasks and, correspondingly, develop mechanisms for incorporating such complexity within the basic element set. This work has generally been called qualified Dublin Core. We examine the notion of accommodating complexity in a simple metadata model and argue that the dual requirements are incompatible. We discuss the role of events and processes in more expressive metadata and how simple resource-centric models, such as DCMES, are not equipped to express these semantics
TL;DR: In this article, the authors propose a four-step model, adapted from an earlier one by DeBloois & Alder (1973), with four levels: awareness, faculty support, faculty skills, and departmental effort.
Abstract: Our premise, that involvement with Web-based instruction proceeds best incrementally, has been explored in a variety of examples. A range of options, as well as a strategy for dealing with them systematically, has been presented. Our four-step model, adapted from an earlier one by DeBloois & Alder (1973), has four levels: awareness, faculty support, faculty skills, and departmental effort. Practitioners in higher education in the 21st Century will need to capitalize on creative ways to build Web resources, used one at a time, into existing courses, leading to enhanced learning opportunities for students and faculty alike
TL;DR: The systems currently used by libraries to provide their users with access to free web resources are described and analyses and the major obstacles for identifying and retrieving these resources are explored.
Abstract: This article describes and analyses the systems currently used by libraries to provide their users with access to free web resources and it explores the major obstacles for identifying and retrieving these resources. The ways in which libraries organise web resources can be described by three models: making lists, creating databases, and incorporating them into the opac. This study concentrates on the description, identification and characteristics of the latter two approaches, focussing on major experiences with special attention given to the following features in each case: selection criteria, types of descriptions, indexing and classification systems, information retrieval methods, and maintenance policies. Finally, current trends are discussed and considerations are offered as to how Spanish libraries might approach the organisation of web resources.
TL;DR: In this paper, the access and organization of Web-based catalogs, networked databases, CD-ROM databases, Web resources, and print resources is studied. But the focus is on the integration of the resources, the consistency of annotations, and if any instruction or assistance is provided on subjects or databases.
Abstract: This article reviews 63 academic business libraries or business collections on the Web. The focus is a study of the access and organization to Web-based catalogs, networked databases, CD-ROM databases, Web resources, and print resources. Special attention is given to the integration of the resources, the consistency of annotations, and if any instruction or assistance is provided on subjects or databases.
TL;DR: The Web way-back machine described here, currently under development by the author, utilizes collections of historical Web resources like the ones provided by the Internet Archive to allow online, read-only access to the revisioned resources.
Abstract: One of the deficiencies of the World Wide Web is that the Web does not have a memory. Web resources always display one revision only, namely the latest one. In addition, once a Web resource moves from one location (i.e., URL) to another, the resource at the original location ceases to exist.
Since the Web does not provide a mechanism to allow access to the revision history of resources, services like the Internet Archive have sprung into action to collect the history of important Web resources.
The Web way-back machine described here, currently under development by the author, utilizes collections of historical Web resources like the ones provided by the Internet Archive to allow online, read-only access to the revisioned resources.
TL;DR: An overview of Internet prostate cancer resources is provided, presenting a brief history of the Internet and its ubiquitous application, the World Wide Web, with a discussion of search engines, the utilization of web resources by physicians (including evaluating web sites, and a highly selected list of noteworthy sites), and the growing use of electronic mail in the patient-physician relationship.
Abstract: The hallmark of the age of personal computers is the ability to obtain information and communicate with others on nearly any subject using a computer connected to the global network known as the Internet. Information on many diseases is available on the World Wide Web. Information on prostate cancer, including its characteristics, diagnosis, and treatment, is abundantly present on the Internet. This article provides an overview of Internet prostate cancer resources, presenting a brief history of the Internet and its ubiquitous application, the World Wide Web, with a discussion of search engines, the utilization of web resources by physicians (including evaluating web sites, and a highly selected list of noteworthy sites), and the growing use of electronic mail (e-mail) in the patient-physician relationship.
TL;DR: A method, article of manufacture, system and data structure providing a chained or compound URL including a base URL and all subsequent navigation steps necessary to access a desired resource such as a responsively instantiated resource or framed resource as discussed by the authors.
Abstract: A method, article of manufacture, system and data structure providing a chained or compound URL including a base URL and all subsequent navigation steps necessary to access a desired resource such as a responsively instantiated resource or framed resource.
TL;DR: The project has been finalized, the package works, and its performance has been tested both objectively and subjectively; it demonstrates superiority over similar solutions from the open literature.
Abstract: The main goal of the project called Socratenon was to build a new Web-based training environment that would go beyond traditional ones. In the open literature, there are several solutions trying to accomplish the same to a certain degree. Some of them are nothing more but plain virtual textbooks that only flip pages on mouse-clicks. More sophisticated techniques include user modeling in order to personalize the content for the user, adaptive interfaces, intelligent agents for improved assistance and search, neural networks and case-based reasoning for building intelligent back-ends, etc. In general, many existing learning environments lack interaction, full utilization of Web resources is scarce, while solutions utilizing a combination of all above are practically non-existent or in development. This paper tries to merge the potential of the new Internet technologies and the latest developments in cognitive sciences, on the one hand, with the comfort of learning at the most suitable time and in the most suitable place, on the other hand. The project has been finalized, the package works, and its performance has been tested both objectively and subjectively; it demonstrates superiority over similar solutions from the open literature. Its complexity is such that it can fit even the widespread PC platforms, although it demonstrates the best performance on state-of-the-art corporate platforms.
TL;DR: Creating DC metadata records of websites selected for seventy-five academic disciplines, Washtenaw investigates the use-ability of Dublin Core and the CORC system, as well as their implication for the future of Internet searching.
Abstract: In January 1999, OCLC embarked on a rapid prototyping project known as CORC, Cooperative Online Resource Catalog It aims to develop a web resource catalog based on Dublin Core (DC), a metadata standard, which complies with W3C's Resource Description Framework (RDF), SML, and HTML Washtenaw Community College Learning Resource Center is the only community college library joining eighty-four other research participants in this eighteen-month collaborative project Creating DC metadata records of websites selected for seventy-five academic disciplines, Washtenaw investigates the use-ability of Dublin Core and the CORC system, as well as their implication for the future of Internet searching