Data discovery algorithm for scientific data grid environment
Azizol Abdullah,Mohamed Othman,Md. Nasir Sulaiman,Hamidah Ibrahim,Abu Talib Othman +4 more
- 01 Nov 2005
- Vol. 65, Iss: 11, pp 1429-1434
TL;DR: This research work presents the scientific data grid as a large P2P-based distributed system model and studies various discovery algorithms for locating data sets in a data grid system based on the P1P architecture.
read more
Abstract: In modern scientific computing communities, scientists are involved in managing massive amounts of very large data collections in a geographically distributed environment. Research in the area of grid computing has given us various ideas and solutions to address these requirements. Data grid mostly deals with large computational problems and provides geographically distributed resources for large-scale data-intensive applications that generate large data sets. Peer-to-peer (P2P) networks have also become a major research topic over the last few years. In a distributed P2P system, a discovery algorithm is required to locate specific information, applications, or users within the system. In this research work, we present our scientific data grid as a large P2P-based distributed system model. By using this model, we study various discovery algorithms for locating data sets in a data grid system. The algorithms we studied are based on the P2P architecture. We investigate these algorithms using our Grid Simulator developed using PARSEC. In this paper, we illustrate our scientific data grid model and our Grid Simulator. We then analyze the performance of the discovery algorithms relative to their average number of hop, success rates and bandwidth consumption.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Enhanced Dynamic Hierarchical Replication and Weighted Scheduling Strategy in Data Grid
TL;DR: A novel job scheduling strategy called Weighted Scheduling Strategy (WSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node and a dynamic data replication strategy that improves file access time that is an enhanced version of the Dynamic Hierarchical Replication strategy are proposed.
30
Survey on Big Data Security Framework
Muthuraman Thangaraj,S. Balamurugan +1 more
- 21 Aug 2017
TL;DR: Big Data are large data sets that are either structured/unstructured collection of various data formats that are mined for information extraction and decision-making and security in Big Data is one of the interesting areas that are being researched.
5
Data resource discovery model based on hybrid architecture in data grid environment
TL;DR: This paper designs a structured logic resource tree in each domain from the P2P system in order to effectively alleviate the load on the super‒node, and proposes a query recording learning algorithm based on this hybrid architecture to reduce traffic in the network and greatly shorten the response time.
5
A Semantic Concast service for data discovery, aggregation and processing on NDN
TL;DR: Multiple types and strategies of data aggregation and processing for combining and processing the positive data and suppressing the negative, futile data, as well as a determination of response completeness are introduced for enhancing relevant results recall and sharing.
1
Agent based optimized réplica management in data grids
Priyanka Vashisht,Vijay Kumar +1 more
- 06 Mar 2020
TL;DR: This paper addresses replica management issues such as availability, placement and consistency in a distributed grid environment by proposing RCPC, which dynamically creates an optimal number of replicas and places them on nodes having minimum placement cost.
1
References
Chord: A scalable peer-to-peer lookup service for internet applications
Ion Stoica,Robert Morris,David R. Karger,M. Frans Kaashoek,Hari Balakrishnan +4 more
- 27 Aug 2001
TL;DR: Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.
11.2K
A scalable content-addressable network
Sylvia Ratnasamy,Paul Francis,Mark Handley,Richard M. Karp,Scott Shenker +4 more
- 27 Aug 2001
TL;DR: The concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales is introduced and its scalability, robustness and low-latency properties are demonstrated through simulation.
7.2K
Globus: a Metacomputing Infrastructure Toolkit
Ian Foster,Carl Kesselman +1 more
- 01 Jun 1997
TL;DR: The Globus system is intended to achieve a vertically integrated treatment of application, middleware, and net work, an integrated set of higher level services that enable applications to adapt to heteroge neous and dynamically changing metacomputing environ ments.
The D0 Experiment Data Grid - SAM
L. Lueking,L. Loebel-Carpenter,W. Merritt,C. D. Moore,Ruth Pordes,I. Terekhov,Sinisa Veseli,Matt Vranicar,Steve White,V. White +9 more
- 12 Nov 2001
TL;DR: The DO emphasis is to develop the more sophisticated global grid job, resource management, authentication and information services needed to fully meet the needs of the experiment during the next 6 years of data taking and analysis.
9
Condor-G: a computation management agent for multi-institutional grids
James W. Frey,Todd Tannenbaum,Miron Livny,Ian Foster,Steven Tuecke +4 more
- 07 Aug 2001
TL;DR: It is asserted that Condor-G can serve as a general-purpose interface to Grid resources, for use by both end users and higher-level program development tools.