Open AccessBook
Cloud Computing: Data-Intensive Computing and Scheduling
Frederic Magoules,Jie Pan,Fei Teng +2 more
- 20 Sep 2012
34
TL;DR: Cloud Computing: Data-Intensive Computing and Scheduling explores the evolution of classical techniques and describes completely new methods and innovative algorithms that demonstrate how cloud computing can meet business requirements and serve as the infrastructure of multidimensional data analysis applications.
read more
Abstract: As more and more data is generated at a faster-than-ever rate, processing large volumes of data is becoming a challenge for data analysis software. Addressing performance issues, Cloud Computing: Data-Intensive Computing and Scheduling explores the evolution of classical techniques and describes completely new methods and innovative algorithms. The book delineates many concepts, models, methods, algorithms, and software used in cloud computing. After a general introduction to the field, the text covers resource management, including scheduling algorithms for real-time tasks and practical algorithms for user bidding and auctioneer pricing. It next explains approaches to data analytical query processing, including pre-computing, data indexing, and data partitioning. Applications of MapReduce, a new parallel programming model, are then presented. The authors also discuss how to optimize multiple group-by query processing and introduce a MapReduce real-time scheduling algorithm. A useful reference for studying and using MapReduce and cloud computing platforms, this book presents various technologies that demonstrate how cloud computing can meet business requirements and serve as the infrastructure of multidimensional data analysis applications.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Scheduling-Guided Automatic Processing of Massive Hyperspectral Image Classification on Cloud Computing Architectures
TL;DR: An acceleration method for HSI classification that relies on scheduling metaheuristics to automatically and optimally distribute the workload of HSI applications across multiple computing resources on a cloud platform is proposed.
93
An Efficient and Scalable Framework for Processing Remotely Sensed Big Data in Cloud Computing Environments
Jin Sun,Yi Zhang,Zebin Wu,Yaoqin Zhu,Xianliang Yin,Zhongzheng Ding,Zhihui Wei,Javier Plaza,Antonio Plaza +8 more
TL;DR: A new big data framework for processing massive amounts of remote sensing images on cloud computing platforms that incorporates task scheduling strategy to further exploit the parallelism during the distributed processing stage and achieves promising results in terms of execution time.
85
Recent Developments in Parallel and Distributed Computing for Remotely Sensed Big Data Processing
Zebin Wu,Jin Sun,Yi Zhang,Zhihui Wei,Jocelyn Chanussot +4 more
- 17 Jun 2021
TL;DR: In this article, a survey of state-of-the-art methods for processing remotely sensed big data and thoroughly investigates existing parallel implementations on diverse popular high-performance computing platforms are discussed in terms of capability, scalability, reliability, and ease of use.
75
Data as a Service (DaaS) for Sharing and Processing of Large Data Collections in the Cloud
Olivier Terzo,Pietro Ruiu,Enrico M. Bucci,Fatos Xhafa +3 more
- 03 Jul 2013
TL;DR: A DaaS approach for intelligent sharing and processing of large data collections with the aim of abstracting the data location (by making it relevant to the needs of sharing and accessing) and to fully decouple the data and its processing is proposed.
Hybrid Task Scheduling Method for Cloud Computing by Genetic and DE Algorithms
Amin Kamalinia,Ali Ghaffari +1 more
TL;DR: This paper proposed a hybrid meta-heuristic method by using HEFT algorithm that outperforms three other heuristic and genetic algorithms in terms of the makespan in the randomly Direct Acyclic Graphs (DAGs).
41
References
MapReduce: simplified data processing on large clusters
Jeffrey Dean,Sanjay Ghemawat +1 more
TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment
C. L. Liu,James W. Layland +1 more
TL;DR: The problem of multiprogram scheduling on a single processor is studied from the viewpoint of the characteristics peculiar to the program functions that need guaranteed service and it is shown that an optimum fixed priority scheduler possesses an upper bound to processor utilization.
The vision of autonomic computing
TL;DR: A 2001 IBM manifesto noted the almost impossible difficulty of managing current and planned computing systems, which require integrating several heterogeneous environments into corporate-wide computing systems that extend into the Internet.
7.2K
Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility
TL;DR: This paper defines Cloud computing and provides the architecture for creating Clouds with market-oriented resource allocation by leveraging technologies such as Virtual Machines (VMs), and provides insights on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain Service Level Agreement (SLA) oriented resource allocation.
6.3K
Bigtable: A Distributed Storage System for Structured Data
Fay W. Chang,Jeffrey Dean,Sanjay Ghemawat,Wilson C. Hsieh,Deborah A. Wallach,Michael Burrows,Tushar Deepak Chandra,Andrew Fikes,Robert E. Gruber +8 more
TL;DR: The simple data model provided by Bigtable is described, which gives clients dynamic control over data layout and format, and the design and implementation of Bigtable are described.
3.5K