Benchmarking Smart Meter Data Analytics
Xiufeng Liu,Lukasz Golab,Wojciech Golab,Ihab F. Ilyas +3 more
- 01 Jan 2015
- pp 385-396
TL;DR: This paper proposes a performance benchmark that includes common data analysis tasks on smart meter data, and presents an algorithm for generating large realistic data sets from a small seed of real data.
read more
Abstract: Smart electricity meters have been replacing conventional meters worldwide, enabling automated collection of fine-grained (every 15 minutes or hourly) consumption data. A variety of smart meter analytics algorithms and applications have been proposed, mainly in the smart grid literature, but the focus thus far has been on what can be done with the data rather than how to do it efficiently. In this paper, we examine smart meter analytics from a software performance perspective. First, we propose a performance benchmark that includes common data analysis tasks on smart meter data. Second, since obtaining large amounts of smart meter data is difficult due to privacy issues, we present an algorithm for generating large realistic data sets from a small seed of real data. Third, we implement the proposed benchmark using five representative platforms: a traditional numeric computing platform (Matlab), a relational DBMS with a built-in machine learning toolkit (PostgreSQL/MADLib), a main-memory column store (“System C”), and two distributed data processing platforms (Hive and Spark). We compare the five platforms in terms of application development effort and performance on a multi-core machine as well as a cluster of 16 commodity servers. We have made the proposed benchmark and data generator freely available online.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
The Role of Data Analysis in the Development of Intelligent Energy Networks
TL;DR: More comprehensive data analysis methods are needed to handle the increasing amount of data and to mine more valuable information in intelligent energy networks.
Smart Meter Data Analytics: Systems, Algorithms, and Benchmarking
TL;DR: This article designs a performance benchmark that includes common smart meter analytics tasks as well as a framework for online anomaly detection that is implemented and presents an algorithm for generating large realistic datasets from a small seed of real data.
71
Understanding energy demand behaviors through spatio-temporal smart meter data analysis
TL;DR: Wang et al. as discussed by the authors proposed a spatio-temporal visual analysis approach for urban energy consumption pattern discovery in order to identify energy-saving potentials, plan energy supply and improve energy efficiency.
51
Data driven model for heat load prediction in buildings connected to District Heating by using smart heat meters
TL;DR: In this paper , a data-driven model for the characterization and prediction of heating demand in buildings connected to a district heating (DH) network is presented, which makes use of four climatic variables, including outdoor ambient temperature, global solar radiation and wind speed and direction, combined with time factors and data from smart meters.
48
•Posted Content
Regression-based Online Anomaly Detection for Smart Grid Data
Xiufeng Liu,Per Sieverts Nielsen +1 more
TL;DR: This paper empirically evaluates the system and the detection algorithm, and the results show the effectiveness and the scalability of the proposed lambda detection system.
38
References
•Proceedings Article
Spark: cluster computing with working sets
Matei Zaharia,Mosharaf Chowdhury,Michael J. Franklin,Scott Shenker,Ion Stoica +4 more
- 22 Jun 2010
TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Hive: a warehousing solution over a map-reduce framework
Ashish Thusoo,Joydeep Sen Sarma,Namit Jain,Zheng Shao,Prasad Chakka,Suresh Anthony,Hao Liu,Pete Wyckoff,Raghotham Murthy +8 more
- 01 Aug 2009
TL;DR: Hadoop is a popular open-source map-reduce implementation which is being used as an alternative to store and process extremely large data sets on commodity hardware.
Short-term load forecasting
George Gross,Francisco D. Galiana +1 more
- 01 Dec 1987
TL;DR: In this paper, the authors discuss the state of the art in short-term load forecasting (STLF), that is, the prediction of the system load over an interval ranging from one hour to one week.
Comparisons among clustering techniques for electricity customer classification
TL;DR: Various techniques are discussed and compared able to reduce the size of the clustering input data set, in order to allow for storing a relatively small amount of data in the database of the distribution service provider for customer classification purposes.
567
An electric energy consumer characterization framework based on data mining techniques
TL;DR: This paper presents an electricity consumer characterization framework based on a knowledge discovery in databases (KDD) procedure, supported by data mining techniques, applied on the different stages of the process.