TL;DR: A novel study of the properties of Bitcoin's PoW, the challenges of a more “rational” solution as PoX, and a comprehensive approach for PoX are provided.
Abstract: Cryptocurrency and blockchain technologies are recently gaining wide adoption since the introduction of Bitcoin, being distributed, authority-free, and secure. Proof of Work (PoW) is at the heart of blockchain's security, asset generation, and maintenance. Although simple and secure, a hash-based PoW like Bitcoin's puzzle is often referred to as “useless”, and the used intensive computations are considered “waste” of energy. A myriad of Proof of “something” alternatives have been proposed to mitigate energy consumption; however, they either introduced new security threats and limitations, or the “work” remained far from being really “useful”. In this work, we introduce Proof of eXercise (PoX): a sustainable alternative to PoW where an eXercise is a real world matrix-based scientific computation problem. We provide a novel study of the properties of Bitcoin's PoW, the challenges of a more “rational” solution as PoX, and we suggest a comprehensive approach for PoX.
TL;DR: It is shown that distributed deep learning computation on WAN connected devices feasible, in spite of the traffic caused by learning tasks, and that such a setup rises some important challenges, most notably the ingress traffic that the servers hosting the up-to-date model have to sustain.
Abstract: A large portion of data mining and analytic services use modern machine learning techniques, such as deep learning. The state-of-the-art results by deep learning come at the price of an intensive use of computing resources. The leading frameworks (e.g., TensorFlow) are executed on GPUs or on high-end servers in datacenters. On the other end, there is a proliferation of personal devices with possibly free CPU cycles; this can enable services to run in users' homes, embedding machine learning operations. In this paper, we ask the following question: Is distributed deep learning computation on WAN connected devices feasible, in spite of the traffic caused by learning tasks? We show that such a setup rises some important challenges, most notably the ingress traffic that the servers hosting the up-to-date model have to sustain. In order to reduce this stress, we propose AdaComp, a novel algorithm for compressing worker updates to the model on the server. Applicable to stochastic gradient descent based approaches, it combines efficient gradient selection and learning rate modulation. We then experiment and measure the impact of compression, device heterogeneity and reliability on the accuracy of learned models, with an emulator platform that embeds TensorFlow into Linux containers. We report a reduction of the total amount of data sent by workers to the server by two order of magnitude (e.g., 191-fold reduction for a convolutional network on the MNIST dataset), when compared to a standard asynchronous stochastic gradient descent, while preserving model accuracy.
TL;DR: This paper presents the first study of Android Ad- Blocking apps (or Ad-Blockers), analysing 97 Ad-Blocking mobile apps extracted from a corpus of more than 1.5 million Android apps on Google Play.
Abstract: Online advertisers, third party trackers and analytics services are constantly tracking user activities as they access web services through their web browsers or mobile apps. While, web browser plugins disabling and blocking Ads (often associated tracking/analytics scripts), e.g. AdBlock Plus[3] have been well studied and are relatively well understood, an emerging new category of apps in the tracking mobile eco-system, referred as the mobile Ad-Blocking apps, received very little to no attention. With the recent significant increase of the number of mobile Ad-Blockers and the exponential growth of mobile Ad-Blocking apps' popularity, this paper aims to fill in the gap and study this new category of players in the mobile ad/tracking eco-system. This paper presents the first study of Android Ad-Blocking apps (or Ad-Blockers), analysing 97 Ad-Blocking mobile apps extracted from a corpus of more than 1.5 million Android apps on Google Play. While the main (declared) purpose of the apps is to block advertisements and mobile tracking services, our data analysis revealed the paradoxical presence of third-party tracking libraries and permissions to access sensitive resources on users' mobile devices, as well as the existence of embedded malware code within some mobile Ad-Blockers. We also analysed user reviews and found that even though a fraction of users raised concerns about the privacy and the actual performance of the mobile Ad-Blocking apps, most of the apps still attract a relatively high rating.
TL;DR: An improved data fusion method, IICKPDA, is proposed for privacy protection in wireless sensor networks, based on the ICKPDA algorithm, which improves the data fusion precision and the fusion efficiency of the intermediate fusion node.
Abstract: Aiming at the privacy and security of data acquisition and monitoring in wireless sensor networks, an improved data fusion method, IICKPDA, is proposed for privacy protection in wireless sensor networks The method is based on the ICKPDA algorithm, which optimizes the deadline between the same layer and the different layers,which can be randomly selected by the adjacent nodes of the data fragments to be transmitted, and also the non-leaf nodes which only the data is not collected to improve the characteristics of the data The improved algorithm avoids the redundancy of the intermediate fusion process and protects the collected monitoring data, which improves the data fusion precision and the fusion efficiency of the intermediate fusion node Theory and related experiments demonstrate the feasibility of the method
TL;DR: This work proposes a more complete attack model and design an advanced IFA, and shows the efficiency of this novel attack scheme by extensively assessing some of the state-of-the-art countermeasures.
Abstract: The Named-Data Networking (NDN) has emerged as a clean-slate Internet proposal on the wave of Information-Centric Networking. Although the NDN's data-plane seems to offer many advantages, e.g., native support for multicast communications and flow balance, it also makes the network infrastructure vulnerable to a specific DDoS attack, the Interest Flooding Attack (IFA). In IFAs, a botnet issuing unsatisfiable content requests can be set up effortlessly to exhaust routers' resources and cause a severe performance drop to legitimate users. So far several countermeasures have addressed this security threat, however, their efficacy was proved by means of simplistic assumptions on the attack model. Therefore, we propose a more complete attack model and design an advanced IFA. We show the efficiency of our novel attack scheme by extensively assessing some of the state-of-the-art countermeasures. Further, we release the software to perform this attack as open source tool to help design future more robust defense mechanisms.
TL;DR: This work presents three redundancy patterns to build a reliable path from a source to a destination, the well-known two node-Disjoint paths, and a third one based on a Triangular pattern.
Abstract: Time Slotted Channel Hopping (TSCH) networks are emerging as a promising technology for the Internet of Things and the Industry 4.0 where ease of deployment, reliability, short latency, flexibility and adaptivity are required. Our goal is to improve reliability of data gathering in such wireless sensor networks. We present three redundancy patterns to build a reliable path from a source to a destination. The first one is the well-known two node-Disjoint paths. The second one is based on a Triangular pattern, and the third one on a Braided pattern. A comparative evaluation is carried out to analyze the reliability achieved, the number of failures tolerated, the number of message copies generated and the energy consumed by each node to ensure that at least one copy of the message is delivered to the destination. These results are validated by simulations.
TL;DR: The design and the integration of a novel Link reliable and Trust aware model into the RPL protocol is focused on, which aims to ensure Trust among entities and to provide QoS guarantees during the construction and the maintenance of the network routing topology.
Abstract: Internet of Things (IoT) is characterized by heterogeneous devices that interact with each other on a collaborative basis to fulfill a common goal. In this scenario, some of the deployed devices are expected to be constrained in terms of memory usage, power consumption and processing resources. To address the specific properties and constraints of such networks, a complete stack of standardized protocols has been developed, among them the Routing Protocol for Low-Power and lossy networks (RPL). However, this protocol is exposed to a large variety of attacks from the inside of the network itself. To fill this gap, this paper focuses on the design and the integration of a novel Link reliable and Trust aware model into the RPL protocol. Our approach aims to ensure Trust among entities and to provide QoS guarantees during the construction and the maintenance of the network routing topology. Our model targets both node and link Trust and follows a multidimensional approach to enable an accurate Trust value computation for IoT entities. To prove the efficiency of our proposal, this last has been implemented and tested successfully within an IoT environment. Therefore, a set of experiments has been made to show the high accuracy level of our system.
TL;DR: Using principles of Software Defined Networks (SDN), this paper presents an analysis of video streaming for military surveillance in which multiple UAVs are employed as data providers through an SDN-enabled network, with promising results.
Abstract: Video streaming is an important service provided by surveillance systems to enhance situation awareness. However, in military systems, data acquisition heavily depends on the network infrastructure. In this application domain, units are spread and the distance between the sources of data and the decision makers may be very large. In the case of video streaming, the demand for high network throughput poses some extra requirements on the network. Considering the mobility patterns of the military units and the diversity of the new generations of sensors, especially those used by Unmanned Aerial Vehicles (UAV), the configuration and the management of the network must be so dynamic and so sensitive to data flow parameters that manual configuration is not acceptable. For this reason, the capability of the network to configure itself to offer the necessary Quality of Service is a must. Using principles of Software Defined Networks (SDN), this paper presents an analysis of video streaming for military surveillance in which multiple UAVs are employed as data providers through an SDN-enabled network, with promising results.
TL;DR: A novel approach to integrate plugged-in electric vehicles, electric vehicle supply equipments, and smart grid infrastructure using state of the art software-defined network technology has a potential to provide unprecedented flexibility to smart grid communication network.
Abstract: This paper proposes a novel approach to integrate plugged-in electric vehicles (PEVs), electric vehicle supply equipments (EVSEs), and smart grid (SG) infrastructure using state of the art software-defined network (SDN) technology, which has a potential to provide unprecedented flexibility to smart grid communication network. We further present set-cardinality based search algorithms for assigning charging station to PEVs to reduce their average charging time. Simulation results show a considerable improvement in the average charging time of PEVs as compared to conventional minimum distance based charging station selection.
TL;DR: RANSOMSAFEDROID is protected from malware by leveraging the ARM TrustZone extension and running in the secure world and does backup of files periodically to a secure local persistent partition and pushes these backups to external storage to protect them from ransomware.
Abstract: The growing popularity of Android and the increasing amount of sensitive data stored in mobile devices have lead to the dissemination of Android ransomware. Ransomware is a class of malware that makes data inaccessible by blocking access to the device or, more frequently, by encrypting the data; to recover the data, the user has to pay a ransom to the attacker. A solution for this problem is to backup the data. Although backup tools are available for Android, these tools may be compromised or blocked by the ransomware itself. This paper presents the design and implementation of RANSOMSAFEDROID, a TrustZone based backup service for mobile devices. RANSOMSAFEDROID is protected from malware by leveraging the ARM TrustZone extension and running in the secure world. It does backup of files periodically to a secure local persistent partition and pushes these backups to external storage to protect them from ransomware. Initially, RANSOMSAFEDROID does a full backup of the device filesystem, then it does incremental backups that save the changes since the last backup. As a proof-of-concept, we implemented a RANSOMSAFEDROID prototype and provide a performance evaluation using an i.MX53 development board.
TL;DR: This work evaluates and enhances an architecture for privacy in the integration of IoT and cloud computing by providing a method to protect the data generated by IoT devices without the use of a secure transport layer protocol, decreasing the amount of resources consumed by these devices.
Abstract: Through the Internet of Things (IoT) a large number of devices are connected to the Internet, resulting in a huge amount of produced data. Since these devices have limited resources, it is proposed the use of cloud computing to store, process and control the access to these data. A fundamental challenge related to this integration is the privacy, since confidential information about users may be collected by IoT devices and sent to the cloud. Consequently, it is imperative to provide mechanisms that allow users to control the use of their data. The proposed architectures either do not provide security inside the IoT network or do not evaluate the overhead imposed in the IoT devices. In order to fill this gap, this work evaluates and enhances an architecture for privacy in the integration of IoT and cloud computing by providing a method to protect the data generated by IoT devices without the use of a secure transport layer protocol, decreasing the amount of resources consumed by these devices. The proposed method is analyzed, in terms of delay and energy consumption, by an analytical and an experimental evaluation, comparing its performance with other approaches found in the literature.
TL;DR: A Data Dissemination protocol Based on Centrality (DDBC) for urban scenarios is proposed and simulation results show that DDBC protocol offers good efficiency in terms of delays and overhead, while achieve network coverage around 90%.
Abstract: Vehicular Ad-hoc NETworks (VANETs) are composed of moving vehicles with the ability to process, store, and communicate via wireless medium. VANETs promise a wide scope of services, such as, safety and security, traffic efficiency, and others. For instance, a VANET application can detect, control and reduce traffic congestion based on data that describes traffic patterns. However, disseminating data in VANET is a challenging task, due to its particular characteristics, i.e., heterogeneous density, short-range communication, and node mobility. Since, existing protocols for data dissemination do not effectively address the high overhead, in this paper, we proposed a Data Dissemination protocol Based on Centrality (DDBC) for urban scenarios. The simulation results show that DDBC protocol offers good efficiency in terms of delays and overhead, while achieve network coverage around 90%.
TL;DR: Niflheim is presented, a generic end-to-end middleware that provides modular microservice-based orchestration of applications to deploy and manage them on all resources across the tiers of the IoT, from IoT end-devices through gateways to the cloud.
Abstract: The state-of-practice for Internet of Things (IoT) applications is deployment on specialised networks of embedded devices connected to a cloud backend. While this paradigm has successfully supported a range of IoT systems, its power is limited by the high latency and bandwidth caused by communications with remote data servers and the inability to share specialised IoT infrastructure across applications. To improve these aspects, this paper proposes re-imagining all resources of the IoT infrastructure as microservice-hosting platforms. Applications decomposed as a set of services can then share IoT resources and run communicating modules closer together, tightening control loops and reducing latency and communications. To this end, we present Niflheim, a generic end-to-end middleware that provides modular microservice-based orchestration of applications to deploy and manage them on all resources across the tiers of the IoT, from IoT end-devices through gateways to the cloud. This enables increased flexibility in deployment and operations, while remaining efficient in terms of hardware and software requirements. We evaluate Niflheim in a smart building use case and demonstrate improved latency and bandwidth consumption for applications, while enabling efficient shared use of the IoT infrastructure resources.
TL;DR: This paper model the path selection problem for edge traffic engineering using a risk minimization technique inspired by portfolio theory in economics, and uses machine learning to estimate path selection risks, and suggests that a Bayesian Network approach may lead to good latency (peak) estimation performance, as long as there are dependencies among the time series path latency measurements.
Abstract: Traffic engineering at network edges is challenging given the latency-sensitive nature of all applications that need to be supported. End-to-end delay estimation and forecasts were essential traffic engineering tools even before the mobile edge computing paradigm pushed the cloud closer to the end user. In this paper, we model the path selection problem for edge traffic engineering using a risk minimization technique inspired by portfolio theory in economics, and we use machine learning to estimate path selection risks. In particular, using real latency time series measurements, both existing and collected with and without the GENI testbed, we compare four short-horizon latency estimation techniques, commonly used by the finance community to estimate prices of volatile financial instruments. Our results suggest that a Bayesian Network approach may lead to good latency (peak) estimation performance, as long as there are dependencies among the time series path latency measurements.
TL;DR: The Resource Demand Aware Scheduling (RDAS) algorithm is designed that schedules workflows based on their resource demands and priorities considering workflow structure and outperforms three existing algorithms by 22%, 13% and 33%, on average, in terms of makespan, cost and the number of resources used, respectively.
Abstract: A major challenge of running applications in clouds is to determine the right number of resources (virtual machines or VMs) to rent in terms of both performance and cost. Such a challenge becomes greater if the application requires to run across multiple resources. In this paper, we address the problem of scheduling scientific workflow applications. The structure of workflows, dictated by precedence/data dependencies, and the diversity of resources in clouds both at large scale make the resource provisioning and task scheduling very complex. To this end, we design the Resource Demand Aware Scheduling (RDAS) algorithm that schedules workflows based on their resource demands and priorities considering workflow structure. RDAS partitions workflows and allocates resources of possibly different capacities/types to the partitions in a “fair” manner such that their execution times do not vary significantly. RDAS turns resource and application heterogeneity (a major hindering factor in clouds) into an opportunity for optimizing resource provisioning for scientific workflows. Based on our experimental results, RDAS demonstrates its capacity of minimizing the overall workflow completion time (makespan) and in turn minimizing costs of the execution. In particular, RDAS outperforms three existing algorithms by 22%, 13% and 33%, on average, in terms of makespan, cost and the number of resources used, respectively.
TL;DR: This work presents PPDAS, a mechanism which supports fine-grained read and write operations in a setting where decryption keys are generated by multiple attribute authorities, and the access policy is hidden from all unauthorized entities including the attribute authorities.
Abstract: Attribute-based encryption schemes provide read access to data based on users' attributes. In these schemes, user privacy is compromised as the access policies are visible. This privacy issue has been addressed in literature by enabling the data owner to obfuscate the policy in a setting where a single authority generates decryption keys. However, a single authority can figure out the hidden access policy which violates user privacy. We present PPDAS, a scheme which overcomes these limitations and makes two contributions. Firstly, we present a mechanism which supports fine-grained read and write operations in a setting where decryption keys are generated by multiple attribute authorities, and the access policy is hidden from all unauthorized entities including the attribute authorities. Our scheme is also accompanied with a user revocation mechanism. Secondly, we show that it is possible to adapt the scheme for accessing data through resource-constrained devices such as smart watches and IoT devices through extensive experimental evaluations.
TL;DR: Intensive experiments show that the proposed VM placement scheme offers better performance attributes over Minimum Correlation Coefficient (MCC) and Power Aware Best Fit Decreasing (PABFD) schemes measured in terms of the following metrics.
Abstract: The key to maintaining high standards of quality and power conservation of physical machines in data centers lies in efficient consolidation of virtual machines (VMs). Several schemes have been proposed for this purpose; and these include online migration and VM placement — which can offer the best in terms of resource utilization. The consolidation process can be made effective by finding “opportunities” to migrate VMs as well approximating the resource utilization for the VM placement. An inefficient placement scheme, however, will lead to a substantial overloading of physical machines. This proposed VM placement scheme uses correlation coefficient and predicted future requirements of computing resources to accurately compute the value/s of variable, and has been termed LIFE — Lowest Interdependence Factor Exponent. This variable shows the extent to which a VM can be associated with a target physical machine. Higher value of LIFE will correspondingly result in a larger impact factor influencing the performance of existing VMs whenever a VM is selected for migration to a target machine. To minimize performance degradation, migration of a VM to a target machine will only take place if it is found to correspond with a value of LIFE that is found to be the lowest. Intensive experiments show that the proposed scheme offers better performance attributes over Minimum Correlation Coefficient (MCC) and Power Aware Best Fit Decreasing (PABFD) schemes measured in terms of the following metrics: power consumption by 44.08% and 27.52%, SLA violation by 50.90% and 19.53% and number of VM migration by 52.91% and 9.66% respectively.
TL;DR: Evaluation using data from a real-world smart grid pilot project as well as extreme demand profiles that scale up and down the demand 50% on average confirm the cost-effectiveness of in-network aggregation empowered by self-adaptation.
Abstract: The Internet of Things empowers citizens to interconnect their devices, such as smart phones, into large-scale participatory decentralized networks, which they can use to make real-time collective measurements as public good, for instance, crowd-sourcing the monitoring of traffic in a city. This approach is an alternative to big data analytics systems that are often expensive to access, privacy-intrusive and allow discriminatory and profiling actions over citizens' data. On the contrary, large-scale decentralized networks are complex to manage and collective measurements, i.e. computations of aggregation functions, need to encounter several dynamics such as continuously changing input data streams and highly varying temporal demand for access to the collective measurements. This paper proposes a highly reactive self-adaptation model to tackle the challenge of dynamic computational demand in large-scale decentralized in-network aggregation. The self-adaptation process makes nodes self-aware about other nodes that join and leave the network and therefore it makes them capable of self-orchestrating the communication to improve accuracy and minimize communication cost. The model is simple, yet agile. This is shown when applied in DIAS, the Dynamic Intelligent Aggregation Service without introducing architectural changes. Evaluation using data from a real-world smart grid pilot project as well as extreme demand profiles that scale up and down the demand 50% on average confirm the cost-effectiveness of in-network aggregation empowered by self-adaptation. The findings are confirmed both in simulation and a large-scale live deployment in a cluster infrastructure with 3000 independent Java virtual machines each running a DIAS node. Overall, the results encourage new promising pathways towards the broader adoption of self-adaptive participatory data analytics in large-scale decentralized networks.
TL;DR: This paper introduces new data-migration algorithms for the heterogeneous case and performs an empirical comparative study of the performance of these algorithms against algorithms from [1] and [2].
Abstract: In large scale storage systems such as data centers, data layouts need to be reconfigured over time for load balancing or in the event of system failure/upgrades. The data-migration problem pertains to computing an efficient plan to migrate data to their target locations. Most of the previous results on data-migration assume that storage devices have similar capabilities and can perform only one data transfer at a time. In this paper, we consider the heterogeneous data-migration problem where we associate a transfer constraint to each of the storage nodes, representing the number of simultaneous transfers that each of the nodes can handle. We introduce new data-migration algorithms for the heterogeneous case and perform an empirical comparative study of the performance of these algorithms against algorithms from [1] and [2].
TL;DR: A predictive handover mechanism which can estimate achievable throughput values from different candidate access networks after handoff execution is proposed which significantly outperforms an existing reference base station efficiency based approach.
Abstract: The fifth generation (5G) ultra dense network (UDN) is envisioned as a very dense deployment of low power base stations where heterogeneous radio access technologies are used to satisfy the data rate demand of users employing both licensed and unlicensed spectrum. In UDN scenario, conditions of the channels operating in licensed band may exhibit intermittent characteristics due to the varying level of interference received from large number of nearby access networks. On the other hand, channel conditions in unlicensed band may fluctuate drastically due to the interference caused by the co-existence of long term evolution (LTE-U) and wireless local area network (WLAN) in unlicensed band. The traditional handover mechanisms rely on the instantaneous assessments of link qualities such as RSS and SINR. The target network selected based on such instantaneous values may not be the appropriate one when the actual handover is executed. In this work, we have proposed a predictive handover mechanism which can estimate achievable throughput values from different candidate access networks after handoff execution. Simulation results confirm that our proposed predictive handover mechanism significantly outperforms an existing reference base station efficiency based approach.
TL;DR: A software-defined security service (SDS2) for protecting cloud infrastructures focuses on defining security concerns regarding physical and virtual boundaries of data, resources, tenants and detecting security breaches through violations of boundaries.
Abstract: Software-Defined Infrastructure (SDI) is a resource sharing infrastructure that embraces the concept of separation of the network control plane from its data plane, and software realization of network functions from the underlying hardware appliances through the virtualization technology in emerging infrastructures such as Cloud, Network Function Virtualization (NFV), and Software-Defined Networking (SDN). Virtualization and virtualized infrastructures bring with them new challenges regarding security and virtual resources protection. Traditional security measures and endpoint security are no longer adequate due to invisible boundaries created among shared logical and virtual entities among numerous users. This paper introduces a software-defined security service (SDS 2 ) for protecting cloud infrastructures. SDS 2 focuses on defining security concerns regarding physical and virtual boundaries of data, resources, tenants and detecting security breaches through violations of boundaries. Boundaries are defined by security policies and security violations by attackers are predicted, monitored, and detected when boundaries are crossed. This paper describes SDS 2 and presents its initial implementation. The paper provides examples of policy-defined boundaries and shows the effectiveness and feasibility of our design in detecting invisible security boundaries through simulation of a security control structure and agile, dynamic, and intelligent VSFs.
TL;DR: First, the improved k-means clustering algorithm is used for the residential electricity load clustering analysis to extract the typical load curve of each cell; and then a depth belief network classifier is constructed to identify the residential power load patterns and provide reliable support for distribution network maintenance.
Abstract: The study of power load patterns is the premise and basis of power distribution network maintenance. In view of the shortage of the existing power load model focusing on industry, agriculture, commerce and other large users, not on residents, In this paper, a method of residential power load patterns based on clustering and deep belief network is proposed. Firstly, use the improved k-means clustering algorithm for the residential electricity load clustering analysis to extract the typical load curve of each cell; and then a depth belief network classifier is constructed to classify the typical load curves of each cell, to identify the residential power load patterns and provide reliable support for distribution network maintenance. The effectiveness of the method is demonstrated by experiments on power data.
TL;DR: This paper presents two Blockchain abbreviation schemes based on the Ethereum project and proposes replacing the full Blockchain with a new Genesis block, which summarizes everyone's account balances at a certain point in time and a UNIX based architecture using the file system, for implementing Blockchain.
Abstract: Blockchain's ever increasing size has become a major problem. Bitcoin [7], for example, has grown to 115120 MB as of May 2017, which is roughly 115 GB. This uncontrollable growth of the Blockchain is bound to become an issue in the future, as hard disks may become too small to store the entire Blockchain history and traversing the transactions databases may become increasingly slow. Already, there are lightweight clients in various Blockchain platforms (Bitcoin included), who do not store the entire chain locally but rely on a third party to send them the blocks they need. There are many issues with these clients, mainly security problems, since they go back to trusting a central authority rather than gaining trust from several distributed peers. These clients' knowledge of the Blockchain is solely based on some third party that should be trusted, while the conceptual base for Blockchain is trust distributing. In this paper we present two Blockchain abbreviation schemes. The first one is based on the Ethereum [8] project and proposes replacing the full Blockchain with a new Genesis block, which summarizes everyone's account balances at a certain point in time. One possible benefit is to use less communication while still storing the prefix of the old Blockchain (or signature of the Blockchain that can validate a version archived by other participants) in a local archive. Here we trade loss of transaction history for efficiency. Our second contribution is a UNIX based architecture using the file system, for implementing Blockchain. We demonstrate a Blockchain abbreviation technique for this architecture too.
TL;DR: This paper examines the possibility to utilize the well-known approximations of Jaccard metric in order to reduce computational complexity of Edit Distance metric estimation, and forms inequalities between the J Accard metric and the Edit Distance.
Abstract: In this paper, we examine the possibility to utilize the well-known approximations of Jaccard metric in order to reduce computational complexity of Edit Distance metric estimation. The scope of our analytical results is the representing strings rather than the original (raw) textual data, still in practice we obtained a solid indication that the results can be applied to (raw) strings that have low n-gram repetitions. We formulate inequalities between the Jaccard metric and the Edit Distance, that impose upper and lower bounds on the Edit Distance values in terms of the Jaccard values. We validate our inequality over strings of API call traces where (the small) clusters obtained are refined by applying Edit Distance. Jaccard is a measure of similarity between two sets, while Edit Distance is a measure for two strings, such as traces of API calls. The computation associated with creating n-grams and using Jaccard similarity is much more efficient than the computation of Edit Distance (linear versus quadratic time complexity). Thus, our new bounds on the Edit Distance given the Jaccard value are of practical interest. Another new aspect we coped with in our research is the inherent imbalance between malicious and benign API traces that are harvested from the system, as most of the traces are benign. We performed clustering only on the malware traces where each cluster concentrates malware with some specific common essence. The obtained clustering is used with great success in classifying new query traces for being either benign or malware. The traces for our research were obtained from the KVM hypervisor Runtime Execution Introspection and Profiling (REIP) system based on Virtual Machine Introspection (VMI) techniques to profile hooked Windows API calls.
TL;DR: Results show that the server and network leak information to a level of detail that allows sorting out CPU from network bottlenecks, or even a combination of the two, in a large spectrum of cases, suggesting that a black-box monitoring approach is not only possible, but promising, as it may complement traditional white-box approaches.
Abstract: In spite of their growing maturity, current web monitoring tools are unable to observe all operating conditions. For example, clients in different geographical locations might get very diverse latencies to the server; the network between client and server might be slow; or third-party servers with external page resources might underperform. Ultimately, only the clients can determine whether a site is up and running in good conditions. In this paper, we use the response times experienced by clients, to infer about server and network performance. The goal is to detect internal and external bottlenecks doing black-box monitoring, in particular CPU (internal) and network (external). We aim to determine to what extent are the clients able to tell one type of bottleneck from the other, i.e., what kind of information do the server and network leak, regarding their operating conditions. To answer this question, we resort to an empirical approach. We submit an HTTP server and network to a large number of operating conditions and train two machine learning algorithms, a linear and a non-linear one, to identify the cause of the congestion affecting the system. Results show that the server and network leak information to a level of detail that allows sorting out CPU from network bottlenecks, or even a combination of the two, in a large spectrum of cases. This suggests that a black-box monitoring approach is not only possible, but promising, as it may complement traditional white-box approaches.
TL;DR: This paper designs and implements the egocentric betweenness measure in VANETs and compares it to the sociocentricBetweenness measure, which is most used in ego-networks.
Abstract: Ego-network concept has been systematically studied, since this kind of network employs only locally available information to analyze its structure. Degree, closeness, and betweenness are widely studied centrality measures. Among the three measures presented, betweenness centrality in ego-networks is the most used in several fields such as Wireless Mesh Networks, Wireless Sensor Networks, and Delay Tolerant Networks. However, surprisingly, that measure has not been largely investigated in Vehicular ad hoc Networks (VANETs). In this paper, we contribute to filling this gap by designing and implementing the egocentric betweenness measure in VANETs, besides we compare it to the sociocentric betweenness measure.
TL;DR: This work has developed a stochastic integer programming (SIP) based model to optimize the average network delay while keeping the packet loss below a threshold and has used random walk mobility model for simplicity, but it can be extended to other mobility model with known relevant distributions.
Abstract: Device to device (D2D) communication in 5G is on the spurge to compensate for the exponential increase in mobile users and their data requirements. In D2D communication, proximity users can communicate and control the links among themselves without the need of base station. This brings a real challenge to model the network to incorporate uncertainty which arises due to the mobility of users. We have developed a stochastic integer programming (SIP) based model to optimize the average network delay while keeping the packet loss below a threshold. This SIP model involves a probabilistic constraint which deals with conditional probability of link breakage in the next time instance given it was active at current time instance. By exploiting the SIP model, we have developed a greedy metric termed as connectivity factor (CF) which captures the nodes' mobility and hence takes care of link reliability which in turn controls packet loss and delay per hop. Based on CF we give a pure distributed greedy algorithm for forwarding the packets to an appropriate next hop node. Advantage of our algorithm to find an appropriate next hop node is that it requires only to compute the expectation and variance of path loss, which in turn can be computed from the distributions of mobility related parameters. Here we have used random walk mobility model for simplicity, but it can be extended to other mobility model with known relevant distributions. Through simulation we have shown that our proposed algorithm gives a significant improvement in packet loss due to mobility of nodes over the traditional received signal strength (RSS) based approach and an existing contention based forwarding (CBF) approach.
TL;DR: A cognition layer in parallel and interacting with the network layer which comprises two cognitive processes: path learning (routing) and trust learning and the algorithm, TQOR, is introduced, which shows better end-to-end delay and communication overhead which further improve as time progresses, without sacrificing the data packet delivery ratio.
Abstract: Dynamicity and infrastructure-less nature of MANETs expose the routing in such networks to a variety of attacks, and moreover, make the conventional fixed policy routing algorithms inefficient. To deal with the routing challenges and varying behavior of malicious nodes in such networks, employing reinforcement learning algorithms and proper trust models seem promising. In this paper, we introduce a cognition layer in parallel and interacting with the network layer which comprises two cognitive processes: path learning (routing) and trust learning. The first process is based on machine learning algorithms and the latter is based on trust management. We compare our algorithm, TQOR, with a well known trust-based routing protocol, TQR, in terms of three measures of performance. The simulation results show better end-to-end delay and communication overhead which further improve as time progresses, without sacrificing the data packet delivery ratio.
TL;DR: This proposal aims at efficiently handling ultra-dense networking 5G use cases to achieve benefits at unprecedented levels and presents a solution for slicing WLAN infrastructures, aiming to provide differentiated services on top of the same substrate through customized, isolated and independent digital building blocks.
Abstract: The advent of future 5th Generation (5G) use cases, such as ultra-dense networking and ultra-low latency propelled by Smart Cities and IoT projects will demand revolutionary network infrastructures. The need for low latency, high bandwidth, scalability, ubiquitous access and support for IoT resource-constrained devices are some of the prominent issues that networks have to face to support future 5G use cases, which arise since current wireless and mobile infrastructures are not able to fulfill. In particular, the pervasiveness and high-density of Wireless Local Area Networks (WLAN) at urban centers, together with their growing capacity and evolving standards, can be leveraged to support such demand. We argue that the integration of key 5G cornerstone technologies, such as Network Function Virtualization (NFV) and softwarization, fill some of the abovementioned gaps in regards to proper WLAN management and service orchestration. In this paper, we present a solution for slicing WLAN infrastructures, aiming to provide differentiated services on top of the same substrate through customized, isolated and independent digital building blocks. Through this proposal, we aim at efficiently handling ultra-dense networking 5G use cases to achieve benefits at unprecedented levels. Towards this goal, we present proof of concept realised over a real testbed and assess its feasibility.
TL;DR: The effort to integrate a machine learning-based framework which can predict the remaining time to failure of computing nodes with Hadoop applications is illustrated, allowing to set a possible path towards the definition of best practices for the development of systems to support autonomic management of cloud applications.
Abstract: This paper illustrates the effort to integrate a machine learning-based framework which can predict the remaining time to failure of computing nodes with Hadoop applications. This work is part of a larger effort targeting the development of a cloud-oriented autonomic framework to increase the availability of applications subject to software anomalies, and to jointly improve their performance. The framework uses machine-learning, software rejuvenation, and load distribution techniques to proactively prevent failures. We believe that this work allows to set a possible path towards the definition of best practices for the development of systems to support autonomic management of cloud applications, illustrating what are the issues that should be addressed by the research community. Indeed, given the scale and the complexity of modern computing infrastructures, effective autonomic management approaches of cloud applications are becoming mandatory.