TL;DR: The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio.
Abstract: Inexact (or approximate) computing is an attractive paradigm for digital processing at nanometric scales. Inexact computing is particularly interesting for computer arithmetic designs. This paper deals with the analysis and design of two new approximate 4-2 compressors for utilization in a multiplier. These designs rely on different features of compression, such that imprecision in computation (as measured by the error rate and the so-called normalized error distance) can meet with respect to circuit-based figures of merit of a design (number of transistors, delay and power consumption). Four different schemes for utilizing the proposed approximate compressors are proposed and analyzed for a Dadda multiplier. Extensive simulation results are provided and an application of the approximate multipliers to image processing is presented. The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio (more than 50 dB for the considered image examples).
TL;DR: This work relates logic encryption to fault propagation analysis in IC testing and develop a fault analysis-based logic encryption technique that enables a designer to controllably corrupt the outputs.
Abstract: Globalization of the integrated circuit (IC) design industry is making it easy for rogue elements in the supply chain to pirate ICs, overbuild ICs, and insert hardware Trojans. Due to supply chain attacks, the IC industry is losing approximately $4 billion annually. One way to protect ICs from these attacks is to encrypt the design by inserting additional gates such that correct outputs are produced only when specific inputs are applied to these gates. The state-of-the-art logic encryption technique inserts gates randomly into the design, but does not necessarily ensure that wrong keys corrupt the outputs. Our technique ensures that wrong keys corrupt the outputs. We relate logic encryption to fault propagation analysis in IC testing and develop a fault analysis-based logic encryption technique. This technique enables a designer to controllably corrupt the outputs. Specifically, to maximize the ambiguity for an attacker, this technique targets 50% Hamming distance between the correct and wrong outputs (ideal case) when a wrong key is applied. Furthermore, this 50% Hamming distance target is achieved using a smaller number of additional gates when compared to random logic encryption.
TL;DR: This paper introduces outsourcing computation into IBE for the first time and proposes a revocable IBE scheme in the server-aided setting and proposes another construction which is provable secure under the recently formulized Refereed Delegation of Computation model.
Abstract: Identity-Based Encryption (IBE) which simplifies the public key and certificate management at Public Key Infrastructure (PKI) is an important alternative to public key encryption. However, one of the main efficiency drawbacks of IBE is the overhead computation at Private Key Generator (PKG) during user revocation. Efficient revocation has been well studied in traditional PKI setting, but the cumbersome management of certificates is precisely the burden that IBE strives to alleviate. In this paper, aiming at tackling the critical issue of identity revocation, we introduce outsourcing computation into IBE for the first time and propose a revocable IBE scheme in the server-aided setting. Our scheme offloads most of the key generation related operations during key-issuing and key-update processes to a Key Update Cloud Service Provider, leaving only a constant number of simple operations for PKG and users to perform locally. This goal is achieved by utilizing a novel collusion-resistant technique: we employ a hybrid private key for each user, in which an AND gate is involved to connect and bound the identity component and the time component. Furthermore, we propose another construction which is provable secure under the recently formulized Refereed Delegation of Computation model. Finally, we provide extensive experimental results to demonstrate the efficiency of our proposed construction.
TL;DR: This paper studies, for the first time, multi-user computation partitioning problem (MCPP), which considers the partitioning of multiple users' computations together with the scheduling of offloaded computations on the cloud resources, and designs an offline heuristic algorithm, namely SearchAdjust, to solve MCPP.
Abstract: Elastic partitioning of computations between mobile devices and cloud is an important and challenging research topic for mobile cloud computing. Existing works focus on the single-user computation partitioning, which aims to optimize the application completion time for one particular single user. These works assume that the cloud always has enough resources to execute the computations immediately when they are offloaded to the cloud. However, this assumption does not hold for large scale mobile cloud applications. In these applications, due to the competition for cloud resources among a large number of users, the offloaded computations may be executed with certain scheduling delay on the cloud. Single user partitioning that does not take into account the scheduling delay on the cloud may yield significant performance degradation. In this paper, we study, for the first time, multi-user computation partitioning problem (MCPP), which considers the partitioning of multiple users’ computations together with the scheduling of offloaded computations on the cloud resources. Instead of pursuing the minimum application completion time for every single user, we aim to achieve minimum average completion time for all the users, based on the number of provisioned resources on the cloud. We show that MCPP is different from and more difficult than the classical job scheduling problems. We design an offline heuristic algorithm, namely SearchAdjust , to solve MCPP. We demonstrate through benchmarks that SearchAdjust outperforms both the single user partitioning approaches and classical job scheduling approaches by 10 percent on average in terms of application delay. Based on SearchAdjust , we also design an online algorithm for MCPP that can be easily deployed in practical systems. We validate the effectiveness of our online algorithm using real world load traces.
TL;DR: This paper proposes defect and fault models specific to RRAM, i.e., the Over-Forming (OF) defect and the Read-One-Disturb (R1D) fault, and develops a novel squeeze-search scheme to identify the OF defect, which leads to the Stuck-At Fault (SAF).
Abstract: The Resistive Random Access Memory (RRAM) is a new type of non-volatile memory based on the resistive memory device. Researchers are currently moving from resistive device development to memory circuit design and implementation, hoping to fabricate memory chips that can be deployed in the market in the near future. However, so far the low manufacturing yield is still a major issue. In this paper, we propose defect and fault models specific to RRAM, i.e., the Over-Forming (OF) defect and the Read-One-Disturb (R1D) fault. We then propose a March algorithm to cover these defects and faults in addition to the conventional RAM faults, which is called March C*. We also develop a novel squeeze-search scheme to identify the OF defect, which leads to the Stuck-At Fault (SAF). The proposed test algorithm is applied to a first-cut 4-Mb HfO
2
-based RRAM test chip. Results show that OF defects and R1D faults do exist in the RRAM chip. We also identify specific failure patterns from the test results, which are shown to be induced by multiple short defects between bit-lines. By identifying the defects and faults, designers and process engineers can improve the RRAM yield in a more cost-effective way.
TL;DR: A cellular computing model in the slime mold physarum polycephalum is exploited to solve the Steiner tree problem which is an important NP-hard problem in various applications, especially in network design.
Abstract: Using insights from biological processes could help to design new optimization techniques for long-standing computational problems. This paper exploits a cellular computing model in the slime mold physarum polycephalum to solve the Steiner tree problem which is an important NP-hard problem in various applications, especially in network design. Inspired by the path-finding and network formation capability of physarum, we develop a new optimization algorithm, named as the physarum optimization, with low complexity and high parallelism. To validate and evaluate our proposed models and algorithm, we further apply the physarum optimization to the minimal exposure problem which is a fundamental problem corresponding to the worst-case coverage in wireless sensor networks. Complexity analysis and simulation results show that our proposed algorithm could achieve good performance with low complexity. Moreover, the core mechanism of our physarum optimization also may provide a useful starting point to develop some practical distributed algorithms for network design.
TL;DR: MuR-DPA as mentioned in this paper is a public data auditing scheme based on the Merkle hash tree (MHT), which can not only incur much less communication overhead for both update verification and integrity verification of cloud datasets with multiple replicas, but also provide enhanced security against dishonest cloud service providers.
Abstract: Cloud computing that provides elastic computing and storage resource on demand has become increasingly important due to the emergence of “big data”. Cloud computing resources are a natural fit for processing big data streams as they allow big data application to run at a scale which is required for handling its complexities (data volume, variety and velocity). With the data no longer under users’ direct control, data security in cloud computing is becoming one of the most concerns in the adoption of cloud computing resources. In order to improve data reliability and availability, storing multiple replicas along with original datasets is a common strategy for cloud service providers. Public data auditing schemes allow users to verify their outsourced data storage without having to retrieve the whole dataset. However, existing data auditing techniques suffers from efficiency and security problems. First, for dynamic datasets with multiple replicas, the communication overhead for update verifications is very large, because each update requires updating of all replicas, where verification for each update requires O(log n ) communication complexity. Second, existing schemes cannot provide public auditing and authentication of block indices at the same time. Without authentication of block indices, the server can build a valid proof based on data blocks other than the blocks client requested to verify. In order to address these problems, in this paper, we present a novel public auditing scheme named MuR-DPA. The new scheme incorporated a novel authenticated data structure (ADS) based on the Merkle hash tree (MHT), which we call MR-MHT. To support full dynamic data updates and authentication of block indices, we included rank and level values in computation of MHT nodes. In contrast to existing schemes, level values of nodes in MR-MHT are assigned in a top-down order, and all replica blocks for each data block are organized into a same replica sub-tree. Such a configuration allows efficient verification of updates for multiple replicas. Compared to existing integrity verification and public auditing schemes, theoretical analysis and experimental results show that the proposed MuR-DPA scheme can not only incur much less communication overhead for both update verification and integrity verification of cloud datasets with multiple replicas, but also provide enhanced security against dishonest cloud service providers.
TL;DR: This work presents a proposed EMD-based detection system, which is developed based on a widely used dissimilarity measure, namely Earth Mover's Distance (EMD), that can detect unknown DoS attacks and achieves 99.95 percent detection accuracy on KDD Cup 99 dataset and 90.12 percent on ISCX 2012 IDS evaluation dataset.
Abstract: Detection of Denial-of-Service (DoS) attacks has attracted researchers since 1990s. A variety of detection systems has been proposed to achieve this task. Unlike the existing approaches based on machine learning and statistical analysis, the proposed system treats traffic records as images and detection of DoS attacks as a computer vision problem. A multivariate correlation analysis approach is introduced to accurately depict network traffic records and to convert the records into their respective images. The images of network traffic records are used as the observed objects of our proposed DoS attack detection system, which is developed based on a widely used dissimilarity measure, namely Earth Mover’s Distance (EMD). EMD takes cross-bin matching into account and provides a more accurate evaluation on the dissimilarity between distributions than some other well-known dissimilarity measures, such as Minkowski-form distance $L_{p}$ and $X^{2}$ statistics. These unique merits facilitate our proposed system with effective detection capabilities. To evaluate the proposed EMD-based detection system, ten-fold cross-validations are conducted using KDD Cup 99 dataset and ISCX 2012 IDS Evaluation dataset. The results presented in the system evaluation section illustrate that our detection system can detect unknown DoS attacks and achieves 99.95 percent detection accuracy on KDD Cup 99 dataset and 90.12 percent detection accuracy on ISCX 2012 IDS evaluation dataset with processing capability of approximately 59,000 traffic records per second.
TL;DR: This paper proposes a new verifiable auditing scheme for outsourced database, which can simultaneously achieve the correctness and completeness of search results even if the dishonest CSP purposely returns an empty set.
Abstract: The notion of database outsourcing enables the data owner to delegate the database management to a cloud service provider (CSP) that provides various database services to different users. Recently, plenty of research work has been done on the primitive of outsourced database. However, it seems that no existing solutions can perfectly support the properties of both correctness and completeness for the query results, especially in the case when the dishonest CSP intentionally returns an empty set for the query request of the user. In this paper, we propose a new verifiable auditing scheme for outsourced database, which can simultaneously achieve the correctness and completeness of search results even if the dishonest CSP purposely returns an empty set. Furthermore, we can prove that our construction can achieve the desired security properties even in the encrypted outsourced database. Besides, the proposed scheme can be extended to support the dynamic database setting by incorporating the notion of verifiable database with updates.
TL;DR: This paper proposes new distributed deduplication systems with higher reliability in which the data chunks are distributed across multiple cloud servers, and achieves the security requirements of data confidentiality and tag consistency by introducing a deterministic secret sharing scheme in distributed storage systems.
Abstract: Data deduplication is a technique for eliminating duplicate copies of data, and has been widely used in cloud storage to reduce storage space and upload bandwidth. However, there is only one copy for each file stored in cloud even if such a file is owned by a huge number of users. As a result, deduplication system improves storage utilization while reducing reliability. Furthermore, the challenge of privacy for sensitive data also arises when they are outsourced by users to cloud. Aiming to address the above security challenges, this paper makes the first attempt to formalize the notion of distributed reliable deduplication system. We propose new distributed deduplication systems with higher reliability in which the data chunks are distributed across multiple cloud servers. The security requirements of data confidentiality and tag consistency are also achieved by introducing a deterministic secret sharing scheme in distributed storage systems, instead of using convergent encryption as in previous deduplication systems. Security analysis demonstrates that our deduplication systems are secure in terms of the definitions specified in the proposed security model. As a proof of concept, we implement the proposed systems and demonstrate that the incurred overhead is very limited in realistic environments.
TL;DR: This paper proposes a solution, to deploy wireless sensors at strategic locations to achieve the best estimates of structural health by following the widely used wired sensor system deployment approach from civil/structural engineering.
Abstract: Structural health monitoring (SHM) systems are implemented for structures (e.g., bridges, buildings) to monitor their operations and health status. Wireless sensor networks (WSNs) are becoming an enabling technology for SHM applications that are more prevalent and more easily deployable than traditional wired networks. However, SHM brings new challenges to WSNs: engineering-driven optimal deployment, a large volume of data, sophisticated computing, and so forth. In this paper, we address two important challenges: sensor deployment and decentralized computing. We propose a solution, to deploy wireless sensors at strategic locations to achieve the best estimates of structural health (e.g., damage) by following the widely used wired sensor system deployment approach from civil/structural engineering. We found that faults (caused by communication errors, unstable connectivity, sensor faults, etc.) in such a deployed WSN greatly affect the performance of SHM. To make the WSN resilient to the faults, we present an approach, called ${\tt FTSHM}$ (fault-tolerance in SHM), to repair the WSN and guarantee a specified degree of fault tolerance. ${\tt FTSHM}$ searches the repairing points in clusters in a distributed manner, and places a set of backup sensors at those points in such a way that still satisfies the engineering requirements. ${\tt FTSHM}$ also includes an SHM algorithm suitable for decentralized computing in the energy-constrained WSN, with the objective of guaranteeing that the WSN for SHM remains connected in the event of a fault, thus prolonging the WSN lifetime under connectivity and data delivery constraints. We demonstrate the advantages of ${\tt FTSHM}$ through extensive simulations and real experimental settings on a physical structure.
TL;DR: A double resource renting scheme is designed firstly in which short-term renting and long- term renting are combined aiming at the existing issues, and the results show that the scheme can not only guarantee the service quality of all requests, but also obtain more profit than the latter.
Abstract: As an effective and efficient way to provide computing resources and services to customers on demand, cloud computing has become more and more popular. From cloud service providers’ perspective, profit is one of the most important considerations, and it is mainly determined by the configuration of a cloud service platform under given market demand. However, a single long-term renting scheme is usually adopted to configure a cloud platform, which cannot guarantee the service quality but leads to serious resource waste. In this paper, a double resource renting scheme is designed firstly in which short-term renting and long-term renting are combined aiming at the existing issues. This double renting scheme can effectively guarantee the quality of service of all requests and reduce the resource waste greatly. Secondly, a service system is considered as an M/M/m+D queuing model and the performance indicators that affect the profit of our double renting scheme are analyzed, e.g., the average charge, the ratio of requests that need temporary servers, and so forth. Thirdly, a profit maximization problem is formulated for the double renting scheme and the optimized configuration of a cloud platform is obtained by solving the profit maximization problem. Finally, a series of calculations are conducted to compare the profit of our proposed scheme with that of the single renting scheme. The results show that our scheme can not only guarantee the service quality of all requests, but also obtain more profit than the latter.
TL;DR: Simulation studies based on both random topologies and real network topologies of a 74-node physical wireless sensor network testbed demonstrate that the analysis provides safe and reasonably tight upper bounds of the end-to-end delays of real-time flows, and hence enables effective schedulability tests for WirelessHART networks.
Abstract: WirelessHART is a new standard specifically designed for real-time and reliable communication between sensor and actuator devices for industrial process monitoring and control applications. End-to-end communication delay analysis for WirelessHART networks is required to determine the schedulability of real-time data flows from sensors to actuators for the purpose of acceptance test or workload adjustment in response to network dynamics. In this paper, we consider a network model based on WirelessHART, and map the scheduling of real-time periodic data flows in the network to real-time multiprocessor scheduling. We then exploit the response time analysis for multiprocessor scheduling and propose a novel method for the delay analysis that establishes an upper bound of the end-to-end communication delay of each real-time flow in the network. Simulation studies based on both random topologies and real network topologies of a $74$ -node physical wireless sensor network testbed demonstrate that our analysis provides safe and reasonably tight upper bounds of the end-to-end delays of real-time flows, and hence enables effective schedulability tests for WirelessHART networks.
TL;DR: Two optimizations coupled with a novel precomputation technique are introduced drastically reducing the computation latency for all FHE primitives and the GH FHE scheme on two GPUs is implemented to further speedup the operations.
Abstract: In 2010, Gentry and Halevi presented the first FHE implementation. FHE allows the evaluation of arbitrary functions directly on encrypted data on untrusted servers. However, even for the small setting with 2048 dimensions, the authors reported a performance of 1.8 s for a single bit encryption and 32 s for recryption on a high-end server. Much of the latency is due to computationally intensive multi-million-bit modular multiplications. In this paper, we introduce two optimizations coupled with a novel precomputation technique. In the first optimization called partial FFT, we adopt Strassen’s FFT-based multiplication algorithm along with Barret reduction to speedup modular multiplications. For the encrypt primitive, we employ a window-based evaluation technique along with a modest degree of precomputation. In the full FFT optimization, we delay modular reductions and change the window algorithm, which allows us to carry out the bulk of computations in the frequency domain. We manage to eliminate all FFT conversion except the final inverse transformation drastically reducing the computation latency for all FHE primitives. We implemented the GH FHE scheme on two GPUs to further speedup the operations. Our experimental results with small parameter setting show speedups of 174, 7.6, and 13.5 times for encryption, decryption, and recryption, respectively, when compared to the Gentry–Halevi implementation. The speedup is enhanced in the medium setting. However, in the large setting, memory becomes the bottleneck and the speedup is somewhat diminished.
TL;DR: This contribution presents the first full realization of FHE in hardware based on the Gentry-Halevi fully homomorphic encryption scheme using an optimized multi-million bit multiplierbased on the Schonhage Strassen multiplication algorithm.
Abstract: We present a custom architecture for realizing the Gentry-Halevi fully homomorphic encryption (FHE) scheme. This contribution presents the first full realization of FHE in hardware. The architecture features an optimized multi-million bit multiplier based on the Schonhage Strassen multiplication algorithm. Moreover, a number of optimizations including spectral techniques as well as a precomputation strategy is used to significantly improve the performance of the overall design. When synthesized using 90 nm technology, the presented architecture achieves to realize the encryption, decryption, and recryption operations in 18.1 msec, 16.1 msec, and 3.1 sec, respectively, and occupies a footprint of less than 30 million gates.
TL;DR: The fault analysis reveals that unique faults occur in addition to some conventional memory faults, and the detection of such unique faults cannot be guaranteed with just the application of traditional march tests, so a new Design-for-Testability (DfT) concept is presented to facilitate the Detection of the unique faults.
Abstract: Memristor-based memory technology, also referred to as resistive RAM (RRAM), is one of the emerging memory technologies potentially to replace conventional semiconductor memories such as SRAM, DRAM, and flash. Existing research on such novel circuits focuses mainly on the integration between CMOS and non-CMOS, fabrication techniques, and reliability improvement. However, research on (manufacturing) test for yield and quality improvement is still in its infancy stage. This paper presents fault analysis and modeling for open defects based on electrical simulation, introduces fault models, and proposes test approaches for RRAMs. The fault analysis reveals that unique faults occur in addition to some conventional memory faults, and the detection of such unique faults cannot be guaranteed with just the application of traditional march tests. The paper also presents a new Design-for-Testability (DfT) concept to facilitate the detection of the unique faults. Two DfT schemes are developed by exploiting the access time duration and supply voltage level of the RRAM cells, and their simulation results show that the fault coverage can be increased with minor circuit modification. As the fault behavior may vary due to process variations, the DfT schemes are extended to be programmable to track the changes and further improve the fault/defect coverage.
TL;DR: The mathematical analysis proves that there exists a threshold for the proportion of faulty nodes, above which the system collapses, and the robustness of the model caused by random attacks or failures is analyzed by calculating the size of functioning parts in both networks.
Abstract: In this paper, we focus on the cyber-physical system consisting of interdependent physical-resource and computational-resource networks, e.g., smart power grids, automated traffic control system, and wireless sensor and actuator networks, where the physical-resource and computational-resource network are connected and mutually dependent. The failure in physical-resource network might cause failures in computational-resource network, and vice versa. A small failure in either of them could trigger cascade of failures within the entire system. We aim to investigate the issue of cascading failures occur in such system. We propose a typical and practical model by introducing the interdependent complex network. The interdependence between two networks is practically defined as follows: Each node in the computational-resource network has only one support link from the physical-resource network, while each node in physical-resource network is connected to multiple computational nodes. We study the effect of cascading failures using percolation theory and present detailed mathematical analysis of failure propagation in the system. We analyze the robustness of our model caused by random attacks or failures by calculating the size of functioning parts in both networks. Our mathematical analysis proves that there exists a threshold for the proportion of faulty nodes, above which the system collapses. Using extensive simulations, we determine the critical values for different system parameters. Our simulation also shows that, when the proportion of faulty nodes approaching critical value, the size of functioning parts meets a second-order transition. An important observation is that the size of physical-resource and computational-resource networks, and the ratio between their sizes do not affect the system robustness.
TL;DR: Results from both trace-driven simulations and extensive real-world experiments show that AppATP can be applied to a variety of application scenarios while achieving 30-50 percent energy savings for mobile devices.
Abstract: Many mobile applications require frequent wireless transmissions between the content provider and mobile devices, consuming much energy in mobile devices. Motivated by the popularity of prefetch-friendly or delay-tolerant apps (e.g., social networking, app updates, cloud storage), we design and implement an application-layer transmission protocol, AppATP, which leverages cloud computing to manage data transmissions for mobile apps, transferring data to and from mobile devices in an energy-efficient manner. Measurements show that significantly amount of energy is consumed by mobile devices during poor connectivity. Based on this observation, AppATP adaptively seizes periods of good bandwidth condition to prefetch frequently used data with minimum energy consumption, while deferring delay-tolerant data during poor network connectivity. Using the stochastic control framework, AppATP only relies on the current network information and data queue sizes to make an online decision on transmission scheduling, and performs well under unpredictable wireless network conditions. We implement AppATP on Samsung Note 2 smartphones and Amazon EC2. Results from both trace-driven simulations and extensive real-world experiments show that AppATP can be applied to a variety of application scenarios while achieving $30$ - $50$ percent energy savings for mobile devices.
TL;DR: This paper investigates the local-recoding problem for big data anonymization against proximity privacy breaches and attempts to identify a scalable solution to this problem, and presents a proximity privacy model with allowing semantic proximity of sensitive values and multiple sensitive attributes and model the problem of local recoding as a proximity-aware clustering problem.
Abstract: Cloud computing provides promising scalable IT infrastructure to support various processing of a variety of big data applications in sectors such as healthcare and business. Data sets like electronic health records in such applications often contain privacy-sensitive information, which brings about privacy concerns potentially if the information is released or shared to third-parties in cloud. A practical and widely-adopted technique for data privacy preservation is to anonymize data via generalization to satisfy a given privacy model. However, most existing privacy preserving approaches tailored to small-scale data sets often fall short when encountering big data, due to their insufficiency or poor scalability. In this paper, we investigate the local-recoding problem for big data anonymization against proximity privacy breaches and attempt to identify a scalable solution to this problem. Specifically, we present a proximity privacy model with allowing semantic proximity of sensitive values and multiple sensitive attributes, and model the problem of local recoding as a proximity-aware clustering problem. A scalable two-phase clustering approach consisting of a t -ancestors clustering (similar to k -means) algorithm and a proximity-aware agglomerative clustering algorithm is proposed to address the above problem. We design the algorithms with MapReduce to gain high scalability by performing data-parallel computation in cloud. Extensive experiments on real-life data sets demonstrate that our approach significantly improves the capability of defending the proximity privacy breaches, the scalability and the time-efficiency of local-recoding anonymization over existing approaches.
TL;DR: NextCell-a novel algorithm that aims to enhance the location prediction by harnessing the social interplay revealed in cellular call records and achieves higher precision and recall than the state-of-the-art schemes at cell tower level in the forthcoming one to six hours.
Abstract: Location prediction based on cellular network traces has recently spurred lots of attention. However, predicting user mobility remains a very challenging task due to the fuzziness of human mobility patterns. Our preliminary study included in this paper shows that there is a strong correlation between the calling patterns and co-cell patterns of users (i.e., co-occurrence in the same cell tower at the same time). Based on this finding, we propose NextCell—a novel algorithm that aims to enhance the location prediction by harnessing the social interplay revealed in cellular call records. Moreover, our proposal removes the assumption held in previous schemes that binds locations of cell towers to concrete physical coordinates, e.g., GPS coordinates. We validate our approach with the MIT Reality Mining dataset that involves 32,579 symbolic cell tower locations and 350,000 hours of continuous activity information. Experimental results show that NextCell achieves higher precision and recall than the state-of-the-art schemes at cell tower level in the forthcoming one to six hours.
TL;DR: It is concluded that mimicking attacks can be discriminated from genuine flash crowds using second order statistical metrics and a new fine correntropy metrics are defined and show its effectiveness compared to others.
Abstract: Botnets have become major engines for malicious activities in cyberspace nowadays. To sustain their botnets and disguise their malicious actions, botnet owners are mimicking legitimate cyber behavior to fly under the radar. This poses a critical challenge in anomaly detection. In this paper, we use web browsing on popular web sites as an example to tackle this problem. First of all, we establish a semi-Markov model for browsing behavior. Based on this model, we find that it is impossible to detect mimicking attacks based on statistics if the number of active bots of the attacking botnet is sufficiently large (no less than the number of active legitimate users). However, we also find it is hard for botnet owners to satisfy the condition to carry out a mimicking attack most of the time. With this new finding, we conclude that mimicking attacks can be discriminated from genuine flash crowds using second order statistical metrics. We define a new fine correntropy metrics and show its effectiveness compared to others. Our real world data set experiments and simulations confirm our theoretical claims. Furthermore, the findings can be widely applied to similar situations in other research fields.
TL;DR: A large-scale PARAFAC method is developed, which is supported by general-purpose computing on the graphics processing unit (GPGPU) and forms the basis of a model for the analysis of electrocochleography recordings obtained from epilepsy patients, which proves to be effective in the epilepsy state detection.
Abstract: Analysis of neural data with multiple modes and high density has recently become a trend with the advances in neuroscience research and practices There exists a pressing need for an approach to accurately and uniquely capture the features without loss or destruction of the interactions amongst the modes (typically) of space, time, and frequency Moreover, the approach must be able to quickly analyze the neural data of exponentially growing scales and sizes, in tens or even hundreds of channels, so that timely conclusions and decisions may be made A salient approach to multi-way data analysis is the parallel factor analysis (PARAFAC) that manifests its effectiveness in the decomposition of the electroencephalography (EEG) However, the conventional PARAFAC is only suited for offline data analysis due to the high complexity, which computes to be $O(n^{2})$ with the increasing data size In this study, a large-scale PARAFAC method has been developed, which is supported by general-purpose computing on the graphics processing unit (GPGPU) Comparing to the PARAFAC running on conventional CPU-based platform, the new approach dramatically excels by ${>}360$ times in run-time performance, and effectively scales by ${>}400$ times in all dimensions Moreover, the proposed approach forms the basis of a model for the analysis of electrocochleography (ECoG) recordings obtained from epilepsy patients, which proves to be effective in the epilepsy state detection The time evolutions of the proposed model are well correlated with the clinical observations Moreover, the frequency signature is stable and high in the ictal phase Furthermore, the spatial signature explicitly identifies the propagation of neural activities among various brain regions The model supports real-time analysis of ECoG in ${>}1{,}000$ channels on an inexpensive and available cyber-infrastructure
TL;DR: This paper proves that BHn(n > 2) is super-λ' but not super-ic'.
Abstract: Huang and Wu in [IEEE Transactions on Computers 46 (1997) 484-490] introduced the balanced hypercube $BH_n$ as an interconnection network topology for computing systems, and they proved that $BH_n$ is vertex-transitive. However, some other symmetric properties, say edge-transitivity and arc-transitivity, of $BH_n$ remained unknown. In this paper, we solve this problem and prove that $BH_n$ is an arc-transitive Cayley graph. Using this, we also investigate some reliability measures, including super-connectivity, cyclic connectivity, etc., in $BH_n$ . First, we prove that every minimum edge-cut of $BH_n (n\ge 2)$ isolates a vertex, and every minimum vertex-cut of $BH_n (n\ge 3)$ isolates a vertex. This is stronger than that obtained by Wu and Huang which shows the connectivity and edge-connectivity of $BH_n$ are $2n$ . Second, Yang [Applied Mathematics and Computation 219 (2012) 970-975.] proved that for $n\ge 2$ , the super-connectivity of $BH_n$ is $4n-4$ and the super edge-connectivity of $BH_n$ is $4n-2$ . In this paper, we proved that $BH_n (n\ge 2)$ is super- $\lambda^{\prime }$ but not super- $\kappa^{\prime }$ . That is, every minimum super edge-cut of $BH_n (n\ge 2)$ isolates an edge, but the minimum super vertex-cut of $BH_n (n\ge 2)$ does not isolate an edge. Third, we also obtain that for $n\ge 2$ , the cyclic connectivity of $BH_n$ is $4n-4$ and the cyclic edge-connectivity of $BH_n$ is $4(2n-2)$ . That is, to become a disconnected graph which has at least two components containing cycles, we need to remove at least $4n-4$ vertices (resp. $4(4n-2)$ edges) from $BH_n (n\ge 2)$ .
TL;DR: This paper first formulate a novel data collection maximization problem by adopting multi-rate data transmissions and performing transmission time slot scheduling, and shows that the problem is NP-hard, and devise an offline algorithm with a provable approximation ratio for the problem by exploiting the combinatorial property of the problem.
Abstract: In this paper we study data collection in an energy renewable sensor network for scenarios such as traffic monitoring on busy highways, where sensors are deployed along a predefined path (the highway) and a mobile sink travels along the path to collect data from one-hop sensors periodically. As sensors are powered by renewable energy sources, time-varying characteristics of ambient energy sources poses great challenges in the design of efficient routing protocols for data collection in such networks. In this paper we first formulate a novel data collection maximization problem by adopting multi-rate data transmissions and performing transmission time slot scheduling, and show that the problem is NP-hard. We then devise an offline algorithm with a provable approximation ratio for the problem by exploiting the combinatorial property of the problem, assuming that the harvested energy at each node is given and link communications in the network are reliable. We also extend the proposed algorithm by minor modifications to a general case of the problem where the harvested energy at each sensor is not known in advance and link communications are not reliable. We thirdly develop a fast, scalable online distributed algorithm for the problem in realistic sensor networks in which neither the global knowledge of the network topology nor sensor profiles such as sensor locations and their harvested energy profiles is given. Furthermore, we also consider a special case of the problem where each node has only a fixed transmission power, for which we propose an exact solution to the problem. We finally conduct extensive experiments by simulations to evaluate the performance of the proposed algorithms. Experimental results demonstrate that the proposed algorithms are efficient and the solutions obtained are fractional of the optimum.
TL;DR: This work exploits the dynamic frequency scaling technique and forms an optimization problem that minimizes OPEX while guaranteeing the quality-of-service, i.e, the expected response time of tasks.
Abstract: With the rising demands on cloud services, the electricity consumption has been increasing drastically as the main operational expenditure (OPEX) to data center providers. The geographical heterogeneity of electricity prices motivates us to study the task placement problem over geo-distributed data centers. We exploit the dynamic frequency scaling technique and formulate an optimization problem that minimizes OPEX while guaranteeing the quality-of-service, i.e., the expected response time of tasks. Furthermore, an optimal solution is discovered for this formulated problem. The experimental results show that our proposal achieves much higher cost-efficiency than the traditional resizing scheme, i.e., by activating/deactivating certain servers in data centers.
TL;DR: This paper model the workload and the power consumption of a multicore processor as random variables and exploit the monotonicity property of their distribution functions to establish a quantitative relationship between the random variables.
Abstract: Quantitatively estimating the relationship between the workload and the corresponding power consumption of a multicore processor is an essential step towards achieving energy proportional computing. Most existing and proposed approaches use Performance Monitoring Counters (Hardware Monitoring Counters) for this task. In this paper we propose a complementary approach that employs the statistics of CPU utilization (workload) only. Hence, we model the workload and the power consumption of a multicore processor as random variables and exploit the monotonicity property of their distribution functions to establish a quantitative relationship between the random variables. We will show that for a single-core processor the relationship is best approximated by a quadratic function whereas for a dualcore processor, the relationship is best approximated by a linear function. We will demonstrate the plausibility of our approach by estimating the power consumption of both custom-made and standard benchmarks (namely, the SPEC power benchmark and the Apache benchmarking tool) for an Intel and AMD processors.
TL;DR: A new, complete proof that an n-dimensional Star Graph is actually ((k + 1)n - 3k - 1)/k-diagnosable, and an O(N log N) diagnostic algorithm to locate all faulty processors, among which at most k fault-free processors might be wrongly diagnosed as faulty.
Abstract: The ${{t/k}}$ -diagnosis is a diagnostic strategy at system level that can significantly enhance the system’s self-diagnosing capability. It can detect up to ${{t}}$ faulty processors (or nodes, units) which might include at most ${{k}}$ misdiagnosed processors, where ${ {k}}$ is typically a small number. Somani and Peleg ( , 1996) claimed that an $n$ -dimensional Star Graph (denoted ${{S_n}}$ ), a well-studied interconnection model for multiprocessor systems, is ${{((k + 1)n - 3k - 2)/k}}$ -diagnosable. Recently, Chen and Liu ( , 2012) found counterexamples for the diagnosability obtained in , without further pursuing the cause of the flawed result. In this paper, we provide a new, complete proof that an ${\mbi {n}}$ -dimensional Star Graph is actually ${{((k + 1)n - 3k - 1)/k}}$ -diagnosable, where ${{1 \leq k \leq 3}}$ , and investigate the reason that caused the flawed result in . Based on our newly obtained fault-tolerance properties, we will also outline an ${ {O(N \log N)}}$ diagnostic algorithm ( ${ {N = n!}}$ is the number of nodes in ${{S_n}}$ ) to locate all (up to ${ {(k + 1)n - 3k - 1}}$ ) faulty processors, among which at most ${ {k\, (1 \leq k \leq 3)}}$ fault-free processors might be wrongly diagnosed as faulty.
TL;DR: The experimental results show that the problem modeling approach and the proposed selection algorithm make it feasible to manage the fault tolerance of complex service-oriented systems both efficiently and effectively.
Abstract: Functionally equivalent web services can be composed to form more reliable service-oriented systems. However, the choice of fault tolerance strategy can have a significant effect on the quality-of-service (QoS) of the resulting service-oriented systems. In this paper, we investigate the problem of selecting an optimal fault tolerance strategy for building reliable service-oriented systems. We formulate the user requirements as local and global constraints and model the selection of fault tolerance strategy as an optimization problem. A heuristic algorithm is proposed to efficiently solve the optimization problem. Fault tolerance strategy selection for semantically related tasks is also investigated in this paper. Large-scale real-world experiments are conducted to illustrate the benefits of the proposed approach. The experimental results show that our problem modeling approach and the proposed selection algorithm make it feasible to manage the fault tolerance of complex service-oriented systems both efficiently and effectively.
TL;DR: A new efficient framework named Constant-size Ciphertext Policy Comparative Attribute-Based Encryption (CCP-CABE) with the support of negative attributes and wildcards that embeds the comparable attribute ranges of all the attributes into the user's key, and incorporates the attribute constraints into one piece of ciphertext during the encryption process to enforce flexible access control policies with various range relationships.
Abstract: With the proliferation of mobile devices in recent years, there is a growing concern regarding secure data storage, secure computation, and fine-grained access control in data sharing for these resource-constrained devices in a cloud computing environment. In this work, we propose a new efficient framework named Constant-size Ciphertext Policy Comparative Attribute-Based Encryption (CCP-CABE) with the support of negative attributes and wildcards. It embeds the comparable attribute ranges of all the attributes into the user’s key, and incorporates the attribute constraints of all the attributes into one piece of ciphertext during the encryption process to enforce flexible access control policies with various range relationships. Accordingly, CCP-CABE achieves the efficiency because it generates constant-size keys and ciphertext regardless of the number of involved attributes, and it also keeps the computation cost constant on lightweight mobile devices. We further discuss how to extend CCP-CABE to fit a scenario with multiple attribute domains, such that the decryption proceeds from the least privileged attribute domain to the most privileged one to help protect the privacy of the access policy. We provide security analysis and performance evaluation to demonstrate their efficiency at the end.
TL;DR: The architecture, design, analysis, and simulation and measurement results of the 3D-MAPS (3D massively parallel processor with stacked memory) chip built with a 1.5 V, 130 nm process technology and a two-tier 3D stacking technology are described.
Abstract: This paper describes the architecture, design, analysis, and simulation and measurement results of the 3D-MAPS (3D massively parallel processor with stacked memory) chip built with a 1.5 V, 130 nm process technology and a two-tier 3D stacking technology using 1.2 \microm-diameter, 6 \micro m-height through-silicon vias (TSVs) and 3.4
bsp\microm-diameter face-to-face bond pads. 3D-MAPS consists of a core tier containing 64 cores and a memory tier containing 64 memory blocks. Each core communicates with its dedicated 4KB SRAM block using face-to-face bond pads, which provide negligible data transfer delay between the core and the memory tiers. The maximum operating frequency is 277 MHz and the maximum memory bandwidth is 70.9 GB/s at 277 MHz. The peak measured memory bandwidth usage is 63.8 GB/s and the peak measured power is approximately 4 W based on eight parallel benchmarks.