TL;DR: It is shown that it is possible to perform a successful cache attack against this AES implementation, in AES-256/GCM mode, using widely available hardware.
Abstract: The ARM TrustZone is a security extension which is used in recent Samsung flagship smartphones to create a Trusted Execution Environment (TEE) called a Secure World, which runs secure processes (Trustlets). The Samsung TEE includes cryptographic key storage and functions inside the Keymaster trustlet. The secret key used by the Keymaster trustlet is derived by a hardware device and is inaccessible to the Android OS. However, the ARM32 AES implementation used by the Keymaster is vulnerable to side channel cache-attacks. The Keymaster trustlet uses AES-256 in GCM mode, which makes mounting a cache attack against this target much harder. In this paper we show that it is possible to perform a successful cache attack against this AES implementation, in AES-256/GCM mode, using widely available hardware. Using a laptop’s GPU to parallelize the analysis, we are able to extract a raw AES-256 key with 7 min of measurements and under a minute of analysis time and an AES-256/GCM key with 40 min of measurements and 30 min of analysis.
TL;DR: An energy-efficient reconfigurable platform for in-memory processing based on novel four-terminal spin Hall effect-driven domain wall motion devices that could be employed as both nonvolatile memory cell and in- memory logic unit is proposed.
Abstract: In this paper, we propose an energy-efficient reconfigurable platform for in-memory processing based on novel four-terminal spin Hall effect-driven domain wall motion devices that could be employed as both nonvolatile memory cell and in-memory logic unit. The proposed designs lead to unity of memory and logic. The device to system level simulation results show that, with 28% area increase in memory structure, the proposed in-memory processing platform achieves a write energy ~15.6 fJ/bit with 79% reduction compared to that of SOT-MRAM counterpart while keeping the identical 1 ns writing speed. In addition, the proposed in-memory logic scheme improves the operating energy by 61.3%, as compared with the recent nonvolatile in-memory logic designs. An extensive reliability analysis is also performed over the proposed circuits. We employ advanced encryption standard (AES) algorithm as a case study to elucidate the efficiency of the proposed platform at application level. Simulation results exhibit that the proposed platform can show up to 75.7% and 30.4% lower energy consumption compared to CMOS-ASIC and recent pipelined domain wall (DW) AES implementations, respectively. In addition, the AES energy-delay product can show 15.1% and 6.1% improvements compared to the DW-AES and CMOS-ASIC implementations, respectively.
TL;DR: A high performance encryption system based on AES is proposed, in which AES can work at all three modes including AES-128, AES-192, and AES-256, and a design of $1^{st}$ order mask has been proposed which can resist order differential (or correlation) power attack.
Abstract: Advanced Encryption Standards (AES) defined by National Institute of Standards and Technology (NIST) is widely used for symmetric cryptography. In this paper, a high performance encryption system based on AES is proposed, in which AES can work at all three modes including AES-128, AES-192, and AES-256. In addition, the proposed AES implementation is piped into 4 stages for each round operation with decryption module reusing some circuits of encryption module, which leads to a performance improvement in term of area and throughput. Furthermore, a design of $1^{st}$ order mask has been proposed which can resist $1^{st}$ order differential (or correlation) power attack. Our design can be easily expanded to other SPN-based primitives.
TL;DR: This paper introduces Correlation Optimization (CO), a novel approach that improves CEMA attacks by formulating the selection of useful EM leakage samples in a trace as a machine learning optimization problem, and proposes the correlation loss function, which aims to maximize the Pearson correlation between a set of EM traces and the true AES key during training.
Abstract: Sensitive cryptographic information, e.g. AES secret keys, can be extracted from the electromagnetic (EM) leakages unintentionally emitted by a device using techniques such as Correlation Electromagnetic Analysis (CEMA). In this paper, we introduce Correlation Optimization (CO), a novel approach that improves CEMA attacks by formulating the selection of useful EM leakage samples in a trace as a machine learning optimization problem. To this end, we propose the correlation loss function, which aims to maximize the Pearson correlation between a set of EM traces and the true AES key during training. We show that CO works with high-dimensional and noisy traces, regardless of time-domain trace alignment and without requiring prior knowledge of the power consumption characteristics of the cryptographic hardware. We evaluate our approach using the ASCAD benchmark dataset and a custom dataset of EM leakages from an Arduino Duemilanove, captured with a USRP B200 SDR. Our results indicate that the masked AES implementation used in all three ASCAD datasets can be broken with a shallow Multilayer Perceptron model, whilst requiring only 1,000 test traces on average. A similar methodology was employed to break the unprotected AES implementation from our custom dataset, using 22,000 unaligned and unfiltered test traces.
TL;DR: This work builds a heuristic leakage model and a novel leakage model to exploit the simultaneous EM leakages in parallel scenarios and evaluates the effectiveness of EM attacks on GPU-based AES implementation to show that GPU- based AES implementation is vulnerable to EM attacks.
Abstract: In this work, for the first time, we investigate Electro-Magnetic (EM) attacks on GPU-based AES implementation. In detail, we first sample EM traces using a delicate trigger; then, we build a heuristic leakage model and a novel leakage model to exploit the simultaneous EM leakages in parallel scenarios. After that, we evaluate the effectiveness of EM attacks on GPU-based AES implementation. Our evaluation results show that GPU-based AES implementation is vulnerable to EM attacks. This work also suggests that GPU-based AES implementation needs to be protected against EM attacks in real scenarios.
TL;DR: A representation of the AES S-box exploiting rotational symmetry which leads to a 50% reduction of the area footprint on FPGA devices is demonstrated and a heuristic-based algorithm to find a masking of a given function with d + 1 shares is introduced, allowing the smallest masked AES implementation on Xilinx FPGAs, to-date.
Abstract: The effort in reducing the area of AES implementations has largely been focused on Application-Specific Integrated Circuits (ASICs) in which a tower field construction leads to a small design of the AES S-box. In contrast, a naive implementation of the AES S-box has been the status-quo on Field-Programmable Gate Arrays (FPGAs). A similar discrepancy holds for masking schemes – a wellknown side-channel analysis countermeasure – which are commonly optimized to achieve minimal area in ASICs.In this paper we demonstrate a representation of the AES S-box exploiting rotational symmetry which leads to a 50% reduction of the area footprint on FPGA devices. We present new AES implementations which improve on the state of the art and explore various trade-offs between area and latency. For instance, at the cost of increasing 4.5 times the latency, one of our design variants requires 25% less look-up tables (LUTs) than the smallest known AES on Xilinx FPGAs by Sasdrich and Guneysu at ASAP 2016. We further explore the protection of such implementations against first-order side-channel analysis attacks. Targeting the small area footprint on FPGAs, we introduce a heuristic-based algorithm to find a masking of a given function with d + 1 shares. Its application to our new construction of the AES S-box allows us to introduce the smallest masked AES implementation on Xilinx FPGAs, to-date.
TL;DR: To deal with the attacks and improve AES circuit's information security, one protection, namely Registered Data Obfuscation, is presented and experiment results show that with the proposed protection, the scan-based attack is invalidated to leak the critical data.
Abstract: With the rapid development and globalization of semiconductor industry, data security is becoming a more critical issue for highly confidential devices, especially for cryptography related applications. Advanced Encryption Standard (AES) is widely used for information security. For AES, the most important data are plaintext and keys, which are the targets of attacks. In this paper, AES security vulnerabilities are analyzed first. Information leakage would be a major concern for AES. Hence one of the most common types of attacks that could leak information at the AES implementation, inserted into AES and utilizing scan chains in or around AES to extract keys or plaintext, is discussed. To deal with the attacks and improve AES circuit's information security, one protection, namely Registered Data Obfuscation, is presented. Experiment results show that with the proposed protection, the scan-based attack is invalidated to leak the critical data. Meanwhile, the proposed protection can also disalbe key Trojan attack introduced in [1, 2]. The cost analysis shows that the additional area and power overhead incurred by the proposed protection are 1.09% and 0.46%, respectively.
TL;DR: The catch-the-flag challenge of CHES 2017 as discussed by the authors was the first white-box cryptography challenge, where the participants were not expected to disclose their identities or the underlying designing/attacking techniques.
Abstract: White-box cryptography (WBC) protects key extraction from software implementations of cryptographic primitives. Many academic works have been done achieving partial results toward WBC, but a complete solution has not been found yet by the cryptography community. As a result, the industry can only on proprietary and non-publicly scrutinized white-box implementations. It is therefore of interest to investigate the obtainable resistance of an AES implementation to thwart a white-box adversary in this paradigm. To this purpose, the ECRYPT CSA project has organized the WhibOx contest as the catch the flag challenge of CHES 2017. Researchers and engineers were invited to participate either as designers by submitting the source code of an AES-128 white-box implementation with a freely chosen key, or as breakers by trying to extract the hard-coded keys in the submissions. The participants were not expected to disclose their identities or the underlying designing/attacking techniques. In the end, 94 submitted challenges were all broken, and only 13 of them held more than one day. The strongest (in terms of surviving time) implementation survived for 28 days (which is more than twice as much as the second one). It was only broken by the authors of the present paper with reverse engineering and algebraic analysis. In this paper, we give a detailed description of the different steps of our cryptanalysis. We then generalize it to an attack methodology to break further obscure white-box implementations. In particular, we formalize and generalize the linear decoding analysis that we use to extract the key from the encoded intermediate variables of the target challenge.
TL;DR: The practical issues that relate to white-box block cipher implementations from lightweight block ciphers are discussed and the performance and the costs are compared with the white- box AES implementation.
TL;DR: The authors show that the secret key of the Luo-Lai-You (LLY) implementation can be recovered with a time complexity of about 2 44 , and propose a new white-box AES implementation based on table lookups, which is shown to be resistant against the existing table-composition-targeting white- box attacks.
Abstract: White-box cryptography protects cryptographic software in a white-box attack context (WBAC), where the dynamic execution of the cryptographic software is under full control of an adversary. Protecting AES in the white-box setting attracted many scientists and engineers, and several solutions emerged. However, almost all these solutions have been badly broken by various efficient white-box attacks, which target compositions of key-embedding lookup tables. In 2014, Luo, Lai, and You proposed a new WBAC-oriented AES implementation, and claimed that their implementation is secure against both Billet et al.
's attack and De Mulder et al.
's attack. In this study, based on the existing table-composition-targeting cryptanalysis techniques, the authors show that the secret key of the Luo-Lai-You (LLY) implementation can be recovered with a time complexity of about 2
44
. Furthermore, the authors propose a new white-box AES implementation based on table lookups, which is shown to be resistant against the existing table-composition-targeting white-box attacks. The authors, key-embedding tables are obfuscated with large affine mappings, which cannot be cancelled out by table compositions of the existing cryptanalysis techniques. Although their implementation requires twice as much memory as the LLY WBAES to store the tables, its speed is about 63 times of the latter.
TL;DR: This paper presents the modifications of a common ZYBO board, that are necessary to perform the CPA attack, and illustrates the whole process of attacking both software and hardware implementations of AES-128.
Abstract: Differential power analysis (DPA) and its enhanced variant, correlation power analysis (CPA), are one of the most common side channel attacks today. A dedicated hardware platform is often used when performing this kind of attack for experimental purposes. In this paper, we present the modifications of a common ZYBO board, that are necessary to perform the CPA attack. We illustrate the whole process of attacking both software and hardware implementations of AES-128 and we present our experimental results.
TL;DR: An extensive performance analysis is presented, including a proof-of-concept, software-based AES implementation featuring the masking technique to resist both fault and side-channel attacks at the same time.
Abstract: Side channel analysis and fault attacks are two powerful methods to analyze and break cryptographic implementations. At CHES 2011, Roche and Prouff applied secure multiparty computation to prevent side-channel attacks. While multiparty computation is known to be fault-resistant as well, the particular scheme used for side-channel protection does not currently offer this feature. This work introduces a new secure multiparty circuit to prevent both fault injection attacks and sidechannel analysis. The new scheme extends the Roche and Prouff scheme to make faults detectable. Arithmetic operations have been redesigned to propagate fault information until a new secrecy-preserving fault detection can be performed. A new recombination operation ensures randomization of the output in the case of a fault, ensuring that nothing can be learned from the faulty output. The security of the new scheme is proved in the ISW probing model, using the reformulated t-SNI security notion. Besides the new scheme and its security proof, we also present an extensive performance analysis, including a proof-of-concept, software-based AES implementation featuring the masking technique to resist both fault and side-channel attacks at the same time. The performance analysis for different security levels are given for the ARM-M0+ MCU with its memory requirements. A comprehensive leakage analysis shows that a careful implementation of the scheme achieves the expected security level.
TL;DR: There is a significant difference in the speed performance between the two versions favouring the proposed AES algorithm using multiple S-Boxes, and the AES-2SBox performed more efficiently in both encryption and decryption processes.
Abstract: This paper proposes for a modified version of the AES algorithm using multiple substitution boxes (S-Boxes). While many studies have been conducted specifically on modifying the S-Box, these studies were made to replace the Rijndael S-boxes in the AES cipher. We propose to implement two substitution boxes, where the first S-Box is the Rijndael S-box and will be used as is. The second S-Box was constructed through an XOR operation and affine transformation and will replace the MixColumns operation within the internal rounds in the cipher. Based on simulation testing conducted, it was found out that there is a significant difference in the speed performance between the two versions favouring the proposed AES algorithm using multiple S-Box. The findings also revealed that in both encryption and decryption processes, the AES-2SBox performed more efficiently at 27.638% and 108.369% respectively as compared to the original AES algorithm. However, when tested using the avalanche effect, the changes in the output bits were below the minimum expected rate.
TL;DR: Based on a theoretical analysis on how to quantify the remaining entropy, a practical search algorithm is derived and shows that even in a setting with high noise or few available traces the authors can either successfully recover the full AES key or reduce its entropy significantly.
Abstract: Side Channel Attacks are an important attack vector on secure AES implementations. The Correlation-Enhanced Power Analysis Collision Attack by Moradi et al. [MME10] is a powerful collision attack that exploits leakage caused by collisions in between S-Box computations of AES. The attack yields observations from which the AES key can be inferred. Due to noise, an insufficient number of collisions, or errors in the measurement setup, the attack does not find the correct AES key uniquely in practice, and it is unclear how to determine the key in such a scenario. Based on a theoretical analysis on how to quantify the remaining entropy, we derive a practical search algorithm. Both our theoretical analysis and practical experiments show that even in a setting with high noise or few available traces we can either successfully recover the full AES key or reduce its entropy significantly.
TL;DR: This paper presents a novel strategy to recover the secret AES key by exploiting the properties of the FPGA’s memory elements called Block RAM (BRAM) that are often used to store the Rijndael S-boxes.
Abstract: Fault injection attacks constitute a major attack vector on cryptographic implementations, such as the Advanced Encryption Standard (AES). On Field Programmable Gate Arrays (FPGAs), the circuit can be altered by tampering the configuration data and thereby causing a desired faulty execution that leaks information about the secret key. Often it is not even necessary to conduct extensive reverse engineering of the propriety bitstream file format. In this paper, we present a novel strategy to recover the secret AES key by exploiting the properties of the FPGA’s memory elements called Block RAM (BRAM) that are often used to store the Rijndael S-boxes. The attack can be performed by a single reconfiguration with a faulty bitstream without any knowledge of either design properties or plaintext input. The advantage of our approach is that this attack works also with encrypted bitstreams. However, our experiments show that the number of reconfigurations might increase in this case.
TL;DR: It is suggested that cache-collision on GPU does give rise to leakages via EM side-channels and it should be considered in the design of secure GPU-based cryptographic implementations.
Abstract: For computationally-intensive tasks like cryptographic applications, GPU is thought to be an ideal platform due to its parallel computing power. However, some vulnerabilities of GPU have been published due to overflow attacks, covert-channel attacks and side-channel attacks. In this work, for the first time, we investigate cache-collision attacks on GPU-based AES implementation utilizing Electro-Magnetic (EM) leakages. We construct a much efficient leakage model based on generalized simultaneous cache-collision in multi-threads scenarios, and we mount a key-recovery attack with Differential Electro-Magnetic Analysis (DEMA). Our evaluation results show that the 16-byte secret key of GPU-based AES implementation can be recovered with only 5,000 EM traces, and 600 EM traces are enough when assisted with appropriate key enumeration algorithm (KEA). This work suggests that cache-collision on GPU does give rise to leakages via EM side-channels and it should be considered in the design of secure GPU-based cryptographic implementations.
TL;DR: The use of encoding with one-hot masking technique does not provide the maximum countermeasure effect against CPA-based attacks, and CPA attack can be successfully revealing the AES secret-key.
Abstract: Modern communication system use cryptography algorithm to ensure data still confidentiality, integrity, and authentic. There is a new vulnerability in a cryptographic algorithm when implemented on a hardware device. This vulnerability is considered capable of uncovering a secret key used in a cryptographic algorithm. This technique is known as a power analysis attack. Previous and other research introduces countermeasure to countering this new vulnerability. Some researchers suggest using logic level with encoding the AES. The countermeasure using logic is meager cost and efficient. The contribution of this paper is to analyze CPA on encryption device that has been given logic level countermeasure. Our finding of this paper is the use of encoding with one-hot masking technique does not provide the maximum countermeasure effect against CPA-based attacks. In this research, CPA attack can be successfully revealing the AES secret-key
TL;DR: This work performs a power analysis attack on an FPGA implementation of the Advanced Encryption Standard (AES) which is not protected against side-channel attacks and estimates the number of power traces required to extract its secret key.
Abstract: Side-channel attacks are currently one of the most powerful attacks against implementations of cryptographic algorithms. They exploit the correlation between the physical measurements (power consumption, electromagnetic emissions, timing) taken at different points during the computation and the secret key. Some of the existing countermeasures offer a protection against one specific type of side channel only. We show that it can be a bad practice which can make exploitation of other side-channels easier. First, we perform a power analysis attack on an FPGA implementation of the Advanced Encryption Standard (AES) which is not protected against side-channel attacks and estimate the number of power traces required to extract its secret key. Then, we repeat the attack on AES implementations which are protected against fault injections by hardware redundancy and show that they can be broken with three times less power traces than the unprotected AES. We also demonstrate that the problem cannot be solved by complementing the duplicated module, as previously proposed. Our results show that there is a need for increasing knowledge about side-channel attacks and designing stronger countermeasures.
TL;DR: This work utilized traditional as well as LabVIEW FPGA platforms to get an optimized high speed design of AES (Advanced Encryption Standard) to secure the communication between ROV and control station in a marine environment.
Abstract: The LabVIEW FPGA platform is based on graphical programming approach, which makes easy the FPGA programming and the I/O interfacing. The LabVIEW FPGA significantly improves the design productivity and helps to reduce the time to market. On the other hand, traditional FPGA platform is helpful to get an efficient/optimized design by providing control over each bit using HDL programming languages. This work utilized traditional as well as LabVIEW FPGA platforms to get an optimized high speed design of AES (Advanced Encryption Standard). The AES is considered to be a secure and reliable cryptographic algorithm that is used worldwide to provide encryption services, which hide the information during communication over untrusted networks, like Internet. Here, AES core is proposed to secure the communication between ROV (Remotely Operated Vehicle) and control station in a marine environment; but this core can be fit in any other high speed electronic communications. This work provides encryption of 128-bytes, 256-bytes and 512-bytes set of inputs (individually and simultaneously) using a 128-bit key. In case of simultaneous implementation, all the above mentioned set of inputs is encrypted in parallel. This simultaneous implementation is resulted in throughput of Gbps range.
TL;DR: An AES implementation using energy efficient shift register with ADOC and RTPG gating technique which improves the efficiency of the AES implemented design is introduced.
Abstract: Security is the most important aspects of Internet of Things (IoT). The popularity of IoT increased the need of Radio Frequency Identification (RFID) chips and smart cards that are energy efficient and more secure against attacks. Lightweight encryption circuit is essential for any IoT application, which is required to be energy and area efficient. This paper introduces an AES implementation using energy efficient shift register with ADOC and RTPG gating technique which improves the efficiency of the AES implemented design. Comparison is done with the current AES implemented designs with respect to power, area and delay. The proposed AES design shows 29.32 percent improvement in power consumption over register renaming based AES design.
TL;DR: This paper utilizes the special architecture of All Programmable SoC to implement a secure AES encryption scheme which can efficiently resist both cache timing and power/electromagnetic analysis attacks.
Abstract: With the rapid development of IoT devices in the direction of multifunction and personalization, All Programmable SoC has been used more and more frequently because of its unrivaled levels of system performance, flexibility, and scalability. On the other hand, this type of SoC faces a growing range of security threats. Among these threats, cache timing attacks and power/elctromagnetic analysis attacks are two considerable ones which have been widely studied. Although many countermeasures have been proposed to resist these two types of attacks, most of them can only withstand a single type but are often incapable when facing multi-type attacks. In this paper, we utilize the special architecture of All Programmable SoC to implement a secure AES encryption scheme which can efficiently resist both cache timing and power/electromagnetic analysis attacks. The AES implementation has a beginning software stage, a middle hardware stage and a final software stage. Operations in software and start/end round of hardware are all randomized, which allow our implementation to withstand two types of attacks. To illustrate the security of the implementation, we conduct the three types of attacks on unprotected software/hardware AES, shuffled software AES and our scheme. Furthermore, we use Test Vector Leakage Assessment (TVLA) to test their security on encryption times and power/electromagnetic traces. The final result indicates that our encryption implementation achieves a high secure level with almost 0.86 times data throughput of the shuffled software AES implementation.
TL;DR: A new scheme based on randomizing power consumption of a fixed-operation logic gate is proposed, enhancing the immunity of AES algorithm against DPA and can be used as a general hardening method in the majority of cryptographic algorithms.
Abstract: Side-channel attacks are considered to be the most important problems of modern digital security systems. Today, Differential Power Attack (DPA) is one of the most powerful tools for attacking hardware encryption algorithms in order to discover the correct key of the system. In this work, a new scheme based on randomizing power consumption of a fixed-operation logic gate is proposed. The goal of this method is enhancing the immunity of AES algorithm against DPA. Having a novel topology to randomize the power consumption of each Exclusive-NOR gate, the proposed circuit causes random changes in the overall power consumption of the steps of the algorithm; thus, the correlation between the instantaneous power consumption and the correct key is decreased and the immunity of the AES implementations which the key is injected into their process through Exclusive-NOR gates is extremely increased. The proposed method can be used as a general hardening method in the majority of cryptographic algorithms. The results of theoretical analysis and simulations in 90-nm technology demonstrate the capability of the proposed circuits to strengthen AES against DPA. The CMOS area and power consumption overhead is less than 1%.
TL;DR: This review paper will concentrate on two kinds of method of constructing AES S-Box, which are algebraic approach and heuristic approach and the finding may offer the potential approach to develop a new S- box that is better than the original one.
Abstract: Although the attack on cryptosystem is still not severe, the development of the scheme is still ongoing especially for the design of S-Box Two main approach has been used, which are heuristic method and algebraic method Algebraic method as in current AES implementation has been proven to be the most secure S-Box design to date This review paper will concentrate on two kinds of method of constructing AES S-Box, which are algebraic approach and heuristic approach The objective is to review a method of constructing S-Box, which are comparable or close to the original construction of AES S-Box especially for the heuristic approach Finally, all the listed S-Boxes from these two methods will be compared in terms of their security performance which is nonlinearity and differential uniformity of the S-Box The finding may offer the potential approach to develop a new S-Box that is better than the original one
TL;DR: An AES implementation based on one lookup table of 512 B with optimised structure, named 1-T, to improve the access-driven cache attack resistant ability and optimise the implementation of round function of 1- T to eliminate the speed influence from the shrunken lookup table.
Abstract: The traditional advanced encryption standard (AES) implementations based on four lookup tables (4-T) of 1 KB size, have high encryption performance, whereas face access-driven cache attack at the same time. In this paper, we present an AES implementation based on one lookup table of 512 B with optimised structure, named 1-T, to improve the access-driven cache attack resistant ability. Furthermore, we optimise the implementation of round function of 1-T to eliminate the speed influence from the shrunken lookup table. The experiment result shows that attack resistant ability of 1-T is much higher than 4-T's under the same cache setting; and encryption time of 1-T is increased by 43.5% and 106.3% than 4-T's on the ARM and the ×86 platform respectively, but storage overhead is only 28% of 4-T's.
TL;DR: The research shows that the secret key of the target implementation can be recovered with less cost than expected, which suggests that the side-channel security of parallel cryptographic implementations should be reevaluated before application.
Abstract: Parallel cryptographic implementations are generally considered to be more advantageous than their non-parallel counterparts in mitigating side-channel attacks because of their higher noise-level. So far as we know, the side-channel security of GPU-based cryptographic implementations have been studied in recent years, and those implementations then turn out to be susceptible to some side-channel attacks. Unfortunately, the target parallel implementations in their work do not achieve strict parallelism because of the occurrence of cached memory accesses or the use of conditional branches, so how strict parallelism affects the side-channel security of cryptographic implementations is still an open problem. In this work, we make a case study of the side-channel security of a GPU-based bitsliced AES implementation in terms of bit-level parallelism and threadlevel parallelism in order to show the way that works to reduce the side-channel security of strict parallel implementations. We present GPU-based bitsliced AES implementation as the study case because (1) it achieves strict parallelism so as to be resistant to cache-based attacks and timing attacks; and (2) it achieves both bit-level parallelism and thread-level parallelism (a.k.a. tasklevel parallelism), which enables us to research from multiple perspectives. More specifically, we first set up our testbed and collect electro-magnetic (EM) traces with some special techniques. Then, the measured traces are analyzed in two granularity. In bit-level parallelism, we give a non-profiled leakage detection test before mounting attacks with our proposed bit-level fusion techniques like multi-bits feature-level fusion attacks (MBFFA) and multi-bits decision-level fusion attacks (MBDFA). In threadlevel parallelism, a profiled leakage detection test is employed to extract some special information from multi-threads leakages, and with the help of those information our proposed multithreads hybrid fusion attack (MTHFA) method takes effect. Last, we propose a simple metric to quantify the side-channel security of parallel cryptographic implementations. Our research shows that the secret key of our target implementation can be recovered with less cost than expected, which suggests that the side-channel security of parallel cryptographic implementations should be reevaluated before application. Keywords—Side-Channel Attacks (SCA), Side-Channel Fusion Attacks (SCFA), Electro-Magnetic Attacks (EMA), Strict Parallel Cryptographic Implementation, Warp Asynchronous Leakages
TL;DR: This paper adapts an ID-based authentication scheme that can significantly increase the authentication speed compared to conventional schemes and can verify the authenticity of a prover among 2 70 different provers within 0.59 s; this could not be handled effectively using previous schemes.
Abstract: Various electronic devices are increasingly being connected to the Internet. Meanwhile, security problems, such as fake silicon chips, still exist. The significance of verifying the authenticity of these devices has led to the proposal of side-channel authentication. Side-channel authentication is a promising technique for enriching digital authentication schemes. Motivated by the fact that each cryptographic device leaks side-channel information depending on its used secret keys, cryptographic devices with different keys can be distinguished by analyzing the side-channel information leaked during their calculation. Based on the original side-channel authentication scheme, this paper adapts an ID-based authentication scheme that can significantly increase the authentication speed compared to conventional schemes. A comprehensive study is also conducted on the proposed ID-based side-channel authentication scheme. The performance of the proposed authentication scheme is evaluated in terms of speed and accuracy based on an FPGA-based AES implementation. With the proposed scheme, our experimental setup can verify the authenticity of a prover among 2 70 different provers within 0.59 s; this could not be handled effectively using previous schemes.
TL;DR: A thorough comparison between different AES S-box circuits in 28nm Fully Depleted Silicon-On-Insulator (FD-SOI) technology of STMicroelectronics allows cryptographic hardware designers to select the most suitable S- box design for their resource-limited AES implementation.
Abstract: This paper elaborates on the results of a thorough comparison between different AES S-box circuits in 28nm Fully Depleted Silicon-On-Insulator (FD-SOI) technology of STMicroelectronics. The three evaluated S-boxes are strategically chosen to provide a maximum coverage of the design space. Simulation results regarding area, speed, power and energy are presented and analyzed. Further, ultra low-power implementations are considered by simulating the circuits in the sub-threshold region. The presented performance comparison allows cryptographic hardware designers to select the most suitable S-box design for their resource-limited AES implementation.
TL;DR: This work realized an effective cache attack on AES implementations for Android smart phone based on the Prime+Probe strategy and proposed to use K-S statistical test to help rank private key assumptions in noisy execution environment.
Abstract: Cache attack is mainly based on information leakage through the timing difference between cache hit and miss. It is an effective technique to attack AES implementations on x86 platform. However, since the cache architecture, instruction set of smartphone is different from that of the Intel platform, effective cache attack on AES implementations for smart phones still faces several challenges. In this work, we realized an effective cache attack on AES implementations for Android smart phone based on the Prime+Probe strategy. We also proposed to use K-S statistical test to help rank private key assumptions in noisy execution environment. Our results show that cache attack on ASE implementations for Android platform is practical and countermeasures are needed to ensure mobile security.
TL;DR: An FPGA architecture for a 512-bit AES implementation using a pre-ciphered lookup table approach is developed using Verilog HDL and synthesized using Virtex-7 device which shows a 290.71% increase in the throughput achieved in comparison with the previous implementation.
Abstract: This paper proposes an FPGA architecture for a 512-bit AES implementation using a pre-ciphered lookup table approach. The hardware realization uses a 512-bit block message and a 512-bit key. The architecture is designed to give an increased throughput for applications were session keys are used for communication. The architecture exploits the fact that session key does not change for substantial duration for an entire session; therefore, a pre-ciphered lookup table can be used to enhance the encryption throughput. The design is suitable for applications where communication is performed in sessions and the key does not alter frequently, such as HTTP, Telnet remote login session in the application layer. An FPGA architecture is developed using Verilog HDL and synthesized using Virtex-7 device which shows a 290.71% increase in the throughput achieved in comparison with the previous implementation.
TL;DR: It is shown that the white-box implementation of the authors' AES-like cipher can resist current known attacks, and is proposed by replacing AES’s S-boxes and MixColumn matrices with key-dependent components while keeping their good cryptographic properties.
Abstract: It is becoming increasingly common to deploy cryptographic algorithms within software applications which are executed in untrusted environments owned and controlled by a possibly malicious party. White-box cryptography aims to protect the secret key in such an environment. Chow et al. developed a white-box AES implementation in 2002 by hiding secret keys into lookup tables. Afterwards, some improvements were proposed. However, all the published schemes have been shown to be insecure. AES was originally designed without consideration of execution in a white-box attack context. Because of the fixed confusion and diffusion operations, it is easy to break AES’s white-box version. In this paper, we propose an AES-like cipher by replacing AES’s S-boxes and MixColumn matrices with key-dependent components while keeping their good cryptographic properties. We show that the white-box implementation of our AES-like cipher can resist current known attacks.