Topic

512-bit

About: 512-bit is a research topic. Over the lifetime, 19 publications have been published within this topic receiving 327 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers

Journal Article•10.1049/IP-CDT:20040791•

Modified Montgomery modular multiplication and RSA exponentiation techniques

[...]

C. McIvor¹, M. McLoone¹, John V. McCanny¹•Institutions (1)

Queen's University Belfast¹

20 Dec 2004

TL;DR: The approach presented is based on a reformulation of the solution to modular multiplication within the context of RSA exponentiation, and the resulting RSA units exhibit the highest data rates reported in the literature to date, reflecting the very low and word length independent critical path delay achieved.

...read moreread less

Abstract: Modified Montgomery multiplication and associated RSA modular exponentiation algorithms and circuit architectures are presented. These modified multipliers use carry save adders (CSAs) to perform large word length additions. These have the attraction that, when repeatedly used to perform RSA modular exponentiation, the (carry save) format of the output words is compatible with that required by the multiplier inputs. This avoids the repeated interim output/input format conversion, needed when previously reported Montgomery multipliers are used for RSA modular exponentiation. Thus, the lengthy and costly conventional additions required at each stage are avoided. As a consequence, the critical path delay and, hence, the data throughput rate of the resulting Montgomery multiplier architectures are also word length independent. The approach presented is based on a reformulation of the solution to modular multiplication within the context of RSA exponentiation. Two algorithmic variants are presented, one based on a five-to-two CSA and the other on a four-to-two CSA plus multiplexer. The practical application of the approach has been demonstrated by using this to design special purpose RSA processing units with 512-bit and 1024-bit key sizes. The resulting RSA units exhibit the highest data rates reported in the literature to date, reflecting the very low and word length independent critical path delay achieved.

...read moreread less

181 citations

Proceedings Article•10.1109/HPCS48598.2019.9188239•

Energy Efficiency Features of the Intel Skylake-SP Processor and Their Impact on Performance

[...]

Robert Schöne¹, Thomas Ilsche¹, Mario Bielert¹, Andreas Gocht¹, Daniel Hackenberg¹ - Show less +1 more•Institutions (1)

Dresden University of Technology¹

15 Jul 2019

TL;DR: In this article, the effects of hardware controlled energy efficiency features for the Intel Skylake-SP processor were analyzed and it was shown that data has a significant impact on processor power consumption which causes a large error in energy models relying only on instructions.

...read moreread less

Abstract: The overwhelming majority of High Performance Computing (HPC) systems and server infrastructure uses Intel x86 processors. This makes an architectural analysis of these processors relevant for a wide audience of administrators and performance engineers. In this paper, we describe the effects of hardware controlled energy efficiency features for the Intel Skylake-SP processor. Due to the prolonged micro-architecture cycles, which extend the previous Tick-Tock scheme by Intel, our findings will also be relevant for succeeding architectures. The findings of this paper include the following: C-state latencies increased significantly over the Haswell-EP processor generation. The mechanism that controls the uncore frequency has a latency of approximately 10ms and it is not possible to truly fix the uncore frequency to a specific level. The out-of-order throttling for workloads using 512 bit wide vectors also occurs at low processor frequencies. Data has a significant impact on processor power consumption which causes a large error in energy models relying only on instructions.

...read moreread less

57 citations

Book Chapter•10.1007/3-540-38424-3_35•

CORSAIR: A SMART Card for Public Key Cryptosystems

[...]

Dominique De Waleffe¹, Jean-Jacques Quisquater¹•Institutions (1)

Philips¹

11 Aug 1990

TL;DR: The new smart card is in the final design stage; the first test chips should be available by the end of 1990, and CORSAIR achieves up to 40 (8 bit) MIPS with a clock speed of 6 Mhz.

...read moreread less

Abstract: Algorithms best suited for flexible smart card applications are based on public key cryptosystems -- RSA, zero-knowledge protocols ... Their practical implementation (execution in ? 1 second) entails a computing power beyond the reach of classical smart cards, since large integers (512 bits) have to be manipulated in complex ways (exponentiation). CORSAIR achieves up to 40 (8 bit) MIPS with a clock speed of 6 Mhz. This allows to compute XE mod M, with 512 bit operands, in less than 1.5 second (0.4 sec for a signature). The new smart card is in the final design stage; the first test chips should be available by the end of 1990.

...read moreread less

34 citations

Proceedings Article•10.1145/3068943.3068949•

SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors

[...]

Tomokatsu Takahashi¹, Hiroaki Shiokawa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

14 May 2017

TL;DR: Extensive evaluations on real-world graphs demonstrate the performance superiority of SCAN-XP over existing approaches, which runs approximately 100 times faster than SCAN.

...read moreread less

Abstract: The structural graph clustering method SCAN, proposed by Xu et al, is successfully used in many applications because it not only detects densely connected nodes as clusters but also extracts sparsely connected nodes as hubs or outliers However, it is difficult to applying SCAN to large-scale graphs since SCAN needs to evaluate the density for all adjacent nodes included in the given graphs In this paper, so as to address the above problem, we present a novel algorithm SCAN-XP that performs over Intel Xeon Phi We designed SCAN-XP in order to make best use of the hardware potential of Intel Xeon Phi by employing the following approaches: First, SCAN-XP avoids the bottlenecks that arise from parallel graph computations by providing good load balances among cores on the Intel Xeon Phi Second, SCAN-XP effectively exploits 512 bit SIMD instructions implemented in the Intel Xeon Phi to speed up the density evaluations As a result, SCAN-XP detects clusters, hubs, and outliers from large-scale graphs with much shorter computation time than SCAN Specifically, SCAN-XP runs approximately 100 times faster than SCAN; for the graphs with 100 million edges, SCAN-XP is able to perform in a few seconds In this paper, extensive evaluations on real-world graphs demonstrate the performance superiority of SCAN-XP over existing approaches

...read moreread less

28 citations

Proceedings Article•10.1109/ISCAS.2008.4542068•

ASIC hardware implementations for 512-bit hash function Whirlpool

[...]

Akashi Satoh

18 May 2008

TL;DR: Hardware architectures for the 512-bit hash function Whirlpool, which is one of the ISO/IEC 10118-3 standard algorithms, are proposed and the performances of the proposed architectures are evaluated using a 0.18-mum CMOS standard cell library.

...read moreread less

Abstract: Hardware architectures for the 512-bit hash function Whirlpool, which is one of the ISO/IEC 10118-3 standard algorithms, are proposed and the performances of the proposed architectures are evaluated using a 0.18-mum CMOS standard cell library. The fastest implementation achieved a throughput of 9.59 Gbps with a gate count of 167.4 K, which is two times faster than the fastest conventional implementation on an FPGA platform. A compact implementation obtained 38.9 Kgates with 2.49 Gbps. The FIPS 180-2 standard hash functions SHA-256 and SHA-512, which are the most popular algorithms in practical use, were also synthesized using the same ASIC library for performance comparisons. The small and fast SHA-256 implementations achieved 11.0 Kgates with 726 Mbps and 30.7 Kgates with 1.97 Gbps, respectively. The gate count and throughput are both approximately 1/4 those of to Whirlpool, and thus the hardware efficiencies defined as the throughput/gate are almost the same for SHA-256/- 512 and Whirlpool in the present implementations. However, Whirlpool is more flexible than SHA-256/-512 in terms of the variety of hardware architectures. The various architectures for the datapath and primitive function blocks are also described in the present paper.

...read moreread less

17 citations

...

Expand

Performance Metrics

Papers

144

Citations

No. of papers in the topic in previous years
Year	Papers
2021	1
2020	1
2019	3
2017	2
2016	2
2014	1

512-bit

Topic Tools

Papers

Modified Montgomery modular multiplication and RSA exponentiation techniques

Energy Efficiency Features of the Intel Skylake-SP Processor and Their Impact on Performance

CORSAIR: A SMART Card for Public Key Cryptosystems

SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors

ASIC hardware implementations for 512-bit hash function Whirlpool

Related Topics (5)

Performance Metrics