Mixed-precision in-memory computing
Manuel Le Gallo,Manuel Le Gallo,Abu Sebastian,Roland Mathis,Matteo Manica,Matteo Manica,Heiner Giefers,Tomas Tuma,Costas Bekas,Alessandro Curioni,Evangelos Eleftheriou +10 more
- 01 Apr 2018
- Vol. 1, Iss: 4, pp 246-253
TL;DR: A hybrid system that combines a von Neumann machine with a computational memory unit can offer both the high precision of digital computing and the energy/areal efficiency of in-memory computing, which is illustrated by accurately solving a system of 5,000 equations using 998,752 phase-change memory devices.
read more
Abstract: As complementary metal–oxide–semiconductor (CMOS) scaling reaches its technological limits, a radical departure from traditional von Neumann systems, which involve separate processing and memory units, is needed in order to extend the performance of today’s computers substantially. In-memory computing is a promising approach in which nanoscale resistive memory devices, organized in a computational memory unit, are used for both processing and memory. However, to reach the numerical accuracy typically required for data analytics and scientific computing, limitations arising from device variability and non-ideal device characteristics need to be addressed. Here we introduce the concept of mixed-precision in-memory computing, which combines a von Neumann machine with a computational memory unit. In this hybrid system, the computational memory unit performs the bulk of a computational task, while the von Neumann machine implements a backward method to iteratively improve the accuracy of the solution. The system therefore benefits from both the high precision of digital computing and the energy/areal efficiency of in-memory computing. We experimentally demonstrate the efficacy of the approach by accurately solving systems of linear equations, in particular, a system of 5,000 equations using 998,752 phase-change memory devices.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

FIG. 3. Solution of a system of linear equations involving a model covariance matrix Norm of error between the computed x and exact xexa solution of Eq. (1) as a function of the number of iterative refinements for different covariance matrix sizes. xexa was computed by direct inversion of Eq. (1) in double-precision floating point. The inset shows a heat map (colormap in log-scale) of the model covariance matrix A used for N = 1,000. 
FIG. 2. Scalar multiplication a, Schematic of a PCM device and the scalar multiplication implementation based on Ohm’s law. TE (BE) denotes top (bottom) electrode. The grey arrows indicate mappings from one variable to another. b, Plot showing the proportionality between In and Gn f (Vn) (Eq. (2)) for the 1024 different combinations of {βn,γn}. c, Final result of the computed scalar multiplication θ̂n plotted against the exact result θn. d, Error distributions for different numbers of averaged devices K. The inset in d shows the standard deviation (s.d.) of the distributions versus K−0.5. 
FIG. 1. Concept of mixed-precision ‘memcomputing’ a, Possible architecture of a mixed-precision ‘memcomputing’ system. The highprecision processing unit (left) performs digital logic computation and is based on the standard von Neumann computing architecture. The low-precision ‘memcomputing’ unit (right) performs analog in-memory computation using one or multiple memristive arrays. The system bus (middle) implements the overall management (control, data, addressing) between the two units. The purple dotted arrows indicate control communication and the plain arrows (red, blue) indicate data transfers. b, Algorithm for solving a system of linear equations Ax = b using the mixed-precision ‘memcomputing’ system of a. The blue boxes show the steps implemented in the high-precision processing unit and the red box shows the matrix-vector multiplication step implemented in the low-precision ‘memcomputing’ unit. 
FIG. 4. Estimation of autophagy-related gene interactions from RNA measurements a, Convergence of the mixed-precision ‘memcomputing’ algorithm for the 40 linear equations solved for the cancer and normal tissues. xnexa was computed by direct inversion of Eq. (1) in double-precision floating point. b, Matrix of computed partial correlations of the 40 genes studied for cancer and normal tissues (left) and their distributions (right). For visualization purposes, only the interactions for which the magnitude of the partial correlations is larger than a threshold of 0.13, corresponding to the 90-th percentile of the normal tissue, are displayed. c, Interactome obtained from normal tissue. d, Interactome obtained from cancer tissue. In c and d, the upstream nodes are dark colored and the downstream targets are light colored. The blue edges denote positive interactions and the red edges denote negative interactions.
Citations
Towards spike-based machine intelligence with neuromorphic computing.
TL;DR: An overview of the developments in neuromorphic computing for both algorithms and hardware is provided and the fundamentals of learning and hardware frameworks are highlighted, with emphasis on algorithm–hardware codesign.
1.6K
Memory devices and applications for in-memory computing
TL;DR: This Review provides an overview of memory devices and the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing.
1.5K
Parallel convolutional processing using an integrated photonic tensor core.
Johannes Feldmann,Nathan Youngblood,Nathan Youngblood,Maxim Karpov,Helge Gehring,Xuan Li,Maik Stappers,M. Le Gallo,Xin Fu,Anton Lukashchuk,Arslan S. Raja,Junqiu Liu,C.D. Wright,Abu Sebastian,Tobias J. Kippenberg,Wolfram H. P. Pernice,Harish Bhaskaran +16 more
TL;DR: In this paper, the authors demonstrate a computationally specific integrated photonic hardware accelerator (tensor core) that is capable of operating at speeds of trillions of multiply-accumulate operations per second.
1.1K
Resistive switching materials for information processing
Zhongrui Wang,Huaqiang Wu,Geoffrey W. Burr,Cheol Seong Hwang,Kang L. Wang,Qiangfei Xia,Jianhua Yang +6 more
TL;DR: This Review surveys the four physical mechanisms that lead to resistive switching materials enable novel, in-memory information processing, which may resolve the von Neumann bottleneck and examines the device requirements for systems based on RSMs.
948
Parallel convolution processing using an integrated photonic tensor core
Johannes Feldmann,Nathan Youngblood,Maxim Karpov,Helge Gehring,Xuan Li,Maik Stappers,Manuel Le Gallo,Xin Fu,Anton Lukashchuk,Arslan S. Raja,Junqiu Liu,David Wright,Abu Sebastian,Tobias J. Kippenberg,Wolfram H. P. Pernice,Harish Bhaskaran +15 more
TL;DR: The results indicate the potential of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.
819
References
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
Bo Li,Colin N. Dewey +1 more
TL;DR: It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.
•Book
Iterative Methods for Sparse Linear Systems
Yousef Saad
- 01 Apr 2003
TL;DR: This chapter discusses methods related to the normal equations of linear algebra, and some of the techniques used in this chapter were derived from previous chapters of this book.
The missing memristor found
TL;DR: It is shown, using a simple analytical example, that memristance arises naturally in nanoscale systems in which solid-state electronic and ionic transport are coupled under an external bias voltage.
Cramming More Components Onto Integrated Circuits
Gordon E. Moore
- 01 Jan 1998
TL;DR: Integrated circuits will lead to such wonders as home computers or at least terminals connected to a central computer, automatic controls for automobiles, and personal portable communications equipment as mentioned in this paper. But the biggest potential lies in the production of large systems.
•Journal Article
Cramming more components onto integrated circuits
TL;DR: The future of integrated electronics is the future of electronics itself, and the advantages of integration will bring about a proliferation of electronics, pushing this science into many new areas.
6.6K
Related Papers (5)
Daniele Ielmini,H.-S. Philip Wong +1 more
- 01 Jun 2018
Mohammed A. Zidan,John Paul Strachan,Wei Lu +2 more
- 01 Jan 2018