Xiaoming Chen
Intel
21 Papers
172 Citations
Xiaoming Chen is an academic researcher from Intel. The author has contributed to research in topics: Artificial neural network & Thread (computing). The author has an hindex of 7, co-authored 21 publications.
Chat about Author
Papers
Patent
Programmable coarse grained and sparse matrix compute hardware with advanced scheduling
Eriko Nurvitadhi,Balaji Vembu,Nicolas C. Galoppo Von Borries,Rajkishore Barik,Tsung-Han Lin,Sinha Kamal,Nadathur Satish,Jeremy Bottleson,Farshad Akhbari,Koker Altug,Narayan Srinivasa,Dukhwan Kim,Sara S. Baghsorkhi,Justin Gottschlich,Chen Feng,Elmoustapha Ould-Ahmed-Vall,Kevin Nealis,Xiaoming Chen,Anbang Yao +18 more
- 28 Apr 2017
TL;DR: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instructions to cause the compute device to perform a complex machine learning compute operation as discussed by the authors.
32
Patent
Instructions and logic to perform floating-point and integer operations for machine learning
Himanshu Kaul,Mark A. Anders,Sanu Mathew,Anbang Yao,Ray Joydeep,Ping T. Tang,Michael S. Strickland,Xiaoming Chen,Tatiana Shpeisman,Appu Abhishek R,Koker Altug,Sinha Kamal,Balaji Vembu,Nicolas C. Galoppo Von Borries,Eriko Nurvitadhi,Rajkishore Barik,Tsung-Han Lin,Ranganathan Vasanth,Sanjeev Jahagirdar +18 more
- 18 Oct 2017
TL;DR: In this article, a graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture and a first compute unit included within the MPC, the at least one single instruction is required to cause the first unit to perform a two-dimensional matrix multiply and accumulate operation.
24
Patent
Mixed inference using low and high precision
Elmoustapha Ould-Ahmed-Vall,Barath Lakshmanan,Tatiana Shpeisman,Ray Joydeep,Ping T. Tang,Michael S. Strickland,Xiaoming Chen,Anbang Yao,Ben Ashbaugh,Linda L. Hurd,Liwei Ma +10 more
- 14 Mar 2018
TL;DR: In this article, a general-purpose graphics processing unit comprising a streaming multiprocessor having a single instruction, multiple thread (SIMT) architecture including hardware multithreading is presented.
20
Patent
Compute optimizations for neural networks
Kevin Nealis,Anbang Yao,Xiaoming Chen,Elmoustapha Ould-Ahmed-Vall,Sara S. Baghsorkhi,Eriko Nurvitadhi,Balaji Vembu,Nicolas C. Galoppo Von Borries,Rajkishore Barik,Tsung-Han Lin,Sinha Kamal +10 more
- 24 Apr 2017
TL;DR: In this paper, a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register are used to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value.
18
Patent
Compute optimizations for low precision machine learning operations
Elmoustapha Ould-Ahmed-Vall,Sara S. Baghsorkhi,Anbang Yao,Kevin Nealis,Xiaoming Chen,Koker Altug,Appu Abhishek R,John C. Weast,Mike B. MacPherson,Dukhwan Kim,Linda L. Hurd,Ben Ashbaugh,Barath Lakshmanan,Liwei Ma,Ray Joydeep,Ping T. Tang,Michael S. Strickland +16 more
- 28 Apr 2017
TL;DR: In this article, the authors present an accelerator module comprising a memory stack including multiple memory dies, a graphics processing unit (GPU) coupled with the memory stack via one or more memory controllers, the GPU including a plurality of multiprocessors having a single instruction, multiple thread (SIMT) architecture, and the at least one single instruction to cause at least a portion of the GPU to perform a floating-point operation on input having differing precisions.
17