Xiaoming Chen

Intel

21 Papers

172 Citations

Xiaoming Chen is an academic researcher from Intel. The author has contributed to research in topics: Artificial neural network & Thread (computing). The author has an hindex of 7, co-authored 21 publications.

Author Tools

Create citation map

Create Author Profile

Analyze Xiaoming Chen's Top Papers

Chat about Author

Papers

Patent

Programmable coarse grained and sparse matrix compute hardware with advanced scheduling

Eriko Nurvitadhi, +18 more

- 28 Apr 2017

TL;DR: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instructions to cause the compute device to perform a complex machine learning compute operation as discussed by the authors.

...read moreread less

Patent

Instructions and logic to perform floating-point and integer operations for machine learning

Himanshu Kaul, +18 more

- 18 Oct 2017

TL;DR: In this article, a graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture and a first compute unit included within the MPC, the at least one single instruction is required to cause the first unit to perform a two-dimensional matrix multiply and accumulate operation.

...read moreread less

Patent

Mixed inference using low and high precision

Elmoustapha Ould-Ahmed-Vall, +10 more

- 14 Mar 2018

TL;DR: In this article, a general-purpose graphics processing unit comprising a streaming multiprocessor having a single instruction, multiple thread (SIMT) architecture including hardware multithreading is presented.

...read moreread less

Patent

Compute optimizations for neural networks

Kevin Nealis, +10 more

- 24 Apr 2017

TL;DR: In this paper, a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register are used to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value.

...read moreread less

Patent

Compute optimizations for low precision machine learning operations

Elmoustapha Ould-Ahmed-Vall, +16 more

- 28 Apr 2017

TL;DR: In this article, the authors present an accelerator module comprising a memory stack including multiple memory dies, a graphics processing unit (GPU) coupled with the memory stack via one or more memory controllers, the GPU including a plurality of multiprocessors having a single instruction, multiple thread (SIMT) architecture, and the at least one single instruction to cause at least a portion of the GPU to perform a floating-point operation on input having differing precisions.

...read moreread less

...

Expand