Matthew Boyd

6 Papers

2 Citations

Matthew Boyd is an academic researcher. The author has contributed to research in topics: Computer science & Scalability. The author has an hindex of 2, co-authored 6 publications.

Author Tools

Create citation map

Create Author Profile

Analyze Matthew Boyd's Top Papers

Chat about Author

Papers

•Proceedings Article•10.1145/3470496.3527405

A software-defined tensor streaming multiprocessor for large-scale machine learning

Dennis Abts, +21 more

- 11 Jun 2022

TL;DR: The topology, routing and flow control are described to characterize the performance of the network that serves as the fabric for a large-scale parallel machine learning system with up to 10,440 TSPs and more than 2 TeraBytes of global memory accessible in less than 3 microseconds of end-to-end system latency.

...read moreread less

Proceedings Article•10.1109/HCS55958.2022.9895630

The Groq Software-defined Scale-out Tensor Streaming Multiprocessor : From chips-to-systems architectural overview

Dennis Abts, +9 more

- 21 Aug 2022

Proceedings Article•10.1109/ASAP54787.2022.00022

Answer Fast: Accelerating BERT on the Tensor Streaming Processor

Ibrahim Ahmed, +8 more

- 22 Jun 2022

TL;DR: By carefully fusing all the nonlinear components with the matrix multiplication components, the on-chip matrix multiplication units are efficiently utilized resulting in a deterministic tail latency of 130 μs for a batch-1 inference through BERT-base, which is 6× faster than the current state-of-the-art.

...read moreread less

Proceedings Article•10.1109/dsn-s54099.2022.00016

Challenges/Opportunities to Enable Dependable Scale-out System with Groq Deterministic Tensor-Streaming Processors

Dennis Abts, +6 more

- 01 Jun 2022

TL;DR: This work explores the challenges and opportunities to scale such deterministic architecture across multiple processors to ensure a dependable scale-out system and the high-radix, low diameter topology enables N + 1 redundancy to improve the reliability of the system.

...read moreread less

Matthew Boyd

Author Tools

Chat about Author

Papers

A software-defined tensor streaming multiprocessor for large-scale machine learning

A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads

The Groq Software-defined Scale-out Tensor Streaming Multiprocessor : From chips-to-systems architectural overview

Answer Fast: Accelerating BERT on the Tensor Streaming Processor

Challenges/Opportunities to Enable Dependable Scale-out System with Groq Deterministic Tensor-Streaming Processors