Fathom: Reference Workloads for Modern Deep Learning Methods

doi:10.1109/IISWC.2016.7581275

Open AccessProceedings Article10.1109/IISWC.2016.7581275

Fathom: Reference Workloads for Modern Deep Learning Methods

Robert Adolf, +4 more

- 23 Aug 2016

- arXiv: Learning

145

TL;DR: Fathom as discussed by the authors is a collection of eight archetypal deep learning workloads for study, ranging from the familiar deep convolutional neural network of Krizhevsky et al. to the more exotic memory networks from Facebook's AI research group.

Abstract: Deep learning has been popularized by its recent successes on challenging artificial intelligence problems. One of the reasons for its dominance is also an ongoing challenge: the need for immense amounts of computational power. Hardware architects have responded by proposing a wide array of promising ideas, but to date, the majority of the work has focused on specific algorithms in somewhat narrow application domains. While their specificity does not diminish these approaches, there is a clear need for more flexible solutions. We believe the first step is to examine the characteristics of cutting edge models from across the deep learning community. Consequently, we have assembled Fathom: a collection of eight archetypal deep learning workloads for study. Each of these models comes from a seminal work in the deep learning community, ranging from the familiar deep convolutional neural network of Krizhevsky et al., to the more exotic memory networks from Facebook's AI research group. Fathom has been released online, and this paper focuses on understanding the fundamental performance characteristics of each model. We use a set of application-level modeling tools built around the TensorFlow deep learning framework in order to analyze the behavior of the Fathom workloads. We present a breakdown of where time is spent, the similarities between the performance profiles of our models, an analysis of behavior in inference and training, and the effects of parallelism on scaling.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, +74 more

- 16 Apr 2017

- arXiv: Hardware Architecture

TL;DR: This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters.

...read moreread less

4.1K

•Proceedings Article•10.1145/3079856.3080246

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, +75 more

- 24 Jun 2017

TL;DR: The Tensor Processing Unit (TPU) as discussed by the authors is a custom ASIC deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) using a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS).

...read moreread less

3.8K

•Proceedings Article•10.1109/ISCA45697.2020.00045

MLPerf inference benchmark

Vijay Janapa Reddi, +46 more

- 30 May 2020

TL;DR: This paper presents the benchmarking method for evaluating ML inference systems, MLPerf Inference, and prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures.

...read moreread less

554

Proceedings Article•10.1145/3173162.3173177

Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks

Amirali Boroumand, +10 more

- 19 Mar 2018

TL;DR: This work comprehensively analyzes the energy and performance impact of data movement for several widely-used Google consumer workloads, and finds that processing-in-memory (PIM) can significantly reduceData movement for all of these workloads by performing part of the computation close to memory.

...read moreread less

368

•Proceedings Article•10.1109/HPCA47549.2020.00047

The Architectural Implications of Facebook's DNN-Based Personalized Recommendation

Udit Gupta, +15 more

- 01 Feb 2020

TL;DR: A set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation are presented and in-depth analysis is conducted that underpins future system design and optimization for at-scale recommendation.

...read moreread less

318

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K

...

Expand

Fathom: Reference Workloads for Modern Deep Learning Methods

Chat with Paper

AI Agents for this Paper

Citations

In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

MLPerf inference benchmark

Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks

The Architectural Implications of Facebook's DNN-Based Personalized Recommendation

References

Deep Residual Learning for Image Recognition

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet: A large-scale hierarchical image database

Related Papers (5)

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Gradient-based learning applied to document recognition