Proceedings Article10.1109/ICPP.2015.32
Optimizing Image Sharpening Algorithm on GPU
Mengran Fan,Haipeng Jia,Yunquan Zhang,Xiaojing An,Ting Cao +4 more
- 01 Sep 2015
- pp 230-239
1
TL;DR: This paper proposes a complete solution to implement and optimize sharpness on GPU and includes five major and effective techniques: Data Transfer Optimization, Kernel Fusion, Vectorization for Data Locality, Border and Reduction Optimization.
read more
Abstract: Sharpness is an algorithm used to sharpen images. As the increase of image size, resolution, and the requirements for real-time processing, the performance of sharpness needs to get improved greatly. The independent pixel calculation of sharpness makes a good opportunity to use GPU to largely accelerate the performance. However, to transplant it to GPU, one challenge is that sharpness involves several stages to execute. Each stage has its own characteristics, either with or without data dependency to other stages. Based on those characteristics, this paper proposes a complete solution to implement and optimize sharpness on GPU. Our solution includes five major and effective techniques: Data Transfer Optimization, Kernel Fusion, Vectorization for Data Locality, Border and Reduction Optimization. Experiments show that, compared to a well-optimized CPU version, our GPU solution can reach 10.7a#x007E; 69.3 times speedup for different image sizes on an AMD Fire Pro W8000 GPU.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Optimization Techniques for GPU Programming
TL;DR: In this article , a survey discusses various optimization techniques found in 450 articles published in the last 14 years and analyzes the optimizations from different perspectives which shows that the various optimizations are highly interrelated, explaining the need for techniques such as auto-tuning.
54
References
Scalable parallel programming with CUDA
John R. Nickolls,Ian Buck,Michael Garland,Kevin Skadron +3 more
- 11 Aug 2008
TL;DR: Presents a collection of slides covering the following topics: CUDA parallel programming model; CUDA toolkit and libraries; performance optimization; and application development.
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems
TL;DR: The OpenCL standard offers a common API for program execution on systems composed of different types of computational devices such as multicore CPUs, GPUs, or other accelerators as mentioned in this paper, such as accelerators.
A Descriptive Algorithm for Sobel Image Edge Detection
O. R. Vincent,Olusegun Folorunso +1 more
- 01 Jan 2009
TL;DR: The Sobel operator performs a 2-D spatial gradient measurement on images to enhance the removal of redundant data, as a result, reduction of the amount of data is required to represent a digital image.
CUDA cuts: Fast graph cuts on the GPU
Vibhav Vineet,P. J. Narayanan +1 more
- 23 Jun 2008
TL;DR: This paper presents an implementation of the push-relabel algorithm for graph cuts on the GPU that can perform over 60 graph cuts per second on 1024times1024 images and over 150 graph cutsper second on 640times480 images on an Nvidia 8800 GTX.
Parallel Image Processing Based on CUDA
Zhiyi Yang,Yating Zhu,Yong Pu +2 more
- 12 Dec 2008
TL;DR: The distinct features ofCUDA GPU are analyzed, the general program mode of CUDA is summarized and several classical image processing algorithms by CUDA, such as histogram equalization, removing clouds, edge detection and DCT encode and decode are implemented.
218