Design and Implementation of 2D Convolution on x86/x64 Processors

Kelefouras, Vasileios; Keramidas, G

View/Open

Design_and_Implementation_of_2D_Convolution_on_x86_x64_Processors.pdf (5.217Mb)

UoP_Deposit_Agreement v1.1 20160217.pdf (125.4Kb)

Date

2022-04-29

Author

Kelefouras, Vasileios

Keramidas, G

Subject

Convolution

gaussian blur

code optimization

vectorization

AVX

OpenMP

OpenCV

Intel MKL

Intel IPP

high performance computing (HPC)

image processing

Metadata

Show full item record

Abstract

In this paper, a new method for accelerating the 2D direct Convolution operation on x86/x64 processors is presented. It includes efficient vectorization by using SIMD intrinsics, bit-twiddling optimizations, the optimization of the division operation, multi-threading using OpenMP, register blocking and the shortest possible bit-width value of the intermediate results. The proposed method, which is provided as open-source, is general and can be applied to other processor families too, e.g., Arm. The proposed method has been evaluated on two different multi-core Intel CPUs, by using twenty different image sizes, 8-bit integer computations and the most commonly used kernel sizes (3x3, 5x5, 7x7, 9x9). It achieves from 2.8×2.8× to 40×40× speedup over the Intel IPP library (OpenCV GaussianBlur and Filter2D routines), from 105 ×105× to 400 ×400× speedup over the gemm-based convolution method (by using Intel MKL int8 matrix multiplication routine), and from 8.5×8.5× to 618×618× speedup over the vslsConvExec Intel MKL direct convolution routine. The proposed method is superior as it achieves far fewer arithmetical and load/store instructions.

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Journal

IEEE Transactions on Parallel and Distributed Systems

Volume

Issue

Pagination

3800-3815

Author URL

https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000831139000004&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=11bb513d99f797142bcfeffcc58ea008

Recommended, similar items

The following license files are associated with this item:

Original License