Open CL Kernel Fusion for GPU, Xeon Phi and CPU
Kernel fusion is an optimization method, in which the code from several kernels is composed to create a new, fused kernel. It can push the performance of kernels beyond limits given for their isolated, unfused form. In this paper, we introduce a classification of different types of kernel fusion for both data dependent and data independent kernels. We study kernel fusion on three types of OpenCL devices: GPU, Xeon Phi and CPU. Those hardware platforms have quite different properties, thus, kernel fusion often affects performance in quite different ways. We analyze the impact of kernel fusion on those hardware platforms and show how it can be used to improve performance. Based on our study we also introduce a basic transformation method for generating fused kernels, which has good potential to be automatized.
Top- Filipovic, Jiri
- Benkner, Siegfried
Category |
Paper in Conference Proceedings or in Workshop Proceedings |
Event Title |
27th International Symposium on Computer Architecture and High Performance Computing - SBAC-PAD 2015 |
Divisions |
Scientific Computing |
Event Location |
Florianopolis, Brazil |
Event Type |
Conference |
Event Dates |
18-21 Oct. 2015 |
Date |
October 2015 |
Export |