Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/14403
Title: Compiler-Managed Replication of CUDA Kernels for Reliable Execution of GPGPU Applications∗
Authors: Kaya, E.
Öz, I.
Keywords: compiler support
GPU computing
redundancy
soft errors
Publisher: World Scientific
Abstract: As Graphics Processing Units (GPUs) evolve for general-purpose computations besides inherently fault-tolerant graphics programs, soft error reliability becomes a first-class citizen in program design. Especially, safety-critical systems utilizing GPU devices need to employ fault-tolerance techniques to recover from errors in hardware components. While software-level redundancy approaches, based on the replication of the application code, offer high reliability for safe program execution, it is essential to perform redundancy by utilizing parallel execution units in the target architecture not to hurt performance with redundant computations. In this work, we propose redundancy approaches using the parallel GPU cores and implement a compiler-level redundancy framework that enables the programmer to configure the target GPGPU program for redundant execution. We run redundant executions for GPGPU programs from the PolyBench benchmark suite by applying our kernel-level redundancy approaches and evaluate their performance by considering the parallelism level of the programs. Our results reveal that redundancy approaches utilizing parallelism offered by GPU cores yield higher performance for redundant executions, while the programs that already make use of parallel GPU cores in their original form suffer from overhead caused by contention among redundant threads. © World Scientific Publishing Company.
URI: https://doi.org/10.1142/S0218126624502542
https://hdl.handle.net/11147/14403
ISSN: 0218-1266
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Show full item record



CORE Recommender

Page view(s)

10
checked on Jun 17, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.