Evaluation of Thread Level Speculation in BlueGene/Q

Finkel, Hal; Bhattacharyya, Arnamoy; Amaral, Jose Nelson

doi:doi:10.7939/R39W09512

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Computing Science, Department of / Technical Reports (Computing Science)

Usage

250 views
232 downloads

Evaluation of Thread Level Speculation in BlueGene/Q

Author(s) / Creator(s)
Thread Level Speculation (TLS) is a hardware/software technique that guarantees correct parallel execution of loops even in the presence of dependence and has potential to lead to performance gains through the parallelization of loops that cannot be proven to be free of dependencies at compile time. However, given the overhead of TLS execution, the selection of loops to be speculated is important to avoid performance degradation. Data-dependence profiling is often used to find out if the may dependencies reported by the static analysis of a compiler materialize at runtime. A cost analysis may conclude that some loops with a lower probability of dependence should be speculatively parallelized. This report addressed the question as to whether a loops' dependence behaviour changes when the input to the program changes --- a study of 57 different benchmarks indicates that it usually does not change. Then the report describes SpecEval, a new automatic speculative parallelization framework that uses single-input data-dependence profiles to find speculation candidates in the SPEC2006 and PolyBench/C benchmarks. This report also presents the first performance evaluation of TLS implementation in IBM's BlueGene/Q supercomputer and shows that the performance of TLS is affected by several factors, including: number and coverage of speculated loops, miss-speculation overhead due to function calls in a speculated loop, L1 cache miss rate and dynamic instruction path length affects. | TRID-ID TR14-02
Date created

2014
Subjects / Keywords
Type of Item

Report
DOI

https://doi.org/10.7939/R39W09512
License

Attribution 3.0 International

Language
- English