OSCAR Parallelizing Compiler and API for Low Power High Performance Multicores

Hironori Kasahara
Professor, Deptment of Computer Science & Engineering
Director, Advanced Multicore Processor Research Institute
Waseda University, Tokyo, Japan

Abstract

OSCAR (Optimally Scheduled Advanced Multiprocessor) Multigrain Parallelizing Compiler automatically parallelizes an ordinary sequential program written in “Fortran” or “Parallelizable C” (C with restrictions of pointer usage for embedded applications) to a parallelized “Fortran” or “C” program using OSCAR API for code portability. The API consists of four directives from OpenMP and new additional directives for memory management such as local memory, distributed shared memory, on-chip centralized shared memory and off-chip centralized shared memory, for power reduction with frequency-voltage control clock gating and power gating and time management for real-time processing. The generated parallel programs can be executed on commercial SMP and CC-NUMA machines using ordinary OpenMP compilers and also low power embedded multicores for consumer electronics. The compiler exploits coarse grain task parallelism, globally optimizes data locality for cache and local memory and controls frequency/voltage and power shutdown for each cores during parallel execution of a single application program. The source to source compiler succeeded to boost up the vendors compiler performance on various multicores and servers such as IBM p595 SMP server using Power6, Intel Quad core Xeon, SGI Altix 450 using Intel Itanium2 (Montvale) and low power embedded multicores such as Renesas-Hitachi-Waseda RP2 having 8 SH4A cores, Fujitsu FR1000 having 4 VLIW cores with non-coherent cache and NEC NaviEngine having NEC-ARM 4 core MPCore.

back to agenda