Ashari, A., Del Vento, D., & Sadayappan, P. (2012). Machine learning-based compiler optimization [poster]. In AGU Fall Meeting 2012. American Geophysical Union: San Francisco, CA, US.
Scientists running high performance geophysical models want to achieve the fastest runtime possible for their software on any machine. For this goal, they usually select compilers’ default aggressive optimization flag, however this is often a suboptimal choice. In fact, the best matching set of o... Show moreScientists running high performance geophysical models want to achieve the fastest runtime possible for their software on any machine. For this goal, they usually select compilers’ default aggressive optimization flag, however this is often a suboptimal choice. In fact, the best matching set of optimization flags depends both on the underline hardware and on the characteristics of the program. The fast change and rapid improvement in microprocessor technology, and diversity of program profiles make finding such a best set of optimization flags a challenging task. This problem is NP-Complete and thus it is not possible to find the general exact solution. However, an approximate solution may be found. Recently, computer science researchers have applied machine-learning algorithms on both static and dynamic profiling features of computer programs including hardware performance counters to achieve semi-optimal transformations on different benchmarks. cTuning is one of the existing tools, which is an open source framework that profiles static features of computer programs and uses machine learning algorithms to find a semi-optimal set of compiler optimization flags, which are often better than the default aggressive optimization such as -O3 for GCC and -fast for the PGI compiler. In this paper, we extended cTuning by adding new kernels, more relevant to atmospheric models, to the training database. We tested the framework on few simple models in use at NCAR: Shallow Water, EULAG and HD3D. We performed our experiments on Janus high performance computer system. We show our results comparing the execution time of iterative compilation, default aggressive optimization flag/s and flags chosen by machine learning for both the GCC and the PGI Fortran compilers. Show less