Parallelization and Performance of the NIM Weather Model on CPU, GPU, and MIC Processors
The design and performance of the Non-Hydrostatic Icosahedral Model (NIM) global weather prediction model is described. NIM is a dynamical core designed to run on central processing unit (CPU), graphics processing unit (GPU), and Many Integrated Core (MIC) processors. It demonstrates efficient parallel performance and scalability to tens of thousands of compute nodes and has been an effective way to make comparisons between traditional CPU and emerging fine-grain processors. The design of the NIM also serves as a useful guide in the fine-grain parallelization of the finite volume cubed (FV3) model recently chosen by the National Weather Service (NWS) to become its next operational global weather prediction model.
This paper describes the code structure and parallelization of NIM using standards-compliant open multiprocessing (OpenMP) and open accelerator (OpenACC) directives. NIM uses the directives to support a single, performance-portable code that runs on CPU, GPU, and MIC systems. Performance results are compared for five generations of computer chips including the recently released Intel Knights Landing and NVIDIA Pascal chips. Single and multinode performance and scalability is also shown, along with a cost–benefit comparison based on vendor list prices.