In this project, my team and I transformed a serialized version of a hypersonic glide vehicle trajectory simulation, originally written in Fortran, into C code. This conversion was driven by our familiarity with C.

We then implemented the simulation in CUDA using the C code as a reference serial program. The motivation behind employing CUDA was the future integration of varying control surfaces into the program. These integrations necessitate increased time steps per simulation, where GPU implementations can transform the simulation time from days to mere minutes.

Our focus was on optimizing memory coalescing and data reuse, employing NVIDIA’s Nsight tools for detailed project profiling. These efforts were crucial in achieving a 13x speedup in our simulations. We anticipate that with longer simulations, the benefits of GPU acceleration will be even more pronounced, as the initial startup and one-time data transfer become relatively negligible.

This project provided a wealth of learning opportunities. I developed a deeper understanding of loop mitosis techniques and the critical nature of data management when working with GPUs. Additionally, I learned the importance of maximizing GPU occupancy and fine-tuning kernel launches to achieve optimal speed and performance.