Supercomputing 2025 was the place where “HPC ignites” this week in Saint Louis, MO, and Rohan Marwaha helped light the fire with his presentation on “Frameworks for LLM Serving in HPC Environments”.

This paper was authored by: Rohan Marwaha, Qinren Zhou, Kastan Day, Asmita Dabholkar and Volodymyr Kindratenko
Rohan presented it during the Frontiers in Generative AI for HPC Science and Engineering: Foundations, Challenges, and Opportunities session. The topic discussion was around:
Frontiers in Generative AI for HPC Science and Engineering: Foundations, Challenges, and Opportunities.
Realizing the promise of large-scale foundation models for scientific discovery—enabling self-driving laboratories, hypothesis generation, and more—requires unprecedented computational scale and multidisciplinary efforts to prepare diverse scientific data. While only a few organizations can train state-of-the-art models from scratch (e.g., trillions of parameters, tens of trillions of tokens), advances in training strategies and fine-tuning have expanded accessibility. Simultaneously, breakthroughs in training methodologies and data quality are dramatically reducing training costs and improving the performance of even smaller AI models. As AI models advance in general-purpose tasks, the scientific community is refining methods to evaluate and enhance their scientific reasoning capabilities, a critical challenge for trustworthy AI in science. This workshop, catalyzed by the Trillion Parameter Consortium (TPC), will highlight collaborations in scientific skills evaluation, performance optimization, federated learning, responsible AI, and other topics. SC24 drew 33 submissions, with 13 presented to nearly 200 attendees, underscoring the rapid evolution of this field.
If you would like to view the slide deck it is located HERE.