Skip to the content.

ASPLOS’25 Tutorial: GenAI Catalyst

Abstract

In the rapidly evolving landscape of generative artificial intelligence (AI), the efficiency of underlying systems and compilers plays a crucial role in enabling scalable, sustainable, and accessible AI technologies. This tutorial aims to provide participants with a comprehensive understanding of the state-of-the-art techniques in the design and implementation of systems and compilers that optimize the performance of generative AI models, especially for large language models (LLMs).

First, we will present Mirage, the first multi-level superoptimization-based tensor program compiler that can help developers to generate fast GPU Kernels without programming in CUDA/Triton. Second, we will present FlexFlow Serve, a distributed runtime system for low-latency, high-performance LLM serving. We will also introduce FlexLLM, a distributed runtime for memory-efficient LLM finetuning.

Participants will learn about the latest research in AI infrastructure to significantly improve the efficiency of generative AI applications. The tutorial will also feature interactive sessions and hands-on demonstrations, allowing participants to interact directly with the systems and compilers discussed.

This tutorial

This tutorial will be held at ASPLOS 2025 in Rotterdam on Sunday, March 30th, 2025, morning at Room Goudriaan II of Postillion Hotel & Convention Centre WTC Rotterdam.

Tentative Schedule

Time Topic
10 mins Introduction
60 mins Efficient Compilers for Generative AI
30 mins Coffee break
90 mins Efficient Systems for Generative AI
Closing Remark Closing Remark

Organizer

  Organizer
Xupeng Miao is a Kevin C. and Suzanne L. Kahn New Frontiers Assistant Professor in the Department of Computer Science at Purdue University.
Zhihao Jia is an assistant professor in the Computer Science Department at Carnegie Mellon University.

Contact us

For any further questions, please contact Xupeng Miao via xupeng@purdue.edu.