ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive

Published in NeurIPS 2025, 2025

ClusterFusion introduces cluster-level collective primitives to expand operator fusion scope for LLM inference on modern GPUs.

Recommended citation: Xinhao Luo, Zihan Liu, Yangjie Zhou, Shihan Fang, Ziyu Huang, Yu Feng, Chen Zhang, Shixuan Sun, Zhenzhe Zheng, Jingwen Leng, and Minyi Guo. (2025). "ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive." NeurIPS 2025.
Download Paper