Pengfei Zheng

 

Staff Researcher

Huawei Technologies

 

Ph.D. in Computer Science

Machine Learning and AI Systems

 

 

 

News: Mirage: MOE + Decision Transformer for non-interruption, non-overlap resource provision in ML training will will appear at SC'23!

 

News: Shockwave: Fair and Efficient Scheduling for Dynamic Adaptation in Machine Learning will appear at NSDI'23!

 

I am a full-time staff researcher at Huawei Technologies. I can be reached via email at .

 

I work in the area of machine learning and artificial intelligence systems. In one direction, I am interested in the design of statistical and neural learning methods to model the dynamics and uncertainty of large-scale computer systems (e.g., distributed systems or runtime, warehouse-scale clouds, and supercomputers), and furthermore, the design of algorithmic decision-making mechanisms (e.g., convex and nonlinear optimization, Bayesian optimization, contextual bandit, and reinforcement learning) to optimize system performance, efficiency and scalablity. My recent work builds black-box optimization algorithms for fast, autonomous system tuning, recommendation algorithms for real-time customization of preconditioner & solver for linear equation systems, dynamic market theory and stochastic policy learning for scheduling & resource allocation, etc.

 

In the other direction, I work on high-performance machine learning training and inference. Specifically, I build multi-dimensional, hybrid parallelism solvers (e.g., data, tensor, pipeline, sequence, expert parallelism, offload and re-materialization schedule, pipeline interleaving schedule) that optimize token throughput, MFU (Model FLOPs Utilization) and scale-out linearity for hyper-scale LLM (Large Language Model) training on massive GPU/NPU clusters. Moreover, I work on boosting LLM inference with dynamic MOE layers and real-time model and input pruning techniques.

 

I received my PhD degree in Computer Science from Duke University. My PhD advisor is Prof. Benjamin C. Lee and my dissertation is on machine learning for datacenter operations.

 

From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and Prof. Shivaram Venkataraman. My postdoctoral research at UW-Madison was supported by Computer Research Association (CRA) - Computing Innovation Fellowship by July 2021.

 

As a research intern, I was delighted to work with Dr. Kim Hazelwood at Facebook during the summer of 2017 and 2018. Dr. Hazelwood work at the interface between systems and artificial intelligence, and servers the engineering director at Facebook AI Research (FAIR). My intern project at Facebook is about understanding datacenter workloads at Facebook's scale with graph thoery and statistical semantic learning algorithms. During my internship, I also work closely with Dr. Xiaodong Wang, who is a Research Scientist at Facebook AI Infra Foundation and Dr. David Brooks, who is a Professor in Computer Science at Harvard University. After internship, I have been working for Facebook as a part-time researcher.

 

I was a research intern at at Lenovo Research under suervision of principal researcher Jianming Zhang during the summer of 2019. My intern project at Lenovo is about causal inference for online performance diagnosis of containerized microservices and black-box optimization for micro-services performance recovery.

 

I had been supervising the following students on multiple research projects:

Lynn Liu (on HyperAPX, now Ph.D. in CS at UC Berkeley)

Rui Pan (on Shockwave, now Ph.D. in CS at Princeton University)

Calvin Ma (on Hound, now Software Engineer at Goldman Sachs)

 

Before joining Duke, I studied Computer Science at  Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.