Pengfei Zheng
Staff Researcher
Huawei Technologies
Ph.D. in Computer Science
Machine Learning and AI Systems
Biography
Publications
News - Our work on distribution-free Bayesian optimization for data system auto-tuning will appear at SIGMOD'25. Mirage: MOE + Decision Transformer for non-interruption, non-overlap resource provision in ML training appeared at SC'23! Shockwave: Fair and Efficient Scheduling for Dynamic Adaptation in Machine Learning will appeared NSDI'23! I am a full-time staff researcher at Huawei Technologies. I can be reached via email at . I work in the area of autonomous system agents, distributed system and disaggregated datacenter architecture. In one direction, I design statistical and neural learning methods to model the dynamics and uncertainty of large-scale distributed computer systems, and furthermore, I design of algorithmic decision-making mechanisms (e.g., convex and nonlinear optimization, Bayesian optimization, contextual bandit, and reinforcement learning) to optimize system performance, efficiency and scalablity. My recent work builds black-box optimization algorithms for fast, autonomous system tuning, recommendation algorithms for real-time customization of preconditioner & solver for linear equation systems, dynamic market theory and stochastic policy learning for scheduling & resource allocation, etc. In the other direction, I work on high-performance machine learning training and inference. Specifically, I build multi-dimensional, hybrid parallelism solvers (e.g., data, tensor, pipeline, sequence, expert parallelism, offload and re-materialization schedule, pipeline interleaving schedule) that optimize token throughput, MFU (Model FLOPs Utilization) and scale-out linearity for hyper-scale LLM (Large Language Model) training on massive GPU/NPU clusters. Moreover, I work on boosting LLM inference with dynamic MOE layers and real-time model and input pruning techniques. I received my PhD degree in Computer Science from Duke University. My PhD advisor is Prof. Benjamin C. Lee and my dissertation is on machine learning for datacenter operations. From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and Prof. Shivaram Venkataraman. My postdoctoral research at UW-Madison was supported by Computer Research Association (CRA) - Computing Innovation Fellowship by July 2021. As a research intern, I was delighted to work with Dr. Kim Hazelwood at Facebook during the summer of 2017 and 2018. Dr. Hazelwood work at the interface between systems and artificial intelligence, and servers the engineering director at Facebook AI Research (FAIR). My intern project at Facebook is about understanding datacenter workloads at Facebook's scale with graph thoery and statistical semantic learning algorithms. During my internship, I also work closely with Dr. Xiaodong Wang, who is a Research Scientist at Facebook AI Infra Foundation and Dr. David Brooks, who is a Professor in Computer Science at Harvard University. After internship, I have been working for Facebook as a part-time researcher. I was a research intern at at Lenovo Research under suervision of principal researcher Jianming Zhang during the summer of 2019. My intern project at Lenovo is about causal inference for online performance diagnosis of containerized microservices and black-box optimization for micro-services performance recovery. I had been supervising the following students on multiple research projects: Lynn Liu (on HyperAPX, now Ph.D. in CS at UC Berkeley) Rui Pan (on Shockwave, now Ph.D. in CS at Princeton University) Calvin Ma (on Hound, now Software Engineer at Goldman Sachs) Before joining Duke, I studied Computer Science at Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.
Mirage: MOE + Decision Transformer for non-interruption, non-overlap resource provision in ML training appeared at SC'23! Shockwave: Fair and Efficient Scheduling for Dynamic Adaptation in Machine Learning will appeared NSDI'23! I am a full-time staff researcher at Huawei Technologies. I can be reached via email at . I work in the area of autonomous system agents, distributed system and disaggregated datacenter architecture. In one direction, I design statistical and neural learning methods to model the dynamics and uncertainty of large-scale distributed computer systems, and furthermore, I design of algorithmic decision-making mechanisms (e.g., convex and nonlinear optimization, Bayesian optimization, contextual bandit, and reinforcement learning) to optimize system performance, efficiency and scalablity. My recent work builds black-box optimization algorithms for fast, autonomous system tuning, recommendation algorithms for real-time customization of preconditioner & solver for linear equation systems, dynamic market theory and stochastic policy learning for scheduling & resource allocation, etc. In the other direction, I work on high-performance machine learning training and inference. Specifically, I build multi-dimensional, hybrid parallelism solvers (e.g., data, tensor, pipeline, sequence, expert parallelism, offload and re-materialization schedule, pipeline interleaving schedule) that optimize token throughput, MFU (Model FLOPs Utilization) and scale-out linearity for hyper-scale LLM (Large Language Model) training on massive GPU/NPU clusters. Moreover, I work on boosting LLM inference with dynamic MOE layers and real-time model and input pruning techniques. I received my PhD degree in Computer Science from Duke University. My PhD advisor is Prof. Benjamin C. Lee and my dissertation is on machine learning for datacenter operations. From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and Prof. Shivaram Venkataraman. My postdoctoral research at UW-Madison was supported by Computer Research Association (CRA) - Computing Innovation Fellowship by July 2021. As a research intern, I was delighted to work with Dr. Kim Hazelwood at Facebook during the summer of 2017 and 2018. Dr. Hazelwood work at the interface between systems and artificial intelligence, and servers the engineering director at Facebook AI Research (FAIR). My intern project at Facebook is about understanding datacenter workloads at Facebook's scale with graph thoery and statistical semantic learning algorithms. During my internship, I also work closely with Dr. Xiaodong Wang, who is a Research Scientist at Facebook AI Infra Foundation and Dr. David Brooks, who is a Professor in Computer Science at Harvard University. After internship, I have been working for Facebook as a part-time researcher. I was a research intern at at Lenovo Research under suervision of principal researcher
Shockwave: Fair and Efficient Scheduling for Dynamic Adaptation in Machine Learning will appeared NSDI'23! I am a full-time staff researcher at Huawei Technologies. I can be reached via email at .
I am a full-time staff researcher at Huawei Technologies.
I work in the area of autonomous system agents, distributed system and disaggregated datacenter architecture. In one direction, I design statistical and neural learning methods to model the dynamics and uncertainty of large-scale distributed computer systems, and furthermore, I design of algorithmic decision-making mechanisms (e.g., convex and nonlinear optimization, Bayesian optimization, contextual bandit, and reinforcement learning) to optimize system performance, efficiency and scalablity. My recent work builds black-box optimization algorithms for fast, autonomous system tuning, recommendation algorithms for real-time customization of preconditioner & solver for linear equation systems, dynamic market theory and stochastic policy learning for scheduling & resource allocation, etc. In the other direction, I work on high-performance machine learning training and inference. Specifically, I build multi-dimensional, hybrid parallelism solvers (e.g., data, tensor, pipeline, sequence, expert parallelism, offload and re-materialization schedule, pipeline interleaving schedule) that optimize token throughput, MFU (Model FLOPs Utilization) and scale-out linearity for hyper-scale LLM (Large Language Model) training on massive GPU/NPU clusters. Moreover, I work on boosting LLM inference with dynamic MOE layers and real-time model and input pruning techniques. I received my PhD degree in Computer Science from Duke University. My PhD advisor is Prof. Benjamin C. Lee and my dissertation is on machine learning for datacenter operations. From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and Prof. Shivaram Venkataraman. My postdoctoral research at UW-Madison was supported by Computer Research Association (CRA) - Computing Innovation Fellowship by July 2021. As a research intern, I was delighted to work with Dr. Kim Hazelwood at Facebook during the summer of 2017 and 2018. Dr. Hazelwood work at the interface between systems and artificial intelligence, and servers the engineering director at Facebook AI Research (FAIR). My intern project at Facebook is about understanding datacenter workloads at Facebook's scale with graph thoery and statistical semantic learning algorithms. During my internship, I also work closely with Dr. Xiaodong Wang, who is a Research Scientist at Facebook AI Infra Foundation and Dr. David Brooks, who is a Professor in Computer Science at Harvard University. After internship, I have been working for Facebook as a part-time researcher.
In the other direction, I work on high-performance machine learning training and inference. Specifically, I build multi-dimensional, hybrid parallelism solvers (e.g., data, tensor, pipeline, sequence, expert parallelism, offload and re-materialization schedule, pipeline interleaving schedule) that optimize token throughput, MFU (Model FLOPs Utilization) and scale-out linearity for hyper-scale LLM (Large Language Model) training on massive GPU/NPU clusters. Moreover, I work on boosting LLM inference with dynamic MOE layers and real-time model and input pruning techniques. I received my PhD degree in Computer Science from Duke University. My PhD advisor is Prof. Benjamin C. Lee and my dissertation is on machine learning for datacenter operations. From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and Prof. Shivaram Venkataraman. My postdoctoral research at UW-Madison was supported by Computer Research Association (CRA) - Computing Innovation Fellowship by July 2021.
I received my PhD degree in Computer Science from Duke University. My PhD advisor is Prof. Benjamin C. Lee and my dissertation is on machine learning for datacenter operations. From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and Prof. Shivaram Venkataraman. My postdoctoral research at UW-Madison was supported by
From September 2020 to July 2021, I was a Postdoctoral Fellow in the Department of Computer Sciences at University of Wisconsin-Madison (UW-Madison). My faculty mentors were Prof. Aditya Akella and
As a research intern, I was delighted to work with Dr. Kim Hazelwood at Facebook during the summer of 2017 and 2018. Dr. Hazelwood work at the interface between systems and artificial intelligence, and servers the engineering director at Facebook AI Research (FAIR). My intern project at Facebook is about understanding datacenter workloads at Facebook's scale with graph thoery and statistical semantic learning algorithms. During my internship, I also work closely with Dr. Xiaodong Wang, who is a Research Scientist at Facebook AI Infra Foundation and
I was a research intern at at
I had been supervising the following students on multiple research projects: Lynn Liu (on HyperAPX, now Ph.D. in CS at UC Berkeley) Rui Pan (on Shockwave, now Ph.D. in CS at Princeton University) Calvin Ma (on Hound, now Software Engineer at Goldman Sachs) Before joining Duke, I studied Computer Science at Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.
Lynn Liu (on HyperAPX, now Ph.D. in CS at UC Berkeley) Rui Pan (on Shockwave, now Ph.D. in CS at Princeton University) Calvin Ma (on Hound, now Software Engineer at Goldman Sachs) Before joining Duke, I studied Computer Science at Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.
Rui Pan (on Shockwave, now Ph.D. in CS at Princeton University) Calvin Ma (on Hound, now Software Engineer at Goldman Sachs) Before joining Duke, I studied Computer Science at Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.
Calvin Ma (on Hound, now Software Engineer at Goldman Sachs) Before joining Duke, I studied Computer Science at Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.
Before joining Duke, I studied Computer Science at Xi'an Jiaotong University (XJTU). At XJTU, I worked on modeling computer system with statistical learning and econometric theories. As a student, I also visited The Institute of Computing Technology, Chinese Academy of Sciences, and the Trustworthy and Intelligent Internet Computing Laboratory, The Chinese University of Hong Kong for joint research projects.