Pengfei Zheng

 

Ph.D. in Computer Science

 

Computer System Research:

An Artifical Intelligence and Algorithmic Decision Making Assisted Approach

 

 

 

Qiyang Ding*, Pengfei Zheng*, Shreyas Kudari, Shivaram Venkataraman, Zhao Zhang, Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning. The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), *co-first author.

 

Pengfei Zheng, Rui Pan, Tarannum Khan, Shivaram Venkataraman, Aditya Akella, et al. Shockwave: Proactive, Efficient and Fair Scheduling for Dynamic Adaptation in Machine Learning. 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI'23).

 

Pengfei Zheng, et al. Resource-aware Bandit and Mixed-Precision Computing for Fast Hyperparameter Exploration. arXiv preprint 2021.

 

Pengfei Zheng, et al. Limelight+: Graph Theory and Semantic Learning for Understanding Workload at Facebook Datacenter Scale. arXiv preprint 2021.

 

Pengfei Zheng, Benjamin C.Lee. Hound: Causal Learning for Datacenter-scale Straggler Diagnosis. Proc. of the ACM on Measurement and Analysis of Computing Systems (SIGMETRICS'18) , Irvine, CA, June 2018. [Full conference paper]


Pengfei Chen, Yong Qi, Pengfei Zheng, Di Hou. CauseInfer: Automatic and distributed performance diagnosis with hierarchical causality graph in large distributed systems. IEEE Conference on Computer Communications (INFOCOM 2014) , Toronto, ON, 2014. [Full conference paper]


Pengfei Zheng, Yong Qi, Yangfan Zhou, Pengfei Chen, Jianfeng Zhan, Michael Rung-Tsong Lyu. An Automatic Framework for Detecting and Characterizing Performance Degradation of Software Systems. IEEE Transactions on Reliability , vol. 63, no. 4, pp. 927-943, Dec. 2014. [Full Journal paper]


Pengfei Chen, Yong Qi, Yangfan Zhou, Pengfei Zheng, Yihan Wu. Multi-scale Entropy: One Metric of Software Aging. SOSE '13: Proceedings of the 2013 IEEE Seventh International Symposium on Service-Oriented System Engineering, vol. 63, no. 4, pp. 927-943, Dec. 2014. [Full Conference paper]