I am a Postdoctoral Associate in the Department of Electrical and Computer Engineering at Duke University, working with Prof. Yiran Chen and Prof. Hai (Helen) Li. I received my Ph.D. in Computer Science from Shanghai Jiao Tong University in 2023, supervised by Prof. Jingwen Leng.

My research focuses on computer architecture, especially scalable hardware–software co-design for efficient AI systems. I have developed sparsity- and quantization-aware architectures for model compression and overall efficiency. Recently, I have been exploring architectural designs tailored for large language models (LLMs).

Over the past five years, I have published 16 papers at the four flagship computer architecture conferences (ISCA, MICRO, HPCA, and ASPLOS), among which 11 are first- or corresponding-author publications (ISCA ×5, HPCA ×3, MICRO ×2, ASPLOS ×1). My work has received an HPCA 2026 Best Paper Nomination, an ASPLOS 2026 Best Paper Nomination, and was selected as an IEEE 2022 Micro Top Pick (Honorable Mention). An up-to-date publication and citation record is available on my Google Scholar .

🔥 News

2026.07: 🎉 One paper was accepted to MICRO 2026.
2026.03: 🎉 One paper was accepted to ISCA 2026.
2026.03: 🔥 Our ASPLOS 2026 paper (M2XFP) was nominated for Best Paper.
2026.02: 🎉🎉 Two papers were accepted to DAC 2026.
2026.01: 🎉 One paper was accepted to ICLR 2026.
2026.01: 🔥 Our HPCA 2026 paper (Focus) was nominated for Best Paper (one of four nominees, 4/119 accepted).
2025.11: 🎉🎉 Two papers were accepted to HPCA 2026.
2025.10: 🎉 Nominated for the 2025 Outstanding Postdoc Award at Duke University.
2025.09: 🎉 One paper was accepted to ASP-DAC 2026.
2025.03: 🎉🎉🎉 Three papers were accepted to ISCA 2025.
2024.11: 🎉🎉🎉 Three papers were accepted to HPCA 2025.
2024.03: 🎉 I received the 2023 Shanghai Jiao Tong University Outstanding Doctoral Dissertation Award
(15 recipients university-wide, <1% per year; 2023年度上海交通大学优秀博士学位论文，全校共15人).
2023.11: 🎉🎉 Two papers were accepted to ASPLOS 2024.

💻 Experience

2023.12 - Now, Postdoctoral associate, Department of ECE, Duke University.
2021.06 - 2023.12, Research intern, Shanghai Qi Zhi Institute.
2023.04 - 2023.09, Research intern, ANT Group (AliPay).
2020.06 - 2021.05, Research intern, Microsoft Research Asia (Beijing).
2019.05 - 2019.12, Intern, NVIDIA (Shanghai).

📝 Publications

Selected Publications

*: Corresponding Author; =: Equal Contribution

[1] MICRO 2026 Feng Cheng, Cong Guo*, Junyao Zhang, Haoxuan Shan, Chiyue Wei, Hong Wang, Hai “Helen” Li, Yiran Chen; Gossamer: A Utility-Driven Architecture for Constant-Budget KV Cache Compression in Reasoning LLMs. In IEEE/ACM International Symposium on Microarchitecture (MICRO), 2026.

[2] ISCA 2026 Bowen Duan=, Cong Guo=*, Chiyue Wei, Haoxuan Shan, Yuzhe Fu, Xinhua Chen, Yifan Xu, Ziyue Zhang, Changchun Zhou, Hai “Helen” Li, Yiran Chen; EVA: Accelerating LLM Decoding via an Efficient Vector Quantization Architecture. In International Symposium on Computer Architecture (ISCA), 2026.

[3] ICLR 2026 Xinhua Chen=, Sitao Huang=, Cong Guo=*, Chiyue Wei, Yintao He, Jianyi Zhang, Hai “Helen” Li, Yiran Chen; DPad: Efficient Diffusion Language Models with Suffix Dropout. In International Conference on Learning Representations (ICLR), 2026.

[4] HPCA 2026 Chiyue Wei=, Cong Guo=*, Junyao Zhang, Haoxuan Shan, Yifan Xu, Ziyue Zhang, Yudong Liu, Qinsi Wang, Changchun Zhou, Hai “Helen” Li, Yiran Chen; Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models. In IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2026. (Best Paper Nomination)

[5] HPCA 2026 Yuzhe Fu, Changchun Zhou*, Hancheng Ye, Bowen Duan, Qiyu Huang, Chiyue Wei, Cong Guo*, Hai “Helen’’ Li, Yiran Chen; FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing. In IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2026.

[6] ASP-DAC 2026 Haoxuan Shan, Cong Guo*, Chiyue Wei, Feng Cheng, Junyao Zhang, Hai “Helen’’ Li, Yiran Chen; Platinum: Path-Adaptable LUT-Based Accelerator Tailored for Low-Bit Weight Matrix Multiplication. In Asia and South Pacific Design Automation Conference (ASP-DAC), 2026.

[7] DAC 2026 Yuzhe Fu, Hancheng Ye, Cong Guo*, Junyao Zhang, Qinsi Wang, Yueqian Lin, Changchun Zhou, Hai “Helen” Li, Yiran Chen; FlashFPS: Efficient Farthest Point Sampling for Large-Scale Point Clouds via Pruning and Caching. In Design Automation Conference (DAC), 2026.

[8] DAC 2026 Yi Xiong, Jiale Xu, Rui Zhang, Cong Guo*, Zihan Liu, Yangjie Zhou, Weiming Hu, Hao Wu, Boyu Li, Junping Zhao, Minyi Guo, Jingwen Leng, Zongwei Zhu, Xuehai Zhou; eLLM: Elastic Memory Management Framework for Efficient LLM Serving. In Design Automation Conference (DAC), 2026.

[9] ISCA 2025 Cong Guo*, Chiyue Wei, Jiaming Tang, Bowen Duan, Song Han, Hai Li, Yiran Chen; Transitive Array: An Efficient GEMM Accelerator with Result Reuse. In International Symposium on Computer Architecture (ISCA), 2025.

[10] ISCA 2025 Chiyue Wei, Bowen Duan, Cong Guo*, Jingyang Zhang, Qingyue Song, Hai Li, Yiran Chen; Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks. In International Symposium on Computer Architecture (ISCA), 2025.

[11] ISCA 2025 Feng Cheng, Cong Guo*, Chiyue Wei, Junyao Zhang, Changchun Zhou, Edward Hanson, Jiaqi Zhang, Xiaoxiao Liu, Hai Li, Yiran Chen; Ecco: Improving Memory Bandwidth and Capacity for LLMs via Entropy-aware Cache Compression. In International Symposium on Computer Architecture (ISCA), 2025.

[12] HPCA 2025 Chiyue Wei, Cong Guo*, Feng Cheng, Shiyu Li, Hao Yang, Hai Li, Yiran Chen; Prosperity: Accelerating Spiking Neural Networks via Product Sparsity. In IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2025.

[13] IEEE CAS Mag 2025 Cong Guo, Feng Cheng, Zhixu Du, James Kiessling, Jonathan Ku, Shiyu Li, Ziru Li, Mingyuan Ma, Tergel Molom-Ochir, Benjamin Morris, Haoxuan Shan, Jingwei Sun, Yitu Wang, Chiyue Wei, Xueying Wu, Yuhao Wu, Hao Frank Yang, Jingyang Zhang, Junyao Zhang, Qilin Zheng, Guanglei Zhou, Hai Li, Yiran Chen; A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models. In IEEE Circuits and Systems Magazine (CAS Mag), 2025.

[14] ASPLOS 2024 Cong Guo, Rui Zhang, Jiale Xu, Jingwen Leng, Zihan Liu, Ziyu Huang, Minyi Guo, Hao Wu, Shouren Zhao, Junping Zhao, Ke Zhang; GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024.

[15] IEEE TC 2024 Cong Guo, Fengchen Xue, Jingwen Leng, Yuxian Qiu, Yue Guan, Weihao Cui, Quan Chen, Minyi Guo; Accelerating Sparse DNNs Based on Tiled GEMM. In IEEE Transactions on Computers (TC), 2024.

[16] ISCA 2023 Cong Guo$^=$, Jiaming Tang$^=$, Weiming Hu, Jingwen Leng, Chen Zhang, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu; OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization. In International Symposium on Computer Architecture (ISCA), 2023.

[17] MICRO 2022 Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu; ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization. In IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022. (2022 IEEE Micro Top Picks Honorable Mention)

[18] ICLR 2022 Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo; SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation. In International Conference on Learning Representations (ICLR), 2022.

[19] ICCD 2022 Cong Guo, Yuxian Qiu, Jingwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo; Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training. In IEEE International Conference on Computer Design (ICCD), 2022.

[20] SC 2020 Cong Guo, Bo Yang Hsueh, Jingwen Leng, Yuxian Qiu, Yue Guan, Zehuan Wang, Xiaoying Jia, Xipeng Li, Minyi Guo, Yuhao Zhu; Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2020.

[21] DAC 2020 Cong Guo, Yangjie Zhou, Jingwen Leng, Yuhao Zhu, Zidong Du, Quan Chen, Chao Li, Bin Yao, Minyi Guo; Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration. In Design Automation Conference (DAC), 2020.

Collaborative Publications

[22] ASPLOS 2026 Weiming Hu, Zihan Zhang, Haoyan Zhang, Chen Zhang, Cong Guo, Yu Feng, Tianchi Hu, Guanglin Li, Guipeng Hu, Junsong Wang, Jingwen Leng; M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2026. (Best Paper Nomination)

[23] SC 2025 Yangjie Zhou, Honglin Zhu, Qian Qiu, Weihao Cui, Zihan Liu, Peng Chen, Mohamed Wahib, Cong Guo, Siyuan Feng, Jintao Meng, Haidong Lan, Jingwen Leng, Yun Lin, Jin Song Dong, Wenxi Zhu, Minwen Deng; A Sample-Free Compilation Framework for Efficient Dynamic Tensor Computation. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2025.

[24] IEEE TCAD 2025 Yangjie Zhou, Zhihui Zhang, Shuwen Lu, Cong Guo, Jingwen Leng, Feng Zhang, Yufei Ma, Yun Liang, Minyi Guo; A Full-Stack Framework for GNN Acceleration via Partition-Compiler-Architecture Co-Design. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025.

[25] HPCA 2025 Zihan Liu, Xinhao Luo, Junxian Guo, Wentao Ni, Yangjie Zhou, Yue Guan, Cong Guo, Weihao Cui, Yu Feng, Minyi Guo, Yuhao Zhu, Minjia Zhang, Jingwen Leng, Chen Jin; VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference. In IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2025.

[26] HPCA 2025 Weiming Hu, Haoyan Zhang, Cong Guo, Yu Feng, Renyang Guan, Zhendong Hua, Zihan Liu, Yue Guan, Minyi Guo, Jingwen Leng; MANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type. In IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2025.

[27] ICCV 2025 Linshen Liu, Boyan Su, Junyue Jiang, Guanlin Wu, Cong Guo, Ceyu Xu, Hao Frank Yang; Towards Accurate and Efficient 3D Object Detection for Autonomous Driving: A Mixture of Experts Computing System on Edge. In International Conference on Computer Vision (ICCV), 2025.

[28] ASPLOS 2024 Zihan Liu, Wentao Ni, Jingwen Leng, Yu Feng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu; JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024.

[29] IEEE TC 2024 Chen Zhang, Yang Wang, Zhiqiang Xie, Cong Guo, Yunxin Liu, Jingwen Leng, Guangyu Sun, Zhigang Ji, Runsheng Wang, Yuan Xie, Ru Huang; DSTC: Dual-Side Sparsity Tensor Core for DNNs Acceleration on Modern GPU Architectures. In IEEE Transactions on Computers (TC), 2024.

[30] CF 2023 Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo; AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs. In Computing Frontiers (CF), 2023.

[31] MSN 2022 Mustafa Tarik Sanic, Cong Guo, Jingwen Leng, Minyi Guo, Weiyin Ma; Towards Reliable AI Applications via Algorithm-Based Fault Tolerance on NVDLA. In International Conference on Mobility, Sensing and Networking (MSN), 2022. (Best Paper Award)

[32] IISWC 2021 Yangjie Zhou, Mengtian Yang, Cong Guo, Jingwen Leng, Yun Liang, Quan Chen, Minyi Guo, Yuhao Zhu; Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators. In IEEE International Symposium on Workload Characterization (IISWC), 2021.

[33] ISCA 2021 Yang Wang, Chen Zhang, Zhiqiang Xie, Cong Guo, Yunxin Liu, Jingwen Leng; Dual-side Sparse Tensor Core. In International Symposium on Computer Architecture (ISCA), 2021.

[34] CVPR 2019 Yuxian Qiu, Jingwen Leng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu; Adversarial Defense Through Network Profiling Based Path Extraction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

🏆 Honors and Awards

2026.03 ASPLOS 2026 Best Paper Nomination
2026.01 HPCA 2026 Best Paper Nomination
2025.10 Nominee for the 2025 Outstanding Postdoctoral Award (24 nominees university-wide), Duke University
2024.03 Outstanding Doctoral Dissertation Award (15 recipients university-wide), Shanghai Jiao Tong University
2023.07 IEEE Micro Top Picks from 2022 Computer Architecture Conferences Honorable Mention
2023.06 Outstanding Doctoral Graduates, Shanghai Jiao Tong University
2022.08 Excellent Ph.D. Scholarship of Yang Yuanqing Education Fund (Top-3/500+), Shanghai Jiao Tong University
2020.11 Ph.D. National Scholarship (Top-8/500+), Ministry of Education, PRC
2020.07 DAC2020 Richard Newton Young Student Fellow, Design Automation Conference
2018.11 VMware Scholarship, Shanghai Jiao Tong University
2017.11 National Second Prize, The 14th China Post-Graduate Mathematical Contest in Modeling

👔 Academic Service

Journal Reviewer

ACM Transactions on Embedded Computing Systems (TECS)
ACM Transactions on Architecture and Code Optimization (TACO)
IEEE Computer Architecture Letters (CAL)
IEEE Transactions on Computers (TC)
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
IEEE Transactions on Circuits and Systems for Artificial Intelligence (TCASAI)
IEEE Transactions on Very Large Scale Integration (VLSI) Systems (TVLSI)
Journal of Systems Architecture (JSA)
Science China Information Sciences (SCIS)

Conference Service

Publicity Co-Chair, HPCA 2027
Session Chair, ISCA 2026
Session Chair, DAC 2026
Technical Program Committee (TPC) Member, DAC 2026
Technical Program Committee (TPC) Member, MICRO 2026
Reviewer, AAAI 2027
Reviewer, NeurIPS 2026
Reviewer, ICCV 2025

📖 Educations

2020.09 - 2023.09, Ph.D in Computer Science, Department of Computer Science and Engineering, Shanghai Jiao Tong University.
2017.09 - 2020.03, M.E. in Computer Technology, Department of Computer Science and Engineering, Shanghai Jiao Tong University.
2012.09 - 2016.06, B.S. in Computer Science, College of Computer Science and Software Engineering, Shenzhen University.