Cong Fang

Cong Fang

Assistant Professor
Peking University
Email: fangcong at pku dot edu dot cn

Full Paper List [By Topics] [Google Scholar][Back to Homepage]

Scaling Law for Stochastic Gradient Descent in Quadratically Parameterized Linear Regression [arXiv]
Shihong Ding¹, Haihan Zhang¹, Hanzhen Zhao, Cong Fang*,
Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition [arXiv]
Yuanshi Liu¹, Haihan Zhang¹, Qian Chen, Cong Fang*,
Fundamental Computational Limits in Pursuing Invariant Causal Prediction and Invariance-Guided Regularization [arXiv]
Yihong Gu, Cong Fang, Yang Xu, Zijian Guo, and Jianqing Fan,
Learning Curves of Stochastic Gradient Descent in Kernel Regression
Haihan Zhang¹, Weicheng Lin¹, Yuanshi Liu, Cong Fang*,
Hessian-Aware Zeroth-Order Optimization [arXiv]
Haishan Ye, Zhichao Huang, Cong Fang, Junchi Li, and Tong Zhang,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2025.
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process [arXiv]
Hanzhen Zhao, Xingyu Xie, Cong Fang*, and Zhouchen Lin,
International Conference on Learning Representations (ICLR), 2025.
A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium [arXiv]
Shihong Ding¹, Hanze Dong¹, Cong Fang*, Zhouchen Lin, and Tong Zhang,
Journal of Machine Learning Research (JMLR), 2024.
The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization [arXiv]
Haihan Zhang¹, Yuanshi Liu¹, Qianwen Chen, Cong Fang*,
Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning [arXiv]
Yihong Gu, Cong Fang, Peter Bühlmann, and Jianqing Fan,
The Implicit Bias of Heterogeneity towards Invariance and Causality [arXiv]
Yang Xu¹, Yihong Gu¹, Cong Fang*,
Advances in Neural Information Processing Systems (NeurIPS), 2024.
Optimizing over Multiple Distributions under Generalized Quasar-Convexity Condition [arXiv]
Shihong Ding, Long Yang, Luo Luo, Cong Fang*,
Advances in Neural Information Processing Systems (NeurIPS), 2024.
Separation and Bias of Deep Equilibrium Models on Expressivity and Learning Dynamics
Zhoutong Wu, Yimu Zhang, Cong Fang*, Zhouchen Lin*
Advances in Neural Information Processing Systems (NeurIPS), 2024.
Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization [arXiv]
Yuanshi Liu, Hanzhen Zhao, Pengyun Yue, Cong Fang*,
Environment Invariant Linear Least Squares [arXiv]
Jianqing fan, Cong Fang, Yihong Gu, and Tong Zhang (α-β order),
Annals of Statistics (AoS), 2024.
Quantum Algorithms and Lower Bounds for Finite-Sum Optimization[arXiv]
Yexin Zhang, Chenyi Zhang, Cong Fang*, Liwei Wang* and Tongyang Li*,
International Conference on Machine Learning (ICML), 2024.
INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations [arXiv]
Luoli Rui, Guoxi Zhang, Hongming Xu, Yaodong yang, Cong Fang*, and Qing Li*,
International Conference on Machine Learning (ICML), 2024.
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective [arXiv]
Yang Chen, Cong Fang*, Zhouchen Lin*, and Bing Liu,
International Conference on Machine Learning (ICML), 2024.
Double Randomized Underdamped Langevin with Dimension-Independent Convergence Guarantee
Yuanshi Liu, Cong Fang*, and Tong Zhang,
Advances in Neural Information Processing Systems (NeurIPS), 2023.
Task-Robust Pre-Training for Worst-Case Downstream Adaptation [arXiv]
Jianghui Wang¹, Yang Chen¹, Xingyu Xie, Cong Fang*, and Zhouchen Lin*,
Advances in Neural Information Processing Systems (NeurIPS), 2023.
Zeroth-order Optimization with Weak Dimension Dependency
Pengyun Yue, Long Yang, Cong Fang*, and Zhouchen Lin*,
Annual Conference on Learning Theory (COLT), 2023.
On the Lower Bound of Minimizing Polyak-Łojasiewicz Functions [arXiv]
Pengyun Yue, Cong Fang*, and Zhouchen Lin*,
Annual Conference on Learning Theory (COLT), 2023.
Layer-Peeled Model: Toward Understanding Well-Trained Deep Neural Networks [arXiv]
Cong Fang, Hangfeng He, Qi Long, and Weijie Su (α-β order),
Proceedings of the National Academy of Sciences (top journal: PNAS), 2021, accepted.
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Network [arXiv]
Cong Fang, Jason D. Lee, Pengkun Yang, and Tong Zhang (α-β order),
Annual Conference on Learning Theory (COLT), 2021.
Mathematical Models of Overparameterized Neural Networks [arXiv]
Cong Fang, Hanze Dong, and Tong Zhang,
Proceedings of the IEEE (the flagship journal of IEEE: PIEEE), 2021.
How to Characterize the Landscape of Overparameterized Convolutional Neural Networks [paper]
Yihong Gu, Weizhong Zhang, Cong Fang, Jason D. Lee, and Tong Zhang,
Advances in Neural Information Processing Systems (NeurIPS), 2020.
Improved Analysis of Clipping Algorithms for Non-convex Optimization [paper][arXiv]
Bohang Zhang, Jikai Jin, Cong Fang, and Liwei Wang,
Advances in Neural Information Processing Systems (NeurIPS), 2020.
Accelerated First-Order Optimization Algorithms for Machine Learning [paper]
Huan Li*, Cong Fang*, and Zhouchen Lin (*equal contribution),
Proceedings of the IEEE (the flagship journal of IEEE: PIEEE), 2020.
Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters [paper][arXiv]
Huan Li, Cong Fang, Zhouchen Lin, and Wotao Lin,
IEEE Trans. on Signal Processing (top signal processing journal: TSP), 2020.
Training Deep Neural Networks by Lifted Proximal Operator Machines [paper]
Jia Li, Mingqing Xiao, Cong Fang, Daiyue, Chao Xu, and Zhouchen Lin,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2020.
Complexities in Projection-Free Stochastic Non-convex Minimization [paper]
Zebang Shen, Cong Fang, Peilin Zhao, Junzhou Huang, and Hui Qian,
The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
Sharp Analysis for Nonconvex SGD Escaping from Saddle Points [paper][arXiv]
Cong Fang, Zhouchen Lin, and Tong Zhang (α-β order),
Annual Conference on Learning Theory (COLT), 2019.
Lifted Proximal Operator Machines [paper][arXiv]
Jia Li, Cong Fang, and Zhouchen Lin,
Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018.
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator [paper][arXiv]
Cong Fang, Chris Junchi Li, Zhouchen Lin, and Tong Zhang (α-β order),
Advances in Neural Information Processing Systems (NeurIPS), 2018.
Dictionary learning with structured noise [paper]
Pan Zhou, Cong Fang, Zhouchen Lin, Chao Zhang, and Edward Chang,
Neurocomputing, 2018.
Faster and Non-ergodic O(1/K) Stochastic Alternating Direction Method of Multipliers [paper]
Cong Fang, Feng Cheng, and Zhouchen Lin,
Advances in Neural Information Processing Systems (NeurIPS), 2017.
Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization [paper]
Cong Fang and Zhouchen Lin,
Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017.
Feature Learning via Partial Differential Equation with Applications to Face Recognition [paper]
Cong Fang, Zhenyu Zhao, Pan Zhou, and Zhouchen Lin,
Pattern Recognition (PR), 2017.
A Robust Hybrid Method for Text Detection in Natural Scenes by Learning-based Partial Differential Equations [paper]
Zhenyu Zhao, Cong Fang, Zhouchen Lin, and Yi Wu,
Neurocomputing, 2015.

Books

Accelerated Optimization in Machine Learning: First-Order Algorithms [book]
Zhouchen Lin, Huan Li, and Cong Fang, Springer, 2020.
I am in charge of introducing stochastic and distributed algorithms (Chapters 5 and 6)

Alternating Direction Method of Multipliers for Machine Learning [book]
Zhouchen Lin, Huan Li, and Cong Fang, Springer, 2020.
I am in charge of introducing stochastic algorithm (Chapter 5)