Cong Fang

Cong Fang

Assistant Professor
Peking University
Email: fangcong at pku dot edu dot cn

Overview

I am an assistant professor at Peking University. I was a postdoctoral researcher at University of Pennsylvania, hosted by Prof. Weijie Su and Qi Long in 2021, and at Princeton University hosted by Prof. Jason D. Lee in 2020. I received my Ph.D. at Peking Univerity, advised by Prof. Zhouchen Lin. I also work closely with Prof. Tong Zhang at UIUC.

My research interests are broadly in the machine learning algorithms and theory. I currently work on various aspects of optimization and the foundation of deep learning. The topics include, but are not limited to,

Optimization and Sampling: convex and non-convex opt., stochastic opt., distributed/federated opt., min-max opt., gradient based sampling, etc. I focus on building more realistic mathematical foundation for solvers on machine learning models and designing faster algorithms.
Deep Learning Theory: complexity analysis, feature/representation learning analysis, implicit regularization, etc. I am interested in proposing new models or analyses to have a better theoretical understanding of machine learning models, including neural networks.
AI Foundation: data driven causal learning without prior knowledge, understand the emergence of value and rules, etc. I am interested in propose new theoretic frameworks and methodologies to read high-level AI.
Applications : new architectures/code package, LLM Training, efficient methods to collect and use data.

I am recruiting self-motivated Ph.D. and interns who have strong mathematical abilities or coding skills to work with me (you do not need to come from mathematics department). If you are interested, please send your detailed CV to my email. You can be also co-advised by Prof. Zhouchen Lin and may have the opportunity to work with my other advisors, in particular Prof. Tong Zhang and Weijie Su.

[Selected Papers] [Books] [Selected Talks]

Selected Papers [Full List by Years][Full List by Topics][Google Scholar]

Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning [arXiv]
Yihong Gu, Cong Fang, Peter Bühlmann, and Jianqing Fan,
Understand learning invariance based causality for general non-linear model.
The Implicit Bias of Heterogeneity towards Invariance and Causality [arXiv]
Yang Xu¹, Yihong Gu¹, Cong Fang*,
Advances in Neural Information Processing Systems (NeurIPS), 2024.
Argue that the heterogeneity of the data causes the model learning beyond association to invariance and causality.
Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization [arXiv]
Yuanshi Liu, Hanzhen Zhao, Pengyun Yue, Cong Fang*,
Propose adaptive optimization algorithms to achieve provably faster rates for Hessian degenerated problems.
Environment Invariant Linear Least Squares [arXiv]
Jianqing fan, Cong Fang, Yihong Gu, and Tong Zhang (α-β order),
Annals of Statistics (AoS), 2024.
Understand learning invariance across environments for linear model.
On the Lower Bound of Minimizing Polyak-Łojasiewicz Functions [arXiv]
Pengyun Yue, Cong Fang*, and Zhouchen Lin*,
Annual Conference on Learning Theory (COLT), 2023.
Propose the lower bound of first-order complexity for minimizing Polyak-Łojasiewicz functions.
Layer-Peeled Model: Toward Understanding Well-Trained Deep Neural Networks [arXiv]
Cong Fang, Hangfeng He, Qi Long, and Weijie Su (α-β order),
Proceedings of the National Academy of Sciences (top journal: PNAS), 2021, accepted.
Propose a simple model to explain and predict some behaviors of neural networks.
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Network [arXiv]
Cong Fang, Jason D. Lee, Pengkun Yang, and Tong Zhang (α-β order),
Annual Conference on Learning Theory (COLT), 2021.
Provide a proof to show that Gradient Descent finds a global minimum for deepnets in the mean-field regime.
Sharp Analysis for Nonconvex SGD Escaping from Saddle Points [paper][arXiv]
Cong Fang, Zhouchen Lin, and Tong Zhang (α-β order),
Annual Conference on Learning Theory (COLT), 2019.
Propose a new kind of analysis to study non-convex objectives that have continuous Hessian matrices.
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator [paper][arXiv]
Cong Fang, Chris Junchi Li, Zhouchen Lin, and Tong Zhang (α-β order),
Advances in Neural Information Processing Systems (NeurIPS), 2018.
Design a new technique that achieves the first fastest rate to find a stationary point in stochastic non-convex optimization.
Mathematical Models of Overparameterized Neural Networks [arXiv]
Cong Fang, Hanze Dong, and Tong Zhang,
Proceedings of the IEEE (the flagship journal of IEEE: PIEEE), 2021.
Accelerated First-Order Optimization Algorithms for Machine Learning [paper]
Huan Li*, Cong Fang*, and Zhouchen Lin (*equal contribution),
Proceedings of the IEEE (the flagship journal of IEEE: PIEEE), 2020.

Books

Accelerated Optimization in Machine Learning: First-Order Algorithms [book]
Zhouchen Lin, Huan Li, and Cong Fang, Springer, 2020.
I am in charge of introducing stochastic and distributed algorithms (Chapters 5 and 6)

Selected Talks

Layer-Peeled Model: Toward Understanding Well-Trained Deep Neural Networks
University of Pennsylvania, 2021.

Stochastic Nonconvex Optimization, SPIDER
Guest lecture for EE539 at Princeton University, invited by Chi Jin, 2021.

Convex formulation of Overparameterized Deep Neural Networks
Theory of Deep Learning Conference at Duke, invited by Rong Ge, 2020.