About me

Hi! I am a Statistics Ph.D. student at the Wharton School, University of Pennsylvania. Previously, I obtained my M.S. in Computer Science (2020 – 2023) and B.S. in Mathematics (2016 – 2020) from Tsinghua University.

I am broadly interested in the theoretical aspects of modern machine learning, recently focusing on transformers and diffusion models. Feel free to reach out if you’d like to have a chat!

News:

  • Mar. 2024: New preprint on accelerating convergence of diffusion models!
  • Mar. 2024: New preprint on theoretical understanding of self-supervised learning with transformer!
  • Mar. 2024: I will present our work ”in-context convergence of transformers”  at CISS 2024 (Princeton).
  • Feb. 2024: I will give a talk on ” in-context convergence of transformers” at the AI-EDGE seminar.
  • Oct. 2023: New preprint on the theoretical exploration of the in-context learning dynamics of the one-layer transformer!