publications

2025

  1. Conference
    Simple Policy Optimization
    Zhengpeng Xie, Qiang Zhang, Fan Yang, Marco Hutter, and Renjing Xu
    In Forty-second International Conference on Machine Learning, 2025
  2. Preprint
    Representation Convergence: Mutual Distillation is Secretly a Form of Regularization
    Zhengpeng Xie, Jiahang Cao, Qiang Zhang, Jianxiong Zhang, Changwei Wang, and Renjing Xu
    2025
  3. Conference
    Zeroth-Order Optimization is Secretly Single-Step Policy Optimization
    Junbin Qiu, Zhengpeng Xie, Xiangda Yan, Yongjie Yang, and Yao Shu
    In Tiny Titans: The next wave of On-Device Learning for Foundational Models (TTODLer-FM), 2025