I am a PhD student major in Computer Science at Shanghai Jiao Tong University (SJTU), supervised by Prof. Yue Gao. Currently, I am conducting researches on Embodied AI at Shanghai Innovation Institute and MoE key lab of Artificial Intelligence. Previously, I received my bachelor degree in Computer Science from IEEE honor class at SJTU.

My research interest includes:

(1) Reinforcement Learning (RL) algorithms: Robust RL algorithms; Improve sample efficiency; Multi-task/Meta RL algorithms.

(2) Embodied Artificial Intelligence: Vision-Language-Action (VLA)-based robto manipulation; RL-based locomotion and imitation on legged/humanoid robots;

I am a final year CS PhD student, expected to graduate in March 2026. Currently, I am looking for a job about Reinforcement Learning or Embodied AI.

📝 Publications

🧠 Reinforcement Learning Algorithms

ICLR 2025
sym

Select before Act: Spatially Decoupled Action Repetition for Continuous Control

Buqing Nie, Yangqing Fu, Yue Gao. Arxiv Openreview

  • a flexible action repetition framework for continuous control.
  • higher efficiency, superior performance, reduced fluctuation.
  • first work to consider spatial features into temporal abstraction.
AAAI 2024
sym

Improve robustness of reinforcement learning against observation perturbations via $l_\infty$ lipschitz policy networks

Buqing Nie, Jingtian Ji, Yangqing Fu, Yue Gao. Arxiv

  • improve certified robustness under observation adversaries.
  • first work to improve robustness using Lipschitz property.
  • improve performance over 20% (30% on strong perturbations).
Under Review
sym

Action Robust Reinforcement Learning via Optimal Adversary Aware Policy Optimization

Buqing Nie, Yangqing Fu, Yue Gao. Arxiv

  • improve robustness under various action adversaries.
  • formulate and prove OA-PI framework theoretically.
  • training without finding adversaries explicitly.

🤖 Robotics & Embodied Artificial Intelligence

Arxiv
sym

Symmetry Equivariant Deep Reinforcement Learning Policy for Humanoid Robots

Buqing Nie, et al.

  • DRL-based humanoid robot policy with strict symmetry equivariance.
  • simple to implement without additional hyper-parameters.
  • higher tracking accuracy with coordinated motions
ICRA 2022
sym

DanceHAT: Generate Stable Dances for Humanoid Robots with Adversarial Training

Buqing Nie, and Yue Gao. Paper

  • humanoid robot imitation learning using adversarial training.
  • first learning-based IL work for humanoid robot with stability.
RAL 2024
sym

Robust Locomotion Policy with Adaptive Lipschitz Constraint for Legged Robots

Yang Zhang, Buqing Nie, and Yue Gao. Paper

  • induce adaptive Lipschitz constraint for quadruped locomotion tasks
  • action smooth, lower energy cost, robust to obs. noise and disturbances.

📖 Educations

  • 2022.04 - 2026.03, PhD Candidate (combined master and doctoral program), Computer Science, Department of Computer Science, Shanghai Jiao Tong University.
  • 2020.09 - 2022.04, Master, Control Science and Engineering, Department of Automation, Shanghai Jiao Tong University.
  • 2016.06 - 2020.04, Bachelor, Computer Science (IEEE honor class), Department of Computer Science, Shanghai Jiao Tong University.

💻 Academic Services

I serve as a reviewer for AI/robotics conferences/journals, including ICLR 2025, NeurIPS 2025, AAAI 2025-2026, ICRA 2025, IROS 2025, RAL 2024-2025, etc.

🎖 Honors and Awards

  • 2021.11 Huawei Scholarship
  • 2019.05 MCM/ICM Meritorious Winner (team leader)