Multi Path Gait Control Method for Bipedal Robots based on Deep Reinforcement Learning

Lehui Lin; Pingli Lv

doi:10.12694/scpe.v26i5.4790

PDF

Published: Jul 14, 2025

DOI: https://doi.org/10.12694/scpe.v26i5.4790

Keywords:

Reinforcement learning; Bipedal robot; Multi path gait control; Actor critic; Robot robustness

Lehui Lin

Dongbei University of Finance and Economics, Dalian 116025, China

Pingli Lv

Xuzhou Industrial Vocational and Technical College, Xuzhou 221140, China

Abstract

We propose a multipath gait control strategy based on deep reinforcement learning (DRL) for bipedal robot motion planning on diverse and challenging terrains. Traditional control methods, such as PID controllers and model-based motion planning, often struggle in complex environments. These approaches typically underperform because they rely on precise mathematical models or predefined rules, making them ill-suited for nonlinear, uncertain, and dynamic settings. Conventional techniques also have difficulty adapting their control strategies in unpredictable and fluctuating terrains, where robots may encounter unforeseen disturbances, leading to instability or failure. Deep reinforcement learning is able to independently acquire optimal control methods from environmental feedback without requiring a precise model since it combines deep learning and reinforcement learning. In this work, we leverage deep reinforcement learning algorithms (DDPG, TRPO, PPO, A3C, SAC, etc.) based on actor-critic (AC) architectures to enable reliable gait control of bipedal robots in a continuous motion environment. The issue that traditional approaches have in challenging to converge
complicated environments is solved by DRL, which, when compared to traditional methods, can effectively cope with the high nonlinearity of complex terrain and adaptively alter the strategy through continuous contact with the environment. Using goal-conditional techniques, we created a motion planning model and tested it on the actual hardware platform Cassie. According to the experimental results, the approach successfully transfers the simulation strategy to the actual environment, and the robot can accurately complete the goal task without global location feedback. It can also perform a variety of complex tasks, like jumping on discontinuous and flat terrain. Furthermore, the method exhibits significant robustness and adaptability through multithreaded asynchronous training and randomized strategy selection, which solves the shortcomings of conventional motion planning methods in hyperparameter tuning and strategy convergence.

Issue

Vol. 26 No. 5 (2025)

Section

Special Issue - Adaptive AI-ML Technique for 6G/ Emerging Wireless Networks

This work is licensed under a Creative Commons Attribution 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details