Title:
Deep Reinforcement Learning for the Velocity Control of a Magnetic, Tethered Differential-Drive Robot
Deep Reinforcement Learning for the Velocity Control of a Magnetic, Tethered Differential-Drive Robot
Author(s)
Rawal, Devarsi Paresh
Advisor(s)
Pradalier, Cédric
Lau, Mackenzie
Ha, Sehoon
Lau, Mackenzie
Ha, Sehoon
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
The ROBOPLANET Altiscan crawler is a magnetic-wheeled, differential-drive robot being explored as an option to aid, if not completely replace, humans in the inspection and maintenance of marine vessels. Velocity control of the crawler is a crucial part in establishing trust and reliability amongst its operators. However, thanks to the crawler's elongated, magnetic wheels and umbilical tether, it operates in a complex environment rich with nonlinear dynamics which makes control challenging. Model-based approaches for the control of a robot that aim to mathematically formalize the physics of the system require an in-depth knowledge of the domain.
Reinforcement learning (RL) is a trial-and-error-based approach that can solve control problems in nonlinear systems. To accommodate for high-dimensionality and continuous state spaces, deep neural networks (DNNs) can be used as nonlinear function approximators to extend RL, creating a method known as deep reinforcement learning (DRL). DRL coupled with a simulated environment provides a way for a model to learn physics-naive control. The research conducted in this thesis explored the efficacy of a DRL algorithm, proximal policy optimization (PPO), to learn the velocity control of the Altiscan crawler by modeling its operating environment in a novel, GPU-accelerated simulation software called Isaac Gym. The approaches evaluated the error between measured base velocities of the crawler as a result of the actions provided by the DRL model and target velocities in six different environments. Two variants of PPO, standard and recurrent, were compared against the inverse velocity kinematics model of a differential-drive robot. The results show that velocity control in simulation is possible using PPO, but evaluation on the real crawler is needed to come to a meaningful conclusion.
Sponsor
Date Issued
2022-12-15
Extent
Resource Type
Text
Resource Subtype
Thesis