Investigating Sim-to-Real Transfer and Multi-Agent Learning in Assistive Gym

Thumbnail Image
Schaffer, Holden C.
Kemp, Charles C.
Associated Organization(s)
Organizational Unit
Organizational Unit
Supplementary to
As the world's population grows older on average and the number of available caregivers decreases, assistive robotics pose an opportunity for older adults or people with disabilities to continue receiving the care that they need. Recent work has shown tremendous progress in using deep reinforcement learning to teach robotic caregivers how to properly assist people in simulation, where robots can learn how to interact with humans in a safe, controlled manner. However, transferring what the robot has learned from simulation to reality continues to pose a challenge for assistive robotics, and a gap in the literature exists in finding techniques to overcome this challenge for this particular domain. The first part of this research uses an assistive simulation framework known as Assistive Gym and its simulated drinking environment to test various approaches to sim-to-real transfer for assistive robotics. The end result of this portion of the research is the identification of a series of baseline steps that are necessary to transfer the Drinking task in Assistive Gym to a physical PR2. Next, the avenues for future works are addressed by investigating a few potential modifications to the drinking task which could be implemented for a more successful transfer of policies. The second part of the research investigates how multi-agent learning could be implemented in Assistive Gym. This section implements multi-agent assistance for the bed-bathing environment, then tests the effectiveness of three different algorithms in order to gauge their effectiveness for solving this new multi-agent task. These algorithms include two variations of single-agent Proximal Policy Optimization modified for multi-agent use as well as Multi-Agent Deep Deterministic Policy Gradient. Finally, future works related to multi-agent assistance are discussed, namely choosing alternate implementations of MADDPG and investigating the dressing environment for its greater potential for cooperation between robots.
Date Issued
Resource Type
Resource Subtype
Undergraduate Thesis
Rights Statement
Rights URI