Enhancing Tactile-based Reinforcement Learning for Robotic Control

1University of Edinburgh,

Related Links

TLDR; 1. Superhuman dexterity is possible with only proprioception and sparse binary contacts. 2. Tactile sensing can provide distinct benefits over proprioceptive histories in specific scenarios. 3. Self-supervised learning (SSL) is an effective solution for the representation problem in RL for sensory-based agents.

Abstract

Achieving safe, reliable real-world robotic manipulation requires agents to evolve beyond vision and incorporate tactile sensing to overcome sensory deficits and reliance on idealised state information. Despite its potential, the efficacy of tactile sensing in reinforcement learning (RL) remains inconsistent.

We address this by developing self-supervised learning (SSL) methodologies to more effectively harness tactile observations, focusing on a scalable setup of proprioception and sparse binary contacts. We empirically demonstrate that sparse binary tactile signals are critical for dexterity, particularly for interactions that proprioceptive control errors do not register, such as decoupled robot-object motions.

Our agents achieve superhuman dexterity in complex contact tasks (ball bouncing and Baoding ball rotation). Furthermore, we find that decoupling the SSL memory from the onpolicy memory can improve performance. We release the Robot Tactile Olympiad (RoTO) benchmark to standardise and promote future research in tactile-based manipulation

Best PPO agent

Trained end-to-end

Our best agent

Trained with self-supervised dynamics + an auxiliary memory

Best PPO agent

Trained end-to-end

Our best agent

Trained with self-supervised dynamics

Best PPO agent

Trained end-to-en

Our best agent

Trained with self-supervised dynamics

BibTeX

@@inproceedings{miller2025tactilerl,
  author    = {Miller, Elle and McInroe, Trevor and Abel, David and Mac Aodha, Oisin and Vijayakumar, Sethu},
  title     = {Enhancing Tactile-based Reinforcement Learning for Robotic Control},
  journal   = {NeurIPS},
  year      = {2025},
}