Abstract by Josh Robinson
Unsupervised Action Space Learning
Despite their superhuman performance across a variety of domains, deep reinforcement learning algorithms are often sample inefficient, requiring agents to repeatedly interact with an environment in order to learn an effective policy. This significant interaction with the environment can be costly and even infeasible in some problem domains. We propose a novel algorithm that enables completely unsupervised learning of an environment’s action space. Learning in this way allows agents to leverage unlabeled training data to learn the available actions and how these actions affect future states. Experiments show that this algorithm is effective across a wide range of environments.