Abstract by Aadesh Neupane
Learning from Linear Temporal Logic based Goal Formulation
Artificial agents will learn safe and reliable behaviors only with well-defined goal specifications. Natural language is highly expressive for goal specification, but is ambiguous and ill-defined, in contrast to objective functions and reward signals. Temporal logic (TL) based goal specification has the right balance between goal expressiveness and computability. TL based on automatons approaches double exponent complexity to accomplish a goal. Thus, fulfilling complex and dependent goals is not inefficient with the standard temporal logic specification.
This work addresses this limitation by utilizing the recognizer-generator duality from computational theory to propose an algorithm. It learns to generate behaviors that satisfy the goal specification using an error signal from the recognizer in exponential time. We also show that the GenRecProp algorithm was able to specify a variety of goal types and generate behaviors to satisfy those goals for Taxi and Key-Door MDP problems.