Seminar on Computational Learning and Adaptation


 
Value Driven Agents

Dan Shapiro
Stanford EES & OR Dept, and
DaimlerChrysler Research and Technology North America
dgs@stanford.edu

This talk introduces a novel class of value driven artificial agents that act to maximize an internal measure of reward. This design increases autonomy by motivating agents from a sense of what is important instead ofwhat to do, and it enhances human trust by supporting a theoretical guarantee: a well-aligned value driven agent will maximize human utility as a consequence of learning to maximize its own reward. A value driven agent consists of a reward function and a set of skills that express an approximate plan for how to behave. We represent skills in a reactive language (called Icarus) that employs a hierarchical reinforcement learning algorithm (Sharsha) to develop a policy over the options within skills. Sharsha has proven convergence properties. We discuss Icarus and Sharsha, the process of aligning agent reward with human utility, and the results of two experiments in a simulated vehicle control domain that demonstrate the benefit of the value driven agent architecture. The first shows that the use of structured Icarus plans increases learning rate and performance, and can decrease plan size by three orders of magnitude relative to the common formulation of reinforcement learning problems. This hints at a qualitative change in the scope and efficacy of feasible learning applications. The second experiment shows that different reward functions can generate qualitatively different behavior from the same set of skills. This suggests a novel design method: we can develop agents for a given application domain without writing new skills via programming by reward.


Date: Thurs., June 8

Time: 4:15-5:30PM

Place: Cordura 100


Return to the seminar schedule