Seminar on Computational Learning and
Adaptation
Value Driven Agents
Dan Shapiro
Stanford EES & OR Dept, and
DaimlerChrysler Research and Technology North America
dgs@stanford.edu
This talk introduces a novel class of value driven artificial agents
that act to maximize an internal measure of reward. This design
increases autonomy by motivating agents from a sense of what is
important instead ofwhat to do, and it enhances human trust by
supporting a theoretical guarantee: a well-aligned value driven agent
will maximize human utility as a consequence of learning to maximize
its own reward. A value driven agent consists of a reward function
and a set of skills that express an approximate plan for how to
behave. We represent skills in a reactive language (called Icarus)
that employs a hierarchical reinforcement learning algorithm (Sharsha)
to develop a policy over the options within skills. Sharsha has
proven convergence properties. We discuss Icarus and Sharsha, the
process of aligning agent reward with human utility, and the results
of two experiments in a simulated vehicle control domain that
demonstrate the benefit of the value driven agent architecture. The
first shows that the use of structured Icarus plans increases learning
rate and performance, and can decrease plan size by three orders of
magnitude relative to the common formulation of reinforcement learning
problems. This hints at a qualitative change in the scope and
efficacy of feasible learning applications. The second experiment
shows that different reward functions can generate qualitatively
different behavior from the same set of skills. This suggests a novel
design method: we can develop agents for a given application domain
without writing new skills via programming by reward.
Date: Thurs., June 8
|
Time: 4:15-5:30PM
|
Place: Cordura 100
|
Return to the seminar schedule