Seth O. Rogers. Symbolic Performance & Learning in
Continuous-valued Environments. PhD Thesis, The University of
Michigan, Dept. of Computer Science and Electrical Engineering,
January 1997.
Abstract:
Real-world and simulated real-world domains, such as flying and
driving, commonly have the characteristics of continuous-valued (CV)
environments. These environments are frequently complex and difficult
to control, requiring a great deal of specific, detailed knowledge.
Although past approaches to learning control policies employed various
forms of numerical processing, symbolic agents can also perform and
learn in CV environments. There are both functional and theoretical
motivations for choosing symbolic processing. SPLICE (Symbolic
Performance \& Learning In Continuous-Valued Environments) is a
symbolic agent for adaptive control implemented in the Soar
architecture. SPLICE uses a three-level framework to first classify
its sensory information into symbolic regions, then map the set of
regions to a local model, then use the local model to determine an
action that the agent predicts will achieve the goal in the current
situation. The agent monitors the results of the action and learns by
changing its action mapping and local models. Learning causes the
models to gradually become more specific and more accurate. SPLICE is
suited to complex dynamic environments because its incremental
learning ensures the response time will not increase as the agent
gains more experience, and it can be proven to achieve any goal for
any environment that meets certain assumptions. The performance of
the SPLICE agent is evaluated in simple domains using four
experimental methodologies. SPLICE is shown to exhibit flexibility in
learning a variety of domains. Lesion studies show that SPLICE's most
effective capabilites are generalization and constructing local linear
response functions. SPLICE's performance is compared to other
adaptive control algorithms and found to behave comparably. We also
instantiate SPLICE in a complex environment, simulated flight, and
evaluate its performance. The final goal is not only to create an
effective controller for complex continuous environments, but also to
understand more clearly the ramifications of symbolic learning in
continuous domains.
Keywords: artificial intelligence, machine learning,
function approximation, incremental learning, regression
Postscript (3903 KB, 176 pages)