Seth O. Rogers. Symbolic Performance & Learning in Continuous-valued Environments. PhD Thesis, The University of Michigan, Dept. of Computer Science and Electrical Engineering, January 1997.

Abstract:

Real-world and simulated real-world domains, such as flying and driving, commonly have the characteristics of continuous-valued (CV) environments. These environments are frequently complex and difficult to control, requiring a great deal of specific, detailed knowledge. Although past approaches to learning control policies employed various forms of numerical processing, symbolic agents can also perform and learn in CV environments. There are both functional and theoretical motivations for choosing symbolic processing. SPLICE (Symbolic Performance \& Learning In Continuous-Valued Environments) is a symbolic agent for adaptive control implemented in the Soar architecture. SPLICE uses a three-level framework to first classify its sensory information into symbolic regions, then map the set of regions to a local model, then use the local model to determine an action that the agent predicts will achieve the goal in the current situation. The agent monitors the results of the action and learns by changing its action mapping and local models. Learning causes the models to gradually become more specific and more accurate. SPLICE is suited to complex dynamic environments because its incremental learning ensures the response time will not increase as the agent gains more experience, and it can be proven to achieve any goal for any environment that meets certain assumptions. The performance of the SPLICE agent is evaluated in simple domains using four experimental methodologies. SPLICE is shown to exhibit flexibility in learning a variety of domains. Lesion studies show that SPLICE's most effective capabilites are generalization and constructing local linear response functions. SPLICE's performance is compared to other adaptive control algorithms and found to behave comparably. We also instantiate SPLICE in a complex environment, simulated flight, and evaluate its performance. The final goal is not only to create an effective controller for complex continuous environments, but also to understand more clearly the ramifications of symbolic learning in continuous domains.

Keywords: artificial intelligence, machine learning, function approximation, incremental learning, regression

Postscript (3903 KB, 176 pages)