Seminar on Computational Learning and Adaptation




A Case Study of Data Mining in a Complex Technical Domain


Michael Borth
Daimler-Benz AG
Research and Technology, FT3/KL
P.O. Box 2360
89013 Ulm, Germany
Michael.Borth@dbag.ulm.DaimlerBenz.COM
mborth@engr.sgi.com [until 10/30/98]



Determining the key attributes that predict car breakdowns is one of many data-mining tasks we have studied at the Daimler-Benz Research Center in Ulm, Germany. The high complexity of the technical systems in question and an unknown number of outside influences make a theoretical analysis of this problem impossible, but the availability of technical data makes data mining a natural approach. In this talk we present a case study of recent work in this area. The aim was to identify, within a production line, a subset of cars with accelerated usage characteristics that exhibit early on the characteristics that other cars will show at a later date. These "early identification" cars let one predict (and sometimes prevent) breakdowns and identify car configurations with untypical behavior, thus pointing to technical problems. Our approach relies on the CRISP-DM (CRoss Industry Standard Process for Data Mining) model, which includes data understanding and preparation as major phases. As our results show, these phases are vital to the data-mining process and can be more important than the actual induction step. Searching for ways to improve our predictions and understanding of the system, we then take a closer look at the data-mining process itself, again hoping to analyze a complex system.

This talk describes joint work with Ruediger Wirth and Thomas Reinartz.


Date: Thurs., October 22; Time: 4:15-5:30PM; Place: Gates 104


The goal of this seminar is to increase communication among local researchers with interests in computational approaches to learning and adaptation. If you would like to be added to (or removed from) the mailing list, or if you are interested in giving a talk in the seminar, please send email to iba@isle.org.


Return to seminar schedule.