Seminar on Computational Learning and Adaptation


 
Lazy Bayesian Rules:
A Technique for Making Highly Accurate Predictions

Zijian Zheng
Data Mining Applications
Blue Martini Software, Inc.
San Mateo, California

zijian@bluemartini.coml


The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and generates a local naive Bayesian classifier at each leaf. The tests leading to a leaf can alleviate attribute inter-dependencies for the local naive Bayesian classifier. However, Bayesian tree learning still suffers from the replication, fragmentation, and small disjunct problems of tree learning. While inferred Bayesian trees demonstrate low average prediction error rates, there is reason to believe that error rates will be higher for those leaves with few training examples. In this talk, I present an application of lazy learning techniques to Bayesian tree induction and presents the resulting lazy Bayesian rule learning algorithm, called LBR. For each test example, it builds a most appropriate rule with a local naive Bayesian classifier as its consequent. I will show that, on average, this new algorithm obtains lower error rates significantly more often than the reverse in comparison to a naive Bayesian classifier, C4.5, a Bayesian tree learning algorithm, a constructive Bayesian classifier that eliminates attributes and constructs new attributes using Cartesian products of existing nominal attributes, and a lazy decision tree learning algorithm in a wide cross-selection of natural domains. It also outperforms, although the result is not statistically significant, a selective naive Bayesian classifier. I will also demonstrate using experiments with these domains that the computational requirements of LBR are reasonable.

Furthermore, I analyze the LBR algorithm using the bias and variance decomposition. I will show that LBR significantly reduces the bias of naive Bayesian classification at a cost of a slight increase in variance. Empirical comparison of LBR with boosting decision trees, a technique being considered as a breakthrough in recent machine learning research, shows that LBR has, on average, significantly lower variance and higher bias. As a result of the interaction of these effects, the average prediction error of LBR over a range of learning tasks is at a level directly comparable to boosting. Empirical comparison of LBR with bagging decision trees shows that LBR has lower average variance and bias, and thus lower average error.

All these results suggest that LBR provides a very competitive learning technique where error minimization is an important criterion.


Date: Thurs., Feb. 24

Time: 4:15-5:30PM

Place: Ventura 17


Return to the seminar schedule