Some of the best statistical learning algorithms are computationally expensive when implemented in the obvious way. Fortunately, simple data structures from computational geometry can often provide huge improvements. For many of the algorithms, the average-case asymptotic performance is provably good for samples drawn from an arbitrary smooth underlying distribution. I'll give several examples of such data structures and indicate how the analysis works.
To introduce the approach I'll describe a linear expected time sorting algorithm called double-bucket sort. In practice, the algorithm is several times faster than quicksort on real data sets.
These data structures work by adapting to the data they represent, and so they are intimately connected with non-parametric density estimation. I'll describe the connection to histogram, volume, kernel, and adaptive kernel estimators, stressing the role of the beta distribution in the analysis.
I'll then describe k-d trees, boxtrees, and balltrees and present efficient algorithms for constructing them and for a variety of queries including finding the k-nearest neighbors of a point.
Finally I'll describe bumptrees which are useful for cheaply evaluating
Gaussian mixtures. I'll then show how to use them in the context of local
weighted linear regression. In learning a robot kinematics task, bumptrees
lead to factor of 50 speedup over a straight radial basis function implementation.
| Date: Thurs., Jan. 28 |
|
Place: Cordura 100
|
Return to the seminar schedule