Spectral Learning: Extending Spectral Clustering to Classification
Dan Klein
Stanford University
Spectral clustering methods use the eigenvectors of similarity matrices to
detect cluster structure. These methods have traditionally been applied to
fully unsupervised pattern detection problems, but we extend the general
approach to incorporate supervisory information. In this talk, I'll first
give an overview of how basic spectral clustering methods work. I'll also
discuss the relationship between spectral clustering and more well-known
eigenvector-based methods, such as latent semantic analysis (LSA) and
principal component analysis (PCA). I'll then describe a simple,
easy-to-implement spectral algorithm which, in the absence of supervisory
information, reduces to spectral clustering. When supervisory information
is available, however, we incorporate it by modifying the input similarity
matrix before clustering. Then, the same kind of representational
transformation used in spectral clustering can be used for
classification. This approach performs comparably to other spectral
clustering algorithms in unsupervised cases, and, in a partially supervised
text categorization setting, has achieved high accuracy on the
categorization of thousands of documents given only a few dozen labeled
training examples.
Date: Thursday, November 6 |
Time: 4:15-5:30PM |
Place: Cordura 100 |
Return to the seminar schedule