The Bias-variance Tradeoff in Active Learning
Hinrich Schuetze
Enkata Technologies
Training set generation is the main cost in many applications of
statistical classification. Active learning is often used for
getting the maximum amount of information possible from each labeling
decision. However, training sets generated by active learning are
biased samples of the underlying population. How can we compute
an unbiased estimate of classifier performance in this scenario? We
propose a solution, evaluate it for a text classification problem and
discuss the bias-variance tradeoff we face.
|
Date: Wednesday, August 18 |
Time: 4:15-5:30PM |
Place: Cordura 100 |
Return to the seminar schedule