Seminar on Computational Learning and Adaptation


  Aprocedureforunsupervisedlexiconlearning

Anand Venkataraman
Speech Technology and Research Laboratory
SRI International
Menlo Park, CA
anand@speech.sri.com

How do we learn new words by just listening to continuous speech? What kinds of cues might we employ to discover word boundaries? What if we didn't use any cues at all? In particular, how well might we hope to do at the word discovery problem if we just relied on a few "short and sweet" utterances to fuel our lexicon acquisition task? In this talk, we discuss a bare-bones statistical model for segmentation and word discovery in continuous speech. We also look at an incremental unsupervised learning algorithm to infer word boundaries based on this model and the results of some empirical tests to evaluate it.



Date: Thursday, October 4

Time: 4:15-5:30PM

Place: Cordura 100


Return to the seminar schedule