Seminar on Computational Learning and
Adaptation
Learning Structure from Sequences, with Applications in a Digital Library
Ian H. Witten
Department of Computer Science
University of Waikato
Hamilton, New Zealand
ihw@cs.waikato.ac.nz
www.cs.waikato.ac.nz/~ihw
The services that digital libraries provide to users can be greatly enhanced by
automatically gleaning certain kinds of information from the full text of the
documents they contain. This talk will review recent work that applies novel
techniques of machine learning (broadly interpreted) to extract information
from plain text. We describe three areas of research: hierarchical phrase
browsing, including efficient methods for inferring a phrase hierarchy from a
large corpus of text; text mining using adaptive compression techniques, giving
a new approach to word segmentation, generic entity extraction, and acronym
extraction; and keyphrase extraction and its application in a digital library.
Date: Thursday, November 29
|
Time: 4:15-5:30PM
|
Place: Cordura 100
|
Return to the seminar schedule