Stochastic Spatio-Temporal Grammars for Images and Video
Jeffrey Mark Siskind
Electrical and Computer Engineering
Purdue University
Probabilistic context-free grammars induce distributions over strings, which
can be viewed as observations that are maps from indices to terminals. The
domains of such maps are totally ordered and the terminals are discrete. We
extend this framework to induce densities over observations with unordered
domains and continuous-valued terminals. We call our extension spatial
random tree grammars (SRTGs). Although this formalism is context sensitive,
the inside-outside algorithm can be extended to support exact likelihood
calculation, MAP estimates, and ML estimation updates in polynomial time.
We call this extension the center-surround algorithm. SRTGs extend mixture
models by adding hierarchal structure that can vary across observations.
The center-surround algorithm can recover the structure of observations,
learn structure from observations, and classify observations based on their
structure. We have used SRTGs and the center-surround algorithm to process
both static images and dynamic video. In static images, SRTGs have been
trained to distinguish houses from cars. In dynamic video, they have
been trained to distinguish events such as entering, exiting, picking
up, putting down, sitting down, and standing up. We demonstrate how
the structural priors provided by SRTGs support these tasks.
This talk describes joint work with Charles Bouman, Shawn Brownfield,
Bingrui Foo, Mary Harper, Ilya Pollak, and James Sherman.
|
Date: Wednesday, April 13, 2005 |
Time: 4:15-5:30PM |
Place: Gates 104 |
Return to the seminar schedule