Talk Abstract
Statistical Models of Text: From Bags of Words to Structure

Ralph Weischedel
BBN Technologies


During the last five years, attempts to apply statistical language models to computational linguistics have led to new capabilities in processing text. In this paper, we survey those techniques (named entity identification and classification, parsing, and fact extraction), since they provide structural and semantic features that can be the input to text mining algorithms, rather than relying solely on models of bags of words.


