HOME    »    PROGRAMS/ACTIVITIES    »    Annual Thematic Program
Spring 2000
IMA Hot Topics Workshop
Text Mining
April 17-18, 2000

Army High Performance Computing Research Center (AHPCRC),
Supercomputing Institute for Digital Simulation and Advanced Computation,
West Group


James Allan
University of Massachusetts

Vipin Kumar

Paul Thompson
West Group

Text mining is a new interdisciplinary field. It is related to data mining, a relatively mature technology, typically applied to the analysis of data stored in structured databases. Text mining seeks to apply some of the same types of analysis, such as knowledge discovery, or trend analysis, to unstructured textual data, that data mining applies to structured data. Text mining combines the disciplines of data mining, information extraction, information retrieval, text categorization, probabilistic modeling, linear algebra, machine learning, and computational linguistics to discover structure, patterns, and knowledge in large textual corpora.

Advances in computational resources and new statistical algorithms for text analysis have helped text mining develop as a field. This 2-day workshop is intended to bring together leading researchers in this new field, representing its various constituencies, including: computer science, mathematics and statistics, information retrieval, and artificial intelligence. There is not yet a consensus within the text mining community as to exactly what text mining is. One of the purposes of this workshop will be to help the community come closer to such a consensus. More generally, it will provide an opportunity to share research among the diverse groups represented at the workshop.


All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted.
Monday Tuesday

MONDAY, April 17, 2000
All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted.
8:00 am Coffee and Registration Reception Room EE/CS 3-176
8:30 am Willard Miller, Fred Dulles,
and Vipin Kumar
8:45 am Chris Wolff
West Group
Riding the XML Wave
9:15 am Dharmendra Modha
IBM Almaden

Clustering Hypertext with Applications to Web Searching

talk (ps), talk(pdf), paper (ps), paper(pdf)

9:45 am Daniel Boley
University of Minnesota

Principal Direction Partitioning in Text Data Mining/

Talk pdf

10:15 am Break Reception Room EE/CS 3-176
10:45 am Michael W. Berry
University of Tennessee

Level Search Filtering for IR Model Reduction

Slides   pdf      html (for Internet Browser only)

11:15 am Inderjit S. Dhillon
University of Texas, Austin
Matrix Approximations for Large, Sparse Text Data Using Clustering
11:45 am-
12:15 pm
George Karypis
University of Minnesota

Concept Indexing: A Fast Dimensionality Reduction Algorithm with Applications to Document Retrieval & Categorization

Talk pdf   html (for Internet Explorer browser only)

2:00 pm Marti Hearst
University of California Berkeley

Untangling Text Data Mining

Talk  pdf  html (for Internet Explorer browser only)

2:45 pm Michael Steinbach
University of Minnesota
Document Clustering: Is Hierarchical Clustering Really Better?
3:15 pm Break Reception Room EE/CS 3-176
3:45 pm Thomas Hofmann
Brown University
Probabilistic Models for Information Retrieval and Text Mining
4:15 pm David Lewis
AT&T Labs
Online Text Classification with ATTICS
4:45 pm Ralph Weischedel
BBN Technologies

Statistical Models of Text: From Bags of Words to Structure

Talk   pdf   html (for Internet Explore browser only)

5:15 pm IMA Tea IMA East, 400 Lind Hall
A variety of appetizers and beverages will be served.
TUESDAY, April 18, 2000
All talks are in Lecture Hall EE/CS 3-180 unless otherwise noted.
8:15 am Coffee Reception Room EE/CS 3-176
8:30 am Jaime Carbonell
Carnegie Mellon University
Prospecting for Novelty in Text Mining
9:15 am Lucy T. Nowell
Information Visualization: Changing the Balance of Power
9:45 am Breck Baldwin
Baldwin Language Technologies
Coreference Driven Link Analysis Through Visualization
10:15 am Break Reception Room EE/CS 3-176
10:45 am David Jensen
University of Massachusetts -Amherst
Relational Knowledge Discovery: Applications to Text
11:15 -
11:45 am
Peter Jackson
West Group
Information Extraction Project
1:30 pm Henry Lieberman
Text Mining in Real Time
2:00 pm Eui-Hong (Sam) Han
University of Minnesota
Centroid-Based Document Classification Algorithms: Analysis & Experimental Results
2:30 pm Panel Discussions  
3:30 pm Coffee Reception Room EE/CS 3-176
Monday Tuesday



as of 4/17/2000
Name Department Affiliation
Breck Baldwin Institute for Research in Cognitive Science University of Pennsylvania
Arun Batchu Graduate Programs in Software University of St. Thomas
Michael Berry Computer Science University of Tennessee
Daniel Boley Comp. Sci. & Eng. University of Minnesota
Kelsey Bruso Computer Science University of Minnesota
Jaime Carbonell Computer Science, Language Technologies. Inst. Carnegie Mellon University
Jack Conrad Computer Science Research West Group
Inderjit Dhillon Computer Science University of Texas at Austin
Suchartia Gopal Geography & Cen. for Remote Sensing Boston University
Eui-Hong Han Computer Science & Engineering University of Minnesota
Marti Hearst School of Information Management & Systems University of California - Berkeley
Thomas Hofmann Computer Science Brown University
Peter Jackson   West Group
Ravi Janardan Computer Science and Engineering University of Minnesota
David Jensen Computer Science University of Massachusetts
Moon-Gu Jeon Computer Science University of Minnesota
Yunjae Jung Computer Science University of Minnesota
George Karypis CS & E University of Minnesota
Krishna Kataria ISG Content Services Unisys
Vipin Kumar Computer Science/Engineering University of Minnesota
Paul Lareau Information Management 3M Corporation
David D. Lewis   AT&T Labs - Research
Henry Lieberman Media Laboratory MIT
Jayanth Majhi Computer Science Synopsys
Karen Michael Software Engineering University of St. Thomas
Dharmendra Modha IBM Alamaden Research Center
Isabelle Moulinier Computer Science Research West Group
Lucy T. Nowell SAVI Group-Synthesis, Analysis & Visualization Battelle/Pacific Northwest National Laboratory
Haesun Park Computer Science & Engr. University of Minnesota
Chang Peng Computer Science and Engineering University of Minnesota
William Pottenger National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Kashif Riaz Capture and C West Group
Quentin Ritchi Worldwide Transportation Unisys
Michael Steinbach Computer Science & Engineering University of Minnesota
Rick Steinheiser   Advanced Analytic Tools
Paul Thompson   West Group
Mark Wasson   LEXIS-NEXIS
Ralph Weischedel   BBN Technologies
Chris Wolff VP of Publishing Technology West Group
Shakila Xavier Computer Science Research West Group


Back to Reactive Flows and Transport Phenomena

Back to "Hot Topics" Workshops

Back to top of page

Connect With Us: