Main navigation | Main content

HOME » PROGRAMS/ACTIVITIES » Annual Thematic Program

PROGRAMS/ACTIVITIES

Annual Thematic Program »Postdoctoral Fellowships »Hot Topics and Special »Public Lectures »New Directions »PI Programs »Industrial Programs »Seminars »Be an Organizer »Annual »Hot Topics »PI Summer »PI Conference »Applying to Participate »

Abstracts for the IMA "Hot Topics" Workshop

Enhancing the Searching of Mathematics

Enhancing the Searching of Mathematics

April 26-27, 2004

**Su-Shing
Chen** (Department of Computer & Information
Science & Engineering, University of Florida) suchen@cise.ufl.edu

**Indexing
Mathematical Abstracts by Metadata and Ontology**

Statement
of Interest

Slides:
html
pdf
ps
ppt

Based on earlier results, I will describe some ideas of indexing mathematical abstracts or papers by metadata and ontology. Metadata includes existing subject classification schemes and some recent metadata for electronic records. Ontology is a different approach to index abstracts by clustering them into an information visualization interface so that users may select using ontology as well as metadata.

**Timothy
W. Cole** (Library Administration, University
of Illinois at Urbana-Champaign) t-cole3@uiuc.edu

**Enriching
Metadata for XML Journal Articles through Extraction of MathML
and Function Names**

Slides:
html
pdf
ps
ppt

Two automated approaches are being investigated. In the first approach we extract all occurrences of MathML contained in full-text of articles included in a sample corpus of XML-encoded sci-tech journal literature published by ACM, AIP, and IEEE-CS (articles include legacy SGML ISO 12083 math fragments previously converted to MathML). We then filter and normalize those MathML fragments recognized as potentially useful for search and discovery, adding the normalized fragments to qualified Dublin Core metadata records describing the articles. The second approach adopts the hierarchical browse vocabulary of the Wolfram Functions Website as a descriptive metadata controlled vocabulary. Function name strings from this vocabulary which occur in a journal article are added to its metadata record, along with the frequency of occurrence. These approaches are seen as having the potential to enhance discoverability of journal articles and facilitate linkages between journal literature and reference mathematics literature (e.g., the Wolfram Functions Website).

**James
Crowley** (SIAM) crowley@siam.org

**A
Publisher's Perspective on Searching and Metadata**

There is diverse array of solution that various publishers, especially scientific societies, are seeking to provide better searching capabilities to the on-line journal literature. Each of these approaches promise improved capability, but come with costs. These will be discussed from the perspective of a scientific society pubisher.

**Matthias
Graefenhan**
(Department of Mathematics and Computer Science, University
of Marburg) Matthias@Graefenhan.de

A
Mathematical Knowledge Base Using Coherent Notation (poster
session)

Slides: html
pdf
ps
ppt

We present an XML based system of mathematical documents currently being developed at the University of Marburg, Germany, which aims at a comprehensive and systematic description of all aspects of pure mathematics. The system consists of numerous documents each devoted to one mathematical topic, which are organized in a highly coherent way. This is achieved through the following features:

1. uniform symbolic notation for all mathematical and logical objects, based on specially created symbols

2. treelike arrangement of the single documents (considered as atomic elements) in order to find each document via a unique path; freely d efinable further arrangements, e.g. cross references or collections of documents for classroom use

3. elaborate network of interconnections between the atomic elements

The structure of the notation mentioned above enables us to perform searching without the need for extra metadata.

**Laurent
Guillopé** (Cellule MathDoc (CNRS/Université
Joseph Fourier, Grenoble) & Université de Nantes) Laurent.Guillope@math.univ-nantes.fr

**Metadata:
Exchange and Fusion**

Slides: pdf

The NUMDAM program is a component of the World Digital Mathematics Library (WDML). The metatata description of its content is the basis for efficient navigation on the webbed WDML : of particular importance are links in both directions, from NUMDAM papers to related documents (reviews, cited articles,...) as from bibliographical databases, digital archives and preprints databases. These linkings require free metadata availability: the convenient tools (OAI server, lookup engines,...) may be further reused to merge metatada sets for building partial slices of the WDML. Current projects worked by the Cellule MathDoc of such gateways will be discussed.

**Nigel
Kerr** (JSTOR) nigelk@jstor.org

**Tensions
and Questions in JSTOR Data, Math and Otherwise**

Statement
of Interest

Paper:
html
pdf
ps
doc

JSTOR has a large body of data, in Mathematics journals and beyond, that has been historically encoded in LaTeX snippets, in the attempt to accurately reproduce information from the print articles. This strategy has its faults, as does some of JSTOR's LaTeX data itself. JSTOR is at a cross-roads of data migration and systems rebuild, and wants to try to Do The Right Thing. This talk is a description of the pressures and challenges we're aware of, and a request for advice and comment about what JSTOR could do for mathematical content.

**Heinz
Kröger** (FIZ Karlsruhe - Zentralblatt MATH
-) heinz@zblmath.fiz-karlsruhe.de

**Searching
Mathematics with Zentralblatt MATH - Overview and Outlook**

Slides: html
pdf
ps
ppt

We describe the present structure and services of Zentralblatt MATH. Then we look at how Zentralblatt MATH is embedded in a European environment to serve the mathematics community in the future. New trends in retrieval and data presentation are touched upon.

**Bernard
F. Schutz**
(Max Planck Institute for Gravitational Physics (Albert Einstein
Institute)) Bernard.Schutz@aei.mpg.de

**HERMES:
An Effective Converter from TeX into MathML **

Slides:
html
pdf
ps
ppt

As a part of the European Union funded research project called MOWGLI, the Albert Einstein Institute (AEI) in Germany has developed a new and very effective package to enable authors to write mathematics for the Web. Called HERMES, this package will not only generate MathML from TeX, but it will allow authors to insert meta-data into the XML environment of the mathematical expressions, which will allow intelligent searches to be performed on documents. The package is being tested by a major mathematics journal, and it will be used by the AEI's own web journal Living Reviews in Relativity to produce fully indexed mathematical documents.

**Masakazu
Suzuki**
(Faculty of Mathematics, Kyushu University) suzuki@math.kyushu-u.ac.jp
http://www.math.kyushu-u.ac.jp/~suzuki/

**From
Paper to XML in Mathematics**

Slides:
pdf
Statement
of Interest

There are several levels of digitization of mathematics:

level 1: bitmap images of printed materials (e.g. GIF, TIFF),

level 2: searchable digitized document (e.g. PDF with hidden
text),

level 3: structured document with links (e.g. HTML(+MathML),
LATEX),

level 4: (partially) executable document (e.g. Mathematica,
Maple),

level 5: formally presented document. (e.g. Mizar, OMDoc) Currently
most of mathematical knowledge is stored and used mainly in
printed materials (level 1) like books or electronic journals.

For being used actively it is preferable that mathematical text
is stored in possibly a higher level of digitization. However,
making documents digitized to a higher level needs quite a lot
of efforts. The aim of the talk is an overview of key technologies
from level 1 to level 3, present state and future problems.
The results of our research in this paradigm can be found in
the web site: http://infty.math.kyushu-u.ac.jp.
Some applications can be downloaded from the site. The talk
will include a demonstration of our OCR software to digitize
mathematical papers into XML in our original format, LaTeX source
files and HTML files with mathematical notations in MathML.

**Michael
Trott**
(Wolfram Research, Inc.) mtrott@wolfram.com

**Mathematical
Searching in the Wolfram Functions Site**

Statement
of Interest
Mathematica
notebook files:
IMA_MS_2004.tar

In this talk I will give an overview over the Wolfram Functions site. The website functions.wolfram.com is generated from a set of Mathematica notebooks. Mathematica notebooks are structured ASCII files, that can be processed and manipulated by the Mathematica kernel. The notebooks contain about 90,000 mathematical formulas about elementary and special functions in typeset form. Because the formulas are readable and "understandable" by Mathematica, it is possible for the software to completely analyze and classify them with respect to their mathematical structure and occurring functions. A first version of a mathematical search interface to be deployed on the website will be shown and demonstrated.

**Abdou
Youssef **(Department
of Computer Science, The George Washington University) ayoussef@gwu.edu

**Advanced
Math Search: Issues and Techniques **

Slides:
html pdf ps ppt

Worldwide efforts are underway to create digital libraries of mathematical contents, such as the Digital Library of Mathematical Functions (DLMF) at the National Institute of Standards and Technology. A fundamental goal of such libraries is to enable users to search not only for text, but also for equations. The mature information retrieval (IR) technology is primarily for text contents. When applied to math search, text IR is inadequate because of its inability to understand mathematical symbols and structures. In this talk, we will identify the issues of building an advanced Math search system, and present techniques for addressing those issues. Some of the techniques are based on current text search technology, while others will be based on emerging XML-based technologies. Some of the math search capabilities that we have already for DLMF developed will be demonstrated in the talk.