HOME    »    SCIENTIFIC RESOURCES    »    Volumes
Abstracts and Talk Materials
Topological Data Analysis
October 7 - 11, 2013

Henry Adams (Colorado State University-Pueblo)

Evasion Paths in Mobile Sensor Networks

Suppose that ball-shaped sensors wander in a bounded and contractible domain. A sensor doesn't know its location but does know when it overlaps a nearby sensor. We say that an evasion path exists in this sensor network if a moving evader can avoid detection. Vin de Silva and Robert Ghrist give a necessary condition, depending only on the time-varying connectivity graph of the sensors, for an evasion path to exist. Can we sharpen this result? We show that the existence of an evasion path depends not only on the fibrewise homotopy type of the region covered by sensors but also on its embedding in spacetime. Furthermore, we study the entire space of evasion paths using a homotopy spectral sequence for diagrams of spaces. This is joint work with Gunnar Carlsson.

Robert Adler (Technion-Israel Institute of Technology)

TDA and the Euler Characteristic Curve

Keywords of the presentation: Euler characteristic curves, topological inference, Gaussian processes

The theory and practice of topological data analysis (TDA) over the past decade has been dominated by the notion and application of persistent homology. Given a representation of the persistent homology of a filtered structure, it is easy to compute the Euler characteristic curve (ECC) which gives the value of the Euler characteristic of the structure at each point of the filtration.

Although ECCs contains less information than, for example, persistence diagrams, they are typically much easier to compute numerically, using techniques not much different to those used by Euler almost three centuries ago. From a practical point of view, ECCs often seem to contain most of the useful information that is necessary for data analysis and, from a more theoretical point of view, it has turned out to be much easier to study ECCs for random structures than persistent homology or even Betti numbers at some point of a filtration.

In the early 1990's, more or less a decade before the appearance of modern TDA, researchers in medical imaging coined the term "Topological Inference" to describe a collection of techniques that exploited the ease of computation and availability of theoretical results for ECCs that were used to analyse PET, MRI and fMRI data.

In this talk I will describe the relation between persistent homology and ECCs in a number of practical and theoretical scenarios, as well as the somewhat different but often quite parallel approaches of TDA and Topological Inference.

F. Javier Arsuaga (San Francisco State University)

Topological Analysis of Copy Number in Breast Cancer

Keywords of the presentation: Breast Cancer, Computational Homology, Her2

Breast cancer is a complex genetic set of diseases that remains to be understood. It is widely believed that for initiation and progression of each of these diseases a number of genes need to be missregulated. Gene missregulation may occur through gains or losses of the genome, commonly termed DNA copy number abnormalities (CNAs). CNAs are routinely detected in the laboratory through a number of techniques including comparative genomic hybridization (CGH). While these techniques lend themselves to the profiling of the entire genome and therefore the identification of multiple CNAs, statistical methods has been mostly limited to the identification of large, independent CNAs. Smaller, co-ocurring CNAs have however been mostly neglected due to the combinatorial explosion of cases that needs to be considered for their identification. One potential approach to partially overcome such combinatorial explosion is by encoding relationships among co-occurring CNAs as single events. Here we propose an approach that uses computational algebraic topology to encode such relationships and therefore partially overcome the combinatorial explosion associated to the identification of co-occurring CNAs in CGH data.

We illustrate our methodology analyzing a set of Her2 positive (Her2+) breast cancer, an aggressive form of the disease that comprise 25% of all breast cancers. Her2+ tumors is characterized by overexpression of the gene ERBB2, which is commonly regulated by copy number gains at ERBB2. When we apply our method to the analysis of ERRB2 in the data published by Horlings et al 2010, we find the amplification of chromosome 17q which harbors the gene ERBB2 and also co-amplifications of chromosome arms 4q and 9q. To test whether these CNAs have functional implications we perform gene expression analysis on the two independent data sets published using SAM. We find differentially expressed genes in 9q that map to the identified CNAs. Our results suggest that these genes involved in Her2+ tumor development are regulated by small, co-occurring CNAS.

Omer Bobrowski (Duke University)

Phase Transitions in Random Čech Complexes

Keywords of the presentation: homology, random cech complexes

In manifold learning, one wishes to infer geometric and topological features of an unknown manifold, embedded in a d-dimensional Euclidean space, from a finite (random) point cloud. One topological invariant of a considerable interest is the homology of the underlying space.

A common method for recovering the homology of a manifold from a set of random samples, is to cover each point with a d-dimensional ball, and study the union of these balls. By the Nerve Lemma, this method is equivalent to studying the homology of the Čech complex generated from the random point cloud.

In this talk we discuss the limiting behavior of random Čech complexes, as the sample size goes to infinity, and the radius of the balls goes to zero. We show that the limiting behavior exhibits multiple phase transitions at different levels, depending on the rate at which the radius of the balls goes to zero. We present the different regimes and phase transitions discovered so far, and observe the nicely ordered fashion in which homology groups of different dimensions appear and vanish.

One interesting consequence of this analysis, is a simple sufficient condition under which the random Čech complex successfully recovers the homology of the original manifold.

Gunnar Carlsson (Stanford University)

Overview of Topological Data Analysis

We will give a summary of the topological methods for understanding complex data sets. We will discuss the persistent homology theme as well as the theme which deals with geometric representations of data.

Frédéric Chazal (INRIA Saclay - Île-de-France )

Stability and Convergence Rates of Persistence Diagrams in Topological Data Analysis

Keywords of the presentation: persistence diagrams, metric spaces, statistics

In TDA, persistent homology appears as a fundamental tool to infer relevant topological information from data. Persistence diagrams are usually computed from filtrations built on top of data sets sampled from some unknown (metric) space. They provide "topological signatures" revealing the structure of the underlying space. To ensure the relevance of such signatures, it is necessary to prove that they come with stability properties with respect to the way data are sampled. In this talk, we will present a few results on the stability of persistence diagrams built on top of general metric spaces. We will show that the use of persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties.

Harish Chintakunta (North Carolina State University)

Distributed Computation of Homology Using Harmonics

We present a distributed algorithm to compute the first homology of a simplicial complex. We employ spanning trees to compute a basis for algebraic 1-cycles, and then use harmonics to efficiently identify the contractible and homologous cycles. The computational complexity of the algorithm is $O(|P|^omega)$, where $|P|$ is much smaller than the number of edges, and $omega$ is the complexity order of matrix multiplication. For geometric graphs, we show using simulations that $|P|$ is very close to the first Betti number.

Isabel K. Darcy (The University of Iowa)

Free Online Course on Applied Algebraic Apology

This Fall 2013 I am teaching a free online course MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology offered through the Mathematics Department and Division of Continuing Education at University of Iowa.

Goal: To prepare students and other researchers for the IMA Thematic Year on Scientific and Engineering Applications of Algebraic Topology, but all interested participants are welcome

Target Audience: Anyone interested in topological data analysis including graduate students, faculty, industrial researchers in bioinformatics, biology, business, computer science, cosmology, engineering, imaging, mathematics, neurology, physics, statistics, etc.

If you are interested in helping to teach a similar course in the spring, please let me know.

More information about the Fall 2013 course can be found at www.math.uiowa.edu/~idarcy/AppliedTopology.html

Vin de Silva (Pomona College)

Persistence and Category Theory

Keywords of the presentation: persistence, categories

I find it convenient to think about persistence in terms of category theory. I will explain why this is natural, and how it simplifies my thinking. Sometimes it gives access to deeper theorems, and more often it is a tidy language that expresses just what I need. Examples will include: sublevelset and interlevelset homology; merge trees and Reeb graphs; generalized factor persistence; a fractional category for "physically measurable" persistent features; comparison theorems for Cech and Rips complexes.

I thank my collaborators in these projects: Peter Bubenik, Gunnar Carlsson, Fred Chazal, Bill Crawley-Boevey, Marc Glisse, Dmitriy Morozov, Liz Munch, Vidit Nanda, Steve Oudot, Amit Patel, Jonathan Scott.

Alireza Dirafzoon (North Carolina State University)

Topological Mapping of Unknown Environments using Biobotic Swarms

Mapping and exploration are essential tasks for swarm robotic systems. These tasks become extremely challenging when localization information is not available. In this study, we explore how stochastic motion models and weak encounter information can be exploited to learn topological information about an unknown environment. Our system behavior mimics a probabilistic motion model of cockroaches, as it is inspired by current biobotic (cyborg insect) systems. We employ tools from algebraic topology to extract spatial information of the environment based on neighbor to neighbor interactions among the biologically inspired agents with no need for localization data. This information is used to build a map of persistent topological features of the environment. We analyze the performance of our estimation and propose a switching control mechanism for the motion models to extract features of complex environments in an effective way. We also propose improved encounter metric to enhance the performance of the persistence feature extraction.

Saba Emrani (North Carolina State University)

Persistent Homology of Delay Embeddings and its Application in Wheeze Detection

We propose a new approach to detect and quantify the periodic structure of dynamical systems using topological methods. We propose to use delay-coordinate embedding as a tool to measure the periodicity, and to use persistent homology to analyse the structure of point clouds of delay-coordinate embeddings. A method for finding the appropriate value of delay is proposed based on an autocorrelation like (ACL) function of the signals. Density based landmark selection and Euler characteristics are used in order to reduce the computational complexity. We apply the introduced topological approach to breathing sound signals for wheeze detection. Simulations have been carried out to substantiate the capabilities of the proposed method.

Brittany Terese Fasy (Tulane University)
Fabrizio Lecci (Carnegie-Mellon University)

Statistical Inference For Persistent Homology

Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with long barcodes as "topological signal." In this poster, we will present some statistical ideas relating to persistent homology; in particular, defining summary functions and confidence intervals that allow us to separate topological signal from topological noise.

Adam R Ferguson (University of California, San Francisco)

TDA-Based Big-Data Analytics for Translational Neurotrauma

Spinal cord (SCI) and traumatic brain injury (TBI) produce a complex constellation of biological and functional changes that are difficult to interpret, replicate, and translate into novel therapies. In the age of “big-data” novel analytics such as topological data analysis (TDA) can help to rapidly integrate, visualize and interpret these complex datasets for precision medicine. METHODS: We formed a collaboration between an academic medical center (UCSF-BASIC) and a Silicon Valley startup (Ayasdi Inc.) to harness topological data analysis/visualization to assist bench-to-bedside translation for TBI and SCI data. Preclinical models of graded cervical SCI, mild TBI, and combined SCI/TBI in female Long Evans rats (N=208) from UCSF-BASIC were analyzed using topological data analysis (TDA) with the Iris software developed by Ayasdi. TDA produced rapid visual mapping of neurological disease patterns from 236 variables, including injury biomechanics, functional, anatomical and health measures. TDA digested and rendered datasets from sham controlled graded cervical SCI, mild TBI, and a combined injury for mild TBI and 75kdyn IH contusion SCI into geometric visual representations of the neurological syndromic space. A correlation metric with a SVD lens clustered subjects with shared syndromic features into nodes, and connected them via edges into an elastic 2-dimensional network. RESULTS: Separate TDAs for graded cervical SCI and combined SCI/TBI produced two distinct network topologies to describe the syndromic space of each dataset. The topology for graded cervical SCI produced 3 distinct flares representing 3 classes of injury (mild, moderate, and severe). Subjects in the first flare included a gradient of shams surgery, hemisections, 75kdyn and 100kdyn IH injuries, with more histological sparing and better functional performance along this gradient. Flare 2 distinguishes a separate group of 75kdyn IH subjects that had unusual improvement due, it turns out, to their condition in a blinded-randomized preclinical drug trial. Flare 3 distinguishes NYU injuries from the rest of the subjects, in a gradient of severity (6.25-12.5 mm). This gradient also correlated to more tissue pathology and limited functional recovery. The topology for combined SCI/TBI produced 2 distinct, unconnected groups. One group represented the SCI syndromic gradient from SCI alone (moderate injury) to SCI+contralateral TBI (severe). The other cluster represented the TBI syndromic gradient ranging from TBI to sham and SCI+ipsilateral TBI subjects. TDA on the neurotrauma syndromic space unveiled the surprising discovery that SCI+ipsilateral TBI are functionally similar to the sham controls, differentiating this type of combined injury from other forms of neurotrauma. CONCLUSIONS: Application of TDA enables rapid, automated visualization of the full neurological syndromic space, allowing for interpretation of multiple neurological disease studies simultaneously. TDA and similar unsupervised, data-driven analytics can help define the full syndromic space of complex neurotrauma models, such as SCI and TBI. By leveraging all of the data that was collected in the study, TDA-based analysis of neurological syndromic space enabled comparison across diverse datasets, accelerating the discovery of consistent features to improve our understanding of neurological injuries and how they respond to treatment effects. Such approaches lay a foundation to promote more efficient design and testing of current and future preclinical/clinical trials of neurological disorders. FUNDING: NIH grants NS067092, NS069537, NS038079, NS079030; DoD CDMRP W81XWH-10-1-0910, and Craig H. Neilsen Foundation 224308.

Joint work: Jessica L. Nielson1, Tomoo Inoue1, Jesse Paquette2, Amity Lin1, Jefferey Sacramento1, Aiwen Liu1, Cristian F. Guandique1, Karen-Amanda Irvine3, John C. Gensel4, Michael S. Beattie1, Jacqueline C. Bresnahan1, Geoffrey T. Manley1, Gunnar E. Carlsson2,3, Pek Yee Lum2, Adam R. Ferguson1 1University of California, San Francisco, Brain and Spinal Injury Center (UCSF-BASIC), 2Ayasdi Inc., Palo Alto, CA, 3Stanford University, Palo Alto CA, 4Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY.

Athanasios Gentimis (North Carolina State University)

Modeling Collaborations With Persistent Homology

How do mathematicians interact? How do they collaborate to create papers? In our work we are going to give an approach using ideas from applied topology, specifically Persistent Homology, and try to expand classical data analysis.We create a theoretical metric amongst mathematicians using some common features (field of expertise, affiliation etc) and we use it to create a network of potential collaborations and subsequently a nested sequence of Socio-plexes. The topological holes are identified and connected to information barriers or in other words "obstructions to collaborations".

Leonidas Guibas (Stanford University)

The Space Between the Data

The information contained across many data sets is often highly correlated. Such connections and correlations can arise because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy. This is particularly true for data sets of a geometric character, such as GPS traces, images, videos, 3D scans, 3D models, etc.

We argue that when extracting knowledge from the data in a given data set, we can do significantly better if we exploit the wider context provided by all the relationships between this data set and a "society" or "social network" of other related data sets. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps, small and large, between inter-related data. The network can act as a regularizer, allowing us to to benefit from the "wisdom of the collection" in performing operations on individual data sets or in map inference between them.

This "functorial" view of data puts the spotlight on consistent, shared relations and maps as the key to understanding structure in data. It is a little different from the current dominant paradigm of extracting supervised or unsupervised feature sets, defining distance or similarity metrics, and doing regression or classification -- though sparsity still plays an important role. The inspiration is more from ideas in homological algebra or algebraic topology, exploiting the algebraic structure of data relationships or maps in an effort to disentangle dependencies and assign importance to the vast web of all possible relationships among multiple data sets.

We illustrate these ideas largely using examples from the realm of 3D shapes and images, e.g., for segmentation or for capturing and visualizing differences between 3D data sets -- and more generally for abstraction, analogy, compression, error correction, and summarization.

This is an overview of joint work with multiple collaborators, as discussed in the talk.

Susan Holmes (Stanford University)

Geometric Methods for Heterogeneous Data

Keywords of the presentation: Multiple Table Methods, RV coefficient, Nonparametric multivariate data analysis.

Modern Data is currently available as combinations of heterogeneous data, medical images can be combined with genetic sequences and protein network information, the challenges of this heterogeneity involve choosing methods that can integrate multiple tables of data without loss of information, I will talk mostly about methods that use various metrics and operator approaches.


Annette Marie Honken (The University of Iowa)

Graph Theoretical Analysis of Knot Distances

A knot is the embedding of S^1 in three dimensional space up to ambient isotopy. We define the distance between two knots, K and L, to be the minimum number of crossing changes required to obtain L from a sequence of crossing changes on K. Then, using rational knots as vertices and placing an edge between two vertices if they are of distance one, we have knot distances graphs. Thus far, we have studied knot distance graphs of rational knots with up to ten crossings. We aim to classify these knots as random, small-world, or scale-free networks through taking measurements of degree, degree distribution, distance statistics, clustering coefficient, and scale-freeness. Additionally, we look at what aligning graphs with different maximum crossing number may tell us. Knots model DNA as well as locally knotted protein, and a crossing change models the action of a type II topoisomerase on double stranded circular DNA. Therefore, we will work to find new patterns and properties among knots via crossing changes and hope to extract biological information as well.

Matthew Kahle (The Ohio State University)

The Threshold for Integer Homology

Keywords of the presentation: random topology, homology

This is joint work with Hoffman and Paquette.

We describe the vanishing threshold for integer homology in Bernoulli random d-dimensional simplicial complexes, answering a 2003 question of Linial and Meshulam. Our bound is tight, up to a constant factor.

The argument is fundamentally different, and surprisingly much simpler, than earlier arguments that gave homology-vanishing thresholds for field coefficients.


Claudia Landi (Università di Modena)

On the Stability of Reeb Graphs of Surfaces

Reeb graphs are very popular shape descriptors in computational frameworks. In this context, a shape is modeled as a space endowed with a function. A number of methodologies have been proposed in the literature to estimate the similarity of the shapes by comparing Reeb graphs. In this context, one of the most important open questions is whether Reeb graphs are robust against function perturbations. In fact, it is clear that any data acquisition is subject to perturbations, noise and approximation errors and, if Reeb graphs were not stable, then distinct computational investigations of the same object could produce completely different results. In this poster we present a stable similarity measure for Reeb graphs. More precisely, focusing our attention on surfaces endowed with Morse functions, we define an editing distance between Reeb graphs, in terms of the cost necessary to transform one graph into another. Our main result is that changes in Morse functions imply smaller changes in the editing distance between Reeb graphs.

Marina Meila (University of Washington)

Geometrically preserving manifold learning

In recent years, manifold learning has become increasingly popular as a tool for performing non-linear dimensionality reduction. This has led to the development of numerous algorithms of varying degrees of complexity that aim to recover manifold geometry using either local or global features of the data.

Building on the Laplacian Eigenmap framework, we propose a new paradigm that off ers a guarantee, under reasonable assumptions, that any manifold learning algori thm will preserve the geometry of a data set. Our approach is based on augmentin g the output of embedding algorithms with geometric information embodied in the Riemannian metric of the manifold. The Riemannian metric allows us to compute ge ometric quantities (such as angle, length, or volume) for any coordinate system or embedding of the manifold. This geometric faithfulness, which is not guarante ed for most algorithms, allows us to define geometric measurements that are inde pendent of the algorithm used, and hence move seamlessly from one algorithm to a nother. In this work, we provide an algorithm for estimating the Riemannian metric from data, consider its consistency, and demonstrate the advantages of our approach in a variety of examples.

As an application of this new framework, we develop a new, principled, unsupervised to selecting the scale parameter in manifold learning, based on optimizing the geometric self-consistency w.r.t the scale.

Facundo Mémoli (The Ohio State University)

Curvature Sets Over Persistence Diagrams

Keywords of the presentation: persistence diagrams, concentration of measure, Gromov-Wasserstein distance

A combinatorial idea of Gromov is to assign to each metric space X the collection of all distance matrices corresponding to all possible n-tuples of points in X. Given a filtration the functor F on finite metric spaces we consider the set of all possible F-persistence diagrams generated by metric subsets of X of cardinality n. For a class of filtration functors which we call compatible, the answer is positive, and these admit stability results in the Gromov-Hausdorff sense.

In order to capture frequency or statistics, it is more useful to consider that, in addition to a metric structure, a probability measure has been specified. Then, given n, to an mm-space X one assigns a probability measure Un induced by pushing forward the reference probability measure on X into the space of all F-persistence diagrams on n-point samples of X. The stability of these constructions can now be expressed in Gromov-Wasserstein sense. As a consequence of this, one can now establish the concentration of the measure Un as n goes to infinity.

Yuriy Mileyko (University of Hawaii at Manoa)

Letting Loops Loose

Problems in applied topology often require computation of an optimal (in some sense) representative among shapes satisfying particular topological constraints. In many cases this is a very challenging, if not infeasible task. Interestingly, such complex problem often can be tackled using biologically inspired algorithms. For example, slime mold can be used to construct optimal networks, ants can teach us how to find shortest paths, and simple evolutionary principles can help us find global optima. Frequently, these processes are driven by stochastic dynamics, suggesting that nature has figured out a way to efficiently sample from a distribution concentrated around the optimum. In this talk we address an example of such a phenomenon. We show that a closed polygonal chain (with large number of short links) chosen uniformly in the space of chains representing a fixed free homotopy class in a (multi-)punctured plane is, with overwhelming probability, close to the minimizing loop in this class. As a consequence, sampling from the stationary distribution of a Markov process (using, say, the Metropolis-Hastings algorithm) gives us a simple way to approximate the minimal length representative in a free homotopy class. This result finds applications in multi-agent systems, as it shows that the swarms of agents subject to obvious constraints move towards the optimal configuration without any control.

Konstantin Mischaikow (Rutgers, The State University Of New Jersey )

Quantitative Description of Multiparameter Nonlinear Dynamics

Keywords of the presentation: global dynamics, Conley theory

It is a classical result that Newton's method converges rapidly to a nondegenerate zero if the initial condition is sufficiently close to the zero. In a recent work, Dupont and Scott ask the following question: do iterates converge to the limit in a particular direction? In the case of a planar system, their numerical simulations suggest that this is often but not always the case. Since, understanding finer information about the asymptotic dynamics may have implications for further algorithm refinement, have a more precise quantitative answer to this question, both in a positive and negative sense seems desirable.

The challenge is that a reasonable model for the asymptotic dynamics consists of a multi parameter family of nonlinear circle maps, where the parameter space is a solid 4-dimensional tours. I will describe our efforts to use a combinatorial, algebraic topological description of the global dynamics to provide a description of the possible dynamics over all of parameter space.

This is joint work with J. Bush, W. Cowen, and S. Harker


Dmitriy Morozov (Lawrence Berkeley Laboratory)

Back to Basics: Merge Trees

Keywords of the presentation: merge trees, interleaving distance

This talk revisits merge trees, a basic topological descriptor that records connectivity of sublevel sets of a scalar function. We introduce an interleaving distance between two merge trees and establish its stability to perturbations of the function. We show that this distance is never smaller than the bottleneck distance between 0-dimensional persistence diagrams of the function.

On the computational side, we consider a distributed representation of merge trees that not only improves their parallel computation, but also supports parallel analysis. As an example, we show how to extract a prescribed levelset component of the function with minimum communication.

Marian Mrozek (Jagiellonian University)

Persistent Homology of Self-maps

Keywords of the presentation: persistent homology, homology of maps, homology algorithms, eigenvalues

When a topological space is known only from sampling, persistence provides a useful tool to study its homological properties. In many applications one can sample not only the space, but also a map acting on the space. The understanding of the topological features of the map is often of interest, in particular in the analysis of time series dynamics but also in the dynamics of a map or differential equation given explicitely when the rigorous study is computationally too expensive and only numerical experiments are available. The aim of the talk is to present an extension of persistent homology to the case of a continuous self-map together with the associated algorithm and numerical examples based on the implementation of the algorithm.

(joint with H. Edelsbrunner and G. Jablonski)

Elizabeth Munch (University of Minnesota, Twin Cities)

A Continuous Mean for Finite Sets of Persistence Diagrams

Keywords of the presentation: Persistent homology, statistics, Frechet mean

Recent progress has shown that the abstract space of persistence diagrams is Polish, and found necessary and sufficient conditions for a set to be compact. This also allowed for the definition of Frechet means, a construction which is possible for any metric space. Since the Frechet mean is a set, not an element, there were no guarantees that this set was non-empty, however Mileyko et al showed that for well behaved distributions, the Frechet mean is, in fact, non-empty.

However, there are simple counterexamples showing that the mean is non-unique, which creates issues when we are interested in continuous, time-varying systems. In order to solve this problem, we present a continuous mean which is a generalization of the Frechet mean.

Monica Nicolau (Stanford University)

On the Effect of Geometry on the Topology of Data

Keywords of the presentation: hypothesis testing, disease subtypes, network topology

I will discuss the effect of geometric transformations on the topology of data. Geometric transformations are central in highlighting characteristics in the data that extract information. A common feature is that the same data set can provide answers to multiple problems. Thus the choice of underlying geometry is crucial in highlighting the answers to the correct problem. There will be many examples from systems biology.

Steve Yann Oudot (INRIA Saclay - Île-de-France )

Zigzag Zoology: Rips Zigzags for Homology Inference

For points sampled near a compact set X, the persistence barcode of the Rips filtration built from the sample contains information about the homology of X as long as X satisfies some geometric assumptions. The Rips filtration is prohibitively large, however zigzag persistence can be used to keep the size linear. In this work we introduce several species of Rips-like zigzags and compare them with respect to the signal-to-noise ratio, a measure of how well the underlying homology is represented in the persistence barcode relative to the noise in the barcode at the relevant scales. Some of these Rips-like zigzags have been available as part of the Dionysus library for several years while others are new. Interestingly, we show that some species of Rips zigzags will exhibit less noise than the (non-zigzag) Rips filtration itself. Thus, Rips zigzags can offer improvements in both size complexity and signal-to-noise ratio.

Along the way, we develop new techniques for manipulating and comparing persistence barcodes from zigzag modules. In particular, we give methods for reversing arrows and removing spaces from a zigzag while controlling the changes occurring in its barcode. These techniques were developed to provide our theoretical analysis of the signal-to-noise ratio of Rips-like zigzags, but they are of independent interest as they apply to zigzag modules generally.

Amit Patel (Institute for Advanced Study)

Connecting Persistent Homology Groups

Keywords of the presentation: persistent homology, well groups, 2-category

This talk is about the categorical structure of the persistent homology group. This is part of an upcoming paper with Robert MacPherson.

Given a function f: X -> R, persistence examines the evolution of the homology of the sublevel sets. There is a persistent homology group for each pair of values r M to manifolds. For now, we are calling the generalized persistent homology group the well group. To each open set of M, there is a well group. For a pair of nested open sets V subset U, we construct a morphism from the well group over U to the well group over V. For a triple of nested open sets W subset V subset U, the triangle consisting of the three well groups and three morphisms does not commute. However, it does commute up to a 2-morphism. We will use pictures to illustrate these ideas.

Jose A Perea (Duke University)

Persistent Homology of Time-Delay Embeddings

We present in this talk a theoretical framework for studying the persistent homology of point clouds from time-delay (or sliding window) embeddings. We will show that maximum 1-d persistence yields a suitable measure of periodicity at the signal level, and present theorems which relate the resulting diagrams to the choices of window size, embedding dimension and field of coefficients. If time permits, we will demonstrate how this methodology can be applied to the study of periodicity on time series from gene expression data.

David Sallach (University of Chicago)

Ideal Types and Prototypes: The Topologies of Social Dynamics

The complexity of social processes has made it difficult to formulate social theories that are mathematically effective. Prototypical social patterns occupy high-dimensional spaces that resist simple characterization. However, category theoretic and topological strategies open up the prospect of innovative interaction between mathematics and social theory. Specifically, ideal types provide the potential to define social invariances relative to which prototypical social concepts can be specified. As statistical distributions shift over time, subject to idealized confordances, dynamic social processes can be clarified. Topological data analysis will play a significant role in this research strategy.

Stephen Smale (City University of Hong Kong)

What is Protein Folding? Mathematical Perspective, New Results

A National Research Council report has called attention to mathematicians the problem of protein folding. Here we will give a setting for the problem by putting a geometrical structure on the space of proteins. A significant question in the understanding of the folding process is this. Which pairs of positions in a linear string representing the protein will come into contact in the 3 dimensional structure?

Katharine Felicity Turner (University of Chicago)

Means, Medians and Null Hypothesis Testing

Keywords of the presentation: persistence diagrams, statistics

We can view persistence diagrams as stochastic variables themselves. As such we wish to be able to analyze sets of persistence diagrams statistically. To this end I will define and characterize the central tendencies of the mean and the median. Separately, I will also provide a method of null hypothesis testing.

Rien van de Weijgaert (Rijksuniversiteit te Groningen)

The Geometry and Topology of the Cosmic Web

The Cosmic Web is the fundamental spatial organization of matter on scales of a few up to a hundred Megaparsec, scales at which the Universe still resides in a state of moderate dynamical evolution. galaxies, intergalactic gas and dark matter exist in a wispy weblike spatial arrangement consisting of dense compact clusters, elongated filaments, and sheetlike walls, amidst large near-empty void regions.

While the complex intricate structure of the cosmic web contains a wealth of cosmological information, its quantification has remained a major challenge. In this lecture, we describe recent work towards invoking concepts from computational topology and computational geometry to our understanding of the structure of the Cosmic Web and to new insights into the conditions in the primordial Universe out of which it emerged.

An important aspect of our understanding of the Cosmic Web concerns the connectivity of the various components. This leads us to recent work on the topological analysis of the Megaparsec scale distribution. To this end, we resort to the homology of the weblike structure, and determine the scale-dependent Betti numbers. To infer this from the discrete spatial galaxy distribution (or of particles in computer models of cosmic structure formation) we extract the Betti numbers from alpha shapes. We have studied the alpha complex of the cosmic weblike point patterns, in order to assess the signature of filaments, walls and voids. Of considerable importance within the context of the multiscale nature of the cosmic mass distrbution, is the information contained in persistence diagrams. Amongst others, I will describe recent work on persistence properties of Gaussian and non-Gaussian random density fields, which form the initial conditions out of which the cosmic web emerged through the force of gravity.

Bei Wang (The University of Utah)

Topology-Inspired Adaptive Sampling Algorithms for Probabilistic Risk Assessment of Nuclear Simulations

Nuclear simulations are often computationally expensive, time-consuming, and high-dimensional with respect to the number of input parameters. Thus exploring the space of all possible simulation outcomes is infeasible using finite computing resources. During simulation-based probabilistic risk analysis, it is important to discover the relationship between a potentially large number of input parameters and the output of a simulation using as few simulation trials as possible. This is a typical context for performing adaptive sampling where a few observations are obtained from the simulation, a surrogate model is built to represent the simulation space, and new samples are selected based on the model constructed. The surrogate model is then updated based on the simulation results of the sampled points. In this way, we attempt to gain the most information possible with a small number of carefully selected sampled points, limiting the number of expensive trials needed to understand features of the simulation space.

We analyze the specific use case of identifying the limit surface, i.e., the boundaries in the simulation space between system failure and system success. In this study, we explore several techniques for adaptively sampling the parameter space in order to reconstruct the limit surface. We focus on several adaptive sampling schemes. First, we seek to learn a global model of the entire simulation space using prediction models or neighborhood graphs and extract the limit surface as an iso-surface of the global model. Second, we estimate the limit surface by sampling in the neighborhood of the current estimate based on topological segmentations obtained locally.

Our techniques draw inspirations from topological structure known as the Morse-Smale complex. We highlight the advantages and disadvantages of using a global prediction model versus local topological view of the simulation space, comparing several different strategies for adaptive sampling in both contexts. One of the most interesting models we propose attempt to marry the two by obtaining a coarse global representation using prediction models, and a detailed local representation based on topology. Our methods are validated on several analytical test functions as well as a small nuclear simulation dataset modeled after a simplified Pressurized Water Reactor.

Joint work with: Dan Maljovec, Valerio Pascucci, Peer-Timo Bremer and Diego Mandelli

Frank Weilandt (Jagiellonian University)

An Algorithm for Computing the Conley Index of a Poincare Map

Poincare maps and their Conley indices are useful tools to study continuous dynamical systems. We are developing a method to compute the Conley index of Poincare maps of non-autonomous periodic ODEs. We apply algorithms for rigorous proofs in dynamical systems and cubical homology calculations. The new method requires only integration of the equation over a short time interval.

The main ideas of the method are developed by Marian Mrozek and Roman Srzednicki.

Shmuel Weinberger (University of Chicago)

Complexity of Some Topological Inference Problems

Keywords of the presentation: topologcial inference, complexity, comparison geometry, controlled topology, concentration of measure

I will discuss some stylized topological inference problems and give some information about Kolmogorov, sample and computational complexity. I will then discuss some tentative first steps towards a theory of feasibly computable invariants.

Michael Werman (Hebrew University)

The Euler Characteristic Feature

The Euler characteristic is a basic topological property. We present an object descriptor for supervised classification based on the Euler characteristic of subsets created by thresholding a function defined over the domain at multiple levels. We demonstrate the effectiveness of the descriptor in different domains - images, videos and 3D mesh surfaces. In addition, we propose methods for efficient calculation of the Euler characteristic for multiple threshold values and calculation of the Euler characteristic in a sliding window.

Mathijs Wintraecken (Rijksuniversiteit te Groningen)

Uniqueness of Gauss-Bonnet Integral Invariant

An elementary geometrical proof of the fact that the Euler characteristic is the only topological invariant of a surface that can be found by integration (using Gauss-Bonnet) is given. A similar method is also applied to three dimensional manifolds.

Matthew L. Wright (St. Olaf College)

Hadwiger Integration and Applications

The intrinsic volumes generalize both Euler characteristic and volume, quantifying the “size” of a set in various ways. Lifting the intrinsic volumes from sets to functions over sets, we obtain the Hadwiger Integrals, a family of integrals that generalize both the Euler integral and the Lebesgue integral. The classic Hadwiger Theorem says that the intrinsic volumes form a basis for the space of all valuations on sets. An analogous result holds for valuations on functions: with certain assumptions, any valuation on functions can be expressed in terms of Hadwiger integrals. These integrals provide various notions of the size of a function, which are potentially useful for analyzing data arising from sensor networks, cell dynamics, image processing, and other areas. This poster provides an overview of the intrinsic volumes, Hadwiger integrals, and possible applications.

Connect With Us: