October 7-11, 2013
Suppose that ball-shaped sensors wander in a bounded and contractible domain. A sensor doesn't know its location but does know when it overlaps a nearby sensor. We say that an evasion path exists in this sensor network if a moving evader can avoid detection. Vin de Silva and Robert Ghrist give a necessary condition, depending only on the time-varying connectivity graph of the sensors, for an evasion path to exist. Can we sharpen this result? We show that the existence of an evasion path depends not only on the fibrewise homotopy type of the region covered by sensors but also on its embedding in spacetime. Furthermore, we study the entire space of evasion paths using a homotopy spectral sequence for diagrams of spaces. This is joint work with Gunnar Carlsson.
Keywords of the presentation: Euler characteristic curves, topological inference, Gaussian processes
The theory and practice of topological data analysis (TDA) over the past decade has been dominated by the notion and application of persistent homology. Given a representation of the persistent homology of a filtered structure, it is easy to compute the Euler characteristic curve (ECC) which gives the value of the Euler characteristic of the structure at each point of the filtration.
Although ECCs contains less information than, for example, persistence diagrams, they are typically much easier to compute numerically, using techniques not much different to those used by Euler almost three centuries ago. From a practical point of view, ECCs often seem to contain most of the useful information that is necessary for data analysis and, from a more theoretical point of view, it has turned out to be much easier to study ECCs for random structures than persistent homology or even Betti numbers at some point of a filtration.
In the early 1990's, more or less a decade before the appearance of modern TDA, researchers in medical imaging coined the term "Topological Inference" to describe a collection of techniques that exploited the ease of computation and availability of theoretical results for ECCs that were used to analyse PET, MRI and fMRI data.
In this talk I will describe the relation between persistent homology and ECCs in a number of practical and theoretical scenarios, as well as the somewhat different but often quite parallel approaches of TDA and Topological Inference.
Keywords of the presentation: Breast Cancer, Computational Homology, Her2
Breast cancer is a complex genetic set of diseases that remains to be understood. It is widely believed that for initiation and progression of each of these diseases a number of genes need to be missregulated. Gene missregulation may occur through gains or losses of the genome, commonly termed DNA copy number abnormalities (CNAs). CNAs are routinely detected in the laboratory through a number of techniques including comparative genomic hybridization (CGH). While these techniques lend themselves to the profiling of the entire genome and therefore the identification of multiple CNAs, statistical methods has been mostly limited to the identification of large, independent CNAs. Smaller, co-ocurring CNAs have however been mostly neglected due to the combinatorial explosion of cases that needs to be considered for their identification. One potential approach to partially overcome such combinatorial explosion is by encoding relationships among co-occurring CNAs as single events. Here we propose an approach that uses computational algebraic topology to encode such relationships and therefore partially overcome the combinatorial explosion associated to the identification of co-occurring CNAs in CGH data.
We illustrate our methodology analyzing a set of Her2 positive (Her2+) breast cancer, an aggressive form of the disease that comprise 25% of all breast cancers. Her2+ tumors is characterized by overexpression of the gene ERBB2, which is commonly regulated by copy number gains at ERBB2. When we apply our method to the analysis of ERRB2 in the data published by Horlings et al 2010, we find the amplification of chromosome 17q which harbors the gene ERBB2 and also co-amplifications of chromosome arms 4q and 9q. To test whether these CNAs have functional implications we perform gene expression analysis on the two independent data sets published using SAM. We find differentially expressed genes in 9q that map to the identified CNAs. Our results suggest that these genes involved in Her2+ tumor development are regulated by small, co-occurring CNAS.
Keywords of the presentation: homology, random cech complexes
In manifold learning, one wishes to infer geometric and topological features of an unknown manifold, embedded in a d-dimensional Euclidean space, from a finite (random) point cloud. One topological invariant of a considerable interest is the homology of the underlying space.
A common method for recovering the homology of a manifold from a set of random samples, is to cover each point with a d-dimensional ball, and study the union of these balls. By the Nerve Lemma, this method is equivalent to studying the homology of the Čech complex generated from the random point cloud.
In this talk we discuss the limiting behavior of random Čech complexes, as the sample size goes to infinity, and the radius of the balls goes to zero. We show that the limiting behavior exhibits multiple phase transitions at different levels, depending on the rate at which the radius of the balls goes to zero. We present the different regimes and phase transitions discovered so far, and observe the nicely ordered fashion in which homology groups of different dimensions appear and vanish.
One interesting consequence of this analysis, is a simple sufficient condition under which the random Čech complex successfully recovers the homology of the original manifold.
We will give a summary of the topological methods for understanding complex data sets. We will discuss the persistent homology theme as well as the theme which deals with geometric representations of data.
Keywords of the presentation: persistence diagrams, metric spaces, statistics
In TDA, persistent homology appears as a fundamental tool to infer relevant topological information from data. Persistence diagrams are usually computed from filtrations built on top of data sets sampled from some unknown (metric) space. They provide "topological signatures" revealing the structure of the underlying space. To ensure the relevance of such signatures, it is necessary to prove that they come with stability properties with respect to the way data are sampled.
In this talk, we will present a few results on the stability of persistence diagrams built on top of general metric spaces. We will show that the use of
persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties.
We present a distributed algorithm to compute the first homology of a simplicial complex. We employ spanning trees to compute a basis for algebraic 1-cycles, and then use harmonics to efficiently identify the contractible and homologous cycles. The computational complexity of the algorithm is $O(|P|^omega)$, where $|P|$ is much smaller than the number of edges, and $omega$ is the complexity order of matrix multiplication. For geometric graphs, we show using simulations that $|P|$ is very close to the first Betti number.
This Fall 2013 I am teaching a free online course MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology offered through the Mathematics Department and Division of Continuing Education at University of Iowa.
Goal: To prepare students and other researchers for the IMA Thematic Year on Scientific and Engineering Applications of Algebraic Topology, but all interested participants are welcome
Target Audience: Anyone interested in topological data analysis including graduate students, faculty, industrial researchers in bioinformatics, biology, business, computer science, cosmology, engineering, imaging, mathematics, neurology, physics, statistics, etc.
If you are interested in helping to teach a similar course in the spring, please let me know.
More information about the Fall 2013 course can be found at www.math.uiowa.edu/~idarcy/AppliedTopology.html
Keywords of the presentation: persistence, categories
I find it convenient to think about persistence in terms of category theory. I will explain why this is natural, and how it simplifies my thinking. Sometimes it gives access to deeper theorems, and more often it is a tidy language that expresses just what I need. Examples will include: sublevelset and interlevelset homology; merge trees and Reeb graphs; generalized factor persistence; a fractional category for "physically measurable" persistent features; comparison theorems for Cech and Rips complexes.
I thank my collaborators in these projects: Peter Bubenik, Gunnar Carlsson, Fred Chazal, Bill Crawley-Boevey, Marc Glisse, Dmitriy Morozov, Liz Munch, Vidit Nanda, Steve Oudot, Amit Patel, Jonathan Scott.
Mapping and exploration are essential tasks for swarm robotic systems. These tasks become extremely challenging when localization information is not available. In this study, we explore how stochastic motion models and weak encounter
information can be exploited to learn topological information about an unknown environment. Our system behavior mimics a probabilistic motion model of cockroaches, as it is inspired by current biobotic (cyborg insect) systems. We employ tools from algebraic topology to extract spatial information of the environment based on neighbor to neighbor interactions among the biologically inspired agents with no need for localization data. This information is used to build a map of persistent topological features of the environment. We analyze the performance of our estimation and propose a switching control mechanism for the
motion models to extract features of complex environments in an effective way. We also propose improved encounter metric to enhance the performance of the persistence feature extraction.
We propose a new approach to detect and quantify the periodic structure of dynamical systems using topological methods. We propose to use delay-coordinate embedding as a tool to measure the periodicity, and to use persistent homology to analyse the structure of point clouds of delay-coordinate embeddings. A method for finding the appropriate value of delay is proposed based on an autocorrelation like (ACL) function of the signals. Density based landmark selection and Euler characteristics are used in order to reduce the computational complexity. We apply the introduced topological approach to breathing sound signals for wheeze detection. Simulations have been carried out to substantiate the capabilities of the proposed method.
Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with long barcodes as "topological signal." In this poster, we will present some statistical ideas relating to persistent homology; in particular, defining summary functions and confidence intervals that allow us to separate topological signal from topological noise.
Spinal cord (SCI) and traumatic brain injury (TBI) produce a complex constellation of biological and functional changes that are difficult to interpret, replicate, and translate into novel therapies. In the age of “big-data” novel analytics such as topological data analysis (TDA) can help to rapidly integrate, visualize and interpret these complex datasets for precision medicine. METHODS: We formed a collaboration between an academic medical center (UCSF-BASIC) and a Silicon Valley startup (Ayasdi Inc.) to harness topological data analysis/visualization to assist bench-to-bedside translation for TBI and SCI data. Preclinical models of graded cervical SCI, mild TBI, and combined SCI/TBI in female Long Evans rats (N=208) from UCSF-BASIC were analyzed using topological data analysis (TDA) with the Iris software developed by Ayasdi. TDA produced rapid visual mapping of neurological disease patterns from 236 variables, including injury biomechanics, functional, anatomical and health measures. TDA digested and rendered datasets from sham controlled graded cervical SCI, mild TBI, and a combined injury for mild TBI and 75kdyn IH contusion SCI into geometric visual representations of the neurological syndromic space. A correlation metric with a SVD lens clustered subjects with shared syndromic features into nodes, and connected them via edges into an elastic 2-dimensional network. RESULTS: Separate TDAs for graded cervical SCI and combined SCI/TBI produced two distinct network topologies to describe the syndromic space of each dataset. The topology for graded cervical SCI produced 3 distinct flares representing 3 classes of injury (mild, moderate, and severe). Subjects in the first flare included a gradient of shams surgery, hemisections, 75kdyn and 100kdyn IH injuries, with more histological sparing and better functional performance along this gradient. Flare 2 distinguishes a separate group of 75kdyn IH subjects that had unusual improvement due, it turns out, to their condition in a blinded-randomized preclinical drug trial. Flare 3 distinguishes NYU injuries from the rest of the subjects, in a gradient of severity (6.25-12.5 mm). This gradient also correlated to more tissue pathology and limited functional recovery. The topology for combined SCI/TBI produced 2 distinct, unconnected groups. One group represented the SCI syndromic gradient from SCI alone (moderate injury) to SCI+contralateral TBI (severe). The other cluster represented the TBI syndromic gradient ranging from TBI to sham and SCI+ipsilateral TBI subjects. TDA on the neurotrauma syndromic space unveiled the surprising discovery that SCI+ipsilateral TBI are functionally similar to the sham controls, differentiating this type of combined injury from other forms of neurotrauma. CONCLUSIONS: Application of TDA enables rapid, automated visualization of the full neurological syndromic space, allowing for interpretation of multiple neurological disease studies simultaneously. TDA and similar unsupervised, data-driven analytics can help define the full syndromic space of complex neurotrauma models, such as SCI and TBI. By leveraging all of the data that was collected in the study, TDA-based analysis of neurological syndromic space enabled comparison across diverse datasets, accelerating the discovery of consistent features to improve our understanding of neurological injuries and how they respond to treatment effects. Such approaches lay a foundation to promote more efficient design and testing of current and future preclinical/clinical trials of neurological disorders. FUNDING: NIH grants NS067092, NS069537, NS038079, NS079030; DoD CDMRP W81XWH-10-1-0910, and Craig H. Neilsen Foundation 224308.
Jessica L. Nielson1, Tomoo Inoue1, Jesse Paquette2, Amity Lin1, Jefferey Sacramento1, Aiwen Liu1, Cristian F. Guandique1, Karen-Amanda Irvine3, John C. Gensel4, Michael S. Beattie1, Jacqueline C. Bresnahan1, Geoffrey T. Manley1, Gunnar E. Carlsson2,3, Pek Yee Lum2, Adam R. Ferguson1
1University of California, San Francisco, Brain and Spinal Injury Center (UCSF-BASIC), 2Ayasdi Inc., Palo Alto, CA, 3Stanford University, Palo Alto CA, 4Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY.
How do mathematicians interact? How do they collaborate to create papers? In our work we are going to give an approach using ideas from applied topology, specifically Persistent Homology, and try to expand classical data analysis.We create a theoretical metric amongst mathematicians using some common features (field of expertise, affiliation etc) and we use it to create a network of potential collaborations and subsequently a nested sequence of Socio-plexes. The topological holes are identified and connected to information barriers or in other words "obstructions to collaborations".
The information contained across many data sets is often highly correlated. Such connections and correlations can arise because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy. This is particularly true for data sets of a geometric character, such as GPS traces, images, videos, 3D scans, 3D models, etc.
We argue that when extracting knowledge from the data in a given data set, we can do significantly better if we exploit the wider context provided by all the relationships between this data set and a "society" or "social network" of other related data sets. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps, small and large, between inter-related data. The network can act as a regularizer, allowing us to to benefit from the "wisdom of the collection" in performing operations on individual data sets or in map inference between them.
This "functorial" view of data puts the spotlight on consistent, shared relations and maps as the key to understanding structure in data. It is a little different from the current dominant paradigm of extracting supervised or unsupervised feature sets, defining distance or similarity metrics, and doing regression or classification -- though sparsity still plays an important role. The inspiration is more from ideas in homological algebra or algebraic topology, exploiting the algebraic structure of data relationships or maps in an effort to disentangle dependencies and assign importance to the vast web of all possible relationships among multiple data sets.
We illustrate these ideas largely using examples from the realm of 3D shapes and images, e.g., for segmentation or for capturing and visualizing differences between 3D data sets -- and more generally for abstraction, analogy, compression, error correction, and summarization.
This is an overview of joint work with multiple collaborators, as discussed in the talk.
Keywords of the presentation: Multiple Table Methods, RV coefficient, Nonparametric multivariate data analysis.
Modern Data is currently available as combinations of heterogeneous data, medical images can be combined with genetic sequences and protein network information, the challenges of this heterogeneity involve choosing methods that can integrate multiple tables of data without loss of information, I will talk mostly about methods that use various metrics and operator approaches.Read More...
A knot is the embedding of S^1 in three dimensional space up to ambient isotopy. We define the distance between two knots, K and L, to be the minimum number of crossing changes required to obtain L from a sequence of crossing changes on K. Then, using rational knots as vertices and placing an edge between two vertices if they are of distance one, we have knot distances graphs. Thus far, we have studied knot distance graphs of rational knots with up to ten crossings. We aim to classify these knots as random, small-world, or scale-free networks through taking measurements of degree, degree distribution, distance statistics, clustering coefficient, and scale-freeness. Additionally, we look at what aligning graphs with different maximum crossing number may tell us. Knots model DNA as well as locally knotted protein, and a crossing change models the action of a type II topoisomerase on double stranded circular DNA. Therefore, we will work to find new patterns and properties among knots via crossing changes and hope to extract biological information as well.
Keywords of the presentation: random topology, homology
This is joint work with Hoffman and Paquette.Read More...
We describe the vanishing threshold for integer homology in Bernoulli random d-dimensional simplicial complexes, answering a 2003 question of Linial and Meshulam. Our bound is tight, up to a constant factor.
The argument is fundamentally different, and surprisingly much simpler, than earlier arguments that gave homology-vanishing thresholds for field coefficients.
Reeb graphs are very popular shape descriptors in computational frameworks. In this context, a shape is modeled as a space endowed with a function. A number of methodologies have been proposed in the literature to estimate the similarity of the shapes by comparing Reeb graphs. In this context, one of the most important open questions is whether Reeb graphs are robust against function perturbations. In fact, it is clear that any data acquisition is subject to perturbations, noise and approximation errors and, if Reeb graphs were not stable, then distinct computational investigations of the same object could produce completely different results. In this poster we present a stable similarity measure for Reeb graphs. More precisely, focusing our attention on surfaces endowed with Morse functions, we define an editing distance between Reeb graphs, in terms of the cost necessary to transform one graph into another. Our main result is that changes in Morse functions imply smaller changes in the editing distance between Reeb graphs.
In recent years, manifold learning has become increasingly popular as a tool for performing non-linear dimensionality reduction. This has led to the development of numerous algorithms of varying degrees of complexity that aim to recover manifold geometry using either local or global features of the data.
Building on the Laplacian Eigenmap framework, we propose a new paradigm that off
ers a guarantee, under reasonable assumptions, that any manifold learning algori
thm will preserve the geometry of a data set. Our approach is based on augmentin
g the output of embedding algorithms with geometric information embodied in the
Riemannian metric of the manifold. The Riemannian metric allows us to compute ge
ometric quantities (such as angle, length, or volume) for any coordinate system
or embedding of the manifold. This geometric faithfulness, which is not guarante
ed for most algorithms, allows us to define geometric measurements that are inde
pendent of the algorithm used, and hence move seamlessly from one algorithm to a
nother. In this work, we provide an algorithm for estimating the Riemannian metric from data, consider its consistency, and demonstrate the advantages of our approach in a variety of examples.
As an application of this new framework, we develop a new, principled,
unsupervised to selecting the scale parameter in manifold learning,
based on optimizing the geometric self-consistency w.r.t the scale.
Keywords of the presentation: persistence diagrams, concentration of measure, Gromov-Wasserstein distance
A combinatorial idea of Gromov is to assign to each metric space X the
collection of all distance matrices corresponding to all possible n-tuples of
points in X. Given a filtration the functor F on finite
metric spaces we consider the set of all
possible F-persistence diagrams generated by metric
subsets of X of cardinality n. For a class of filtration functors
which we call compatible, the answer is positive, and these admit stability results in the Gromov-Hausdorff sense.
In order to capture frequency or statistics, it is more useful to
consider that, in addition to a metric structure, a probability
measure has been specified. Then, given n, to an mm-space X one assigns a probability measure Un induced by pushing forward the reference probability measure on X into the space of all F-persistence diagrams on n-point samples of X. The stability of these constructions can now be expressed in Gromov-Wasserstein sense. As a consequence of this, one can now establish the concentration of the measure Un as n goes to infinity.
Problems in applied topology often require computation of an optimal (in some sense) representative among shapes satisfying particular topological constraints. In many cases this is a very challenging, if not infeasible task. Interestingly, such complex problem often can be tackled using biologically inspired algorithms. For example, slime mold can be used to construct optimal networks, ants can teach us how to find shortest paths, and simple evolutionary principles can help us find global optima. Frequently, these processes are driven by stochastic dynamics, suggesting that nature has figured out a way to efficiently sample from a distribution concentrated around the optimum. In this talk we address an example of such a phenomenon. We show that a closed polygonal chain (with large number of short links) chosen uniformly in the space of chains representing a fixed free homotopy class in a (multi-)punctured plane is, with overwhelming probability, close to the minimizing loop in this class. As a consequence, sampling from the stationary distribution of a Markov process (using, say, the Metropolis-Hastings algorithm) gives us a simple way to approximate the minimal length representative in a free homotopy class. This result finds applications in multi-agent systems, as it shows that the swarms of agents subject to obvious constraints move towards the optimal configuration without any control.
Keywords of the presentation: global dynamics, Conley theory
It is a classical result that Newton's method converges rapidly to a nondegenerate zero if the
initial condition is sufficiently close to the zero. In a recent work, Dupont and Scott ask the following question: do iterates converge to the limit in a particular direction? In the case of a planar system, their numerical simulations suggest that this is often but not always the case. Since, understanding finer information about the asymptotic dynamics may have implications for further algorithm refinement, have a more precise quantitative answer to this question, both in a positive and negative sense seems desirable. Read More...
The challenge is that a reasonable model for the asymptotic dynamics consists of a multi parameter family of nonlinear circle maps, where the parameter space is a solid 4-dimensional tours. I will describe our efforts to use a combinatorial, algebraic topological description of the global dynamics to provide a description of the possible dynamics over all of parameter space.
This is joint work with J. Bush, W. Cowen, and S. Harker
Keywords of the presentation: merge trees, interleaving distance
This talk revisits merge trees, a basic topological descriptor that records
connectivity of sublevel sets of a scalar function. We introduce an interleaving
distance between two merge trees and establish its stability to perturbations of
the function. We show that this distance is never smaller than the bottleneck
distance between 0-dimensional persistence diagrams of the function.
On the computational side, we consider a distributed representation of merge
trees that not only improves their parallel computation, but also supports
parallel analysis. As an example, we show how to extract a prescribed levelset
component of the function with minimum communication.
Keywords of the presentation: persistent homology, homology of maps, homology algorithms, eigenvalues
When a topological space is known only from sampling,
persistence provides a useful tool to study its homological
properties. In many applications one can sample not only the space,
but also a map acting on the space. The understanding of the
topological features of the map is often of interest, in particular
in the analysis of time series dynamics but also in the dynamics
of a map or differential equation given explicitely when the rigorous study
is computationally too expensive and only numerical experiments are available.
The aim of the talk is to present an extension of persistent homology to the case of a continuous
self-map together with the associated algorithm and numerical examples based on the implementation
of the algorithm.
(joint with H. Edelsbrunner and G. Jablonski)
Keywords of the presentation: Persistent homology, statistics, Frechet mean
Recent progress has shown that the abstract space of persistence diagrams is Polish, and found necessary and sufficient conditions for a set to be compact.
This also allowed for the definition of Frechet means, a construction which is possible for any metric space. Since the Frechet mean is a set, not an element, there were no guarantees that this set was non-empty, however Mileyko et al showed that for well behaved distributions, the Frechet mean is, in fact, non-empty.
However, there are simple counterexamples showing that the mean is non-unique, which creates issues when we are interested in continuous, time-varying systems. In order to solve this problem, we present a continuous mean which is a generalization of the Frechet mean.
Keywords of the presentation: hypothesis testing, disease subtypes, network topology
I will discuss the effect of geometric transformations on the topology of data. Geometric transformations are central in highlighting characteristics in the data that extract information. A common feature is that the same data set can provide answers to multiple problems. Thus the choice of underlying geometry is crucial in highlighting the answers to the correct problem. There will be many examples from systems biology.
For points sampled near a compact set X, the persistence barcode of
the Rips filtration built from the sample contains information about the
homology of X as long as X satisfies some geometric assumptions. The
Rips filtration is prohibitively large, however zigzag persistence can
be used to keep the size linear.
In this work we introduce several species of Rips-like zigzags and
compare them with respect to the signal-to-noise ratio, a measure of how
well the underlying homology is represented in the persistence barcode
relative to the noise in the barcode at the relevant scales. Some of
these Rips-like zigzags have been available as part of the Dionysus
library for several years while others are new. Interestingly, we show
that some species of Rips zigzags will exhibit less noise than the
(non-zigzag) Rips filtration itself. Thus, Rips zigzags can offer
improvements in both size complexity and signal-to-noise ratio.
Along the way, we develop new techniques for manipulating and comparing
persistence barcodes from zigzag modules. In particular, we give methods
for reversing arrows and removing spaces from a zigzag while controlling
the changes occurring in its barcode. These techniques were developed to
provide our theoretical analysis of the signal-to-noise ratio of
Rips-like zigzags, but they are of independent interest as they apply to
zigzag modules generally.
Keywords of the presentation: persistent homology, well groups, 2-category
This talk is about the categorical structure of the persistent homology group. This is part of an upcoming paper with Robert MacPherson.
Given a function f: X -> R, persistence examines the evolution of the homology of the sublevel sets. There is a persistent homology group for each pair of values r M to manifolds. For now, we are calling the generalized persistent homology group the well group. To each open set of M, there is a well group. For a pair of nested open sets V subset U, we construct a morphism from the well group over U to the well group over V. For a triple of nested open sets W subset V subset U, the triangle consisting of the three well groups and three morphisms does not commute. However, it does commute up to a 2-morphism. We will use pictures to illustrate these ideas.
We present in this talk a theoretical framework for studying the
persistent homology of point clouds from time-delay (or sliding window)
embeddings. We will show that maximum 1-d persistence yields a suitable
measure of periodicity at the signal level, and present theorems which
relate the resulting diagrams to the choices of window size, embedding
dimension and field of coefficients. If time permits, we will
demonstrate how this methodology can be applied to the study of
periodicity on time series from gene expression data.
The complexity of social processes has made it difficult to formulate social theories that are mathematically effective. Prototypical social patterns occupy high-dimensional spaces that resist simple characterization. However, category theoretic and topological strategies open up the prospect of innovative interaction between mathematics and social theory. Specifically, ideal types provide the potential to define social invariances relative to which prototypical social concepts can be specified. As statistical distributions shift over time, subject to idealized confordances, dynamic social processes can be clarified. Topological data analysis will play a significant role in this research strategy.
A National Research Council report has called attention to
mathematicians the problem of protein folding. Here we will give a setting
for the problem by putting a geometrical structure on the space of proteins.
A significant question in the understanding of the folding process is this.
Which pairs of positions in a linear string representing the protein will
come into contact in the 3 dimensional structure?
Keywords of the presentation: persistence diagrams, statistics
We can view persistence diagrams as stochastic variables themselves. As such we wish to be able to analyze sets of persistence diagrams statistically. To this end I will define and characterize the central tendencies of the mean and the median. Separately, I will also provide a method of null hypothesis testing.
The Cosmic Web is the fundamental spatial organization of matter on scales of a few up to a hundred
Megaparsec, scales at which the Universe still resides in a state of moderate dynamical evolution.
galaxies, intergalactic gas and dark matter exist in a wispy weblike spatial arrangement consisting of
dense compact clusters, elongated filaments, and sheetlike walls, amidst large near-empty void regions.
While the complex intricate structure of the cosmic web contains a wealth of cosmological
information, its quantification has remained a major challenge. In this lecture, we describe
recent work towards invoking concepts from computational topology and computational
geometry to our understanding of the structure of the Cosmic Web and to new insights into
the conditions in the primordial Universe out of which it emerged.
An important aspect of our understanding of the Cosmic Web concerns the connectivity
of the various components. This leads us to recent work on the topological analysis
of the Megaparsec scale distribution. To this end, we resort to the homology of the
weblike structure, and determine the scale-dependent Betti numbers. To infer this from the discrete
spatial galaxy distribution (or of particles in computer models of cosmic structure formation)
we extract the Betti numbers from alpha shapes. We have studied the alpha complex of the cosmic weblike
point patterns, in order to assess the signature of filaments, walls and voids. Of considerable
importance within the context of the multiscale nature of the cosmic mass distrbution, is the
information contained in persistence diagrams. Amongst others, I will describe recent work on
persistence properties of Gaussian and non-Gaussian random density fields, which form the initial
conditions out of which the cosmic web emerged through the force of gravity.
Nuclear simulations are often computationally expensive, time-consuming, and
high-dimensional with respect to the number of input parameters.
Thus exploring the space of all possible simulation outcomes is
infeasible using finite computing resources. During simulation-based
probabilistic risk analysis, it is important to discover the
relationship between a potentially large number of input parameters
and the output of a simulation using as few simulation trials as
possible. This is a typical context for performing adaptive sampling
where a few observations are obtained from the simulation, a surrogate
model is built to represent the simulation space, and new samples are
selected based on the model constructed. The surrogate model is then
updated based on the simulation results of the sampled points. In this
way, we attempt to gain the most information possible with a small
number of carefully selected sampled points, limiting the number of
expensive trials needed to understand features of the simulation
We analyze the specific use case of identifying the limit surface,
i.e., the boundaries in the simulation space between system failure
and system success. In this study, we explore several techniques for
adaptively sampling the parameter space in order to reconstruct the
We focus on several adaptive sampling schemes.
First, we seek to learn a global model of the entire simulation space
using prediction models or neighborhood graphs and extract the limit
surface as an iso-surface of the global model.
Second, we estimate the limit surface by sampling in the neighborhood
of the current estimate based on topological segmentations obtained locally.
Our techniques draw inspirations from topological structure known as
the Morse-Smale complex. We highlight the advantages and disadvantages
of using a global prediction model versus local topological view of
the simulation space, comparing several different strategies for
adaptive sampling in both contexts. One of the most interesting models
attempt to marry the two by obtaining a coarse global representation
using prediction models, and a detailed local representation based on
Our methods are validated on several analytical test functions as well
as a small nuclear simulation dataset modeled after a simplified
Pressurized Water Reactor.
Joint work with: Dan Maljovec, Valerio Pascucci, Peer-Timo Bremer and
Poincare maps and their Conley indices are useful tools to study continuous dynamical systems. We are developing a method to compute the Conley index of Poincare maps of non-autonomous periodic ODEs. We apply algorithms for rigorous proofs in dynamical systems and cubical homology calculations. The new method requires only integration of the equation over a short time interval.
The main ideas of the method are developed by Marian Mrozek and Roman Srzednicki.
Keywords of the presentation: topologcial inference, complexity, comparison geometry, controlled topology, concentration of measure
I will discuss some stylized topological inference problems and give some information about Kolmogorov, sample and computational complexity. I will then discuss some tentative first steps towards a theory of feasibly computable invariants.
The Euler characteristic is a basic topological property. We present
an object descriptor for supervised classification based on the Euler
characteristic of subsets created by thresholding a function defined
over the domain at multiple levels. We demonstrate the effectiveness
of the descriptor in different domains - images, videos and 3D mesh
surfaces. In addition, we propose methods for efficient calculation of
the Euler characteristic for multiple threshold values and calculation
of the Euler characteristic in a sliding window.
An elementary geometrical proof of the fact that the Euler characteristic is the only topological invariant of a surface that can be found by integration (using Gauss-Bonnet) is given. A similar method is also applied to three dimensional manifolds.
The intrinsic volumes generalize both Euler characteristic and volume, quantifying the “size” of a set in various ways. Lifting the intrinsic volumes from sets to functions over sets, we obtain the Hadwiger Integrals, a family of integrals that generalize both the Euler integral and the Lebesgue integral. The classic Hadwiger Theorem says that the intrinsic volumes form a basis for the space of all valuations on sets. An analogous result holds for valuations on functions: with certain assumptions, any valuation on functions can be expressed in terms of Hadwiger integrals. These integrals provide various notions of the size of a function, which are potentially useful for analyzing data arising from sensor networks, cell dynamics, image processing, and other areas. This poster provides an overview of the intrinsic volumes, Hadwiger integrals, and possible applications.