November 18-21, 2013
Keywords of the presentation: carbon flux, variational data assimilation
We consider the estimation of carbon flux at the ocean-atmosphere interface from the perspective of variational data assimilation. The flux is treated as a time-dependent Neumann boundary condition for the carbon mixing ratio, which evolves according to a linear diffusive transport equation. The variational problem has a natural, infinite-dimensional Bayesian formulation. We present an explicit computation of the resulting posterior covariance, and, by examining its dependence on the observation operator, explain how our results can be used to guide future observational strategies.
I will describe a recently-developed approach to studying experimental data obtained from a periodic or quasiperiodic dynamical system. The idea is to use persistent cohomology to construct robust new coordinate functions on the data. These coordinates take values in the circle, so they may correspond to phase variables or angular variables implicit in the system being studied. This is joint work with Primoz Skraba and Mikael Vejdemo-Johansson, with a key technical contribution by Dmitriy Morozov.
Our planet is experiencing simultaneous changes in global population, urbanization, and climate. These changes, along with the rapid growth of climate data and increasing popularity of data mining techniques may lead to the conclusion that the time is ripe for data mining to spur major innovations in climate science. However, climate data bring forth unique challenges that are unfamiliar to the traditional data mining literature, and unless they are addressed, data mining will not have the same impact that it has had on fields such as biology or e-commerce.
This talk provides a technical audience with an introduction to mining climate data with an emphasis on the singular characteristics of the datasets and research questions climate science attempts to address. We demonstrate some of the concepts discussed in the earlier parts of the talk with a spatio-temporal pattern mining application to monitor global ocean eddy dynamics. We show that insightfully mining the spatio-temporal context of climate datasets can yield significant improvements in the performance of learning algorithms.
Keywords of the presentation: data assimilation, ensemble Kalman filter, coupled climate, meridional overturning circulation
Motivated by the need to properly address near-term (i.e. interannual to interdecadal) climate prediction as an initial-value problem, intense interest has emerged on the development of data assimilation for coupled atmosphere--ocean global climate models. Basic research on this problem is challenging due to the large computational expense associated with ensemble simulation of coupled climate models over long periods of time. Here we apply an idealized atmosphere-ocean climate model and an ensemble Kalman filter to explore three basic questions on this problem:
(1) Under what circumstances is data assimilation needed?
(2) Is fully coupled assimilation required?
(3) What are the ideal properties of observations for coupled assimilation?
Robust solutions for large samples over the model attractor reveal that the slow overturning circulation in the ocean may be constrained with properly handled atmospheric observations alone. Results from the idealized model are compared with those obtained for data from the Community Earth System Model (CESM).
Keywords of the presentation: filtering, model errors, multiscale systems
Fundamental issues in improving state estimation (or filtering) problems are model errors. This problem is attributed to incomplete understanding of the underlying physics and our lack of computational resources to resolve physical processes in various time and length scales. In this talk, we will discuss a linear theory for filtering multiscale dynamical systems with model error. In particular, we will use the notion of "consistency" to show the existence and uniqueness of an optimal filter with a reduced stochastic model. By optimality, we mean both the posterior mean and covariance estimates from the reduced filter matches the true filtered statistical solutions. Subsequently, we will construct an accurate reduced filter in a simple, yet challenging nonlinear setting, where the optimal filter is not available as in practical situation. Finally, we will discuss a stochastic parameterization strategy to account for model errors in filtering high-dimensional nonlinear problems. We will demonstrate our stochastic parameterization method in a numerical example by filtering an 81-dimensional model which exhibits many of the characteristics seen in practical applications using a 9-dimensional reduced model.
Keywords of the presentation: data assimilation, forcing, ocean
The focus of this talk is assimilating observations into a system whose errors are more dominated by forcing than by dynamical chaos. I will discuss my results and experience working with ensemble Kalman filtering on the Chesapeake Bay. Errors in Bay forecasts are driven mostly by errors in the wind and river input fields, which necessitates accounting for these uncertainties. I will also touch on how this work has tied into using EnKF for cardiac models.
Keywords of the presentation: multi-scale, balance
Geophysical fluid dynamics exhibit a wide range of spatial and temporal scales that are partially balanced. Observations, consisting in-situ and remote-sensing, are sampled at multi-resolution. To assimilate spatially high resolution and sparse observations together, a data assimilation scheme must fulfill two specific objectives while taking balances taking into account. First, the large scale flow components must be effectively constrained using the sparse observations. Second, small-scale features that are resolved by the high-resolution observations must be utilized to the maximum degree possible. In this talk, we present a practical, multi-scale approach to data assimilation and demonstrate advantage of multi-scale approach over conventional data assimilation.
Keywords of the presentation: Climate, data
In climate science, the computer is the laboratory and climate models are the experimental tools. In this introductory talk I will discuss the hierarchy of climate models and give some examples of how data are collected and what we can do with them.
Keywords of the presentation: EnKF, filter divergence, stochastic analysis
The Ensemble Kalman Filter (EnKF) is a widely used tool for assimilating data with high dimensional nonlinear models. Nevertheless, our theoretical understanding of the filter is largely supported by observational evidence rather than rigorous statements.
In this talk we attempt to make rigorous statements regarding "filter divergence", where the filter loses track of the underlying signal. To be specific, we focus on the more exotic phenomenon known as "catastrophic filter divergence", where the filter reaches machine infinity in finite time.
Keywords of the presentation: Model reduction, Fluctuation-dissipation, fast-slow systems, Lorenz 96
Numerical computation of multiscale systems can be quite challenging, often requiring a very small discretization to adequately capture the fast component. In this talk I present a closure approximation method for the slow component of a two-timescale system of ODEs, replacing the coupled fast component with a simple parametrization. We use the fluctuation-dissipation relation, a result from statistical mechanics to predict perturbation response using observations of the unperturbed dynamics, to incorporate first-order response of the fast component in this parametrization. Results will be shown for the Lorenz 96 system for various coupling and forcing regimes.
Keywords of the presentation: Data assimilation, high performance computing, bias correction
The Local Ensemble Transform Kalman Filter (LETKF) is an accurate
state estimation scheme that has been applied successfully to several
different weather and ocean models. I will address two issues in
this talk. First, the LETKF is an embarrassingly parallel algorithm:
every grid point is updated independently using a local set of
observations. I will discuss some difficulties associated with
getting good efficiency on modern parallel computer architectures.
Second, I will discuss an approach to bias correction with
potential application to atmospheric modeling.
Keywords of the presentation: ensemble Kalman filter, uncertainty
To investigate the impacts of observing only surface pressure, the Data Assimilation Research Testbed and the Community Earth System Model (DART/CESM) are used for observing system simulation experiments with the ensemble Kalman filter (EnKF). An empirical localization function (ELF) is used to effectively spread the surface pressure observations in the vertical. The ELF is defined to minimize the root mean square (RMS) difference between the truth and the posterior ensemble mean for state variables. The spatial density and temporal frequency of the observations are varied. The land surface and sea ice models are fully active and coupled to the atmosphere model but the ocean is specified. Surface pressure observations can constrain uncertainty throughout the entire depth of the troposphere. However, the atmospheric boundary layer sometimes has large uncertainty over land. In general the error of the entire depth of the troposphere can be better constrained with increased observation density and frequency.Read More...
Classical time series analysis shows conclusive evidence
that variations in the Earth's orbital parameters are
present in the geological record of the Earth's climate.
The analysis also shows that there is much more to the
story. Some of the open questions concerning the glacial
cycles of the last few million years will be discussed.
Keywords of the presentation: data assimilation, ensemble Kalman filter, localization, multi-model ensemble, big data
Ensemble data assimilation methods have been improved consistently and have become a viable choice in operational numerical weather prediction. Dealing with multi-scale error covariance and model errors is among the unresolved issues that would play essential roles in analysis performance. With higher resolution models, generally narrower localization is required to reduce sampling errors in ensemble-based covariance between distant locations. However, such narrow localization limits the use of observations that would have larger-scale information. This study aims to separate scales of the analysis increments, independently of observing systems. Inspired by M. Buehner, we applied two different localization scales to find analysis increments at the two separate scales, and obtained improvements in simulation experiments using an intermediate AGCM known as the SPEEDY model. Another important issue is about the model errors. Among many other efforts since Dee and da Silva’s model bias estimation, we explore a discrete Bayesian approach to adaptively choosing model physics schemes that produce better fit to observations. This presentation summarizes our recent progress at RIKEN on these theoretical and practical topics, and also introduces our future perspectives and challenges including “Big Data Assimilation” for extremely-short-range weather forecasting using next-generation high-resolution weather simulations and supercomputers, and new observing instruments.
Keywords of the presentation: data management analysis and visualization
Effective use of data management techniques for massive climate models is a crucial ingredient for the success of modern environmental research. Developing such techniques involves a number of major challenges such as the real-time management of massive data, or the quantitative analysis of scientific features of unprecedented complexity. The Center for Extreme Data Management Analysis and Visualization (CEDMAV) addresses these challenges with and interdisciplinary research in diverse topics including the mathematical foundations of data representations, the design of robust, efficient algorithms, and the integration with relevant applications.
In this talk, I will discuss one approach developed for dealing with massive amount of information via a framework for processing large scale scientific data with high performance selective queries on multiple terabytes of raw data. The combination of this data model with progressive streaming techniques allows achieving interactive processing rates on a variety of computing devices ranging from handheld devices like an iPhone, to simple workstations, to the I/O of parallel supercomputers. With this framework we demonstrated how one can enable the real time streaming of massive simulations from DOE platforms such as Hopper2 at LBNL and Intrepid at ANL.
I will also present the application of a discrete topological framework for the representation and analysis of the same large scale scientific data. Due to the combinatorial nature of this framework, we can implement the core constructs of Morse theory without the approximations and instabilities of classical numerical techniques. The inherent robustness of the combinatorial algorithms allows us to address the high complexity of the feature extraction problem for high resolution scientific data such as eddies ocean simulations.
During the talk, I will provide a live demonstration of the effectiveness of some software tools developed in CEDMAV and discuss the deployment strategies in an increasing heterogeneous computing environment.
Traditional data assimilation is cast as amplitude data assimilation and contrasted to displacement data assimilation, the latter able to correct phase information in a physically-meaningful way. We use area-preserving maps to correct phase errors in problems wherein feature preservation is essential. An example of problem where phase information is crucial is tracking of hurricanes/cyclones/tornadoes.
I will first motivate the use of this method by describing how variance minimizing techniques are less successful in problems where feature preservation/detection is critical. I will describe one of our own amplitude data assimilation methods which is capable of handling nonlinear/non-Gaussian problem, albeit of small dimension, as a benchmark of what is possible with a traditional amplitude data assimilation method.
I will then contrast its results to the displacement assimilation technique and describe then how both of these approaches could be combined to obtained improved estimates of the first few moments of the posterior density of states, given observations.
Joint work with Steven Rosenthal and Shankar Venkataramani.
Keywords of the presentation: data assimilation, model features, uncertainty structure, Kalman filters
Data assimilation algorithms predict the true model state by minimizing uncertainty due to errors from multiple sources, typically between forecast models and physical measurements of the state variables. The accuracy of state predictions is affected by how model error is parameterized, and taking care to preserve model physics in statistical estimators can improve performance. A popular assimilation framework in geoscience applications is the Kalman filter, which is a weighted average that can blur important features such as fronts and extrema. These optimal estimates predict features with reduced intensity and obscured locality; moreover, the statistics of the estimator describe the intensity of feature information, but not its position. This may be detrimental to understanding the trajectories of large scale features. One possible remedy is parameterizing model error in terms of positional shifts of coherent model features over time. Physical constraints can be built into the prediction of the most likely coordinate system supporting the true state, so that its uncertainty can be reduced with much less impact on the intensity and geometry of dominant features. A modified extended Kalman filter is developed to track Lagrangian and Eulerian solutions to a vorticity advection model. A basis for canonical transformations ensures the area of each solution contour is preserved throughout assimilation, and a Tikhonov regularization is proposed to facilitate the change of measure of model state uncertainty into the canonical basis. Passive tracers are considered for an observing system as an indirect measurement of vorticity, and due to their availability and recent attention in ocean transport research. Twin experiments show that this assimilation methodology successfully tracks a simulated truth in the presence of noisy perturbations to the coordinate system and tracers.
This is joint work with Juan Restrepo and Shankar Venkataramani (Univ. of Ariz.), and Arthur Mariano (Univ. of Miami, RSMAS).
Keywords of the presentation: Lagrangian Data Assimilation, Parameter Estimation
Inferring parameters in a geophysical flow model is a challenge for Lagrangian data assimilation
(LaDA). We present a filtering-based method that combines particle filter and ENKF to track time-varying state vectors (positions of drifters) and fixed model parameters in a quasi-geostrophic two-layer shallow water model. Our method uses a dual strategy that performs parameter estimation by particle filtering and subsequently use the ``best" parameter to track the position of drifters by ENKF. This method will suit a situation where the parameter space is low-dimensional but the state vector (the drifters) is high-dimensional.
Keywords of the presentation: decadal prediction, climate prediction
Climate prediction may be defined as probabilistic forecasting of the state of the atmosphere-ocean system for lead times of a season or longer. Explicitly or implicitly, climate prediction forms the basis of all socio-economic planning, such as water resource allocation or insurance risk assessment. Climate prediction is carried out using dynamical and statistical models over a range of timescales, ranging from seasonal to centennial. Prediction skill derives from two distinct sources of information: (I) knowledge of the initial state of the atmosphere-ocean system, and (II) knowledge (or projections) of the time evolution of external “boundary conditions” such as the radiative forcing associated with greenhouse gases.
In this talk, some of the challenges in carrying out seasonal to decadal climate predictions are analyzed, encompassing both dynamical and statistical approaches. One of the biggest challenges is the estimation of the initial state of the global ocean, and data assimilation can play an important role in addressing this. Another challenge is development of dynamical models of the atmosphere-ocean system that are able to simulate the current state of climate with fidelity. The rapid loss in the skill of climate prediction at regional spatial scales, as opposed to predictions of low-order spatial moments like the global-mean surface temperature, poses yet another challenge to the utility of climate predictions.
Keywords of the presentation: data assimilation, particle filters, nonlinear systems
Particle filters were developed as an ensemble data assimilation method that, unlike many traditional methods, could approximate non-Gaussian probability distributions well. Unfortunately, the number of particles needed for this method scales exponentially with the size of the problem, and thus may not be feasible for large-scale problems. In this talk I will review previous results which show how filter collapse is related to the size of the problem in linear systems. I will then discuss new results which numerically show that similar results hold in the nonlinear case, and discuss some complications that arise when we switch from the linear to the nonlinear regime. Finally, I will discuss some encouraging results regarding the optimal proposal vs the standard proposal distributions.
We investigate the assimilation of data that are collected while Lagrangian ocean instruments are in transit between surfacings. Effectively utilizing such data presents a challenge as the subsurface paths of these instruments are unknown. We introduce an observation operator that takes these data into account in addition to the data that are typically assimilated. A key point is that the subsurface, en-route paths of these ocean instruments are estimated as part of the assimilation scheme.
We will also discuss a hybrid assimilation scheme being developed that is quite well suited to Lagrangian data assimilation. Finally we will posit how we see these two schemes being used together.
Keywords of the presentation: forecast errors, spatiotemporal dynamics, TIGGE
The main focus of this presentation is on the spatiotemporal dynamics of errors in global model-based forecasts of the atmospheric state. First, a review of the latest important results from the literature is provided. Then, the results of a new investigation, which is based on The THORPEX Interactive Grand Global Ensemble (TIGGE) data set are presented. This data set provides a unique opportunity to study the dynamics of model errors, because it includes data produced by the top global forecast systems of the world. These systems use a variety of techniques to represent the effects of the initial condition and model errors on the forecast errors. The point that exponential growth of the forecast errors at the baroclinically unstable scales play a central role in the error dynamics is reiterated.
Keywords of the presentation: particle filter, implicit sampling, data assimilation, sequential Monte Carlo
In this talk, we will introduce implicit particle filter, a new particle filter that focuses particle paths on regions of high probability by
solving equations with a random input. Thus it is applicable even if the state dimension is large.
We will also present the details of the implementation of implicit particle filter and its connection with other data assimilation methods. Several examples will be provided to illustrate the efficiency and accuracy of the algorithm.
Keywords of the presentation: Lyapunov exponents, continuous matrix factorizations, error analysis, Lyapunov vectors
In this talk we present computational techniques for approximation of
Lyapunov exponents based upon smooth matrix factorizations and some
potential applications of these techniques to earth system processes.
Lyapunov exponents characterize stability properties of time dependent
solutions to differential equations. We introduce methods for approximation
of Lyapunov exponents, review results on the sensitivity of Lyapuonv exponents
to perturbations, describe codes we have developed for their computation, and
present results on the approximation of Lyapunov vectors and some possible
applications of these results.
Keywords of the presentation: topological data analysis, persistent homology, climate reanalysis data
Topological data analysis is a robust tool for examining noisy, high-dimensional data. We adapt topological techniques, such as Mapper and persistent homology, to examine the dynamics of sea surface temperatures and identify climate oscillations.
This is joint work with D. Morozov and Prabhat.