Main navigation | Main content

HOME » SCIENTIFIC RESOURCES » Volumes

Abstracts and Talk Materials

August 4, 2010

Boundary integral equations are an efficient approach for the scattering of acoustic and electromagnetic waves from periodic arrays (gratings) of obstacles. The standard way to periodize is to replace the free-space Green's function kernel with its quasi-periodic cousin. This idea has two drawbacks: i) the quasi-periodic Green's function diverges (does not exist) for parameter families known as Wood's anomalies, even though the scattering problem remains well-posed, and ii) the lattice sum representation of the quasi-periodic Green's function converges in a disc, thus becomes unwieldy when obstacles have high aspect ratio.

We bypass both problems by means of a new integral representation that relies on the free-space Green's function alone, and adds auxiliary layer potentials on the boundary of the unit cell strip while expanding the linear system to enforce quasi-periodicity. The result is a (slightly larger) 2nd kind system that achieves spectral accuracy, is immune to Wood's anomalies, avoids lattice sums, handles large aspect ratios, and couples to existing scattering code. By careful summing of neighboring images, obstacles may intersect the unit cell walls.

If time, we will discuss a similar robust new approach to the photonic crystal band structure eigenvalue problem.

Joint work with Leslie Greengard (NYU).

We bypass both problems by means of a new integral representation that relies on the free-space Green's function alone, and adds auxiliary layer potentials on the boundary of the unit cell strip while expanding the linear system to enforce quasi-periodicity. The result is a (slightly larger) 2nd kind system that achieves spectral accuracy, is immune to Wood's anomalies, avoids lattice sums, handles large aspect ratios, and couples to existing scattering code. By careful summing of neighboring images, obstacles may intersect the unit cell walls.

If time, we will discuss a similar robust new approach to the photonic crystal band structure eigenvalue problem.

Joint work with Leslie Greengard (NYU).

In an approach similar to Ewald’s method for evaluating lattice sums, we split the application of Helmholtz Green’s functions between the spatial and the Fourier domains and, for any finite accuracy, approximate their kernels.
In the spatial domain we use a near optimal linear combination of decaying Gaussians with positive coefficients and, in the Fourier domain, a multiplication by a band-limited kernel obtained by using new quadratures appropriate for the singularity in the Fourier domain. Applying this approach to the free space and the quasi-periodic Green’s functions, as well as those with Dirichlet, Neumann or mixed boundary conditions on simple domains, we obtain fast algorithms in dimensions two and three for computing volumetric integrals involving these Green's functions.

This is a joint work with Chris Kurcz amd Lucas Monzon.

This is a joint work with Chris Kurcz amd Lucas Monzon.

The fast multipole method (FMM) is an efficient algorithm for what is
known as "N-body problems." I will present a new scalable algorithm
and a new implementation of the kernel-independent fast multipole
method, in which both distributed memory parallelism (using MPI) and
shared memory/SIMD parallelism (via GPU acceleration) are employed. I
will conclude my talk by discussing the direct numerical simulation of
blood flow in the Stokes regime using the FMM. I will describe
simulations with 200 million red blood cells, an improvement of four
orders of magnitude over previous results.

Bio:

George Biros holds Associate Professor appointments with the Schools of Computational Science and Engineering at Georgia Tech and The Wallace H. Coulter Department of Biomedical Engineering at Georgia Tech and Emory University. Prior to joining Georgia Tech, he was an assistant professor in Mechanical Engineering and Applied Mechanics, Bioengineering and Computer and Information Science at the University of Pennsylvania. He received his BS in Mechanical Engineering from Aristotle University Greece (1995), his MS in Biomedical Engineering from Carnegie Mellon (1996), and his PhD in Computational Science and Engineering also from Carnegie Mellon (2000). He was a postdoctoral associate at the Courant Institute from 2000 to 2003.

Bio:

George Biros holds Associate Professor appointments with the Schools of Computational Science and Engineering at Georgia Tech and The Wallace H. Coulter Department of Biomedical Engineering at Georgia Tech and Emory University. Prior to joining Georgia Tech, he was an assistant professor in Mechanical Engineering and Applied Mechanics, Bioengineering and Computer and Information Science at the University of Pennsylvania. He received his BS in Mechanical Engineering from Aristotle University Greece (1995), his MS in Biomedical Engineering from Carnegie Mellon (1996), and his PhD in Computational Science and Engineering also from Carnegie Mellon (2000). He was a postdoctoral associate at the Courant Institute from 2000 to 2003.

We describe a collection of techniques which allow
for the fast and accurate solution of boundary integral
equations on two-dimensional domains whose boundaries
have corner points. Our approach has two key advantages
over existing and recently suggested schemes: (1) it
does not require a prior analytic estimates for solutions,
and (2) many aspects of the scheme generalize readily
to singular three-dimensional domains.

August 3, 2010

We show how to construct higher-order (10 and above) finite-difference schemes using
Minimum Sobolev Norm techniques that lead to O(N^{2}) condition number discretizations of second-order elliptic PDEs. Numerical experiments comparing with standard FEM solvers show the benefit of the new approach. We then discuss fast numerical methods to convert the discrete equations into discretized integral equations that can have bounded condition number, and hence achieve even higher numerical accuracy.

Sparse solution to an underdetermined linear integral equation is
the central problem for a broad range of applications - scattering,
sensing, imaging, machine learning, signal and image processing, data
analysis and compression, model reduction, optimal control and
design. We will introduce a weak formulation of the problem and
construct its sparse solution by a nonlinear process - the design
of a Gaussian quadrature for the kernel of the integral equation.
We will present a systematic method to solve the resulting quadrature
problem by eigen decomposition.

Solving electromagnetics problem is a challenging task, especially when the structure is multi-scale. These structures are often encountered in circuits in electronic packaging, small antenna designs, as well as small sensor designs. A challenging problem in computational electromagnetics is in solving multi-scale problems in the low frequency regime, especially in the regime between statics and electrodynamics. When the wavelength is much longer than the size of the structure, the physics of the electromagnetic field resembles those of circuits, and hence, this is the circuit physics regime. When the wavelength is sizeable compared to the structure, than wave physics becomes important, and it is important that a simulation method can capture the wave physics interaction. When a structure is multi-scale, and has parts that are small compared to wavelength, but at the same time, is on the order of wavelength, then both circuit physics and wave physics are important. A simulation method has to capture both physics. We will discuss the use of the equivalence principle algorithm (EPA) to capture the multi-scale physics of complex structures. In this method, complex structures are partitioned into parts by the use of equivalence surfaces. The interaction of electromagnetic field with structures within the equivalence surface is done through scattering operators working via the equivalence currents on the equivalence surfaces. The solution within the equivalence surface can be obtained by various numerical methods. Then the interaction between equivalence surfaces is obtained via the use of translation operators. When accelerated with the mixed-form fast multipole method, large multi-scale problems can be solved in this manner. We will also discuss the augmented electric field integral equation (A-EFIE) approach in solving the low-frequency breakdown problem as encountered in circuits in electronic packaging. In this method, the EFIE is augmented with an additional charge unknown, and an additional continuity equation relating the charge to the current. The resultant equation, after proper frequency normalization, is frequency stable down to very low frequency. This belongs to a generalized saddle-point method, and apparently it does not suffer from the low-frequency breakdown, but it does have the low-frequency inaccuracy problem. We will discuss the use of the perturbation method to derive accurate solutions when the low-frequency inaccuracy problem occurs. We will also discuss the hybridization of EPA and A-EFIE to tackle some multi-scale problems.

Joint work with Cris Cecka and Pierre-David Letourneau (Stanford University).

The fast multipole method (FMM) is a technique allowing the fast calculation of sums in O(N) or O(N ln N) steps with some prescribed tolerance. Original FMMs required analytical expansions of the kernel, for example using spherical harmonics or Taylor expansions. In recent years, the range of applicability and the ease of use of FMMs has been considerably extended by the introduction of black box or kernel independent techniques. In these approaches, the user only provides a subroutine to numerically calculate the kernel K. This allows changing the definition of K with practically no change to the computer program. In this talk we will present a novel method that attempts to address some of the shortcomings of existing kernel independent FMMs. This method is applicable to all kernels that are analytic in some appropriate region. An important property of the method is the fact that the multipole-to-local operator is diagonal, that is for p multipole coefficients, the cost of applying the operator is O(p). The computational cost is O(N) for non-oscillatory kernels and O(N ln N) for oscillatory kernels. The approach is based on Cauchy's integral formula. We will present a numerical analysis of the convergence, and methods to find the optimal parameters in the FMM. Numerical results will be presented for some benchmark calculations to demonstrate the accuracy as a function of the number of multipole coefficients, and the computational cost of the different steps.

The fast multipole method (FMM) is a technique allowing the fast calculation of sums in O(N) or O(N ln N) steps with some prescribed tolerance. Original FMMs required analytical expansions of the kernel, for example using spherical harmonics or Taylor expansions. In recent years, the range of applicability and the ease of use of FMMs has been considerably extended by the introduction of black box or kernel independent techniques. In these approaches, the user only provides a subroutine to numerically calculate the kernel K. This allows changing the definition of K with practically no change to the computer program. In this talk we will present a novel method that attempts to address some of the shortcomings of existing kernel independent FMMs. This method is applicable to all kernels that are analytic in some appropriate region. An important property of the method is the fact that the multipole-to-local operator is diagonal, that is for p multipole coefficients, the cost of applying the operator is O(p). The computational cost is O(N) for non-oscillatory kernels and O(N ln N) for oscillatory kernels. The approach is based on Cauchy's integral formula. We will present a numerical analysis of the convergence, and methods to find the optimal parameters in the FMM. Numerical results will be presented for some benchmark calculations to demonstrate the accuracy as a function of the number of multipole coefficients, and the computational cost of the different steps.

The butterfly algorithm is a robust alternative to the FFT for
computing certain oscillatory integrals in a fast and accurate manner. In
this approach low-rank interactions are updated in a hierarchical fashion
up and down quadtrees. We review the method, its expected accuracy, and
present an application to synthetic aperture radar imaging. Joint work
with Lexing Ying.

I will discuss recent joint work with Leslie Greengard on a new representation of solutions to the time harmonic Maxwell Equations. Using this representation we reduce the solution of the classical problems of scattering off of a smooth bounded interface to the solution of Fredholm integral equations of second kind on the interface. What distinguishes our representation is that it does not have any spurious "interior" resonances, or suffer from low frequency breakdown. It also reveals some interesting topological features of the time harmonic equations at non-zero frequencies.

The general form of the boundary integral representation formula
for a first order linear system in convervation law form is
reviewed.
The Frechet derivative of this representation is then studied
with the aid
of the parametric derivative formulae for the normal boundary
element,
n ds (resp. n dS or n dV) for boundaries in R^{2}
(resp. R^{3} or
R^{4}). It
is shown how the proper handling of the surface variation
enables the
recovery of the obvious representation formula the parametric
derivative
of the field vector.

December 31, 1969

Poisson-Boltzmann (PB) equation based implicit solvent model can greatly reduce the computational cost in simulating solvated biomolecular systems by applying the mean field approximation in permittivities and capturing the mobile ions with Boltzmann distribution. However, solving PB equation suffers many numerical difficulties ranging from discontinuous permittivities and electrostatic field across the dielectric interface, the infinite boundary conditions, and charge singularities. In this project, we provide an efficient and accurate numerical algorithm, which adopts a well-conditioned boundary integral equation to handle these difficulties while accelerates the Krylov subspace based iterative methods such as GMRES with treecode. This Cartesian coordinates based treecode is an O(N*log(N)) scheme with properties of easy implementation, efficient memory usage O(N) , and straightforward parallelization. Benchmark testing results on Kirkwood sphere plus simulations of biomoleculars in various sizes are provided to demonstrate the accuracy, efficiency and robustness of the present methods.

In this talk, we present a simple and efficient method for rotating a
spherical harmonic expansion. This is a well-studied problem, arising
in classical scattering theory, quantum mechanics and numerical
analysis, usually addressed through the explicit construction of the
Wigner rotation matrices. Existing fast algorithms, based on
recurrence relations, are subject to a variety of instabilities,
limiting the effectiveness of the approach for expansions of high
degree.

We show that rotation can be carried out easily and stably through "pseudospectral" projection, without ever constructing the matrix entries themselves. In the simplest version of the method, projection is carried out on the equator of the rotated sphere. If only the lowest angular modes are required, the algorithm can be further accelerated by using a sequence of constant latitude circles.

This is joint work with Leslie Greengard.

We show that rotation can be carried out easily and stably through "pseudospectral" projection, without ever constructing the matrix entries themselves. In the simplest version of the method, projection is carried out on the equator of the rotated sphere. If only the lowest angular modes are required, the algorithm can be further accelerated by using a sequence of constant latitude circles.

This is joint work with Leslie Greengard.

August 3, 2010

We have developed new versions of the FMM in three dimensions for both static and dynamic problems that are reasonably efficient, easy to implement, and fully adaptive. By combining these schemes with suitable quadrature codes, we have constructed "triangle-based" libraries for layer potentials that permit
the rapid deployment of robust integral equation solvers in complex geometry. We will describe some of
the core algorithmic ideas as well as some applications in electrodynamics and acoustics. This is joint work with Zydrunas Gimbutas.

An exact representation is presented for the field inside a sphere
(the observation sphere) due to primary sources enclosed by a second sphere
(the source sphere). The regions bounded by the two spheres have no common
points. The field of the primary sources is expressed in terms of Gaussian
beams whose branch-cut disks are centered in the source sphere. The
expansion coefficients for the standing spherical waves in the observation
sphere are expressed in terms of the output of Gaussian-beam receivers,
whose branch-cut disks are centered in the observation sphere. In this
configuration the patterns of the transmitting and receiving beams
"multiply" to produce a higher directivity than is usually seen with
Gaussian beams. This leads to a fast method for computing matrix-vector
multiplications in scattering calculations, as will be illustrated for a
Dirichlet square plate.

First I discuss low-threshold stagnation problems in the GMRES
iterative solver. I show how to alleviate them in certain situations.

Then I turn to the main topic of the talk – a method to enhance the efficiency of integral equation based schemes for elliptic PDEs on domains with corners, multi-wedge points, and mixed boundary conditions. The key ingredients are a block-diagonal inverse preconditioner 'R' and a fast recursion, 'i=1,...,n', where step 'i' inverts and compresses contributions to 'R' from the outermost quadrature panels on level 'i' of a locally 'n'-ply refined mesh. From an efficiency point of view, this method converts a non-smooth boundary into a smooth boundary and mixed boundary conditions into pure boundary conditions. The spectral properties of the system matrices in the linear systems that eventually have to be solved are essentially the same. The "corner difficulties" are gone.

The talk is available as a pdf-file: http://www.maths.lth.se/na/staff/helsing/corners.pdf

Then I turn to the main topic of the talk – a method to enhance the efficiency of integral equation based schemes for elliptic PDEs on domains with corners, multi-wedge points, and mixed boundary conditions. The key ingredients are a block-diagonal inverse preconditioner 'R' and a fast recursion, 'i=1,...,n', where step 'i' inverts and compresses contributions to 'R' from the outermost quadrature panels on level 'i' of a locally 'n'-ply refined mesh. From an efficiency point of view, this method converts a non-smooth boundary into a smooth boundary and mixed boundary conditions into pure boundary conditions. The spectral properties of the system matrices in the linear systems that eventually have to be solved are essentially the same. The "corner difficulties" are gone.

The talk is available as a pdf-file: http://www.maths.lth.se/na/staff/helsing/corners.pdf

In this talk, we discuss a numerical scheme for the accurate and efficient
solution of time dependent partial differential equations. The technique
first discretizes the temporal direction using Gaussian type nodes and
spectral integration, and applies low-order time marching schemes to
form a preconditioned elliptic system. The better conditioned system is
then solved iteratively using Jacobi-Free Newton–Krylov techniques, and
each function evaluation is simply one low-order time-stepping
approximation
of the error by solving a decoupled system using available fast elliptic
equation solvers. Preliminary numerical experiments show that this
technique
can be unconditionally stable and spectrally accurate in both temporal and
spatial directions.

December 31, 1969

We describe the implementation of EMX, a commercial full-wave 3D Electromagnetic simulator used for Integrated Circuit (IC) design. EMX uses a volume integral equation formulation and a specialized iterative solver that is accelerated using a new 3D kernel independent Fast Multipole Method. The code has been implemented to take advantage of multiple-core machines. In addition, domain specific implementation issues are discussed. For example, in advanced IC processes, the physical properties of wires vary depending on the surrounding wiring. EMX automatically modifies the drawn layout to mimic the IC fabrication process. We compare simulation results to measurements over a large variety of IC structures.

We present recent work on treecode algorithms for computing integrals
that arise in particle simulations. Our approach uses a Cartesian
Taylor series for the far-field expansion and is suitable for a
variety of kernels including multiquadric radial basis functions, the
screened Coulomb potential, and a regularized Biot-Savart kernel.
Sample results are shown.

We present recent work on treecode algorithms for computing
integrals that arise in particle simulations. Our approach uses a
Cartesian Taylor series for the far-field expansion and is suitable
for a variety of kernels including multiquadric radial basis functions
and the screened Coulomb potential. A sample of results will be
presented for vortex sheet motion in 3D fluid flow and boundary
integral simulations of the Poisson-Boltzmann equation for solvated
biomolecules.

August 2, 2010

We present an efficient integral equation approach to solve the heat equation in a two-dimensional, multiply connected domain, and with Dirichlet boundary conditions. Instead of using integral equations based on the heat kernel, we take the approach of discretizing in time, first. This leads to a non-homogeneous modified Helmholtz equation that is solved at each time step. The solution to this equation is formulated as a volume potential plus a double layer potential. The volume potential is evaluated using a fast multipole-accelerated solver. The boundary conditions are then satisfied by solving an integral equation for the homogeneous modified Helmholtz equation.
The integral equation solver is also accelerated by the fast multipole method (FMM). For a total of N points in the discretization of the boundary and the domain, the total computational cost per time step is O(N) or O(N log N).

Water diffusion magnetic resonance imaging (DMRI) is a method
which uses a combination of applied magnetic fields to measure,
statistically, the diffusion of water molecules due to Brownian motion.
Its spatial resolution is on the order of millimeters.

The cellular structure inside the human brain varies on the scale of micrometers, which is much smaller than the size of a voxel. There may be thousands of irregularly-shaped cells within a voxel, and they all contribute to the environment seen by water molecules whose displacement is measured by the MRI scanner. In the typical DMRI experiment, the time interval over which water diffusion is measured is in the range of 50-100 milliseconds. Using the diffusion coefficient of 'free' water at 37 degrees Celsius, D = 3e-9 m2/s, we get an estimated diffusion distance of 15-25 micrometers. Clearly, in a DMRI experiment, water molecules encounter numerous times inhomogeneities in the cellular environment, such as cell membranes, fibers, and macromolecules. We want to simulate the DMRI signal at the scale of a single voxel, while taking into account cellular structure and the shape and duration of the diffusion gradients.

I will describe the Bloch-Torrey partial differential equation which is a phenomenological equation that describes the time evolution of the nuclear magnetization in a sample. I will talk about the numerical solution of the Bloch-Torrey PDE using Green's functions and alternatively by finite elements discretization. Finally, the measured diffusion MRI signal is the sum of the magnetization in the sample and I give some analytical results on the aggregate signal under some simplying assumptions.

The cellular structure inside the human brain varies on the scale of micrometers, which is much smaller than the size of a voxel. There may be thousands of irregularly-shaped cells within a voxel, and they all contribute to the environment seen by water molecules whose displacement is measured by the MRI scanner. In the typical DMRI experiment, the time interval over which water diffusion is measured is in the range of 50-100 milliseconds. Using the diffusion coefficient of 'free' water at 37 degrees Celsius, D = 3e-9 m2/s, we get an estimated diffusion distance of 15-25 micrometers. Clearly, in a DMRI experiment, water molecules encounter numerous times inhomogeneities in the cellular environment, such as cell membranes, fibers, and macromolecules. We want to simulate the DMRI signal at the scale of a single voxel, while taking into account cellular structure and the shape and duration of the diffusion gradients.

I will describe the Bloch-Torrey partial differential equation which is a phenomenological equation that describes the time evolution of the nuclear magnetization in a sample. I will talk about the numerical solution of the Bloch-Torrey PDE using Green's functions and alternatively by finite elements discretization. Finally, the measured diffusion MRI signal is the sum of the magnetization in the sample and I give some analytical results on the aggregate signal under some simplying assumptions.

System-level electromagnetic design problems are multiscale and very challenging to solve. They remain a significant barrier to system design optimization for a foreseeable future. Such multiscale problems often contain three electrical scales, i.e., the fine scale (geometrical feature size much smaller than a wavelength), the coarse scale (geometrical feature size greater than a wavelength), and the intermediate scale between the two extremes. Existing commercial tools are based on single methodologies (such as finite element method or finite-difference time-domain method), and are unable to solve large multiscale problems. For example, no commercial software package is known to be able to model a simple package (say at 0.1 mm resolution) inside a reverberation chamber (1 m) for RF interference testing.

We will present our recent work in solving realistic multiscale system-level EM design simulation problems in both frequency domain and time domain. In frequency domain, we combine the spectral element method for coarse scales and finite element method for fine scales with a fast spectral integral method. In time domain, we hybridize the spectral element time-domain method and finite-element time-domain method together with the perfectly matched layer, with explicit and implicit time integration schemes for different subdomains. The discontinuous Galerkin method is used as the subdomain interface condition. In time domain, we further incorporate a nonlinear circuit solver, making it possible to perform nonlinear circuit simulation with RF interactions in a seamless manner. Several previously intractable multiscale problems will be illustrated.

We will present our recent work in solving realistic multiscale system-level EM design simulation problems in both frequency domain and time domain. In frequency domain, we combine the spectral element method for coarse scales and finite element method for fine scales with a fast spectral integral method. In time domain, we hybridize the spectral element time-domain method and finite-element time-domain method together with the perfectly matched layer, with explicit and implicit time integration schemes for different subdomains. The discontinuous Galerkin method is used as the subdomain interface condition. In time domain, we further incorporate a nonlinear circuit solver, making it possible to perform nonlinear circuit simulation with RF interactions in a seamless manner. Several previously intractable multiscale problems will be illustrated.

December 31, 1969

Poisson-Boltzmann (PB) electrostatics is a well established model in biophysics, however its application to the study of large scale biosystem dynamics such as the protein-protein encounter is still limited by the efficiency and memory constraints of existing numerical techniques. In this poster, we present an efficient and accurate scheme which incorporates recently developed novel numerical techniques to further enhance the present computational ability. In particular, a boundary integral equation approach is applied to discretize the linearized PB (LPB) equation; the resulting integral formulas are well conditioned and are extended to systems with arbitrary number of biomolecules; a robust meshing method is developed for molecular surface meshing; the solution process is accelerated by the Krylov subspace methods and an adaptive new version of fast multipole method (FMM). The Adaptive Fast Multipole Poisson-Boltzmann (AFMPB) solver is released under open source license agreement, and the meshing tool TMSmesh will also be released.

The talk will describe recently developed fast solvers for the
linear systems arising upon the discretization of elliptic
PDEs. While most existing fast methods tend to be based on
iterative solvers such as GMRES, the new techniques directly
construct an approximate inverse (or LU factorization) of the
coefficient matrix. This makes the techniques robust and
particularly fast for problems involving multiple right hand
sides. Such "fast direct solvers" have been developed both for
the sparse (and often very large) matrices that arise upon
finite element discretizations of elliptic PDEs, and for the
dense matrices arising upon discretization of the associated
integral equations.

In this talk we will present accurate two and three dimensional methods for
interpolating singular or smoothed force fields. The methods are meant to be
used in particle mesh or particle-particle particle-mesh calculations so that
the resulting schemes conserve momentum. The interpolation weights use a
discretization of the differential equation the interaction potential satisfies,
and the methods are most accurate when the force satisfies a homogeneous
elliptic differential equation or system of equations away from the sources.
We will show why the precise accuracy of the interpolation formulas depends
on the accuracy of certain corresponding quadrature formulas, and
give results of numerical experiments.

Wave problems in periodic domains have many interesting applications
such as photonic crystals and metamaterials in Maxwell's equations and
phononic crystals in elastodynamics, etc. Fast multipole methods are
considered to be effective as solvers of such problems, particularly
when the problems under consideration are of scattering nature. In
view of this, we have developed periodic FMMs for various wave
problems. We are now interested in further enhancing the performance
of the periodic FMMs with the help of effective preconditioners. In
this presentation we shall discuss our recent efforts on the use of
preconditioners based on Calderon's formulae in periodic transmission
problems in Helmholtz' equation and elastodynamics. In Helmholtz, we
shall show that the matrix of the discretised integral equations
itself serves as an effective preconditioner. This fact leads to a
very simple preconditioning scheme as one uses GMRES as the solver. We
shall also see that a similar approach is possible in elastodynamics,
but either with some restrictions on the material constants or with
more complicated formulations. We shall present a number of numerical
examples to test the performance of the proposed approaches.

In plasma physics, the Boltzmann equation, which describes the evolution of the plasma over time, has a nonlinear term representing the collisions of various species of the plasma. Current plasma edge simulations do not take this collision effect into account, because of the difficulties in the accurate evaluation of this term. Using the Rosenbluth potential formalism, the collision operator can be written in terms of solutions of a Poisson and a biharmonic free space PDE. Due to the inherent axisymmetry of the input data, cylindrical coordinate solvers are preferred for efficient computation. Standard numerical techniques (based typically on finite differences and finite element approximations) encounter difficulties in achieving high order accuracy, especially in the computation of derivatives of the solution (required in the collision operator formulation), and in imposing radiation conditions at infinity. Our new solver achieves arbitrary order accuracy in
cylindrical coordinates based on a combination of separation of variables, Fourier analysis and the careful solution of the resulting radial ODE. A weak singularity arises in the the continuous Fourier transform of the solution that can be handled effectively with special purpose quadrature rules and spectral accuracy can be achieved in derivatives without loss of precision.

This is joint work with Leslie Greengard.

This is joint work with Leslie Greengard.

We present preliminary results of parallelizing the fast multipole
method (FMM) for multicore processors using POSIX threads.

Short-range interactions are straight forward to parallelize. We invoke multiple threads per compute core to alleviate partition and load imbalances. For the calculation of long-range interactions, we assign the multipole subtrees below a certain level to compute threads with affinity settings that conform to the interaction lists of the tree nodes and exploit the memory hierarchy.

On a Sun SunFire X4600 with 8 AMD Opteron 885 processors (16 cores) running at 2.6 GHz clock rate and 64 GB of memory, we observe a better than 15x speedup compared to the sequential version of FMM-Yukawa for two sample benchmark problems of 10 to 100 million charges uniformly distributed inside a unit box or on the surface of a unit sphere and require six-digit accuracy.

Short-range interactions are straight forward to parallelize. We invoke multiple threads per compute core to alleviate partition and load imbalances. For the calculation of long-range interactions, we assign the multipole subtrees below a certain level to compute threads with affinity settings that conform to the interaction lists of the tree nodes and exploit the memory hierarchy.

On a Sun SunFire X4600 with 8 AMD Opteron 885 processors (16 cores) running at 2.6 GHz clock rate and 64 GB of memory, we observe a better than 15x speedup compared to the sequential version of FMM-Yukawa for two sample benchmark problems of 10 to 100 million charges uniformly distributed inside a unit box or on the surface of a unit sphere and require six-digit accuracy.

In August of 2008, Candes, Demanet, and Ying introduced a fast
algorithm for applying Fourier integral operators of the form
∫ e^{iΦ(x,k)}f(k)dk, where Φ(x,k) is the so-called
phase function and is required to be real-valued and linear in
the second argument. Using a resolution of 1/N in each
dimension, the transform of arbitrary sources in the unit
d-dimensional cube of the frequency domain may be naively
evaluated over the unit cube of the spatial domain with
computational complexity O(N^{2d}). The algorithm of Candes
et al. yields the near-optimal complexity
O(N^{d}log_{2}(N)).

The contribution of the author is a new method for parallelizing the above fast algorithm on distributed-memory machines. The method assumes only a power-of-two number of processes and has been shown to strong scale up to thousands of cores of Blue Gene/P with 90% efficiency even for extremely small problems. This is achieved by carefully peeling off partitions of the frequency domain and applying them to the spatial domain as the butterfly algorithm progresses. The result is that, using 2^{P} processes in d dimensions, the
only communication required by each process is P
reduce-scatter summations over a team of 2^{d} processes.
Results are presented for a generalized Radon transform in
two-dimensions, though the implementation has been shown to
work in arbitrary dimensions. The source code is available at
http://code.google.com/p/butterflyfio.

The contribution of the author is a new method for parallelizing the above fast algorithm on distributed-memory machines. The method assumes only a power-of-two number of processes and has been shown to strong scale up to thousands of cores of Blue Gene/P with 90% efficiency even for extremely small problems. This is achieved by carefully peeling off partitions of the frequency domain and applying them to the spatial domain as the butterfly algorithm progresses. The result is that, using 2

Randomized algorithms are ubiquitous in computer science and
computer
engineering. Many problems that are intractable when viewed
deterministically can be effectively solved with probabilistic techniques.
Perhaps the most
important aspect of most randomized procedures in current use
is the fact
that they produce the correct result with (practically
speaking) 100% reliability, and with (essentially) machine precision.

Historically, randomized techniques have been less popular in numerical analysis. Most of them trade accuracy for speed, and in many numerical environments one does not want to add yet another source of inaccuracy to the calculation that is already sufficiently inaccurate. One could say that in numerical analysis probabilistic methods are an approach of last resort.

I will discuss several probabilistic algorithms of numerical linear algebra that are never less accurate than their deterministic counterparts, and in fact tend to produce better accuracy. In many situations, the new schemes have lower CPU time requirements than existing methods, both asymptotically and in terms of actual timings. I will illustrate the approach with several numerical examples, and discuss possible extensions.

Historically, randomized techniques have been less popular in numerical analysis. Most of them trade accuracy for speed, and in many numerical environments one does not want to add yet another source of inaccuracy to the calculation that is already sufficiently inaccurate. One could say that in numerical analysis probabilistic methods are an approach of last resort.

I will discuss several probabilistic algorithms of numerical linear algebra that are never less accurate than their deterministic counterparts, and in fact tend to produce better accuracy. In many situations, the new schemes have lower CPU time requirements than existing methods, both asymptotically and in terms of actual timings. I will illustrate the approach with several numerical examples, and discuss possible extensions.

We present "locally-corrected spectral boundary integral
methods" for accurate discretization and fast solution
of linear elliptic systems of PDEs. Arbitrary elliptic
systems are transformed to overdetermined first-order form,
and a boundary integral equation is derived. Ewald
summation separates the boundary integral equation into a
low-rank system with regular spectral structure, followed
by a simple local correction formula. Geometric nonuniform
fast Fourier transforms produce accurate Fourier coefficients
of piecewise-polynomial data on a d-dimensional simplicial
tessellation in R^{D}, for arbitrary dimensions d and D.

We introduce numerical study on the discrete counterpart of Gauss'
theorem. The purpose is to seek and establish a third approach,
beside the analytical and the kernel-independent approaches,
for efficient dimension reduction and preconditioning of equations
initially in differential form. Integration is done locally,
or globally, using analytical/symbolic rules as
well as numerical rules and utilizing geometric information.

December 31, 1969

Joint work with Robert J. Harrison and George Fann.

We report the first all electron adaptive refinement multiresolution band structure solver for crystalline systems. Most current software implementations of density functional theory for crystalline systems either use plane waves or a split representation such as linear augmented plane waves (LAPW) or linear muffin tin orbitals (LMTO) for their basis set. MADNESS, on the other hand, uses a fully numerical basis set which resides on an adaptive mesh. This choice has introduced numerous challenges in the implementation of a solid state band structure code. Instead of explicitly diagonalizing the one-particle Hamiltonian to update the single particle orbitals, we reformulate the problem into a bound state Helmholtz integral equation. Another challenge has been implementing the bound state Helmholtz Green's function and the Coulomb Green's function to include a lattice sum over the periodic images in the unit cell. To ensure timely convergence, we wrap the main Kohn-Sham loop inside a Krylov accelerated inexact Newton solver. We present all of these items and some preliminary results in our poster.

We report the first all electron adaptive refinement multiresolution band structure solver for crystalline systems. Most current software implementations of density functional theory for crystalline systems either use plane waves or a split representation such as linear augmented plane waves (LAPW) or linear muffin tin orbitals (LMTO) for their basis set. MADNESS, on the other hand, uses a fully numerical basis set which resides on an adaptive mesh. This choice has introduced numerous challenges in the implementation of a solid state band structure code. Instead of explicitly diagonalizing the one-particle Hamiltonian to update the single particle orbitals, we reformulate the problem into a bound state Helmholtz integral equation. Another challenge has been implementing the bound state Helmholtz Green's function and the Coulomb Green's function to include a lattice sum over the periodic images in the unit cell. To ensure timely convergence, we wrap the main Kohn-Sham loop inside a Krylov accelerated inexact Newton solver. We present all of these items and some preliminary results in our poster.

A spectrally accurate method for the fast evaluation of N-particle
sums of the periodic Stokeslet is presented. Such sums occur in
boundary integral- and potential methods for viscous flow
problems. Two different decomposition methods, leading to one sum in
real space and one in reciprocal space, are considered. An FFT
based method is applied to the reciprocal part of the sum, using
convolutions with a Gaussian function to place the point sources on
a grid. Due to the spectral accuracy of the method, the grid size
needed is low and also in practice, for a fixed domain size,
independent of N. The leading cost therefore arise from the
to-grid and from-grid operations, that are linear in N. Combining
this FFT based method for the reciprocal sum with the direct
evaluation of the real space sum, a spectrally accurate algorithm
with a total complexity of O(N log N) is obtained.

Joint work with H. Reid, L. Zhang, C. Coelho, Z. Zhu, T. Klemas, and S. Johnson.

Despite decades of zealous effort, there are remarkably few applications where fast integral equation solvers dominate. To help explain why, we will describe effective strategies, assess performance, and present remaining challenges associated with applying fast solvers to problems in interconnect model extraction, biomolecular electrostatics, bodies in microflows, nanophotonics, and Casimir force calculation.

Despite decades of zealous effort, there are remarkably few applications where fast integral equation solvers dominate. To help explain why, we will describe effective strategies, assess performance, and present remaining challenges associated with applying fast solvers to problems in interconnect model extraction, biomolecular electrostatics, bodies in microflows, nanophotonics, and Casimir force calculation.

Numerical solution of the Helmholtz equation in the high
frequency regime is a challenging computational problem because of the
indefiniteness of the operator and the size of the discrete system
(due to the high frequency nature of the problem). In this talk, we
will discuss some recent progress on the numerical solution of the
Helmholtz equation in two and three dimensions.

Vesicles are locally-inextensible fluid membranes that can
sustain bending. We consider the dynamics of flows of
vesicles suspended in Stokesian fluids. We use a boundary
integral formulation for the fluid that results in a set of
nonlinear integro-differential equations for the vesicle dynamics.
The motion of the vesicles is determined by balancing
the nonlocal hydrodynamic forces with the elastic
forces due to bending and tension. Numerical simulations
of such vesicle motions are quite challenging. On one hand,
explicit time-stepping schemes suffer from a severe stability
constraint due to the stiffness related to high-order spatial
derivatives and a milder constraint due to a transport-like
stability condition. On the other hand, an implicit scheme
can be expensive because it requires the solution of a set
of nonlinear equations at each time step. We present a set
of numerical techniques for efficient simulation of vesicle
flows. The distinctive features of these numerical methods
include (1) using boundary integral method accelerated with the
fast multipole method (2) spectral (spherical harmonic) discretization
of deforming surfaces in space (3) an algorithm for
surface reparameterization ensuring stability of the
time-stepping scheme and spectral filtering accuracy while minimizing
computational costs.
By introducing these algorithmic components, we obtain a
time-stepping scheme that experimentally is unconditionally
stable and has a low cost per time step. We present numerical results
to analyze the cost and convergence rates of the scheme.