florida state university librariesdiginole.lib.fsu.edu/islandora/object/fsu:181994/... · 2015. 4....

Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2011

Parallel Grid Generation and Multi-Resolution Methods for Climate ModelingApplicationsDouglas W. (Douglas William) Jacobsen

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected]

http://fsu.digital.flvc.org/

mailto:[email protected]

THE FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

PARALLEL GRID GENERATION AND MULTI-RESOLUTION METHODS FOR

CLIMATE MODELING APPLICATIONS

By

DOUGLAS W. JACOBSEN

A Dissertation submitted to theDepartment of Scientific Computing

in partial fulfillment of therequirements for the degree of

Doctor of Philosophy

Degree Awarded:Summer Semester, 2011

The members of the committee approve the dissertation of Douglas W. Jacobsen

defended on June 14th, 2011.

Max GunzburgerProfessor Directing Thesis

Doron NofUniversity Representative

Janet PetersonCommittee Member

Gordon ErlebacherCommittee Member

Michael NavonCommittee Member

John BurkardtCommittee Member

Todd RinglerCommittee Member

Approved:

Max Gunzburger, Chair, Department of Scientific Computing

Joseph Travis, Dean, College of Arts and Sciences

The Graduate School has verified and approved the above-named committee mem-

bers.

ii

I would like to dedicate this dissertation to my loving and supportive wife whohelped me significantly through all of my school work. Also, I would like to thankmy parents and brothers for their continued support.

iii

ACKNOWLEDGMENTS

I would like to thank Dan Voss, Geoff Womeldorff, Mark Peterson, Michael Duda,

and Phil Jones for many useful discussions. The work contained in this dissertation

was supported by the US Department of Energy under grant numbers DE-SC0002624

and DE-FG02-07ER64432.

iv

TABLE OF CONTENTS

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1 Introduction 1

1.1 Personal Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Parallel SCVT Generator Background 9

2.1 Delaunay Triangulations . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Voronoi Tessellations . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Stereographic Projections . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Parallel Algorithm Details . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4.1 Convergence Criteria . . . . . . . . . . . . . . . . . . . . . . . 21

2.4.2 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Parallel SCVT Generator Results 23

3.1 Quasi-Uniform Results . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Variable Resolution Results . . . . . . . . . . . . . . . . . . . . . . . 29

3.3 Grid Generator Performance . . . . . . . . . . . . . . . . . . . . . . . 34

4 Numerical Model Background 36

4.1 Shallow-Water Equations and Numerical Method . . . . . . . . . . . 37

4.2 Shallow-Water Test Cases . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2.1 Non-linear Geostrophic Flow (TC2) . . . . . . . . . . . . . . . 42

4.2.2 Zonal Flow Over an Isolated Mountain (TC5) . . . . . . . . . 43

4.2.3 Barotropic Instability (BTI) . . . . . . . . . . . . . . . . . . . 44

5 Numerical Model Results 45

5.1 Shallow-Water Model Setup . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 Shallow Water Test Case Results . . . . . . . . . . . . . . . . . . . . 47

5.2.1 Shallow Water Test Case 5 . . . . . . . . . . . . . . . . . . . . 47

5.2.2 Shallow Water Test Case 2 . . . . . . . . . . . . . . . . . . . . 57

5.2.3 Barotropic Instability Test Case . . . . . . . . . . . . . . . . . 60

v

6 Adaptive Mesh Refinement Background 636.1 AMR Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.2 SCVT-AMR Framework . . . . . . . . . . . . . . . . . . . . . . . . . 64

7 Adaptive Mesh Refinement Results 737.1 642 Point Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737.2 2562 Point Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

8 Discussion 878.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

vi

LIST OF TABLES

3.1 Timing results for MPI-SCVT with bisection and Monte Carlo initialconditions and the speedup of bisection relative to Monte Carlo initialconditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 Comparison of STRIPACK with Serial and Parallel versions of MPI-SCVT using final triangulations . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Comparison of STRIPACK with serial and parallel versions of MPI-SCVT using per iteration triangulations . . . . . . . . . . . . . . . . . 26

3.4 Timings based on the domain decomposition used. Uniform uses a coarsequasi-uniform SCVT to define region centers and their associated radii,and sorts using a simple dot product. x16 uses a coarse x16 SCVT todefine region centers and their associated radii, and sorts using a simpledot product. Voronoi uses a coarse x16 SCVT to define region centersand their associated radii, and sorts using a Voronoi cell based sort. . . 31

5.1 Table of grid sizes and spacings for quasi-uniform grids used in shallow-water exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 Minimum values and grid spacing factors . . . . . . . . . . . . . . . . . 46

5.3 Approximate mesh resolutions (km) of the fine-mesh (dxf ) and coarse-mesh (dxc) regions of the global domain for the x1 through x16 meshesas a function of the number of grid points. . . . . . . . . . . . . . . . . 46

7.1 Error norms associated with the suite of AMR meshes based on the 642grid point reference mesh. Presented are L2 and L∞ norms of the errorin the thickness field, compared to a T511 reference simulation . . . . . 77

7.2 Error norms for AMR grids based on 2562 grid point reference mesh. L2

and L∞ norms are computed with the thickness field relative to a T511simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

vii

LIST OF FIGURES

2.1 Cross-sectional illustration of a stereographic projection from a sphereinto a tangent plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Domain Decomposition Example. Figure 2.2(a) is an SCVT used for a12 processor domain decomposition, where Figure 2.2(b) is a 10242 gen-erator Delaunay triangulation computed using the 12 generator SCVTfor parallelization. Each colored ring represents a regions radius Rk,where region centers Tk are the Voronoi cell center, at the center ofeach pentagonal structure in Figure 2.2(a). . . . . . . . . . . . . . . . . 18

2.3 Triangulations in a plane after Stereographic projection. 2.3(a) is thetriangulation before (2.8) is applied, and 2.3(b) is after it is applied . . 19

2.4 Triangle division used for integrating Voronoi cells using only the De-launay triangulation without any adjacency information. Kite sectionscontribute to the Voronoi cell centered at the vertex that is part of thekite. A, B, C vertices are generators in the point set, where the point atthe center of the triangle is the circumcenter of this triangle. Triangularregions that are colored similarly contribute to the same vertex. . . . . 20

3.1 Timings for a STRIPACK based SCVT Generator at 162, 642, 10242,40962 and 163842 generators. The red solid line represents the timespent in STRIPACK computing a triangulation, where the green dashedline represents the time spent integrating the Voronoi cells outside ofSTRIPACK in one iteration of Lloyd’s algorithm. Timings in this figurewere computed using an Intel Core 2 Duo T8100 CPU with 3GB of RAM. 24

viii

3.2 Timings for various portions of MPI-SCVT using 2 processors and 2regions. As the problem size increases the slope of both the triangula-tion (Red-Solid) and the integration (Green-Dashed) remain constant.The triangulation doesn’t become more expensive than the integrationuntil after roughly 163842 generators, as compared to Figure 3.1 wheretriangulation was more expensive after only 2562 generators. Also, atriangulation using 2621442 generators costs roughly the same usingMPI-SCVT and 2 processors as a triangulation using 163842 generatorsin STRIPACK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 Timing results from MPI-SCVT vs. number of processors. Constantproblem size, shown as parallelization is increased. Red solid lines rep-resent the cost of computing a triangulation, where green dashed linesrepresent the cost of integrating all Voronoi cells, and blue dotted linesrepresent the cost of communicating each region’s updated point set toits neighbors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4 Density function that creates a grid with resolutions that differ by afactor of 16 between the coarse and the fine region. The maximumvalue of the density function is 1, where the minimum value is ( 1

16)4. . 29

3.5 Figures show a variable resolution grid created using a density functionwith the format defined in (3.1). All three figures are of the same grid,only the viewing perspective is changed. Figure 3.5(a) shows the coarseregion of the grid, 3.5(b) shows the transition region of the grid, and3.5(c) shows the fine region of the grid. . . . . . . . . . . . . . . . . . . 32

3.6 Number of points each processor has to triangulate. 3.6(a) uses a quasi-uniform SCVT for its decomposition, with a simple dot product. 3.6(b)uses a x16 SCVT for its decomposition, with a simple dot product.3.6(c) uses a x16 SCVT for its decomposition, with a more complicatedsort based on the region’s Voronoi diagram. . . . . . . . . . . . . . . . 33

3.7 Scalability results based on number of generators. Green is a linearreference where Red is the Speedup computed using parallel version ofMPI-SCVT against a serial version . . . . . . . . . . . . . . . . . . . . 35

4.1 Four members of a family of meshes constructed from (3.1). Each meshuses 2562 grid points and only differ in the setting of the parameterγ. x1, x2, x4 and x16 shown in the top-left, top-right, bottom-left andbottom-right, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . 40

ix

4.2 C-grid staggering of variables for the finite-volume scheme used in MPAS.Fluid thickness, topography, and kinetic energy are stored at Voronoicell centers. The normal component of the velocity field is defined atthe mid-point of line segments connecting cell centers. Vorticity relatedfields such as relative, absolute, and potential vorticity are stored atVoronoi cell vertices. Derived fields, he, qe, and F⊥

e must be recon-structed at each velocity point. . . . . . . . . . . . . . . . . . . . . . . 41

5.1 The fluid height, hi + bi, at day 15 for TC5. Starting at the upperleft and moving clockwise shows results from the X1, X2, X16 and X4meshes using 40962 cells. The black oval denotes the location of themountain. The figures are generated by filling each Voronoi cell with asingle color, i.e. there is no interpolation due to rendering. This allowsthe coarse-mesh grid cells to be seen in the X4 and X16 simulations. Allresults are plotted with an identical color scheme with a maximum of5975 m and a minimum of 5025 m. . . . . . . . . . . . . . . . . . . . 52

5.2 Log10 of the relative change in available total energy for TC5 as a func-tion of time for the x1, x2, x4, x8 and x16 meshes with 40962 grid points. 53

5.3 Globally averaged potential enstrophy as a function of time for x1, x2,x4, x8, and x16 meshes with 40962 grid points. Simulations are runfor 15 days. Figures show decreasing potential enstrophy for x1 and x2meshes, and increasing potential enstrophy for x4, x8, and x16 meshes. 54

5.4 Log10 of the relative change in available potential enstrophy for TC5 asa function of time for the x1, x2, x4, x8 and x16 meshes with 40962 gridpoints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.5 The L2 error of the thickness field at day 15 for TC5 shown for the x1,x2, x4, x8 and x16 meshes. Figure 5.5(a) shows errors as a function ofnumber of generators, and figure 5.5(b) shows errors as a function ofcoarse-mesh grid spacing. Error norms are computed against a T511reference solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.6 The L2 error of the thickness field at day 12 for TC2 for the x1, x2, x4,x8 and x16 meshes. Figure 5.6(a) shows errors as a function of number ofgenerators, and Figure 5.6(b) shows errors as a function of coarse-meshgrid spacing. Error norms are computed against the analytic initialconditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

x

5.7 Each panel depicts the relative vorticity field at day 6 for a barotropically-unstable jet using 655362 cells. The panels differ only in the mesh usedin the simulation. The vertical extent of each panel covers the north-ern hemisphere. The horizontal extent covers all longitudes starting at-90 degrees such that the fine-mesh region is approximately centered oneach panel. The color scales are identical for every panel and saturateat ±1.0× 10−4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.1 Density field obtained after one simulation day using the relative vor-ticity field from shallow-water test case 5, on a x1 2562 generator grid,corresponding to the first four steps in Algorithm 3. Figure 6.1(a) hasno smoothings applied, Figure 6.1(b) has 16 smoothings applied, Figure6.1(c) has 64 smoothings applied, and Figure 6.1(d) has 128 smoothingsapplied. The smoothing operator is defined in (6.2). Red representsthe minimum, where blue represents the maximum. To show transitionscolor represents log2(ρ

1/4) . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.2 Three triangles with subdivision based on density values. Figure 6.2(a)shows a triangle whose density value is 14 providing no divisions. Figure6.2(b) shows a triangle whose density value is 24 providing one division.Figure 6.2(c) shows a triangle whose density value is 44 providing twodivisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.1 AMR grids based on a 642 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.2 Reference data fields for 642 quasi-uniform mesh. Shallow-water testcase 5 was simulated for 1 day, plotted in Figure 7.2(a) is the fluidthickness field, Figure 7.2(b) is the potential vorticity field, and Figure7.2(c) is the relative vorticity field. . . . . . . . . . . . . . . . . . . . . 75

7.3 Thickness fields from the 642 suite of AMR meshes. Figure 7.3(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.3(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.3(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.3(d) showsthe thickness field from a mesh with 128 smoothings. . . . . . . . . . . 76

7.4 Potential vorticity fields from the 642 suite of AMR meshes. Figure7.4(a) shows the potential vorticity field from an unsmoothed AMRmesh. Figure 7.4(b) shows the potential vorticity field from a meshwith 16 smoothings. Figure 7.4(c) shows the potential vorticity fieldfrom a mesh with 64 smoothings. Figure 7.4(d) shows the potentialvorticity field from a mesh with 128 smoothings. . . . . . . . . . . . . . 78

xi

7.5 Relative vorticity fields from the 642 suite of AMR meshes. Figure7.5(a) shows the relative vorticity field from an unsmoothed AMR mesh.Figure 7.5(b) shows the relative vorticity field from a mesh with 16smoothings. Figure 7.5(c) shows the relative vorticity field from a meshwith 64 smoothings. Figure 7.5(d) shows the relative vorticity field froma mesh with 128 smoothings. . . . . . . . . . . . . . . . . . . . . . . . 79

7.6 AMR grids based on a 2562 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

7.7 Reference data fields for 2562 quasi-uniform mesh. Shallow-water testcase 5 was simulated for 1 day, plotted in figure 7.7(a) is the fluid thick-ness field, figure 7.7(b) is the potential vorticity field, and figure 7.7(c)is the relative vorticity field. . . . . . . . . . . . . . . . . . . . . . . . . 83

7.8 Thickness fields from the 2562 suite of AMRmeshes. Figure 7.8(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.8(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.8(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.8(d) showsthe thickness field from a mesh with 128 smoothings. . . . . . . . . . . 84

7.9 Potential vorticity fields from the 2562 suite of AMR meshes. Figure7.9(a) shows the potential vorticity field from an unsmoothed AMRmesh. Figure 7.9(b) shows the potential vorticity field from a meshwith 16 smoothings. Figure 7.9(c) shows the potential vorticity fieldfrom a mesh with 64 smoothings. Figure 7.9(d) shows the potentialvorticity field from a mesh with 128 smoothings. . . . . . . . . . . . . . 85

7.10 Relative vorticity fields from the 2562 suite of AMR meshes. Figure7.10(a) shows the relative vorticity field from an unsmoothed AMRmesh. Figure 7.10(b) shows the relative vorticity field from a mesh with16 smoothings. Figure 7.10(c) shows the relative vorticity field from amesh with 64 smoothings. Figure 7.10(d) shows the relative vorticityfield from a mesh with 128 smoothings. . . . . . . . . . . . . . . . . . . 86

xii

ABSTRACT

Spherical centroidal Voronoi tessellations (SCVT) are used in many applications in a

variety of fields, one being climate modeling. They are a natural choice for spatial dis-

cretizations on the surface of the Earth. New modeling techniques have recently been

developed that allow the simulation of ocean and atmosphere dynamics on arbitrarily

unstructured meshes, including SCVTs. Creating ultra-high resolution SCVTs can

be computationally expensive. A newly developed algorithm couples current algo-

rithms for the generation of SCVTs with existing computational geometry techniques

to provide the parallel computation of SCVTs and spherical Delaunay triangulations.

Using this new algorithm, computing spherical Delaunay triangulations shows a speed

up on the order of 4000 over other well known algorithms, when using 42 processors.

As mentioned previously, newly developed numerical models allow the simulation

of ocean and atmosphere systems on arbitrary Voronoi meshes providing a multi-

resolution modeling framework. A multi-resolution grid allows modelers to provide

areas of interest with higher resolution with the hopes of increasing accuracy. How-

ever, one method of providing higher resolution lowers the resolution in other areas

of the mesh which could potentially increase error. To determine the effect of multi-

resolution meshes on numerical simulations in the shallow-water context, a standard

set of shallow-water test cases are explored using the Model for Prediction Across

Scales (MPAS), a new modeling framework jointly developed by the Los Alamos

National Laboratory and the National Center for Atmospheric Research.

An alternative approach to multi-resolution modeling is Adaptive Mesh Refine-

ment (AMR). AMR typically uses information about the simulation to determine

xiii

optimal locations for degrees of freedom, however standard AMR techniques are not

well suited for SCVT meshes. In an effort to solve this issue, a framework is developed

to allow AMR simulations on SCVT meshes within MPAS.

The resulting research contained in this dissertation ties together a newly devel-

oped parallel SCVT generator with a numerical method for use on arbitrary Voronoi

meshes. Simulations are performed within the shallow-water context. New algorithms

and frameworks are described and bench-marked.

xiv

CHAPTER 1

INTRODUCTION

Modeling the Earth’s climate has been considered a grand-challenge problem due to

the broad range of spatial and temporal scales required for robust simulation of its

subcomponents. For example, the climate of the ocean is controlled by both basin

scales of motion, O(104) km, and sub-mesoscale processes with O(10−1) km scales

[2]. These scales are highly interacting, as is typical of nonlinear systems, in that the

O(104) km global scales modify and are modified by the O(10−1) km local scales. For

robust simulation of the climate, an accurate representation of the smallest scales is a

requirement based on this strong inter-scale dependence. This broad scale interaction

is present in both the atmosphere and the ocean creating a difficulty in accurately

simulating the full climate system.

One major deficiency in climate modeling today is resolving small-scale processes.

These processes are resolved typically in one of two ways; either parameterization,

or direct simulation. Direct simulation is computationally expensive as it requires a

high enough spatial resolution to resolve even the smallest-scale processes. Currently

the computational resources available are not sufficient to directly simulate all scales

associated with the fundamental processes in the atmosphere and ocean, such as

clouds and ocean eddies [22]. As an alternative to direct simulation many models use

parameterizations of processes. However, parameterizing a process can be extremely

difficult because it requires an a priori knowledge of the cross-scale interaction of the

1

process. This requires developers to have a greater understanding of the underlying

physics associated with the physical process than those trying to perform direct sim-

ulations of the same process. Although parameterizations are indispensable tools, the

underlying difficulty in developing accurate parameterizations leads climate modelers

to increase model resolution, therefore allowing more direct simulations of small-scale

processes.

A novel technique in climate modeling is explored as part of this dissertation. This

new technique, referred to as a multi-resolution method, is complementary to three

existing branches of research that are active in the climate modeling community to-

day. The first is global ultra high-resolution climate system modeling [18]. Global

ultra high-resolution climate modeling attempts to pair ultra high-resolution climate

systems with state-of-the art high performance computing systems to achieve simula-

tions at unprecedented resolution. However, this approach has a disadvantage in that

reducing horizontal grid spacing by a factor of two typically requires a factor of 23

increase in computing resources, where longitude, latitude, and time each account for

a factor of 2 individually. This example is ignoring any extra expense from increases

in vertical resolution. Based on this significant increase in computational expense,

it is clear that global ultra-high resolution simulations are only able to represent a

small portion of all the simulations performed.

The second approach, intended to circumvent global high-resolution climate mod-

eling, is called limited-area climate modeling. Limited-area climate modeling has

been explored over the last two decades [12, 19, 34]. Typically this approach uses a

high-resolution mesh only over an area of interest, thus only spanning a portion of the

sphere. Utilizing a limited-area mesh reduces the computational requirements signif-

icantly; however one-way, non-interactive lateral boundary conditions are required.

Typically these lateral boundary conditions are obtained from either reanalysis data

or coarse-resolution global climate simulations.

2

The third approach currently being explored is referred to as multi-scale modeling.

Multi-scale modeling couples models at different scales to create a full simulation.

Previously, multi-scale modeling has been investigated with respect to atmospheric

modeling [13]; however a preliminary exploration of this method with regards to ocean

modeling is in progress [4]. Multi-scale methods are built under the assumption that a

scale separation exists that can be exploited in modeling the physical system, meaning

the fine-scale and coarse-scale processes act on temporal and spatial scales sufficiently

far away from each other. However, this assumption remains unvalidated.

As mentioned previously the work contained in this dissertation, and in [25], which

hopes to become a fourth approach, attempts to address some of the existing compu-

tational challenges in modeling the climate system. This new method is informally

referred to as a multi-resolution approach, and essentially merges traditional global

climate modelling approaches with regional limited-area approaches. A global mod-

eling framework is maintained in multi-resolution simulations in the sense that the

entire spatial extent of the atmosphere and/or ocean is simulated within a single

model; however arbitrary regions of local mesh refinement are allowed, similarly to

limited area or multi-scale methods. A global, conforming mesh is employed simi-

lar to stretched-grid or conformal mapping approaches previously explored [9, 10].

Stretched-grid approaches require a deformation of the mesh through a continuous

mapping, e.g. an increase of resolution in one region requires a decrease of resolution

in other regions. Also, stretched-grid approaches are limited in their ability to place

enhanced resolution in multiple regions. The multi-resolution approach developed in

[25] and explored as part of this dissertation alleviates several of the disadvantages

of stretched-grid methods. However, as with stretched-grid approaches, scale aware

parameterizations need to be developed for use with multi-resolution methods.

Multi-resolution approaches allow one or more regions with significantly higher

grid-resolution than the remainder of the mesh, as can be seen in Figures 3.5 and 4.1.

3

These meshes can be used to directly simulate processes in high resolution areas, while

parameterizing those same processes in low resolution regions, similar to multi-scale

methods. Following the motivation and requirements in [25], this multi-resolution

method requires two key components: First, a finite-volume method capable of main-

taining conservation properties when implemented on highly non-uniform grids, and

second, a conforming variable-resolution mesh with exceptional mesh-quality charac-

teristics.

Before describing the spatial meshes that are used, the finite-volume scheme capa-

ble of conservative simulations on highly varying meshes is introduced. As described

in [24, 31], a new finite-volume method has been developed which allows the use of

Voronoi meshes to produce robust simulations of rotationally-dominated geophysical

flows. Robust finite-volume techniques used in global atmosphere and ocean models

often showcase their ability to constrain the spurious growth of nonlinear quanti-

ties, such as potential enstrophy and total energy [1]. This challenge is particularly

difficult when implemented on non-uniform meshes. Combining the recent works of

[24, 31] provides a finite-volume approach that allows for the conservation of nonlinear

quantities, even when the underlying mesh is highly variable.

Although results presented in [24, 31] only showcase quasi-uniform meshes, the

numerical method described allows the use of arbitrary Voronoi meshes. As part of

this dissertation the numerical method’s ability to simulate on highly varying meshes

is explored. Jointly developed by the Los Alamos National Laboratory (LANL) and

the National Center for Atmospheric Research (NCAR), the Model for Prediction

Across Scales (MPAS) provides a framework suitable for the rapid prototyping and

development of dynamical cores. LANL has developed a shallow-water and a full

three-dimensional ocean dynamical core for use in MPAS, while NCAR has developed

an atmospheric model. MPAS implements the numerical method described in [24,

31] allowing the simulation on arbitrary Voronoi meshes, and will be used for the

4

exploration in this dissertation. In order to explore the model’s ability to simulate on

variable resolution meshes, a standard suite of test cases are used in the shallow-water

system as described in [39].

Before describing their use in multi-resolution modeling, a brief history of Voronoi

diagrams is provided. Voronoi diagrams have had many different names in their

past such as Thiessen polygons, Wigner-Seitz unit cells, and Brillouin zones [21].

The use of Voronoi diagrams involves a wide range of applications from condensed

matter physics, to measuring spatially distributed geophysical and meteorological

data. Although their use today is broad, their past use can be traced back to Descartes

in 1644. Originally Dirichlet derived modern Voronoi diagrams, however only in 2

and 3 dimensional spaces. Georgy Fedoseevich Voronoi generalized this work in 1908

to arbitrary dimensions, providing the definition of what we call Voronoi diagrams

today [33].

One version of these Voronoi diagrams, called a Spherical Centroidal Voronoi Tes-

sellation (SCVT), fulfills the requirements of a conforming, variable-resolution mesh.

Recently in climate modeling, Voronoi-like meshing of the sphere has found success

in global atmosphere modeling [14, 32, 35]. Each of these examples motivates the use

of Voronoi-like meshing through the ability to produce high-quality meshes of uni-

form resolution. In addition to the high-quality of Voronoi-like meshes, problematic

grid singularities associated with other meshing approaches are eliminated. Recent

work suggests that even though Voronoi meshes are well suited for uniform spheri-

cal meshes, they are perhaps even more valuable with respect to variable resolution

meshes.

As discussed in Chapter 2, the generation of variable-resolution SCVTs requires

two key components. First, a point-density function must be defined over the sur-

face of the sphere, providing high density in areas of interest. This density function

will help to enforce the variable-resolution nature of the grid. Second, a centroid

5

constraint must be iteratively enforced in every Voronoi cell. Coupling these two

together allows the creation of general variable-resolution meshes. However, current

algorithms for the generation of SCVTs provide less than desired performance as the

point set increases in size. In an effort to aid multi-resolution modeling, a new al-

gorithm is developed as part of this dissertation to allow the parallel computation

of SCVTs. SCVT generation involves two portions, first a triangulation step where

all points are triangulated. Second, an integration step which enforces the centroidal

constraint on the Voronoi diagram. In current algorithms, the performance bottle-

neck is the triangulation step, because of its sequential implementation. Previous

research has been done in an attempt to parallelize planar triangulation computa-

tions [6], however this work does not directly translate onto the surface of the sphere.

Combining existing computational geometry tools, such as stereographic projections

and domain decomposition, this new algorithm provides the parallel computation of

spherical Delaunay triangulations.

One method yet to be explored using SCVTs in geophysical simulations is adaptive

mesh refinement (AMR). AMR has previously been explored in the context of the

shallow-water equations [5, 29]. However, the majority of work uses cubed sphere

meshes that provide static degrees of freedom. Grid cells are used to represent the root

nodes of quad-trees providing easily implemented coarsening and refining. Typically,

some criterion is defined in order to determine if a grid cell should be coarsened or

refined; however grid cells are not allowed to be coarser than their initial size. In

order to satisfy the name of Adaptive Mesh Refinement, refinement of the meshes is

performed adaptively as the simulation progresses. Usually a field of interest is used to

define the refinement criteria, such as relative vorticity. As the simulation progresses,

the field of interest propagates within the domain, providing new regions that need

to be refined while previously refined regions might need to be coarsened. This

process provides a usable framework typical of standard AMR techniques; however

6

this method does not translate easily to SCVT meshes. In order to relate AMR

techniques to SCVT meshes and the MPAS framework, a new technique is explored

providing AMR-like grid generation. Currently the tools to fully implement an AMR

framework do not exist, however part of the work in this dissertation is intended to

aid the creation of an AMR framework within the context of SCVTs.

The parallel generation of SCVTs is described in detail in Chapter 2, and results

from the new algorithm are presented in Chapter 3. Background material on the

MPAS model, as well as the test cases used, are provided in Chapter 4. Results from

the exploration of MPAS on variable-resolution meshes is provided in Chapter 5. A

brief background on AMR and the new AMR framework are provided in Chapter 6.

The results of this new AMR framework are presented in Chapter 7. Finally, this

dissertation concludes with a discussion of the presented material in Chapter 8.

1.1 Personal Contributions

This section explains my personal contributions to the work contained in this

dissertation. In Chapters 2 and 3 my personal contributions are as follows:

• Developed and implemented algorithm for parallel computation of spherical

Delaunay triangulations and spherical centroidal Voronoi tessellations;

• Developed and implemented load balancing algorithm for variable resolution

grids;

• Benchmarked algorithm to produce results.

In Chapters 4 and 5 my personal contributions are as follows:

• Wrote software for conversion from a point set and triangulation to MPAS grid;

• Wrote software for visualization of MPAS input/output/restart files;

• Generated all grids for simulations;

• Wrote software for computation of globally averaged diagnostic quantities;

7

• Wrote initial condition generator for barotropic instability test case;

• Wrote software for computation of global error norms;

• Ran all 25 simulations and computed global error norms.

In Chapters 6 and 7 my personal contributions are as follows:

• Developed AMR framework for SCVT meshes;

• Wrote software for refining SCVT meshes based on a field from the output of

MPAS;

• Wrote software for mapping a density field and smoothing it;

• Wrote software for computation of global error norms;

• Ran all 8 simulations and computed global error norms.

8

CHAPTER 2

PARALLEL SCVT GENERATORBACKGROUND

This chapter provides the necessary background for, and describes the newly devel-

oped algorithm for the parallel generation of spherical centroidal Voronoi tessellations

that was created as part of this dissertation work. Results for this new grid generator

are presented in Section 3.

To begin, constructs required for the definition of SCVTs are described, beginning

with Delaunay triangulations and Voronoi tessellations. Stereographic projections

and their associated properties are then introduced; followed by a detailed description

of the parallel algorithm used for the construction of SCVTs.

2.1 Delaunay Triangulations

A k-simplex is defined as a k-dimensional polytope which is the convex hull of its

k + 1 vertices. For example, a 2-simplex would be a triangle, and a 3-simplex would

be a tetrahedron. A k-simplex is made up of what are referred to as s-faces, where

an s-face is made up of any s + 1 distinct vertices of the k-simplex. For example, a

2-face is a triangular face, a 1-face is an edge, and a 0-face is a vertex.

Given a point set, P , in Rd, the Delaunay triangulation of this point set, D(P ),

is the set of d-simplices such that:

• A point, p, in Rd, is a vertex of a simplex in D(P ), ⇐⇒ p ∈ P ;

9

• The intersection of two simplices in D(P ), is either the empty set, or a common

face;

• The interior of the circumscribing d-sphere through the d + 1 vertices of a

particular simplex contains no other points from the set P .

If the circumscribing d-sphere has more than d+ 1 points lying on its perimeter,

the triangulation is Delaunay, but not unique. The Delaunay triangulation of a point

set defined in Rd is related to the convex hull of the point set when projected onto a

paraboloid in Rd+1 [6].

2.2 Voronoi Tessellations

The dual mesh of a Delaunay triangulation is called the Voronoi tessellation. Given

a set of points, P , called generators, the Voronoi tessellation, V = Vi, is defined as

||x− xi|| < ||x− xj|| ∀x ∈ Vi, (2.1)

where Vi represents a Voronoi cell, and xi ∈ P and xj ∈ P represent generators.

This property, called the Voronoi property, states that every point contained inside

a Voronoi cell is closer to its cell generator than to any other generator in the set P .

To be a centroidal Voronoi tessellation, the cell generators xi are required to be the

centers of mass for the cells, meaning xi = x∗i , with x∗

i defined as

x∗i =

∫Vixρ(x)dx∫

Viρ(x)dx

, (2.2)

where ρ(x) defines a non-negative point-density function which can be used to create

variable resolution meshes.

The center of mass and the generator of a Voronoi cell are generally not coincident.

The requirement that xi and x∗i be the same can be imposed through one of many

algorithms, such as Lloyd’s algorithm [17]. Lloyd’s algorithm imposes this by iterating

10

on the point set, moving each generator to its Voronoi cell’s center of mass until they

are identical. Lloyd’s algorithm is more rigorously discussed in [7].

In general, the density function in (2.2) affects the grid spacing of the final SCVT.

If we arbitrarily select two Voronoi cells from a tessellation, and index them i and j,

their grid spacing and density are related as

hihj

≈

[ρ(xj)

ρ(xi)

] 1

d′+2

, (2.3)

where d′ is the dimension of the simplical elements in the tessellation, ρ(xi) is the

density function as in (2.2) evaluated at a point xi ∈ Vi, and hi is a measure of the

local grid spacing at the point xi. Though (2.3) is an open conjecture, it has been

supported through many numerical studies as can be seen further in [25].

Replacing all of the constructs defined in Sections 2.1 and 2.2 with their analogous

components on the surface of a sphere creates the spherical complements to Delaunay

triangulations and Voronoi tessellations. The spherical versions of Delaunay triangu-

lations and Voronoi tessellations are used for the construction of SCVTs as opposed

to planar CVTs which have been discussed above for simplicity. While planar CVTs

tessellate a 2-dimensional region with polygons, an SCVT tessellates the surface of a

3-dimensional sphere with polygons.

2.3 Stereographic Projections

Stereographic projections are special mappings between the surface of a sphere and

a plane tangent to the sphere. Not only are stereographic projections a conformal

mapping, meaning that angles are preserved, but the projections also preserve circles.

As will be discussed below, preserving circles is a particularly important property of

stereographic projections. Stereographic projections also map the interior of these

circles to the interior of the mapped circles [3, 26]. Preserving circularity implies that

the stereographic projection preserves Delaunay criteria as described in Section 2.1,

11

because Delaunay triangle circumcircles (along with their interiors) are preserved,

and therefore Delaunay triangulations are preserved. This projection can be used to

compute a triangulation of a portion of the sphere, by allowing the triangulation to

be carried out in the more convenient geometry of the plane.

To define the stereographic projection, we need to define the following quantities,

all in Cartesian coordinates in R3. C is the center of the sphere, typically the origin,

T is the point of tangency (where the projection plane is tangent to the sphere), F is

the focus point, which is a reflection about C of T, and P is a point on the surface

of the sphere. The stereographic projection of P into a point Q in the plane defined

by T is defined by

s = 2 ∗(C− F) · (C− F)

(C− F) · (P− F)(2.4)

Q = s ∗P+ (1− s) ∗ F. (2.5)

Figure 2.1 illustrates the stereographic projection, using the variables defined for

(2.4) and (2.5).

For the purposes of this research, it is more useful to define the projection relative

to T, rather than F for reasons that will be explained later. A simple substitution of

T = C− F produces

s = 2 ∗1

(T) · (P+T)(2.6)

Q = s ∗P+ (s− 1) ∗T (2.7)

This projection can be used to project from Rd to R

d−1, and can be repeated until

d− 1 = 2.

12

Figure 2.1: Cross-sectional illustration of a stereographic projection from a sphereinto a tangent plane.

13

2.4 Parallel Algorithm Details

The parallel algorithm closely follows the layout of Lloyd’s algorithm, with a

few modifications. The key modification is computing a Delaunay triangulation in

parallel, since all other portions are considered embarrassingly parallel. The idea

of computing a planar triangulation in parallel has been discussed for several years

[6]. Typically, such algorithms divide the point set up into smaller regions that can

then be triangulated independently from each other. Each triangulation needs to be

stitched together to form a global triangulation. This stitching, or merge step, is

typically computed serially because it could involve modifying significant portions of

each triangulation if the division was not performed correctly. The merge step is the

main difference between most parallel algorithms. The main benefit of the algorithm

here is that the merge step is done in parallel. To create a spherical triangulation in

parallel, a similar technique is employed as in the planar triangulations.

First, the sphere is divided intoN overlapping regions Yk(Tk, Rk) for k = 1, . . . , N ,

which are defined by a geodesic arc length Rk, and a tangent plane defined by the

regions point of tangency Tk. Each of these regions is owned by an independent

processor, and these regions also have some connectivity, or list of neighbors, defined.

On the sphere, these regions would look like overlapping umbrellas, as can be seen

in Figure 2.2(a). Each region (or processor) would take from the global point set,

pi ∈ P , the points that are inside of its region radius, where cos−1(Tk · pi) ≤ Rk.

Keep in mind, this sorting may cause one point to be in several regions, as in Figure

2.2, where Figure 2.2(a) shows an example domain decomposition with 12 regions

that could be used on a set of generators shown triangulated in Figure 2.2(b). Since

the end goal of this algorithm is to compute an SCVT, the regional triangulations do

not need to be merged on every iteration because they overlap.

After a spherical point set, Pk, is determined, the stereographic projection Pk =

S[Pk,Tk] of P into the plane defined by the point of tangency Tk is computed. Be-

14

cause a stereographic projection preserves circles (and their interiors), the projection

also preserves the Delaunay criteria that every triangle’s circumcircle needs to be

empty. The newly projected point set is now triangulated using some planar trian-

gulation algorithm, such as Triangle [28] which is used in this study. If the mapping

from global point index to local point index is appropriately maintained, a simple

map from local index to global index gives the approximate triangulation for the re-

gion on the sphere. One final step is needed to make this the true triangulation for

the region, which is to remove all “non-Delaunay” triangles. The criteria required to

be a Delaunay triangle in the global triangulation is defined as

cos−1 ||Tk − ci||+ ri < Rk, (2.8)

where Tk is a region center, Rk is a region radius, ri is a triangle circumradius, and

ci is a triangle circumcenter.

Since each region is unaware of the triangles and points outside of its radius, only

triangles whose circumcircles are completely contained inside of the region radius Rk

are guaranteed to be Delaunay, as no other points from the point set can be in their

circumcircle. Any triangle whose circumcircle extends outside of its regions radius

may contain points that were not in Pk, and should be discarded from the region’s

triangulation because this triangle is not guaranteed to adhere to the Delaunay criteria

for the entire point set. Figure 2.3 visualizes this point, where Figure 2.3(a) shows a

projected planar triangulation Pk before removing triangles that do not satisfy (2.8),

and Figure 2.3(a) shows the exact same triangulation after removing these potentially

non-Delaunay triangles. After this step is complete, the regional triangulation is now

exactly Delaunay. After the regional triangulation is computed, the integration step

of Lloyd’s algorithm can begin. The overlapping of regions is key to this portion of

the algorithm, because if the overlap is not large enough some true Delaunay triangles

might not be entirely in at least one region.

15

In Lloyd’s algorithm, after the Delaunay triangulation of the point set is computed,

every Voronoi cell center of mass must be computed by integration, so its generator

can be replaced. This step typically requires the computation of the Voronoi diagram

for a region in addition to the Delaunay triangulation previously computed. However,

some careful geometry can reveal that one doesn’t actually need the Voronoi diagram.

A single triangle from a Delaunay triangulation contributes to the integration of three

different Voronoi cells. As seen in Figure 2.4, if the triangle is split into three kites,

each made up of two edge midpoints, the triangle’s circumcenter, and a vertex of the

triangle, each one contributes to the Voronoi cell associated with the triangle vertex

that is part of the kite. Integrating each kite, and updating a portion of the centroid

integral allows one to only use the Delaunay triangulation when computing a CVT or

an SCVT, so that no mesh connectivity needs to be computed on an iteration basis.

To make this algorithm parallel, one simply has to ensure that each generator

is only updated by one region. This can be done using one of a variety of domain

decomposition methods. The method used in this particular algorithm uses the set

of generators from a coarse SCVT to define region centers. Each region then updates

only the generators that are inside of its defined Voronoi cell based on (2.1), using

region centers, or points of tangency, Tk as xi and generators pi ∈ P as x. Since

Voronoi cells are non-overlapping, each generator will only get updated by one region.

As mentioned earlier, the overlapping of regions is necessary to ensure that the trian-

gulation of all points contained inside each region’s Voronoi cells is exact. In practice,

a region radius corresponding to the maximum distance to any adjacent region center

allows enough overlap for the triangulation to be exact, and is defined br

Ri = maxj=1,...,N

cos−1(Ti ·Tj), (2.9)

where N is the number of region neighbors, Ti is the region center of interest, Tj is

a neighboring region center, and Ri is the geodesic arc distance for region i.

16

Whereas this heuristic allows the algorithm to work correctly, it may not be op-

timal for variable resolution grids, as some regions might contain many more points

than they need to when they border both a fine and a coarse region.

Once each of the generators is updated, each region needs to transfer its newly

updated points only to its adjacent neighbors, not to all of the active processors. This

limits each processor’s communications to roughly 6 sends and receives, regardless of

the total number of processors used. After this step is over, the convergence of the

grid is checked, and the iterations continue, or stop depending on the result.

17

(a) 12 Generator SCVT

(b) 10242 Generator Delaunay Triangulation

Figure 2.2: Domain Decomposition Example. Figure 2.2(a) is an SCVT used for a 12processor domain decomposition, where Figure 2.2(b) is a 10242 generatorDelaunay triangulation computed using the 12 generator SCVT for paral-lelization. Each colored ring represents a regions radius Rk, where regioncenters Tk are the Voronoi cell center, at the center of each pentagonalstructure in Figure 2.2(a).

18

(a) Before application of (2.8)

(b) After application of (2.8)

Figure 2.3: Triangulations in a plane after Stereographic projection. 2.3(a) is thetriangulation before (2.8) is applied, and 2.3(b) is after it is applied

19

Figure 2.4: Triangle division used for integrating Voronoi cells using only the De-launay triangulation without any adjacency information. Kite sectionscontribute to the Voronoi cell centered at the vertex that is part of thekite. A, B, C vertices are generators in the point set, where the point atthe center of the triangle is the circumcenter of this triangle. Triangularregions that are colored similarly contribute to the same vertex.

20

2.4.1 Convergence Criteria

When checking for convergence, two metrics are used. Currently, the L2 norm

(2.10) of the generator movement and the L∞ norm (2.11) of generator movement

are compared with some tolerance. If the norm of interest reaches the tolerance, the

iteration process is deemed to have converged. The L∞ is more strict, but both of

these norms follow similar convergence paths when plotted against iteration number.

There are other grid metrics that can be used, such as the clustering energy [8] as in

(2.12), but in practice this tends to be less strict, and more computationally expensive,

when compared with generator movement.

L2 =

√∑Npts

i=1 (xni − xn+1

i )2

Npts

(2.10)

L∞ = maxi=1,...,Npts

(|xni − xn+1

i |) (2.11)

CE =

Npts∑

i=1

∫

Vi

(ρ(x)||x− xi||2dx) (2.12)

2.4.2 Initial Conditions

A variety of initial conditions can be used in an SCVT generator. The most

obvious is Monte Carlo points [20]. These can either be uniformly distributed over the

sphere, to create a quasi-uniform initial condition, or they can be sampled using the

target point-density function, to potentially reduce the number of iterations required

for convergence. In addition to using Monte Carlo initial conditions, one can use a

bisection method to build fine grids from a coarse grid [14]. To create a bisection

grid, a coarse grid will be converged, using as few points as possible. After this coarse

grid is converged, the midpoint of every Voronoi cell edge, or Delaunay triangle edge

is added to the set of points. This causes the overall grid spacing to be reduced by

roughly a factor of two in every cell. It also makes the point set roughly four times

21

as large. In addition to Monte Carlo and bisection initial conditions, there are many

other choices that can be used.

22

CHAPTER 3

PARALLEL SCVT GENERATORRESULTS

Two different types of grids are presented to show the robustness of this algorithm.

To begin, quasi-uniform meshes are created, followed by more complicated variable

resolution meshes, which cover the entire sphere. This method can also be used to

create limited area grids on the sphere, however this is outside the scope of this

dissertation.

All of the results presented below were computed using Florida State University’s

High Performance Computing Facility.

3.1 Quasi-Uniform Results

STRIPACK [23] is an ACM TOMS algorithm that computes Delaunay triangula-

tions on a sphere. STRIPACK is a serial code used as a baseline for comparison in

this study. It is currently one of the few well-known spherical triangulation libraries

available, and is written in Fortran 77. Figure 3.1 shows the performance of STRI-

PACK [23] as the number of generators is increased through bisection as mentioned in

Section 2.4.2. The green dashed line represents the portion of the code that performs

the integration of the Voronoi cells and the red solid line represents the portion of

the code that performs the Delaunay triangulation. It is clear that the majority of

the time per iteration is spent in computing the Delaunay triangulation, and as the

23

number of generators increases the time spent computing a Delaunay triangulation

grows more rapidly than the time to integrate all Voronoi cells.

1

10

100

1000

10000

100000

1e+06

1e+07

100 1000 10000 100000 1e+06

Ave

rag

e t

ime

(m

s)

Number of Generators

TriangulationIntegration

Figure 3.1: Timings for a STRIPACK based SCVT Generator at 162, 642, 10242,40962 and 163842 generators. The red solid line represents the time spentin STRIPACK computing a triangulation, where the green dashed linerepresents the time spent integrating the Voronoi cells outside of STRI-PACK in one iteration of Lloyd’s algorithm. Timings in this figure werecomputed using an Intel Core 2 Duo T8100 CPU with 3GB of RAM.

Since most climate models are shifting towards global high resolution simulations,

the target quasi-uniform grid for this research is a global 15km resolution grid, which

corresponds to 2621442 grid points, or Voronoi cells. Grids created based on a uniform

Monte Carlo and bisection initial conditions are compared. The time for these grids

to converge to an SCVT with a tolerance of 10−6 in the L2 norm, as in (2.10), is

presented. A threshold of 10−6 is the strictest convergence levels that the Monte

Carlo grid can attain. For this reason, we use 10−6 as the convergence threshold for

this study. However, the bisection grid can converge much further beyond this point.

Table 3.1 shows timing results for the parallel algorithm comparing these two different

options of initial conditions. It is clear from this table that bisection initial conditions

provide a significant speedup in the overall cost to generate a grid, seeing as it takes

24

roughly 120th of the time to converge a bisection grid compared to a Monte Carlo grid.

Based on the results presented in Table 3.1, only bisection initial conditions are used

for the following experiments, unless otherwise specified.

Table 3.1: Timing results for MPI-SCVT with bisection and Monte Carlo initial con-ditions and the speedup of bisection relative to Monte Carlo initial condi-tions

Timed Portion Bisection (B) Monte Carlo (MC) Speedup MCB

Total Time (ms) 3,526,041 70,581,300 20.01Triangulation Time (ms) 73,684 21,164,512 287.23Integration Time (ms) 235,016 12,211,376 51.95

Communication Time (ms) 3,152,376 33,713,473 10.69

Tables 3.2 and 3.3 compare the algorithm described in this dissertation (MPI-

SCVT) with STRIPACK [23], for computing spherical Delaunay triangulations. The

results in these tables compare the cost to compute a single triangulation of a 163842

generator (60km global) grid. Table 3.2 compares STRIPACK with the final triangu-

lation routine in MPI-SCVT. This routine produces a full triangulation of the entire

sphere, and is only called once, at the very end of the grid generation process.

Table 3.2: Comparison of STRIPACK with Serial and Parallel versions of MPI-SCVTusing final triangulations

Algorithm Procs Regions Time (ms) SpeedupSTRIPACK 1 1 207528.81 BaselineMPI-SCVT 1 2 9504.02 21MPI-SCVT 42 42 5663.30 37

Table 3.3 compares STRIPACK with the triangulation routine in MPI-SCVT that

is called on every iteration. The results presented relative to MPI-SCVT in Table 3.3

are averages over 2000 iterations. It is clear from this table that we see a significant

speedup over both the serial versions of MPI-SCVT and STRIPACK when using only

42 processors.

25

As was previously mentioned, the drastic different between Tables 3.2 and 3.3 is

due to the different algorithms for computing triangulations. While Table 3.2 presents

timings that are directly comparable to STRIPACK, Table 3.3 presents timings more

useful in computing SCVTs.

As a comparison with Figure 3.1, Figures 3.2 and 3.3 present timing graphs made

from MPI-SCVT. From these three plots, it is clear that the increase in time to com-

pute the Delaunay triangulation does not grow as fast with problem size as it did

in STRIPACK. Two processors are used, because this is the minimum amount of

parallelization that MPI-SCVT supports, and MPI-SCVT requires at least 2 regions

because the stereographic projection has a singularity at the focus point. Eventu-

ally, at around 163842 generators, the triangulation becomes more expensive than

the integration step at least for 2 processors. Figure 3.2 represents the timings of

MPI-SCVT for 2 regions as the problem size increases. Figure 3.3(a) represents the

timings for a 40962 generator grid, which is a global 120km resolution, where Figure

3.3(b) represents a 163842 generator grid, with a 60km resolution, and Figure 3.3(c)

represents a 2621442 generator grid with a 15km resolution.

Table 3.3: Comparison of STRIPACK with serial and parallel versions of MPI-SCVTusing per iteration triangulations

Algorithm Procs Regions Time (ms) SpeedupSTRIPACK 1 1 207528.81 BaselineMPI-SCVT 1 2 3623.09 57MPI-SCVT 42 42 50.6572 4092

26

1

10

100

1000

10000

100000

1e+06

100 1000 10000 100000 1e+06 1e+07

Ave

rag

e T

ime

(m

s)



Communication

Figure 3.2: Timings for various portions of MPI-SCVT using 2 processors and 2 re-gions. As the problem size increases the slope of both the triangulation(Red-Solid) and the integration (Green-Dashed) remain constant. Thetriangulation doesn’t become more expensive than the integration untilafter roughly 163842 generators, as compared to Figure 3.1 where trian-gulation was more expensive after only 2562 generators. Also, a triangu-lation using 2621442 generators costs roughly the same using MPI-SCVTand 2 processors as a triangulation using 163842 generators in STRI-PACK.

27

1

10

100

1000

10000

1 10 100A

vera

ge T

ime (

ms)

Number of Processors


Communication

(a) 40962 Generator Timings

10

100

1000

10000

1 10 100

Avera

ge T

ime (

ms)



Communication

(b) 163842 Generator Timings

100

1000

10000

100000

1e+06

1 10 100

Avera

ge T

ime (

ms)



Communication

(c) 2621442 Generator Timings

Figure 3.3: Timing results from MPI-SCVT vs. number of processors. Constantproblem size, shown as parallelization is increased. Red solid lines rep-resent the cost of computing a triangulation, where green dashed linesrepresent the cost of integrating all Voronoi cells, and blue dotted linesrepresent the cost of communicating each region’s updated point set toits neighbors.

28

3.2 Variable Resolution Results

Variable resolution grids here are only computed using MPI-SCVT. This is done

because STRIPACK performs comparably in both the uniform case and variable

resolution cases. The main issue with regards to variable resolution grids is the

domain decomposition used for MPI-SCVT. For example, a poor choice of domain

decomposition could force the overlap in regions to be significantly larger than it needs

to be. The larger the overlap of regions, the more points each region has to, needlessly,

triangulate. This is especially apparent when using variable resolution grids as will

be seen later. Because of this, two simple domain decompositions are used on a grid

with a highly varying density function applied, in addition to one, more complicated

domain decomposition method. Timings are presented to determine which performs

better, and gives better load balancing. The density function used to compute the

grids in this section can be seen in Figure 3.4.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3

Density

Distance from Center of Density Function (radians)

Density

Figure 3.4: Density function that creates a grid with resolutions that differ by a factorof 16 between the coarse and the fine region. The maximum value of thedensity function is 1, where the minimum value is ( 1

16)4.

The analytic form of the density function used in figure 3.4 is defined as

ρ(xi) =1

2 (1− γ)

[tanh

(β − |xc − xi|

α

)+ 1

]+ γ, (3.1)

29

where xi is constrained to lie on the surface of the unit sphere. This function results

in relatively large values of ρ within a distance β of the point xc where β is measured

in radians and xc is also constrained to lie on the surface of the sphere. The function

transitions to relatively small values of ρ across a radian distance of α. The distance

between xc and xi is computed as |xc − xi| = cos−1(xc · xi) with a range from 0 to

π. Figure 3.5 shows an example grid created using this density function, with xc set

to be the center of the mountain defined in shallow-water test case number 5 from

[39] with φc =3π2, λc =

π6representing longitude and latitude respectively, γ = 1

16

4,

β = π6, and α = 0.15 with 10242 generators. This set of parameters used in (3.1) is

referred to as x16.

It was previously mentioned in Section 2.4 that the heuristic used to determine the

region radius does not provide good load balancing with respect to variable resolution

grids. To resolve this issue, a new algorithm was developed. The new algorithm

begins by sorting each point into a Voronoi cell. After all regions have their point

sets, the union of this point set with the neighboring Voronoi cell’s point sets gives

the final point set used. This sort method is more expensive to perform, however the

better load balancing reduces idle computing time from processors that have small

loads. Timings using this new method in addition to two dot-product-based methods

for domain decomposition can be seen in Table 3.4. Figure 3.6 shows the number

of points that each processor has to triangulate on a per iteration basis. These

timings and figures were computed using the exact same initial conditions, which

was a converged x16 grid with 163842 generators, and they all used 42 processors,

and 42 regions. Timings presented in Table 3.4 are averages over 3000 iterations.

Based on Table 3.4 and Figure 3.6 there is a significant advantage to the Voronoi

based decomposition in that it not only speeds up the overall cost per iteration, but

it provides a more balanced load across the processors. In Table 3.4 note that the

timings are taken relative to processor number 0, and as can be seen in Figure 3.6(a),

30

processor 0 has a very small load so the majority of its iteration time is spent waiting

for the processors with large loads to finish and catch up which is included in the

Communication column of the table.

Table 3.4: Timings based on the domain decomposition used. Uniform uses a coarsequasi-uniform SCVT to define region centers and their associated radii,and sorts using a simple dot product. x16 uses a coarse x16 SCVT todefine region centers and their associated radii, and sorts using a simpledot product. Voronoi uses a coarse x16 SCVT to define region centers andtheir associated radii, and sorts using a Voronoi cell based sort.

Decomposition Triangulation Integration Communication Iteration SpeedupUniform 14.9779 39.3149 2556.971 2611.35 Base

x16 104.793 276.681 1560.71 1965.56 1.32Voronoi 98.5482 249.77 288.694 640.472 4.07

31

(a) Coarse Region (b) Transition Region

(c) Fine Region

Figure 3.5: Figures show a variable resolution grid created using a density functionwith the format defined in (3.1). All three figures are of the same grid,only the viewing perspective is changed. Figure 3.5(a) shows the coarseregion of the grid, 3.5(b) shows the transition region of the grid, and3.5(c) shows the fine region of the grid.

32

0

20000

40000

60000

80000

100000

120000

140000

160000

5 10 15 20 25 30 35 40N

um

ber

Of P

oin

ts In R

egio

nRegion Number

Uniform Decomposition

(a) Uniform

0

20000

40000

60000

80000

100000

120000

140000

160000

5 10 15 20 25 30 35 40

Num

ber

Of P

oin

ts In R

egio

n

Region Number

x16 Decomposition

(b) x16

0

20000

40000

60000

80000

100000

120000

140000

160000

5 10 15 20 25 30 35 40

Num

ber

Of P

oin

ts In R

egio

n

Region Number

Voronoi based Decomposition

(c) Voronoi

Figure 3.6: Number of points each processor has to triangulate. 3.6(a) uses a quasi-uniform SCVT for its decomposition, with a simple dot product. 3.6(b)uses a x16 SCVT for its decomposition, with a simple dot product. 3.6(c)uses a x16 SCVT for its decomposition, with a more complicated sortbased on the region’s Voronoi diagram.

33

3.3 Grid Generator Performance

To assess the overall performance of MPI-SCVT, some scalability results are pre-

sented in Figure 3.7. Figure 3.7(a) shows that this algorithm can easily under-saturate

processors, and when this happens, communication ends up dominating the overall

runtime for the algorithm, which can be seen in Figure 3.3(a), and scalability ends up

being sub-linear. As the number of generators increases (as seen in Figures 3.7(b) and

3.7(c)) the limit for being under-saturated is higher. Currently in the algorithm, com-

munications are done asynchronously using non-blocking sends and receives. Also,

overall communications are reduced by only communicating with a region’s neighbors.

This is possible because points can only move within a region radius on any two sub-

sequent iterations, and because of this can only move into another region which is

overlapping the current region. More efficiency gains could be realized through im-

provements in the communication, and the integration algorithms, which could result

in linear scaling. In theory, because all of the computation is local this algorithm

should scale linearly very well, up to hundreds if not thousands of processors.

34

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60 70

Sp

ee

du

p


SpeedUp (Serial/Parallel)Linear Reference

(a) 40962 Generator Speedup

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60 70

Sp

ee

du

pNumber of Processors


(b) 163842 Generator Speedup

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60 70

Sp

ee

du

p



(c) 2621442 Generator Speedup

Figure 3.7: Scalability results based on number of generators. Green is a linear refer-ence where Red is the Speedup computed using parallel version of MPI-SCVT against a serial version

35

CHAPTER 4

NUMERICAL MODEL BACKGROUND

Climate models are broken into sub-component models. These dynamical cores model

the physics associated with various portions of the climate and are combined in intel-

ligent ways to create full climate models. Dynamical cores can be used to represent

the ocean, atmosphere, sea-ice, ice sheet, and other components of the climate. The

work contained in this dissertation relates specifically to ocean models.

Ocean models typically utilize equations that describe fluid dynamics with full

3-dimensional motion. These equations can be complicated to solve, and overly ex-

pensive for certain problems. Due to the cost of solving these equations, they can

be less than desirable for model development. For this reason, a simplification of

these equations is used as a starting point for ocean models, called the shallow-water

equations.

The shallow-water equations can be used to explore a less expensive system that

can still capture some of the key physical features in the full ocean system. To explore

the capabilities of their models, developers use a shallow-water model coupled with

test cases that showcase the model’s ability to simulate specific physical processes.

For example, [39] provides test cases suitable for modeling anything from advection,

to non-linear geostrophic flow, to flow over an isolated mountain. Combining these

test cases together allows a developer to determine how well their numerical model

performs in specific situations, and benchmark the overall conservation of the numer-

36

ical method. After the numerical method is explored in the shallow-water context,

it can then be implemented in a full 3-dimensional system to simulate ocean pro-

cesses. Once implemented similar benchmarks can be performed, though they are

significantly more expensive.

This chapter begins by giving a basic background into the shallow-water equations

and the numerical method used for the research contained in this portion of the

dissertation, followed by an introduction of several test cases which are used in the

shallow water system to benchmark the numerical method.

4.1 Shallow-Water Equations and Numerical

Method

The shallow-water equations are described as follows:

∂h

∂t+∇ · (hu) = 0, (4.1)

∂u

∂t+ ηk× u = −g∇ (h+ b)−∇K, (4.2)

where h represents the fluid layer thickness and u represents the fluid velocity along

the surface of the sphere. The absolute vorticity, η, is defined as k · (∇× u) + f and

the kinetic energy, K, is defined as |u|2

2. At all points on the surface of the sphere the

vector k points in the local vertical direction and we require k · u = 0 at all points.

The three parameters in the system are gravity, g, Coriolis parameter, f , and bottom

topography, b.

When using the shallow-water equations four quantities are expected to be con-

served; these quantities are total mass, total energy, potential vorticity, and potential

enstrophy. All of these conservation properties are explored in the results section

using MPAS.

37

A more appropriate form of the continuous equations is expressed as:

∂h

∂t+∇ · F = 0, (4.3)

∂u

∂t+ qF⊥ = −g∇ (h+ b)−∇K, (4.4)

where F = hu, F⊥ = k×hu and η = hq where q is the total potential vorticity. Using

the definition of potential vorticity, potential enstrophy is defined as the thickness-

weighted variance of potential vorticity by 1/2 ∗ q2.

The numerical method used in this research to model the shallow-water system is

discussed at length in [24, 31]. An analysis of the linearized version of (4.1) and

(4.2) is conducted in [31] in order to derive a numerical method that is able to

reproduce stationary geostrophic modes found in the continuous system, even when

the numerical method is implemented on variable resolution meshes such as those

shown in Figure 3.5. [24] extends the analysis to the nonlinear shallow-water equations

shown in (4.3) and (4.4) in order to derive a method that conserves total energy and

potential vorticity while allowing for a physically-appropriate amount of potential

enstrophy dissipation.

The work in this dissertation, and in [25] focuses on variable resolution meshes,

as seen in Figure 4.1, whereas both [24, 31] present results for quasi-uniform meshes,

even though the method is suitable for arbitrary Voronoi tessellations.

The numerical scheme is a standard finite-volume method that makes use of a

C-grid staggering as shown in Figure 4.2.

The discrete approximations of the divergence and gradient operator are shown

in Figure 3 of [24], and are used throughout this derivation.

The thickness field is defined on the Voronoi cells while all vorticity-related fields,

such as relative vorticity, absolute vorticity and potential vorticity, are defined on

the Delaunay triangles. Using a discrete approximation to the divergence operator,

a discrete thickness equation is derived. The equation for the normal-component

38

velocity is derived by taking the inner product of ne (from Figure 4.2) and (4.4). The

resulting discrete system is expressed as:

∂hi∂t

= − [∇ · Fe]i , (4.5)

∂ue∂t

+ F⊥e qe = − [∇ (g(hi + bi) +Ki)]e (4.6)

where Fe = heue represents the mass flux across the edge of a Voronoi cell and F⊥e

represents the mass flux across the edge of each Delaunay cell. Ki, he, qe and F⊥e are

defined following [24]. Also following [24], the anticipated potential vorticity method,

[27], is used to dissipate potential enstrophy.

The derivations in [24, 31] provide a numerical method that conserves total energy

to within time-truncation error, conserves total potential vorticity to within machine

round-off error and dissipates potential enstrophy at a rate that depends on a single

parameter. As mentioned previously, this derivation was carried out for use on a

general Voronoi mesh.

In an effort to produce a framework suitable for the rapid prototyping and devel-

opment of dynamical cores, this numerical method has been implemented in a joint

effort between Los Alamos National Laboratory (LANL) and the National Center

for Atmospheric Research (NCAR). This new framework is called the Model for Pre-

diction Across Scales (MPAS). Currently LANL is using this framework to develop

ocean and shallow-water models, and NCAR is developing an atmospheric model.

For the purposes of this dissertation, the ocean and shallow-water models developed

at LANL are used within MPAS.

39

(a) Quasi-Uniform Grid (x1) (b) Variable Resolution Grid (x2)

(c) Variable Resolution Grid (x4) (d) Variable Resolution Grid(x16)

Figure 4.1: Four members of a family of meshes constructed from (3.1). Each meshuses 2562 grid points and only differ in the setting of the parameter γ. x1,x2, x4 and x16 shown in the top-left, top-right, bottom-left and bottom-right, respectively.

40

Figure 4.2: C-grid staggering of variables for the finite-volume scheme used in MPAS.Fluid thickness, topography, and kinetic energy are stored at Voronoi cellcenters. The normal component of the velocity field is defined at themid-point of line segments connecting cell centers. Vorticity related fieldssuch as relative, absolute, and potential vorticity are stored at Voronoicell vertices. Derived fields, he, qe, and F

⊥e must be reconstructed at each

velocity point.

41

4.2 Shallow-Water Test Cases

The ocean modeling community tests their models using a variety of techniques.

One technique is to apply test problems that showcase various features present in the

ocean, and explore the errors associated with resolving these features. The test prob-

lems can involve anything from advection, to geostrophic flow, to Rossby-Haurwitz

waves. There are several test problems generally accepted by the community for the

testing of an ocean or shallow water dynamical core. One set of these test problems

can be found in [39]. This section describes two test cases that are used to benchmark

the MPAS shallow-water dynamical core defined in [39], along with the test problem

defined in [11]. Some additional tests used to benchmark dynamical cores can be seen

in [37, 38].

4.2.1 Non-linear Geostrophic Flow (TC2)

As defined in [39] this test case represents nonlinear geostrophic flow. Geostrophic

flow is an extremely important physical process that naturally occurs in the ocean

and atmosphere. It occurs when the nonlinear Coriolis force balances the horizontal

pressure gradient. This leads to the momentum equation becoming steady state,

taking the form of

fk× u = −g∇(h+ b) (4.7)

The initial conditions for the geostrophic flow defined for this test case are given

by

u = u0(cos(λ) cos(α) + cos(φ) sin(λ) sin(α)) (4.8)

v = −u0 sin(φ) sin(α) (4.9)

gh = gh0 − (αΩu0 +u202)× (− cos(φ) cos(λ) sin(α) + sin(λ) cos(α))2 (4.10)

42

where λ represents the latitude, φ represents the longitude, Ω represents the rotational

rate of the earth, and α represents the angle between solid body rotation and the polar

axis which is taken to be 0.0 in the simulations presented in Chapter 5.

The velocity field described in (4.8) and (4.9) can also be written in a stream

function form as

ψ = −au0(sin(λ) cos(α)− cos(φ) cos(λ) sin(α)) (4.11)

χ = 0 (4.12)

These stream functions provide only zonal (u direction) flow, with no flow in the

meridional (v) direction. To define the initial flow field, the stream function (4.11) is

sampled at Delaunay cell points xv, and computing ue as k×∇ψ. The thickness field

is defined by sampling (4.10) at Voronoi cell points. Even though errors in ue are

present at t = 0, this approach guarantees that the discrete divergence is identically

zero at t = 0.

4.2.2 Zonal Flow Over an Isolated Mountain (TC5)

Shallow-water test case number 5, as defined in [39], represents zonal flow over an

isolated mountain. This test case begins with geostrophic flow as described in Section

4.2.1; however at the initial time step a mountain is added to the topography. The

center of the mountain is placed at φc =3π2, λc =

π6, with a height described by

hs = hs0(1−r

R) (4.13)

where, hs0 = 2000m, R = π/9, and r2 = min(R2, (φ− φc)2 + (λ− λc)

2).

This causes the zonal flow to interact with the added mountain, causing gravity

and Rossby waves to propagate as the flow adjusts to the presence of the topographical

mountain. This interaction leads to strong nonlinearity, and therefore makes this test

case useful for exploration of a numerical method’s conservative properties.

43

4.2.3 Barotropic Instability (BTI)

As defined in [11], this test case starts with a barotropically unstable zonal flow

that includes a simple perturbation added to induce the instability. The perturbation

first causes global gravity waves to propagate around the sphere within a few hours.

Secondly, it creates complex vortical dynamics which develop over a few days. This

test case requires initial conditions of the form

u(φ) =

0 for φ ≤ φ0

umax

enexp

[1

(φ−φ0)(φ−φ1)]]

for φ0 ≤ φ ≤ φ1

0 for φ ≥ φ1

(4.14)

v(φ, λ) = 0 (4.15)

gh(φ) = gh0 −

∫

φ

au(φ′)

[f +

tan(φ′)

au(φ′)

]dφ′ (4.16)

where we take umax = 80ms, φ0 = π

7, φ1 = π

2− φ0, and en = exp

[−4

(φ1−φ0)2

], and a

is the radius of the Earth. In (4.16) h0 is chosen such that the global average sea

surface height is 10km.

(4.14) is then used to derive a stream function, which is sampled at Delaunay

cell locations as in TC2 in Section 4.2.1 to define the flow field. The height field is

generated based on (4.16). After the initial conditions are generated, a perturbation

is added to the height field that will drive the barotropic instability throughout the

system. This perturbation is defined as

h′(λ, φ) = h cos(φ)e−( λα)2e−[

φ2−φ

β ]2

for − π < λ < π (4.17)

where φ2 =π4, α = 1

3, β = 1

15, and h = 120m.

44

CHAPTER 5

NUMERICAL MODEL RESULTS

5.1 Shallow-Water Model Setup

SCVTs are used, as described in Chapter 2, for the spatial discretizations used in

the finite-volume scheme. These SCVTs are generated using (3.1) as the prescribed

density function. 25 different grids are generated and used in this work, however only

a subset of these are shown. Of the generated grids, 20 are variable resolution, and 5

are quasi-uniform. The quasi-uniform grids have grid spacings and generator counts

that can be found in Table 5.1.

Table 5.1: Table of grid sizes and spacings for quasi-uniform grids used in shallow-water exploration

Generators Approx. Grid Spacing2562 480km10242 240km40962 120km163842 60km655362 30km

The variable resolution grids used have the same number of generators as their

quasi-uniform counterparts, but differ only in γ in the density function (3.1). The

values used for γ can be found in Table 5.2. In these grids, is a distinct fine and coarse

region that are connected by a smooth transition region. Based on the parameters

used for the density function, the fine region has a radius of π/6 radians from the

45

center of the mountain as defined in TC5, the transition region extends past the fine

region another π/9 radians, and the coarse region makes up the remainder of the

sphere. γ is varied to give specific factors for the grid spacing between the coarse and

fine region, as can be seen in (2.3). As an example, the x2 grid (seen in Table 5.2)

has a factor of 2 between grid spacings in the coarse, and in the fine regions.

Table 5.2: Minimum values and grid spacing factors

Grid Name Grid Spacing Factor Minimum Valuex1 1.0 1.0x2 2.0 0.062500000x4 4.0 0.003906250x8 8.0 0.000244141x16 16.0 0.000015259

Setting the grid points constant and varying the density function used to create

the grids has advantages and disadvantages. In terms of disadvantages, making one

region finer requires the rest of the sphere to become coarser, so a x2 2562 grid has

cells that are larger than any cell in a x1 2562 grid. As far as advantages go, the

refinement provided in the variable resolution grids can provide an accuracy increase

in specific regions, similar to limited-area modeling approach’s. Table 5.3 shows the

approximate resolutions for the fine and coarse mesh regions when using the described

density function. An example of the grids can also be seen in Figure 4.1.

Table 5.3: Approximate mesh resolutions (km) of the fine-mesh (dxf ) and coarse-mesh (dxc) regions of the global domain for the x1 through x16 meshes asa function of the number of grid points.

Grid Points x1(dxf , dxc) x2(dxf , dxc) x4(dxf , dxc) x8(dxf , dxc) x16(dxf , dxc)2562 (480, 480) (282, 537) (196, 737) (169, 1293) (163, 2419)10242 (240, 240) (141, 169) (98, 368) (85, 648) (81, 1222)40962 (120, 120) (70, 134) (49, 184) (42, 324) (40, 611)163842 (60, 60) (35, 67) (25, 92) (21, 162) (20, 305)655362 (30, 30) (16, 32) (12, 48) (10, 78) (9, 148)

46

All grid points are generated using the bisection method as described in Section

2.4.2. Unique to x1, the grid points are also associated with the recursive bisection-

projection of an inscribed icosahedron [14]. This method results in a particularly

uniform distribution of grid points resulting in a relatively small solution error. This

special distribution of nodes is lost when producing the variable-resolution meshes.

As a result, a relatively large cost, in terms of global error, is incurred by choosing

to move away from the special quasi-uniform meshes, but very little additional cost

is incurred by increasing the extent of the mesh variation.

5.2 Shallow Water Test Case Results

Using the MPAS shallow-water dynamical core, three test cases are explored in

this Section, as defined in Sections 4.2.1, 4.2.2, and 4.2.3. The results in this Chapter

are also published as [25].

5.2.1 Shallow Water Test Case 5

The analysis of TC5 is presented first because it offers insight into the conserva-

tion properties of MPAS. TC5 contains a single mountain that is responsible for the

evolution of the system. While the mountain is large in scale, it is still localized and,

in that sense, is well suited for local mesh refinement. All of the meshes depicted in

Figure 4.1 and Table 5.3 enhance resolution in the vicinity of the defined mountain.

TC5 prescribes an analytic initial condition of large-scale geostrophic flow that

would be in steady state, if not for the presence of the mountain. This mountain

is centered at xc and extends π/9 radians in latitude and longitude. As described

in Chapter 3 the variable resolution meshes created are also centered at xc and the

fine-mesh region extends a distance of π/6 radians, meaning that the fine-mesh region

includes all of the mountain.

To begin, a qualitative assessment of TC5 is presented. Figure 5.1 shows the fluid

47

height field hi + bi at day 15 for the x1, x2, x4, and x16 meshes, with 40962 cells.

As depicted, all four simulations appear to be identical. This is expected because

the flow is characterized by large-scale Rossby waves that are well resolved on the

coarse-mesh resolutions of all of the 40962 meshes. In the x16 simulation result, the

coarse grid cells can clearly be seen.

Based on the numerical scheme two quantities are conserved to round-off error in

every simulation: the area-weighted global sum of thickness and the volume-weighted

potential vorticity. As found throughout the simulation,

∂

∂tV =

∂

∂t

Ni∑

i=1

hiAi = 0, (5.1)

∂

∂t

Nv∑

v=1

qvhvAv =0, (5.2)

to within round-off error in all simulations, where the quantity V represents the total

fluid volume.

In order to evaluate the energetics of the system, the total energy is computed

following [24, Eq. (70)] as

E =∑

e

Ae

[heu

2e

2

]+∑

i

Ai

[ghi

(1

2hi + bi

)]− Er. (5.3)

where Er represents the unavailable potential energy, and has the form:

Er =∑

i

g HiAi

[Hi

2+ bi

](5.4)

where

Hi =

∑iAi (hi + bi)∑

iAi

− bi (5.5)

has been subtracted. From now on, “total energy” implies “total available energy”. Er

represents the potential energy of the fluid at rest, which is unavailable to the system.

Figure 5.2 demonstrates the conservation of the total energy in the simulations. The

figures show log10|(E(t)−E(0))|

|E(0)|over the 15 day integration for the x1, x2, x4, x8 and

48

x16 meshes with 40962 grid points. At day 15, all solutions conserve total energy

to within 1.0 × 10−8 relative to total energy present at t = 0. This finding is orders

of magnitude better than is required when considering the dissipation mechanisms

present in the real atmosphere and ocean [31].

The total energy is conserved in a physically-appropriate manner, therefore the

nonlinear Coriolis force neither creates nor destroys kinetic energy, and the exchange

of energy between its potential and kinetic forms is equal and opposite. The degree to

which the nonlinear Coriolis force is energetically-neutral is explored, by computing

the time it would take for the nonlinear Coriolis force to double the kinetic energy

in the system. With 40962 grid points, the time required for the nonlinear Coriolis

force to double the kinetic energy is approximately 104 years for all meshes, which is

in agreement with Figure 4 of [24].

The other important component in the total energy budget is the conservative

exchange of energy between its potential and kinetic forms. The potential and kinetic

energy equations each have a source term. These source terms are equal and opposite

(see (15) and (16) of [24]). Following (65) and (67) from [24], the source term for

kinetic and potential energy is explored, respectively. Since these RHS sources are

algebraically equivalent in the discrete system, a very high degree of cancellation

between the sources is expected. All 25 simulations show the time scale for doubling

the kinetic energy of the system due to the imperfect cancellation of KE and PE

sources terms is approximately 1010 years. This is essentially machine precision round-

off error.

With regards to conservation, the final quantity of interest is potential enstrophy.

Figure 5.4 shows log10|(R(t)−R(0))|

R(0)where R is the global-integrated potential enstrophy

defined as

R =1

V

Nv∑

v=1

q2vhvAv −Rr. (5.6)

49

Potential enstrophy also has a unavailable reservoir that is equal to the amount of

potential enstrophy that exists when the fluid is at rest. This unavailable reservoir, Rr

is removed from the computation in order to obtain a more representative evaluation

of potential enstrophy conservation.

Figure 5.3 shows the globally average potential enstrophy as a function of time

over a 15 day simulation for each of the x1, x2, x4, x8, and x16 meshes. Figure

5.4 shows the relative change in globally-averaged potential enstrophy for the x1,

x2, x4, x8 and x16 meshes with 40962 nodes. At day 15, the relative changes in

globally-averaged potential enstrophy vary between 10−4 and 10−2.5 for the X1 and

X16 meshes, respectively. In these simulations, the x1 and x2 simulations show a

monotonic decrease in globally-averaged potential enstrophy, while the x4, x8, and x16

simulations show a monotonic increase in globally-averaged potential enstrophy. A

scale aware Anticipated Potential Vorticity method would clearly aid this discrepancy.

In terms of formal L2 global error norms, previous works using local mesh re-

finement with the shallow-water system all find that the solution error is relatively

unchanged when adding resolution in a specific region (e.g. [5, 29, 36]). This means

the solution error appears to be controlled by the coarse region of the mesh when us-

ing static mesh refinement. The global L2 error norm for each of the 25 simulations,

as a function of coarse-mesh resolution, is shown in Figure 5.5(b). Since TC5 does

not have a known analytic solution, error norms are computed with respect to a T511

global spectral model [30]. For TC5 at T511, the global spectral model requires a

scale-selective ∇4 dissipation of 8.0× 1012m4/s in order to prevent the accumulation

of energy and potential enstrophy at the grid scale.

Figure 5.5 shows the error norms for TC5. Figure 5.5(a) shows the normalized

L2 error as a function of number of generators, where Figure 5.5(b) shows the error

as a function of grid spacing in the coarse-mesh region. Based on figure 5.5(b), the

solution error appears to be controlled by the mesh resolution in the coarse region.

50

All of the simulations show the same convergence rate of approximately 1.5. Note

these errors norms are plotted on a log − log scale to emphasize the primary finding

that the L2 error is controlled by the coarse-mesh resolution. Looking at the results

more closely, it is apparent that the variable resolution meshes provide a small, but

measurable, improvement in solution error.

51

(a) x1 grid (b) x2 grid

(c) x4 grid (d) x16 grid

Figure 5.1: The fluid height, hi + bi, at day 15 for TC5. Starting at the upper leftand moving clockwise shows results from the X1, X2, X16 and X4 meshesusing 40962 cells. The black oval denotes the location of the mountain.The figures are generated by filling each Voronoi cell with a single color,i.e. there is no interpolation due to rendering. This allows the coarse-mesh grid cells to be seen in the X4 and X16 simulations. All results areplotted with an identical color scheme with a maximum of 5975 m and aminimum of 5025 m.

52

1e-13

1e-12

1e-11

1e-10

1e-09

1e-08

0 324000 648000 972000 1.296e+06

Rela

tive C

hange in T

ota

l E

ne

rgy

Time (s)

x1x2x4x8

x16

Figure 5.2: Log10 of the relative change in available total energy for TC5 as a functionof time for the x1, x2, x4, x8 and x16 meshes with 40962 grid points.

53

3.22195e-17

3.222e-17

3.22205e-17

3.2221e-17

3.22215e-17

3.2222e-17

3.22225e-17

3.2223e-17

3.22235e-17

3.2224e-17

0 324000 648000 972000 1.296e+06

Po

ten

tia

l E

nstr

op

hy

Time (s)

x1

(a) x1

4.6501e-17

4.65015e-17

4.6502e-17

4.65025e-17

4.6503e-17

4.65035e-17

4.6504e-17

4.65045e-17

4.6505e-17

0 324000 648000 972000 1.296e+06

Po

ten

tia

l E

nstr

op

hy

Time (s)

x2

(b) x2

5.547e-17

5.5475e-17

5.548e-17

5.5485e-17

5.549e-17

5.5495e-17

0 324000 648000 972000 1.296e+06

Po

ten

tia

l E

nstr

op

hy

Time (s)

x4

(c) x4

6.862e-17

6.864e-17

6.866e-17

6.868e-17

6.87e-17

6.872e-17

6.874e-17

6.876e-17

0 324000 648000 972000 1.296e+06

Po

ten

tia

l E

nstr

op

hy

Time (s)

x8

(d) x8

7.2e-17

7.205e-17

7.21e-17

7.215e-17

7.22e-17

7.225e-17

7.23e-17

7.235e-17

0 324000 648000 972000 1.296e+06

Po

ten

tia

l E

nstr

op

hy

Time (s)

x16

(e) x16

Figure 5.3: Globally averaged potential enstrophy as a function of time for x1, x2,x4, x8, and x16 meshes with 40962 grid points. Simulations are run for 15days. Figures show decreasing potential enstrophy for x1 and x2 meshes,and increasing potential enstrophy for x4, x8, and x16 meshes.

54

1e-08

1e-07

1e-06

1e-05

0.0001

0.001

0.01

0 324000 648000 972000 1.296e+06Rela

tive C

hange in P

ote

ntia

l E

nstr

op

hy

Time (s)

x1x2x4x8

x16

Figure 5.4: Log10 of the relative change in available potential enstrophy for TC5 asa function of time for the x1, x2, x4, x8 and x16 meshes with 40962 gridpoints.

55

1e-05

0.0001

0.001

0.01

1000 10000 100000 1e+06

No

rma

lize

d L

2 E

rro

r


x1x2x4x8

x16

(a) Normalized error as a function of number of generators

1e-05

0.0001

0.001

0.01

10 100 1000 10000

Norm

aliz

ed L

2 E

rror

Coarse Grid Spacing (km)

x1x2x4x8

x16

(b) Normalized error as a function of coarse-mesh grid spacing

Figure 5.5: The L2 error of the thickness field at day 15 for TC5 shown for the x1, x2,x4, x8 and x16 meshes. Figure 5.5(a) shows errors as a function of numberof generators, and figure 5.5(b) shows errors as a function of coarse-meshgrid spacing. Error norms are computed against a T511 reference solution.

56

5.2.2 Shallow Water Test Case 2

Having confirmed the ability of the numerical model to simulate transient flows

in a robust manner with TC5, TC2 is now used to measure the method’s ability to

maintain large-scale geostrophic balance. Because TC2 is steady-state, any deviation

of the numerical solution from its initial condition is considered to be numerical error.

While TC5 offers a reason for mesh refinement, no comparable reason is present

in TC2. The motivation for evaluating the variable resolution meshes using TC2 is

not to demonstrate the approaches utility, but rather to measure the cost of mesh

refinement. Maintaining large-scale balance is an important property of any numerical

model of the atmosphere or ocean. TC2 provides an environment to precisely measure,

through the L2 error norm, the impact of mesh refinement on maintaining geostrophic

balance.

Figure 5.6 plots the error norms for TC2. Figure 5.6(a) plots the normalized error

norms as a function of number of generators where 5.6(b) shows the normalized L2

error as a function of the coarse-mesh grid spacing. As found with TC5, essentially all

of the variation in the L2 error in the simulations is controlled by the coarse resolution

grid spacing. For a given coarse resolution, solution error increases by approximately

a factor of 2 between the x2 and x16 meshes. However, the solution error for the x1

mesh is approximately a factor of 10 smaller, regardless of the coarse mesh resolution.

Unfortunately the rate of convergence for TC2 does not appear to be uniform.

Meshes with minimum grid resolutions above 100 km show a convergence rate of

approximately 1.9 with respect to the coarse mesh resolution. As the minimum res-

olution of the mesh becomes smaller and smaller, the rate of convergence becomes

smaller. This reduction in convergence rate is likely caused by at least one the fol-

lowing: deficiencies in the structure of the grids, deficiencies in the manner in which

error norms are computed, and/or deficiencies in the numerical model. Currently

none of these possibilities have been excluded, and there is on-going research to de-

57

termine what the underlying cause of this issue is. It is expected that the 2nd-order

convergence rate would continue indefinitely as resolution is increased.

58

1e-06

1e-05

0.0001

0.001

0.01

0.1

1000 10000 100000 1e+06

No

rma

lize

d L

2 E

rro

r


x1x2x4x8

x16

(a) Errors as a function of number of generators

1e-06

1e-05

0.0001

0.001

0.01

0.1

10 100 1000 10000

Norm

aliz

ed L

2 E

rror

Coarse Grid Spacing (km)

x1x2x4x8

x16

(b) Errors as a function of coarse-mesh grid spacing

Figure 5.6: The L2 error of the thickness field at day 12 for TC2 for the x1, x2, x4,x8 and x16 meshes. Figure 5.6(a) shows errors as a function of numberof generators, and Figure 5.6(b) shows errors as a function of coarse-mesh grid spacing. Error norms are computed against the analytic initialconditions.

59

5.2.3 Barotropic Instability Test Case

The final test case explored in the shallow-water system is the growth of a barotropic

instability on a zonally-symmetric zonal jet [11].

Figure 5.7 shows the relative vorticity field at day 6 for the x1, x2, x4, x8 and

x16 meshes with 655362 cells. The fine-mesh region is coincident with the center of

each panel. In addition, the envelope of the growing barotropic instability is roughly

coincident with the fine mesh region at day 6, with parts of the wave system entering

and exiting the fine-mesh region at this point in time.

Test cases based on instabilities that grow on a zonally-symmetric base state

are particularly challenging for MPAS. The test case is zonally symmetric and the

instability is triggered by a small amplitude perturbation, however SCVT meshes

used are not always zonally-symmetric and, as a result, lead to some truncation error

which projects onto non-zero zonal wave numbers. This truncation error serves as an

additional trigger for the instability and can lead to wave growth that is either too

fast or not in the correct location. As the resolution is increased, the amplitude of the

spurious forcing by truncation error diminishes and the instability is solely controlled

by the perturbation contained in the initial conditions.

In addition, the growth of the unstable waves depends strongly on the type and

strength of the sub-grid scale closures that are either implicit in the underlying nu-

merical formulation or explicitly added to the numerical models. For example, the

x1 panel in Figure 5.7 agrees very closely with panel D in Figure 17 of [15], but is

significantly different than panel D in Figure 9 of [11]. This is because the simula-

tions presented here and in [15] do not use any explicit closure, whereas [11] uses

hyper-diffusion on the RHS of the momentum equation.

The strong correspondence of the x1 simulations with panel D in Figure 17 of

[15] indicates that the x1 simulation is broadly representative of the instability when

simulated in a system with minimal or no damping. The primary purpose here is

60

to understand how the use of variable resolution meshes alters the growth of the

barotropic instability.

First, focusing on the deep, tilted trough just right of center in each panel along

with the ridge-trough-ridge system just upstream to the west one finds that these

dominant features are present in all simulations with the same amplitude and phase.

The x2 simulation is qualitatively equivalent to the x1 simulation in all respects.

In addition, the x8 simulation is qualitatively equivalent to the x4 simulation in all

respects. The x4 simulation differs from the x2 simulation only along the edges of

the panels that corresponds to the center of the coarse-mesh regions. The primary

difference between these two groups of simulations is that the x4/x8 simulations

produce an additional ridge in the upstream wave. The x16 simulation is qualitatively

different from the other simulations in all regions other than the fine-mesh region.

The x16 simulation produces relatively strong ridge-trough systems in the coarse-

mesh region that are not present in the other simulations. It is important to note

that the fine-mesh resolutions of the x8 and x16 simulations are essentially the same at

approximately 10 km, yet the coarse-mesh resolution between these simulations differ

by a factor of two (as in Table 5.3). The x16/655362 simulation is more similar to

the x1/40962 simulation (not shown) than any of the other simulations with 655362

nodes. Since the coarse resolution of the x16/655362 simulation is comparable to

the x1/40962 simulation, this finding is consistent with Figures 5.5(b) and 5.6(b)

which demonstrate that the accuracy of the simulation is controlled primarily by the

resolution in the coarse-mesh region.

61

Figure 5.7: Each panel depicts the relative vorticity field at day 6 for a barotropically-unstable jet using 655362 cells. The panels differ only in the mesh usedin the simulation. The vertical extent of each panel covers the northernhemisphere. The horizontal extent covers all longitudes starting at -90degrees such that the fine-mesh region is approximately centered on eachpanel. The color scales are identical for every panel and saturate at±1.0× 10−4.

62

CHAPTER 6

ADAPTIVE MESH REFINEMENTBACKGROUND

This chapter describes the framework used for the exploration of Adaptive Mesh

Refinement (AMR) in the context of the shallow water equations using SCVT grids.

Results from the described AMR framework are presented in Chapter 7.

6.1 AMR Background

Adaptive Mesh Refinement is typically used as a means to get increased spatial

accuracy without a significant increase in the computational cost. Typically, one

would have to increase the global resolution of a mesh to increase the global accuracy,

however AMR makes use of output data from simulations to generate new meshes

that are better suited for that specific simulation. Because AMR meshes apply local

refinement around features of interest they are a type of multi-resolution mesh. As a

simulation progresses the meshes generated typically track features of interest. As was

seen in Section 5.2.1 error norms associated with multi-resolution meshes appear to

be controlled by the coarse mesh resolution. Because of this, AMR SCVT meshes are

only used with the motivation of reducing horizontal grid spacing in an area defined

by simulation output. At a later point in time, scale aware parameterizations, which

adapt to changes in grid spacing, may reduce the error norms by providing more

accurate simulations on these multi-resolution AMR meshes.

63

The work in this portion of this dissertation follows two similar explorations [5, 29].

Both of these AMR approaches are implemented using cubed sphere grids that provide

two beneficial features for implementing AMR. First, each cell can be thought of as

the root of a quad tree allowing for local refinement simply by subdividing each cell

into four sub-cells. Second, since the cells do not move over time, de-refinement

is as simple as removing the sub-cells and replacing with the previous “root” cell.

One disadvantage to cubed sphere grids is that they are non-conforming meshes and,

because of this, have hanging nodes. The hanging nodes require interpolation schemes

to remove spurious waves generated by reflection from the artificial boundary.

Both [5, 29] use the absolute value of the relative vorticity field as the criteria for

cell refinement. In their explorations they both use several of the standard shallow-

water test cases found in [39] to test their AMR schemes. A common test between

both explorations is shallow-water test case number 5, as defined in Section 4.2.2.

This test case is useful in testing AMR schemes due to the evolution of the vortical

dynamics. As the vortices migrate around the sphere, the AMR scheme should adapt

the grid by refining and derefining cells to compensate for this movement.

6.2 SCVT-AMR Framework

A typical AMR framework for use on cubed sphere meshes with static grid points

has a general format similar to the following pseudo-code:

While Algorithm 1 is a typical AMR framework for use on structured grids with

static grid points, it does not have a direct translation for use on SCVT meshes.

As the focus of this dissertation is on multi-resolution meshes within SCVTs, a new

framework is developed for use within this context, and is described as follows

Currently, the tools required for the use of SCVTs in Algorithm 2 do not ex-

ist. Particularly difficult is re-mapping data between two SCVT meshes. There are

methods of re-mapping data on SCVTs [16], however the implementations are not

64

mature enough to support this type of application. Due to this constraint, the frame-

work presented in this dissertation only incorporates one time step of a typical AMR

framework. This is done under the assumption that eventually the re-mapping tools

will become mature enough to be combined with this AMR scheme, and will allow a

comparison with previously implemented schemes which use Algorithm 1.

The AMR framework used for a single time step is as follows:

Following [5, 29] the relative vorticity field, ξ, is used to define both the point-

density field and the refinement criteria. After one AMR time step, in this case one

day, the output from MPAS’ shallow-water model is used to build a density field.

In order to create a density field, several steps are required. To begin, the absolute

value of the relative vorticity field, |ξ|, is cut off at a threshold comparable to [5, 29] of

|ξ| > 1.0 ∗ 10−5. This step ensures only cells with extreme values of relative vorticity

are refined, which ignores any relative vorticity in the mean flow. After the threshold

is applied, the remaining relative vorticity field is rescaled, where this scaling is defined

as

ρi =|ξi| −minN

j=1 |ξj|

maxNj=1 |ξj|∗ (γ4 − 1.0) + 1.0 (6.1)

where |ξi| represents the absolute value of the relative vorticity at cell i, N represents

the number of cells, ρi represents the density value at a cell, and γ represents an

Algorithm 1 General AMR Framework

t = 0

Initialize simulation

while t < T do

Iterate simulation for time of ∆t

Refine mesh based on chosen criteria, e.g. relative vorticity, ξ

Map data from previous mesh to refined mesh

end while

Compute error norms

65

arbitrary scaling.

Using the scaling defined in (6.1) and the cut off previously discussed, the density

field has a minimum value of 1, and a maximum value defined by γ4. The minimum

grid spacing is then given as a factor of the coarse grid spacing, based on γ4, as in

(2.3). The resulting density field obtained by cutting off and mapping the relative

vorticity field tends to have sharp gradients and can be very concentrated in certain

areas. These sharp gradients cause issues when refinement is applied, as neighboring

cells are allowed to differ by more than one level of refinement.

In an attempt to remove these sharp gradients in the density field a Laplacian

smoothing operator can be applied an arbitrary number of times. Laplacian smooth-

ing will cause the density field to diffuse over the sphere, which will smooth out

gradients in the density field. The motivation of this technique is that grids will be

a higher quality if the density field has smooth gradients. The Laplacian smoothing

operator is defined as

ρ∗i =1

2ρi +

n∑

j=1

(1

2 ∗ n∗ ρj) (6.2)

where n is the number of neighbors cell i has and ρ is the density defined at cell

centers.

This Laplacian smoothing operator replaces a cell value with a weighted average

of its previous value and its neighbor’s values. Because this operator smooths slowly,

it may have to be applied many times before a reasonably smooth density field is

obtained. Figure 6.1 illustrates this point with 4 plots. Presented are density fields

with no smoothings (Figure 6.1(a)), 16 smoothings (Figure 6.1(b)), 64 smoothings

(Figure 6.1(c)), and 128 smoothings (Figure 6.1(d)). 128 is used as the maximum

number of smoothings because most of the features present in the 0 and 16 smoothings

fields are not present anymore. Also, in the 128 smoothings case, refinement is applied

within areas that do not appear to require it.

66

After Laplacian smoothing is applied, the grid refinement levels are validated to

ensure only one level of refinement occurs across edges of Delaunay triangles. Al-

though this could add more points than are required based on the refinement criteria,

this technique is standard in the AMR techniques presented in [5, 29].

Cubed sphere AMR techniques maintain the coarse grid spacing of elements due

to the static grid points. In an attempt to retain this advantage refinement can be

used to intelligently add points to SCVTs. As was discussed previously, bisection

provides a refinement in horizontal grid spacing of roughly a factor of two. In order

to refine the reference mesh, Delaunay triangles are bisected based on the density

value given from re-mapping and smoothing the relative vorticity field. This refine-

ment procedure subdivides Delaunay triangles, adding points on edges, as well as

the interior of triangles. The number of points added depends on the density in the

triangle compared with the minimum density. One edge of a Delaunay triangle is

subdivided n times, where n is defined as

n = log2(ρ1

4n ) (6.3)

where ρn is the density value associated with the Delaunay triangle.

The Delaunay triangle can either be refined when n > 1 or, if |ξ| > α. Both [5, 29]

use |ξ| > α and set α = 2.0 ∗ 10−5. Because of the threshold on ξ the implemented

method combines both these methods. Triangles are refined when n > 1, however

n is only larger than 1 when |ξ| > α. As mentioned previously, refining triangles

based on n > 1 causes the coarse mesh resolution to roughly be preserved, based on

(2.3). Controlling γ from (6.1) allows control over the total number of points added

to a mesh. Although low values of γ produce grids with less total points, they also

produce grids with lower variances in horizontal grid spacing. Figure 6.2 shows three

triangles with varying levels of division based on the density value of the triangle.

After a grid has been refined and smoothed using a combination of (6.1), (6.2), and

67

(6.3), the grid is no longer an SCVT. An SCVT generator, as described in Chapter

2, can be used to converge the point set to an SCVT. Before an SCVT generator

can be used to iterate on a refined point set, a spatial density function needs to be

defined in order to evaluate the point-density of each generator as they move around

the mesh. There are several choices of spatial density functions, each with their own

drawbacks and benefits. The main drawback is differences in computational cost.

Some potential options for spatial density functions are piecewise constant, piecewise

linear, and pointwise constant density functions. While pointwise constant density

functions provide a benefit of having significantly lower computational cost, they have

less dependence on the overlaying density function and can potentially move refined

regions out of the area of interest. Piecewise constant density functions are almost as

expensive as piecewise linear density functions, however they don’t smooth the final

point set over the mesh well, and end up with points clustered in reference Voronoi

cells. Because of the drawbacks of piecewise and pointwise constant density functions,

piecewise linear density functions are used.

Before the refined mesh is output, a reference mesh is output, which contains

points, density values associated with those points, and a triangulation of those points.

The piecewise linear density function is defined as a barycentric interpolation within

reference Delaunay triangles. In order to evaluate the density function at an arbitrary

point, first the Delaunay triangle which contains the point must be determined. This

is done through a combination of vector dot and cross products to check the orien-

tation with each edge of the Delaunay triangle. After the in-out test is completed,

the barycentric weights of the test point need to be computed. This computation is

described as

68

α =Area(B,C, P )

Area(A,B,C)(6.4)

β =Area(C,A, P )

Area(A,B,C)(6.5)

γ =Area(A,B, P )

Area(A,B,C)(6.6)

(6.7)

where A, B, and C are the vertices of the triangle, given in counter-clockwise order,

P is the test point contained inside triangle ABC, and α, β, and γ are the barycentric

weights for P associated with A, B, and C respectively.

After the barycentric weights are computed, the density function is a simple map

defined as follows

ρP = ρA ∗ α + ρB ∗ β + ρC ∗ γ (6.8)

where ρ is the density value, and α, β, and γ are the barycentric weights for point P .

The resulting framework provides a method of producing AMR-like grids that

maintain the specified coarse mesh resolution, while increasing resolution in areas

of interest based on reference simulations. Using this scheme, and some yet-to-be-

developed tools for re-mapping, a full AMR scheme can be implemented. Any of the

fields present in these simulations can be used for computing the density field and

refinement; relative vorticity is only used as a comparison to [5, 29].

69

Algorithm 2 Full AMR Framework for SCVT meshes

t = 0


while t < T do


Convert field of interest, ξ, into a point-density field

If desired, smooth point-density field

Refine mesh based on point-density field

Converge refined mesh to an SCVT as described in Chapter 2

Map data from previous SCVT to new SCVT

end while

Compute error norms

Algorithm 3 Single time step SCVT AMR Framework

t = 0



Convert field of interest, ξ, into a point-density field

If desired, smooth point-density field

Refine mesh based on point-density field

Converge refined mesh to an SCVT as described in Chapter 2

Initialize simulation with newly converged SCVT mesh at t = 0


Compute error norms

70

(a) No smoothing (b) 16 smoothings

(c) 64 smoothings (d) 128 smoothings

Figure 6.1: Density field obtained after one simulation day using the relative vor-ticity field from shallow-water test case 5, on a x1 2562 generator grid,corresponding to the first four steps in Algorithm 3. Figure 6.1(a) hasno smoothings applied, Figure 6.1(b) has 16 smoothings applied, Figure6.1(c) has 64 smoothings applied, and Figure 6.1(d) has 128 smoothingsapplied. The smoothing operator is defined in (6.2). Red represents theminimum, where blue represents the maximum. To show transitions colorrepresents log2(ρ

1/4)

71

(a) n = 1 (b) n = 2

(c) n = 3

Figure 6.2: Three triangles with subdivision based on density values. Figure 6.2(a)shows a triangle whose density value is 14 providing no divisions. Figure6.2(b) shows a triangle whose density value is 24 providing one division.Figure 6.2(c) shows a triangle whose density value is 44 providing twodivisions

72

CHAPTER 7

ADAPTIVE MESH REFINEMENTRESULTS

To explore the potential of the AMR framework introduced in Chapter 6, results

are presented in this chapter. To begin, a 642 generator quasi-uniform mesh is used

as a reference mesh. After the 642 results are presented, a 2564 generator quasi-

uniform mesh is used to produce another set of AMR grids. These point sets are

chosen due to their relatively small number of points, and large grid spacing. All of

the results presented are computed using shallow-water test case number 5 involving

geostrophic flow over an isolated mountain, implemented inside MPAS. Test case 5

was previously defined in Section 4.2.2. Error norms are computed using a T511 high

resolution spectral element solution, as was used for the shallow-water test case 5

results in Chapter 5.

7.1 642 Point Suite

To begin, a suite of results are presented based on a 642 grid cell quasi-uniform grid

with roughly 960km grid spacing. Using this quasi-uniform reference grid, shallow-

water test case 5, as defined in Section 4.2.2, is simulated for one day. After one

simulation day, the relative vorticity field is mapped into a density field over the

mesh using (6.1). Delaunay triangles are then refined based on their density values.

The maximum density value is constrained using γ = 4. This provides four levels

73

of refinement within a mesh, reducing the 960km grid spacing to 120km in areas

with extreme relative vorticity. Delaunay triangles are refined in the aforementioned

manner, maintaining a fixed coarse grid resolution.

Figure 7.1 shows four grids that make up the 642 grid cell suite. Figures 7.1(a),

7.1(b), 7.1(c), and 7.1(d) show 0, 16, 64, and 128 iterations of Laplacian smoothing

on the density fields respectively. Color in these figures represents cell area, with red

representing the smallest area, and purple representing the largest area.

(a) Unsmoothed (b) 16 Smoothings

(c) 64 Smoothings (d) 128 Smoothings

Figure 7.1: AMR grids based on a 642 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied.

74

Before exploring the results obtained from each simulation, reference data is shown

for a qualitative comparison. Figure 7.2 shows the thickness, potential vorticity, and

relative vorticity fields after one simulation day on the 642 quasi-uniform mesh. Again

the relative vorticity field shown in Figure 7.2(c) was scaled to be the density field

used to generate the meshes shown in Figure 7.1.

(a) Thickness (b) Potential Vorticity

(c) Relative Vorticity

Figure 7.2: Reference data fields for 642 quasi-uniform mesh. Shallow-water test case5 was simulated for 1 day, plotted in Figure 7.2(a) is the fluid thicknessfield, Figure 7.2(b) is the potential vorticity field, and Figure 7.2(c) is therelative vorticity field.

The output fields from the four AMR grids after one simulation day using shallow-

water test case 5 are presented in Figures 7.3, 7.4, and 7.5. Figure 7.3 shows the

thickness fields for all four simulations, which appear qualitatively equivalent to the

reference simulation. Figure 7.4 shows the potential vorticity fields for all four simu-

75

lations, which also appear qualitatively equivalent to the reference simulation. Figure

7.5 shows the relative vorticity fields for all four simulations. The relative vorticity

fields appear to have an increase in noise as the number of smoothings applied to the

mesh are increased.

(a) Unsmoothed (b) 16 Smoothing

(c) 64 Smoothing (d) 128 Smoothing

Figure 7.3: Thickness fields from the 642 suite of AMR meshes. Figure 7.3(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.3(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.3(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.3(d) showsthe thickness field from a mesh with 128 smoothings.

As was seen in Section 5.2.1 all conservation properties still hold on these AMR

meshes. In order to explore the effect each AMR mesh has on the simulation error

76

norms are computed as was done in Section 5.2.1. Table 7.1 lists the L2 and L∞

norms of the error in the thickness field of each AMR simulation at the end of one

simulation day.

Table 7.1: Error norms associated with the suite of AMR meshes based on the 642grid point reference mesh. Presented are L2 and L∞ norms of the error inthe thickness field, compared to a T511 reference simulation

Grid L2 L∞ % Irreagular CellsReference 7.45 ∗ 10−4 4.83 ∗ 10−3 1.86%

Unsmoothed 9.00 ∗ 10−4 8.67 ∗ 10−3 7.56%16 Smoothings 8.16 ∗ 10−4 6.04 ∗ 10−3 9.36%64 Smoothings 9.58 ∗ 10−4 5.55 ∗ 10−3 10.83%128 Smoothing 1.45 ∗ 10−3 8.93 ∗ 10−3 7.73%

As can be seen in Table 7.1, all AMR meshes have slightly higher error than the

reference mesh. Also, applying more iterations of Laplacian smoothing does not ap-

pear to aid the error norm much if at all. Most of the error in these meshes comes

from the addition of pentagons and heptagons, or Voronoi cells with five and seven

sides respectively. These irregular cells cause distortion in the area of the mesh sur-

rounding them. Although the reference mesh has the minimum 12 pentagons, each

of the other four meshes have more unneeded pentagons each with an additional sep-

tagon. Although it is an incredibly difficult problem, removing these extra pentagons

and heptagons may potentially improve the error norms, and help this AMR scheme

be more useful.

77



Figure 7.4: Potential vorticity fields from the 642 suite of AMR meshes. Figure 7.4(a)shows the potential vorticity field from an unsmoothed AMR mesh. Fig-ure 7.4(b) shows the potential vorticity field from a mesh with 16 smooth-ings. Figure 7.4(c) shows the potential vorticity field from a mesh with64 smoothings. Figure 7.4(d) shows the potential vorticity field from amesh with 128 smoothings.

78



Figure 7.5: Relative vorticity fields from the 642 suite of AMR meshes. Figure 7.5(a)shows the relative vorticity field from an unsmoothed AMR mesh. Figure7.5(b) shows the relative vorticity field from a mesh with 16 smooth-ings. Figure 7.5(c) shows the relative vorticity field from a mesh with 64smoothings. Figure 7.5(d) shows the relative vorticity field from a meshwith 128 smoothings.

79

7.2 2562 Point Suite

As mentioned in Table 5.1 a 2562 generator quasi-uniform SCVT has a grid spacing

of roughly 480km. Shallow-water test case 5 is simulated for one day, on the quasi-

uniform mesh, after which the relative vorticity field , ξ, is extracted and converted

into a density field using (6.1). As was the case for the 642 grid point suite of meshes,

γ = 4 is chosen to give four levels of refinement between the grid resolutions of the

finest and coarsest grid regions as defined in (2.3). The factor 4 is chosen because it

allows a large variation in grid spacing between the finest and coarsest grid regions

without adding a significant number of points.

Figure 6.1 shows the four meshes created as part of the 2562 grid point suite

of AMR meshes. The four meshes presented consist of varying levels of Laplacian

smoothing, with 0, 16, 64, and 128 applications of Laplacian smoothing. Each of

these four meshes is colored by cell area, where red represents the smallest value, and

purple represents the largest value.

Figure 7.7 plots the thickness, relative vorticity, and potential vorticity fields on

the quasi-uniform 2562 generator mesh after one simulation day. Figure 7.7 is pro-

vided as a reference for data presented on AMR grids based on the quasi-uniform

2562 generator mesh.

Figures 7.8, 7.9, and 7.10 show the thickness, potential vorticity, and relative

vorticity fields for all four AMR simulations based on the 2562 grid point reference

mesh. All simulations are run for one simulation day using shallow-water test case 5.

As was seen in Section 7.1, an increase in the number of times Laplacian smoothing

is applied to the mesh appears to increase the overall noise in the simulation. While

the thickness and potential vorticity fields appear qualitatively similar to the reference

simulation, the relative vorticity field appears to have significantly more noise. In

order to determine the effect of this noise on the final simulations, the error norms

are presented relative to a T511 reference simulation in Table 7.2.

80

Table 7.2 shows the error norms from all 5 simulations to be essentially equiv-

alent in the L2 norm, and not significantly different in the L∞ norm. As was the

case in Section 7.1, all of the AMR grids have more pentagons and septagons than

the reference mesh, and removing these may improve the error norm for the AMR

simulations. Also, the grids presented as part of these results have distinct bound-

aries between levels of refinement in the final SCVT, which is evident from Figure

7.6. Smoothing out this boundary in the final density function may also improve the

simulation results.

Alternatively to the results presented in Table 7.1, Table 7.2 shows a slight de-

crease in both error norms (excluding the 16 smoothings case) as more smoothings are

applied. As the number of points in the reference mesh increase, the resulting density

function can capture more of the dynamics of the relative vorticity field, allowing for

grids with smoother transition regions. This trend may continue to higher resolution

reference grids.

Table 7.2: Error norms for AMR grids based on 2562 grid point reference mesh. L2

and L∞ norms are computed with the thickness field relative to a T511simulation.

Grid L2 L∞ % Irregular cellsReference 3.32 ∗ 10−4 2.95 ∗ 10−3 0.046%

Unsmoothed 4.97 ∗ 10−4 5.91 ∗ 10−3 4.40%16 Smoothings 4.35 ∗ 10−4 8.33 ∗ 10−3 4.52%64 Smoothings 4.22 ∗ 10−4 5.63 ∗ 10−3 5.34%128 Smoothings 4.20 ∗ 10−4 5.32 ∗ 10−3 5.31%

81

(a) Unsmoothed (b) 16 Smoothings

(c) 64 Smoothings (d) 128 Smoothings

Figure 7.6: AMR grids based on a 2562 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied.

82

(a) Thickness (b) Potential Vorticity

(c) Relative Vorticity

Figure 7.7: Reference data fields for 2562 quasi-uniform mesh. Shallow-water testcase 5 was simulated for 1 day, plotted in figure 7.7(a) is the fluid thicknessfield, figure 7.7(b) is the potential vorticity field, and figure 7.7(c) is therelative vorticity field.

83



Figure 7.8: Thickness fields from the 2562 suite of AMR meshes. Figure 7.8(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.8(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.8(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.8(d) showsthe thickness field from a mesh with 128 smoothings.

84



Figure 7.9: Potential vorticity fields from the 2562 suite of AMR meshes. Figure7.9(a) shows the potential vorticity field from an unsmoothed AMR mesh.Figure 7.9(b) shows the potential vorticity field from a mesh with 16smoothings. Figure 7.9(c) shows the potential vorticity field from a meshwith 64 smoothings. Figure 7.9(d) shows the potential vorticity field froma mesh with 128 smoothings.

85



Figure 7.10: Relative vorticity fields from the 2562 suite of AMR meshes. Figure7.10(a) shows the relative vorticity field from an unsmoothed AMRmesh.Figure 7.10(b) shows the relative vorticity field from a mesh with 16smoothings. Figure 7.10(c) shows the relative vorticity field from a meshwith 64 smoothings. Figure 7.10(d) shows the relative vorticity field froma mesh with 128 smoothings.

86

CHAPTER 8

DISCUSSION

Presented in this dissertation was a broad scope of research, including parallel spher-

ical grid generation, shallow-water simulations, and adaptive mesh refinement. Be-

ginning with Chapters 2 and 3 a new algorithm was presented allowing the parallel

computation of spherical Delaunay triangulations and spherical centroidal Voronoi

tessellations. This new algorithm combines existing techniques in computational ge-

ometry to manipulate the point set, allowing spherical triangulations to be computed

in the more simple geometry of the plane. Existing spherical triangulation algo-

rithms are not easily parallelizeable, and tend to scale poorly with the number of

points. The new algorithm, MPI-SCVT, presents a comparison against a well-known

spherical triangulation algorithm, STRIPACK, and showcases a respectable speed

up. As MPI-SCVT has two algorithms used to compute the triangulation, two values

of speed up are presented. A more direct comparison to STRIPACK uses the full

triangulation of all points in the set, where MPI-SCVT performed with a speed up

of roughly 37 when using 42 processors. For a comparison more applicable to SCVT

generators, the regional triangulation from MPI-SCVT presents a speed up of 4096

over STRIPACK, when using 42 processors.

MPI-SCVT was then used to generate variable resolution meshes that were used

in a new shallow-water model from Los Alamos National Laboratory and the National

Center for Atmospheric Research. This new model is known as MPAS and provides

87

a numerical method that is capable of simulating flow on arbitrary Voronoi meshes.

The shallow-water model was explored using a suite of multi-resolution meshes to

determine the effect of variable resolution meshes in the numerical method. It was

shown that MPAS conserves mass, total energy, and potential vorticity, while dis-

sipating potential enstrophy when using variable resolution meshes. These results

are presented using shallow-water test case 5 from [39]. Also presented are L2 error

norms of the fluid thickness field for shallow-water test cases 2 and 5. Test case 5

shows that the coarse grid resolution controls the global error norm, which can also be

seen in previous results [5, 29, 36]. Test case 2 shows the numerical method currently

has an issue with its rate of convergence at high resolutions which is the focus of

research. The shallow-water model is also explored using a barotropic instability test

case as defined in [11]. The barotropic instability provides a system suitable for the

exploration of the propagation of waves through mesh transition regions. In general,

moderately varying meshes had a minor effect on the overall simulation, only highly

varying meshes showed a significant change in results.

Finally, an adaptive mesh refinement (AMR) framework was presented for sim-

ulations using spherical centroidal Voronoi tessellations. Previous efforts to perform

AMR on spherical grids have required structured grids which provide the benefit of

having fixed grid points. These fixed grid points allow cells to be coarsened and

refined at will, while Voronoi tessellations do not maintain this ability. This new

framework needs to be coupled with tools for remapping data on SCVTs, however

these tools have yet to be developed. Using this framework two reference grid sizes

are used for a suite of 8 AMR simulations. All simulations use shallow water test case

5 from [39] due to its dynamic system. Quasi-uniform grids were used as reference

grids containing either 2562 generators or 642 generators. Reference grids were used

to simulate test case 5 for one day, after which the output was used to refine, smooth,

and generate density fields for new grids. Using this newly refined point set and

88

density field, SCVTs are generated and simulated for one day again. L2 error norms

of the thickness field are presented relative to the T511 reference simulation used for

multi-resolution shallow-water simulations.

These 8 AMR simulations help to solidify the finding that the coarse mesh res-

olution controls the global error. Although some noise appears to be introduced in

the simulation due to irregular Voronoi cells, the global error norm remains close to

the reference global error norm. Future work on this effort reducing the number of

defects in an SCVT mesh may reduce the global error. Although these AMR simula-

tions do not appear to have the same benefit traditional AMR simulations do, they

provide another method of generating multi-resolution meshes suitable for specific

simulations. Future work combining this technique into a full AMR framework, and

creating grids with less local distortion may prove to be a useful endeavor.

8.1 Future Work

As this dissertation cover a broad range of topics, the potential for future work

has many opportunities. With regards to the parallel grid generator, future work

might include:

• Parallel Limited Area Meshing

• Parallel Planar Meshing

Future work with regards to MPAS might include:

• Variable Resolution Meshes Simulated in MPAS Ocean Model

• Exploration of MPAS Shallow-water Model in more Test Cases

Future work involving AMR on SCVT meshes might include:

• Reduction of Pentagon-Septagon pairs in SCVT Meshes

• Coupling of Re-mapping Tools With Described AMR Framework

• Full AMR Simulations

89

BIBLIOGRAPHY

[1] A. Arakawa. Computational design for long-term numerical integration of theequations of fluid motion: Two-dimensional incompressible flow. Journal of

Computational Physics, 1:119–143, 1966.

[2] G. Boccaletti, R. Ferrari, and B. Fox-Kemper. Mixed layer instabilities andrestrati-fication. Journal of Physical Oceanography, 37:2228–2250, 2007.

[3] P.L. Bowers, W.E. Diets, and S.L. Keeling. Fast algorithms for generat-ing delaunay interpolation elements for domain decomposition. Unpublished:http://www.math.fsu.edu/ aluffi/archive/paper77.ps.gz, 1998.

[4] J. Campin, C. Hill, H. Jones, and J. Marshall. Super-parameterization in oceanmodelling: Application to deep convection. Ocean Modelling, 2010. Revised.

[5] C. Chen, F. Xian, and X. Li. An adaptive multimoment global model on a cubedsphere. Monthly Weather Review, 139:523–548, 2011.

[6] P. Cignoni, C. Montani, and R. Scopigno. Dewall: A fast divide and conquerdelaunay triangulation algorithm in e-d. Computer-Aided Design, 30:333–341,1998.

[7] Q. Du, M. Emelianenko, and L. Ju. Convergence of the lloyd algorithm forcomputing centroidal voronoi tessellations. SIAM Journal of Numerical Analysis,44:102–119, 2006.

[8] Q. Du, V. Faber, and M. Gunzburger. Centroidal voronoi tessellations: applica-tions and algorithms. SIAM Rev, 41:637–676, 1999.

[9] M. Fox-Rabinovitz et al. Variable resolution general circulation models:Stretched-grid model intercomparison project (sgmip). Journal of Geophysical

Research, 111, 2006.

[10] M. Fox-Rabinovitz, G. Stenchikov, M. Suarez, and L. Takacs. A finite-differencegcm dynamical core with a variable-resolution stretched grid. Monthly Weather

Review, 125:2943–2968, 1997.

[11] J. Galewsky. An initial-value problem for testing numerical models of the globalshallow-water equations. Tellus, 56A:429–440, 2004.

90

[12] M. Giorgi and L. Mearns. Approaches to the simulation of regional climatechange: a review. Reviews of Geophysics, 29:191–216, 1991.

[13] W. Grabowski. Coupling cloud processes with the large-scale dynamics using thecloud-resolving convection parameterization. Journal of Atmospheric Sciences,58:978–997, 2010.

[14] R. Heikes and D. Randall. Numerical integration of the shallow-water equationson a twisted icosahedral grid. part i: Basic design and results of tests. Monthly

Weather Review, 123:1862–1880, 1995.

[15] S. Ii and F. Xiao. A global shallow water model using high order multi-momentconstrained finite volume method and icosahedral grid. Journal of Computational

Physics, 229:1774–1796, 2010.

[16] P.W. Jones. First- and second-order conservative remapping schemes for gridsin spherical coordinates. Monthly Weather Review, 127:2204–2210, 1998.

[17] S.P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Informa-

tion Theory, 28:129–137, 1982.

[18] J.L McClean et al. A prototype two-decade fully-coupled fine-resolution ccsmsimulation. Ocean Modelling, 2011. In Press.

[19] J.L. McGregor. Regional climate modelling. Meteorology and Atmospheric

Physics, 63:105–117, 1997.

[20] N. Metropolis and S. Ulam. The monte carlo method. Journal of the American

Statistical Association, 44:335–341, 1949.

[21] A. Okabe, B. Boots, K. Sugihara, and S.N. Chiu. Spatial Tessellations - Conceptsand Applications of Voronoi Diagrams. John Wiley, second edition, 2000.

[22] D. Randall and S. Bony. Climate models and their evaluation. IPCC WG1

Fourth Assessment Report, 2007.

[23] R.J. Renka. Algorithm 772: Stripack: Delaunay triangulation and voronoi dia-gram on the surface of a sphere. ACM Transactions on Mathematical Software,23:416–434, 1997.

[24] T. Ringler et al. A unified approach to energy conservation and potential vorticitydynamics for arbitrarily-structured c-grids. Journal of Computational Physics,229:3065–3090, 2010.

[25] T.D. Ringler et al. Exploring a multi-resolution modeling approach within theshallow-water equations. Monthly Weather Review, 2011. Accepted.

91

[26] A. Saalfeld. Delaunay triangulations and stereographic projections. Cartographyand Geographic Information Science, 26:289–296, 1999.

[27] R. Sadourny and C. Basdevant. Parameterization of subgrid scale barotropic andbaroclinic eddies in quasi-geostrophic models: Anticipated potential vorticitymethod. Journal of the Atmospheric Sciences, 42:1353–1363, 1985.

[28] J.R. Shewchuk. Triangle: Engineering a 2d quality mesh generator and delaunaytriangulator. Applied Computational Geometry: Towards Geometric Engineer-

ing, 1148:203–222, 1996.

[29] A. St-Cyr, C. Jablonowski, J.M. Dennis, H. Tofu, and S.J. Thomas. A compar-ison of two shallow-water models with nonconforming adaptive grids. Monthly

Weather Review, 136:1898–1922, 2008.

[30] P.N. Swarztrauber. Spectral transform methods for solving the shallow waterequations on the sphere. Monthly Weather Review, 124:730–744, 1996.

[31] J. Thuburn, T. Ringler, W. Skamarock, and J. Klemp. Numerical representationof geostrophic modes on arbitrarily structured c-grids. Journal of Computational

Physics, 228:8321–8335, 2009.

[32] H. Tomita et al. A global cloud-resolving simulation: Preliminary results froman aqua planet experiment. Geophysical Research Letters, 32, 2005.

[33] G. Voronoi. Nouvelles applications des parametres continus a la theorie desformes quadratiques. J. Reine Angew. Math., 134:198–287, 1908.

[34] Y. Wang, L.R. Leung, J.L. McGregor, D.K. Lee, W.C. Wang, Y. Ding, andF. Kimura. Regional climate modeling: progress, challenges, and prospects.Journal of the Meteorological Society of Japan, 82:1599–1628, 2010.

[35] H. Weller and H. Weller. A high-order arbitrarily unstructured finite-volumemodel of the global atmosphere: Tests solving the shallow-water equations. nter-national Journal for Numerical Methods in Fluids, 56:1589–1596, 2008.

[36] H. Weller, H.G. Weller, and A. Fournier. Voronoi, delaunay, and block-structuredmesh refinement for solution of the shallow-water equations on the sphere.Monthly Weather Review, 137:4208–4224, 2009.

[37] D.L. Williamson. Climate simulations with a spectral, semi-lagrangian modelwith linear grids. Numerical Methods in Atmospheric and Oceanic Modelling.

The Andro J. Robert Memorial Volume, pages 279–292, 1997.

[38] D.L. Williamson. The evolution of dynamical cores for global atmospheric mod-els. Journal of the Meterological Society of Japan, 85B:241–269, 2007.

92

[39] D.L. Williamson et al. A standard test set for numerical approximations tothe shallow water equations in spherical geometery. Journal of Computational

Physics, 102:211–224, 1992.

93

BIOGRAPHICAL SKETCH

Douglas Jacobsen performed both his Ph.D. and M.S. work under the advisement of

Prof. Max Gunzburger at Florida State University in the Department of Scientific

Computing. He entered into the program in Fall of 2007 after finishing his bachelors

degree in Computational Physics at Oregon State University, where he studied the

effect of material defects on hysteresis curves in ferromagnetic materials.

Douglas’ masters research included exploring the effect of vertical grid structures

and mixing strategies on north Atlantic overflow simulations.

Douglas’ current research interests include high performance computing including

GPGPU computing, spherical grid generation specifically related to SCVTs, and

geophysical fluid dynamics.

94

florida state university librariesdiginole.lib.fsu.edu/islandora/object/fsu:181994/... · 2015. 4....

Documents