center for uncertainty quantification logo lock-up · mation and kriging for large non-gridded...

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

LikelihoodApproximationWithHierarchicalMatricesForLargeSpatialDatasets

A. Litvinenko, Y. Sun, M. Genton, D. Keyes, CEMSE, KAUST

HIERARCHICAL LIKELIHOOD APPROXIMATIONSuppose we observe a mean-zero, stationary and isotropic Gaussian process Z with a Matérn covari-

ance at n irregularly spaced locations. Let Z = (Z(s1), ..., Z(sn))T then Z ∼ N (0,C(θ)), θ ∈ Rq is anunknown parameter vector of interest, where

Cij(θ) = cov(Z(si), Z(sj)) = C(‖si − sj‖,θ), and

C(r) := Cθ(r) =2σ2

Γ(ν)

( r2`

)νKν

(r`

), θ = (σ2, ν, `)T

is the Matérn covariance function. The MLE ofθ is obtained by maximizing the Gaussian log-likelihood function:

L(θ) = −n2

log(2π)− 1

2log |C(θ)|− 1

2Z>C(θ)−1Z.

On each iteration of a maximization algorithm wehave a new matrix C. For a given θ the Choleskyfactorization requires O(n3) FLOPS. We approxi-mate C ≈ C̃ in the H-matrix format with a log-linear computational cost and storageO(kn log n),where rank k � n is a small integer.

Theorem 1 1. Let ρ(C̃−1C− I) < ε < 1. It holds| log |C| − log |C̃|| ≤ −n log(1− ε). Let ‖C−1‖ ≤ c1,then

|L̃(θ; k)− L(θ)| = 1

2log|C||C̃|− 1

2ZT(C−1 − C̃−1

)Z

≤ c20 · c1 · ε+ n log(1− ε)

H-matrix rank

3 7 9cov. le

ngth

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

Box-plots for differentH-matrix ranksk = {3, 7, 9}, ` = 0.0334.

ν = 0.5, n = 66049, rank k = 16, σ2 = 1.

HIERARCHICAL MATRICES (HACKBUSCH’ 99)Advantages to approximate C by C̃: H-approximation is cheap; storage and matrix-vector productcost O(kn log n); LU and inverse cost O(k2n log2 n); efficient parallel implementations exists.

(left) H-matrix approximations ∈ Rn×n, n = 16641, of the discretised Matérn covariance function onunit square. The biggest dense (dark) block ∈ R32×32, maximal rank k = 13, ν = 0.5, ρ = 0.1, σ = 1;(middle)H-Cholesky factor L̃, C̃ = L̃L̃T; (right) Precision matrix C̃−1.

NUMERICAL EXAMPLES

H-matrix approximation, ν = 0.5, domain G = [0, 1]2, ‖C̃(0.25,0.75)‖2 = {212, 568}, n = 16049.

k KLD ‖C− C̃‖2 ‖CC̃−1 − I‖2` = 0.25 ` = 0.75 ` = 0.25 ` = 0.75 ` = 0.25 ` = 0.75

10 2.6e-3 0.2 7.7e-4 7.0e-4 6.0e-2 3.150 3.4e-13 5e-12 2.0e-13 2.4e-13 4e-11 2.7e-9

Computing time and number of iterations for maximization of log-likelihood L̃(θ; k), n = 66049.k size, GB C̃, set up time, s. compute L̃, s. maximizing, s. # iters10 1 7 115 1994 1320 1.7 11 370 5445 9

dense 38 42 657 ∞ -

Moisture data. We used adaptive rank arithmetics with ε = 10−4 for each block of C̃ and ε = 10−8 foreach block of C̃−1. Number of processing cores is 40.

n compute C̃ L̃L̃T inverseCompr. time size time size ‖I− (L̃L̃T)−1C‖2 time size ‖I− C̃−1C‖2rate % sec. MB sec. MB sec. MB

10000 14% 0.9 106 4.1 109 7.7e-6 44 230 7.8e-530000 7.5% 4.3 515 25 557 1.1e-3 316 1168 1.1e-1

n = 512K, accuracy inside each block 10−8, matrix setup 261 sec., compression rate 0.02% (0.4GB against 2006 GB).H-LU is done in 843 sec., required 5.8 GB RAM, inversion LU error 2 · 10−3.

number of measurements

1000 2000 4000 8000 16000 32000

\nu

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(left) with nuggets {0.01, 0.005, 0.001} for Gaussian covariance, n = 2000, k = 14, σ2 = 1; (center) Zoomof the middle figure; (right) box-plots for ν vs number of locations n.

REFERENCES AND ACKNOWLEDGEMENTS

[1] B. N. KHOROMSKIJ, A. LITVINENKO, H. G. MATTHIES, Application of hierarchical matrices for computing theKarhunen-Loéve expan-sion, Computing, Vol. 84, Issue 1-2, pp 49-67, 2008.

[2] Y. SUN, M. STEIN, Statistically and computationally efficient estimating equations for large spatial datasets, JCGS, 2016,[3] J. CASTRILLON-CANDAS, M. GENTON, R. YOKOTA, Multi-Level Restricted Maximum Likelihood Covariance Esti-

mation and Kriging for Large Non-Gridded Spatial Datasets, Spatial Statistics, 2015[4] W. NOWAK, A. LITVINENKO, Kriging and spatial design accelerated by orders of magnitude: combining low-rank

covariance approximations with FFT-techniques, J. Mathematical Geosciences, Vol. 45, N4, pp 411-435, 2013.

Work supported by SRI-UQ and ECRC, KAUST.

center for uncertainty quantification logo lock-up · mation and kriging for large non-gridded...

Documents