a scalable parallel eigensolver for large-scale simulations on … · 2020. 9. 2. · tetsuya...

36
December 7, 2015 Tetsuya Sakurai @CASTS A Scalable Parallel Eigensolver for Large-scale Simulations on Next-generation Computing Environments Tetsuya Sakurai (University of Tsukuba, JST CREST) Joint work with Akira Imakura, Yasunori Futamura, Xiucai Ye, Yasuyuki Maeda, Takahiro Yano, Yuto Inoue

Upload: others

Post on 21-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

A Scalable Parallel Eigensolver for Large-scale Simulations on Next-generation Computing Environments

Tetsuya Sakurai (University of Tsukuba, JST CREST) Joint work with Akira Imakura, Yasunori Futamura, Xiucai Ye, Yasuyuki Maeda, Takahiro Yano, Yuto Inoue

Page 2: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Contents n  Introduction

Ø Approach for parallel scalability n Quadrature-type Eigensolver

Ø Sakurai-Sugiura method (SSM) Ø Software

n Estimation of Eigenvalue Density Ø Stochastic estimation of eigenvalue count

n Numerical Examples n Conclusions

Page 3: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Large-Scale Eigenvalue Problems n Partial Eigenvalue Problems in Computer Simulations

Ø Find eigenvalues in a given interval e.g. around HOMO-LUMO band gap

Ø Large-scale problems Ø Matrices are given explicitly / implicitly

n Highly Parallel Computing Environment Ø Large-scale supercomputers Ø Distributed computing nodes -  Reduce communications between nodes

Page 4: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Partial Eigenvalue Problems n We consider an eigenvalue problem

where is a matrix valued function, and is a nonzero vector of dimension . Ex. Ø Find all eigenvalues located inside Γ -  Interior eigenvalue problems -  Large-scale sparse matrices

Re

Imλ1

λ2λ3

Γ

G

T (λ)x = 0,

T (λ) x

n

Standard EP (SEP):Generalized EP (GEP):Nonlinear EP (NEP):

In this talk, we mainly explain GEP:Ax = λBx

T (λ) = A − λI

T (λ) = A − λB

T (λ) = A0 + λA1 + λ2A2

T (λ) = A0 + λA1 + eλA2

Page 5: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

ai+1 = bi ai + ci ai−1

IN =

N∑

j=1

wjfj

Our Approach for Parallel Scalability n Avoid recurrence calculations in eigenvalue computation

Ø Algorithms described by recurrence relations: (ex. Krylov methods):

Ø Algorithms without recurrence calculations:

(ex. Numerical quadrature):

ai−1 ai ai+1

f1 f2 fN−1 fN. . .

. . . an

Page 6: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Quadrature-type Eigensolver

Page 7: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Basis of Our Approach n Spectral decomposition of

is given by , where is an eigenvalue and is a projection with respect to . (for simplicity, we consider the case that is simple) By using the residue theorem, we obtain the following relation:

λi Pi

λi

λi

(zB −A)−1B

PΓ =1

2πi

Γ

(zB −A)−1Bdz =

λi∈G

Pi

Re

Im

λ1

λ2λ3

Γ

G

Page 8: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Approximation by N-point Numerical Quadrature n Contour integral is approximated by numerical quadrature

n Apply for vectors :

Systems of linear equations at shift points

: quadrature point : quadrature weightwj

zj

PΓ =∑

λi∈G

Pi =1

2πi

Γ

(zB −A)−1Bdz ≈

N∑

j=1

wj(zjB −A)−1B

z1, . . . , zN

PΓ =∑

λi∈G

Pi =1

2πi

Γ

(zB −A)−1Bdz ≈

N∑

j=1

wj(zjB −A)−1B

V = [v1, . . . ,vL]

PΓV ≈

N∑

j=1

wj(zjB − A)−1

BV

(zjB − A)Yj = BV, j = 1, . . . , N

Page 9: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Extension of Subspace with Higher Order Complex Moments n Obtain various linear combinations of projections using

complex moments:

n Apply for vectors :

n Eigenpairs are extracted from Ø Hankel type, Rayleigh-Ritz type, etc.

P(k)

Γ=

λi∈G

λki Pi =

1

2πi

Γ

zk(zB −A)−1

Bdz, k = 0, 1, . . .

P(k)Γ V ≈ Sk =

N∑

j=1

wjzkj (zjB −A)−1BV, k = 0, 1, . . . ,M − 1

V = [v1, . . . ,vL]

PΓ =∑

λi∈G

Pi =1

2πi

Γ

(zB −A)−1Bdz ≈

N∑

j=1

wj(zjB −A)−1B

S = [S0, S1, . . . , SM−1]L×M

Page 10: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Quadrature-type Methods n Using Numerical Quadrature

Ø GEPs (Generalized Eigenvalue Problems) -  SS-H [S/Sugiura'2003]

•  Grid RPC(remote procedure call) (2004) - Coarse-grained communication, Hierarchical structure

-  SS-RR [S/Tadano'2007] -  Block SS [Ikegami/S/Nagashima'2008] -  FEAST [Polizzi'2009] -  SS-Arnoldi [Imakura/Du/S'2014]

Ø NEPs (Nonlinear Eigenvalue Problems) -  SS-H [Asakura/S et al.’2009] -  Beyn's method [Beyn’2012] -  SS-RR [Yokota/S’2013] -  SS-Arnoldi [Imakura/Du/S'2015]

Re

Imλ1

λ2λ3

Γ

Page 11: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

SS method with Rayleigh-Ritz procedure n Algorithm of SS-RR

1: Solve linear systems

2: Construct a subspace

3: Extract eigenpairs

0: Set contour path and parameters Most time consuming part

Appropriate path and parameters

Compute low-rank approx. s.t. S = UΣWT≈ U1Σ1W

T1

Compute Ritz paris with A = UH1 AU1 and B = UH

1 BU1

Page 12: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Hierarchical Parallel Structure n Computing resources are assigned according to a hierarchical

structure of the algorithm

Γ1 Γ2 Γ3

z1

z2

zN

AX = B

Contour paths

Quadrature points

Linear solvers

Hierarchical structure of the algorithm � Hierarchical structure of a machine �

Page 13: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Our Approach for Parallel Scalability

A B

Subspace

S

Constructbasisbyitera1veprocess

Eigenpairs

Matrices

Difficultforparallelscalability

Constructbasisbynumericalquadrature

Contourintegral-basedeigensolver

Conven1onalmethods

SSmethod

Page 14: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Software n  Published software

Ø z-Pares -  Fortran95, MPI -  For large-scale distributed parallel computing

Ø CISS -  SLEPc / PETSc -  For evaluating efficiency of the algorithm in distributed

parallel computing Ø SSEIG -  MATLAB -  For evaluating efficiency of the algorithm

n Available: -  http://zpares.cs.tsukuba.ac.jp/�

Page 15: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Stochastic Estimation of Eigenvalue Distribution

Page 16: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Eigenvalue Count in a Given Domain n The number of eigenvalues in Γ is

given by �

Ø Approximate contour integral of a trace of inverse matrix by1), 2) where are L sample vectors.

Re

Im

λ1 λ2 λ3

Γm

m =1

2πi

∮Γtr((zB −A)−1B)dz

m ≈

N∑j=1

wj

(1

L

L∑ℓ=1

vTℓ (zjB −A)−1Bvℓ

)

1) Futamura, H. Tadano, and T. Sakurai, JSIAM Letters 2, 127-130 (2010). 2) Y. Maeda, Y. Futamura, A. Imakura and T. Sakurai, JSIAM Letters 7, 53-56 (2015).

v1, . . . ,vL

Page 17: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Estimation of Eigenvalue Density n Estimate the eigenvalue density

Ø Put sub-intervals on a target interval

Re

Im Γ

Γ1

m1 mj

Γj. . .

. . .

. . .

. . .

m

Page 18: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Examples

Page 19: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

K-Computer (AICS) n The K Computer (AICS, RIKEN)

Ø Number of nodes : 88,128 (705,024 cores) Ø Performance : 10.51 PFLOPS (Peak 11.28 PFLOPS) Ø CPU: SPARC64 VIIIfx 8 cores 2.0GHz (128GFLOPS) Ø Memory: 16GB/node (total memory: 1.41 PB) Ø Memory Bandwidth: 64GB/s

Page 20: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Supercomputers in Tsukuba n HA-PACS System @Univ. Tsukuba

-  CPU: Intel E5-2670 2.56GHz x2 -  GPU: NVIDIA M2090 x4 -  Number of nodes : 268 -  Peak performance : 802 TFLOPS

(CPU: 89 TFLOPS, GPU: 713 TFLOPS)

n COMA (PACS-IX) System @Univ. Tsukuba -  CPU: Intel E5-2670v2 2.5GHz x2 -  MIC: Intel Xeon Phi 7110P 61core x2 -  Number of nodes : 393 -  Peak performance : 1.001 PFLOPS

(CPU: 157 TFLOPS, MIC: 844 TFLOPS)

Page 21: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Transmission Design n Application: Automatic transmission design

Ø Matrices are derived by FEM Ø Frequency range: 0 – 6,000 Hz -  5,435,438 Nodes -  16,747,926 DOF -  916 eigenpairs are calculated

Ø Test environment: COMA @Univ. of Tsukuba Ø Solvers -  Eigensolver: z-Pares (SS method)

•  #contour paths: 4 •  #quadrature points: N = 16 -  Linear solver: MUMPS (sparse direct solver)

Page 22: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

0

2000

4000

6000

8000

10000

12000

12 24 36 48 72 144 288

Time(sec.)

Number of modes

OthersLinear solve

Numerical Example: Transmission Design n Matrix dimension: 16,747,926, 916 eigenpairs in [0, 6000] Hz n  #contour paths: 4, #quadrature points: N = 16

11,013 sec

286 sec

12 24 36 48 72 144 288

Tim

e(se

c.)

12,000

10,000

8,000

6,000

4,000

2,000

0

Number of nodes

Page 23: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Density Functional Theory n Band structure calculation with real space density functional

theory (RSDFT by Iwata et al.) Ø An interior standard eigenvalue problem (SEP) -  Eigenvalues around the band gap -  Matrices with several wave numbers:

n Material: Silicon nanowire 9,924 atoms Matrix dim.: 8,719,488 (only mat-vec operation is given)

n Test environment: the K Computer (AICS, Japan) Ø 3,456 nodes (27,648 cores)

n  Linear solver: Shifted Block CGrQ

A(k)u = λu, k = 0,∆k, 2∆k, . . .

Page 24: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example : Density Functional Theory n Band structure calculation for Silicon Nanowire 9,924 atoms

Ø 21 problems are solved Ø Four intervals are set around the band gap Ø 3,456 nodes of the K Computer (about 12 hours)

Page 25: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Order-N DFT n Order-N DFT code CONQUEST

Ø SiGe hut cluster with 200,000 atoms by Nakata and Miyazaki -  Matrix size : 778,292, NNZ : 13,247,248 -  GEP: 223 eigenpairs around HOMO-LUMO are computed

Ø Test environment : COMA@Univ. of Tsukuba Ø Linear solver : MUMPS (sparse direct solver)

GermaniumSilicon

hut cluster

Page 26: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

1 2 4 8

16 32 64

4 8 16 32 64 128 256 Number of nodes

Spe

edup

ratio

2,245 sec

90 sec

Numerical Example: Order-N DFT

Application to order-N DFT code CONQUEST

Page 27: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Lattice QCD n O(a)-improved Wilson-Dirac operator

Matrix dim.: 1,019,215,872 (non-Hermitian) Test environment: 16,384 nodes of the K-Computer Linear solver: BiCGStab

-0.04 -0.02 0 0.02 0.04Re(λ)

-0.04

-0.02

0

0.02

0.04

Im(λ

)

SS methodAnalytical

964,Free,κ=0.125

-0.03 -0.02 -0.01 0 0.01 0.02 0.03Re(λ)

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

Im(λ

)

SS method

964,Quenched,κ=0.1425,cSW=1.6

Multiplicity: 576

144288

48

192

12

144Quadrature points

48

k = 0.125 k = 0.1425

(Suno, Kuramashi, S, et al.)

Page 28: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: MIC/GPU n  Linear solver is replaced for MIC/GPU Ø  Test matrix : GEP, real symmetric, dense (n=10,000) -  5% of eigenpairs are calculated

Ø  CPU: COMA@Univ. Tsukuba Ø  MIC: COMA@Univ. Tsukuba Ø  GPU: HA-PACS@Univ. Tsukuba

0 50

100 150 200 250 300

1 2 4 8 16 # node

MIC GPU

0 50

100 150 200 250 300

1 2 4 8 16

CPU

Cal

cula

tion

time

[sec

.]

# node

300

250

200

150

100

50

0 1 2 4 8 16

# GPU

Page 29: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Nonlinear Eigenvalue Problem n Test problem:

Simulation of the international linear collider

where t = 1, σ1 = 0. n Test environment:

Cray-XT4 at NERSC @Berkeley n  Linear solver: SuperLU_DIST

http://www.linearcollider.org/

T (�) = K � �2M + it�

j=1

��2 � �2

j Wj ,

T (�)x = 0

Page 30: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Nonlinear Eigenvalue Problem n Strong scalability for nonlinear (NEP) case

Ø Matrix size: 2,738,556

n Two contour paths are located. Ø The number of quadrature points is N = 32. Ø 64 linear systems are solved in total.

#cores 256 512 1024 2048

time(sec.) 2513 1273 661 334

speedup - 1.97 3.80 7.30

[Yamazaki et al. '2013]

Page 31: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Eigenvalue Density n RSDFT SiNW 9,924 atoms

Eigenvalue density at initial status Ø #quadrature points: N = 8 Ø Linear solver: Shifted Block CGrQ

−0.5 −0.45 −0.4 −0.35 −0.3 −0.25 −0.2 −0.15 −0.1 −0.05 0

0

5

10

15

20

25

30

35

40

45

Eigenvalue density (1,000 intervals)

-0.5 -0.25 00

20

40

estimateactual

Page 32: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Eigenvalue Density n RSDFT SiNW 9,924 atoms

Accumulated total from left

−0.5 −0.4 −0.3 −0.2 −0.1 0

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

eigenvalue

Num

ber o

f eig

enva

lue

EstimationExact

stochastic estimationactual value

Eigenvalue density (1,000 intervals)

Page 33: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Eigenvalue Density n RSDFT SiNW 107,292 atoms

Eigenvalue density at initial status Ø Matrix dimension: 64,700,000 Ø 10,800 nodes of the K-Computer, 11,890 sec

ï0.8 ï0.7 ï0.6 ï0.5 ï0.4 ï0.3 ï0.2 ï0.1 0 0.1 0.20

100

200

300

400

500

600

700

800

900

1000

eigenvalue

Num

ber o

f eig

enva

lue

About 200,000 eigenvalues exist in the interval

Eigenvalue density (1,000 intervals)

Page 34: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Numerical Example: Shell Model Nuclear Level Density n Shell Model Code: Kshell (Shimizu, et al.)

Ø Spin-dependent level density of 58Ni* -  Matrix dim.: 15 billion -  2,304 nodes of the K-Computer, 24 hours

Exp.: Y. Kalmykov, et al. (2007)

J2π = 2−

J2π = 2+

*Shimizu, et al., Phys. Lett. B (accepted)

Page 35: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

Conclusions n Scalable Algorithm: SS method

Ø Numerical quadrature with higher order moments

n Software Ø z-Pares in Fortran95, CISS in SLEPc, sseig in MATLAB

n Stochastic estimation of eigenvalue density

Ø Uses numerical quadrature

n Future work Ø Efficient linear solvers and preconditioners Ø Linear algebra kernels such as matrix-vector product,

orthogonalization, SVD, etc. Ø Apply for various applications

Page 36: A Scalable Parallel Eigensolver for Large-scale Simulations on … · 2020. 9. 2. · Tetsuya Sakurai @CASTS December 7, 2015 A Scalable Parallel Eigensolver for Large-scale Simulations

December 7, 2015 Tetsuya Sakurai @CASTS

References [1] T. Sakurai, H. Sugiura, A projection method for generalized eigenvalue problems using numerical

integration, J. Comput. Appl. Math. 159, 119-128, 2003. [2] T. Sakurai, H. Tadano, CIRR: a Rayleigh-Ritz type method with contour integral for generalized eigenvalue

problems, Hokkaido Math. J. 36, 745-757, 2007. [3] I. Ikegami, T. Sakurai, U. Nagashima, A filter diagonalization for generalized eigenvalue problems based on

the Sakurai-Sugiura projection method,CS-TR-08-13, Tsukuba, 2008. [4] J. Asakura, T. Sakurai, H. Tadano, T. Ikegami, K. Kimura, A numerical method for nonlinear eigenvalue

problems using contour integrals, JSIAM Lett. 1, 52-55, 2009. [5] T. Sakurai, J. Asakura, H. Tadano, T. Ikegami, Error analysis for a matrix pencil of Hankel matrices with

perturbed complex moments, JSIAM Lett. 1, 76-79, 2009. [6] Y. Futamura, H. Tadano, and T. Sakurai, Parallel stochastic estimation method of eigenvalue distribution,

JSIAM Letters 2, 127-130, 2010. [7] Y. Futamura, T. Sakurai, S. Furuya and J. Iwata, Efficient algorithm for linear systems arising in solutions of

eigenproblems and its application to electronic-structure calculations, LNCS 7851, 226-235, 2012. [8] I. Yamazaki, T. Ikegami, H. Tadano, T. Sakurai, Performance comparison of parallel eigensolvers based on

a contour integral method and a Lanczos method, Parallel Computing 39, 280–290, 2013. [9] T. Yano, Y. Futamura, T. Sakurai, Multi-GPU scalable implementation of a contour-integral-based

eigensolver for real symmetric dense generalized eigenvalue problems, Proc. 8th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2013), pp. 121-127 (2013).

[10] A. Imakura, L. Du, T. Sakurai, A block Arnoldi-type contour integral spectral projection method for solving generalized eigenvalue problems, Appl. Math. Lett., 32, 22-27, 2014.

[11] A. Imakura, L. Du and T. Sakurai, Error bounds of Rayleigh--Ritz type contour integral-based eigensolver for solving generalized eigenvalue problems, Numerical Algorithms, (accepted).

[12] Y. Maeda, Y. Futamura, A. Imakura and T. Sakurai, Filter analysis for the stochastic estimation of eigenvalue counts, JSIAM Letters 7, 53-56 (2015).