cpuから見たg1gc
TRANSCRIPT
![Page 1: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/1.jpg)
2017.11.18
数村 憲治
CPUから見たG1GC
Copyright 2017 FUJITSU LIMITED
JJUG CCC 2017 Fall
C-7
![Page 2: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/2.jpg)
アジェンダ
Copyright 2017 FUJITSU LIMITED
モチベーション
GCの歴史
PA分析
JITとGC
最後に
1
![Page 3: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/3.jpg)
自己紹介
Copyright 2017 FUJITSU LIMITED
Professional in Java Core Tech
JCP Executive Committee
JSR382 Configuration API Expert Group
Javaオンラインコース
https://directshop.fom.fujitsu.com/shop/commodity_param/ctc/el_middleit/shc/0/cmc/ASP03737
https://directshop.fom.fujitsu.com/shop/commodity_param/ctc/el_middleit/shc/0/cmc/ASP03738
@kkzr
2
![Page 4: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/4.jpg)
アジェンダ
Copyright 2017 FUJITSU LIMITED
モチベーション
GCの歴史
PA分析
JITとGC
最後に
3
![Page 5: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/5.jpg)
0
10
20
30
40
50
60
g1 gc parallel gc shenandoah gc
GCトレンド
Copyright 2017 FUJITSU LIMITED
https://trends.google.com
4
![Page 6: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/6.jpg)
JDK 9 のデフォルトGCが G1GC
スループットとポーズタイムを同時に実現
G1GC
Copyright 2017 FUJITSU LIMITED
https://docs.oracle.com/javase/9/gctuning/garbage-first-garbage-collector.htm
It attempts to meet garbage collection pause-time
goals with high probability while achieving high
throughput with little need for configuration.
HotSpot Virtual Machine Garbage Collection Tuning Guide
5
![Page 7: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/7.jpg)
SPECjbb2015
Copyright 2017 FUJITSU LIMITED
SPECjbb2015ではG1GCは使われていない
https://www.spec.org/jbb2015/results/res2017q4/jbb2015-20171011-00259.html
スループット(max-jOPS)とレスポンス(critical-jOPS)の2つの指標
critical-jOPSのワールドレコード(2017/11現在)
6
![Page 8: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/8.jpg)
大容量メモリ対応
Copyright 2017 FUJITSU LIMITED
The Garbage-First (G1) garbage collector is targeted for
multiprocessor machines with a large amount of
memory.
https://docs.oracle.com/javase/9/gctuning/garbage-first-garbage-collector.htm
HotSpot Virtual Machine Garbage Collection Tuning Guide
大容量メモリ搭載マシンをターゲット
ご希望に応えて、大容量メモリ搭載マシンで性能比較
7
![Page 9: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/9.jpg)
1 public class GCTest extends Thread2 {3 static final int N = 7000000;4 static final int M = 4;5 static GCTest[] g;6 Object[] objs = new Object[N];78 public static void main(String ... arg) throws Exception9 {10 g = new GCTest[M];11 for (int i = 0 ; i < M ; ++i)12 g[i] = new GCTest();1314 System.out.println("warm up ...");15 for (int i = 0 ; i < N ; ++i)16 for (int j = 0 ; j < M ; ++j)17 g[j].doIt(i);1819 System.out.println("start");2021 for (int j = 0 ; j < M ; ++j)22 g[j].start();23 }24 public void run() {25 long start = System.currentTimeMillis();26 for (int j = 0 ; j < 60 ; ++j)27 for (int i = 0 ; i < N ; ++i)28 doIt(i);29 long end = System.currentTimeMillis();30 System.out.println("time: " + (end-start) + "ms");31 }3233 void doIt(int i) {34 if (objs[i] == null)35 objs[i] = new X();36 else37 objs[i] = g[i%M].objs[i];38 }3940 static class X {41 byte[] b = new byte[128];42 }
ソース
Copyright 2017 FUJITSU LIMITED8
![Page 10: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/10.jpg)
デモ
Copyright 2017 FUJITSU LIMITED9
![Page 11: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/11.jpg)
結果
Copyright 2017 FUJITSU LIMITED
G1GC ParallelGC
Xms/Xmx 96GB/96GB
プログラム 同じ
GC発生 なし
実行結果 16秒 2.5秒
10
![Page 12: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/12.jpg)
アジェンダ
Copyright 2017 FUJITSU LIMITED
モチベーション
GCの歴史
PA分析
JITとGC
最後に
11
![Page 13: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/13.jpg)
メモリ解放処理時間
Copyright 2017 FUJITSU LIMITED
アプリ処理
メモリ解放処理C/C++
Java
実行時間分布
メモリ解放処理にかかるトータル時間は変わらなそう
マルチコア環境では総ポーズ時間に加えスループットも問題
C/C++
Java シリアルGC
12
![Page 14: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/14.jpg)
マルチコア環境で2系統のGC
Copyright 2017 FUJITSU LIMITED
アプリ処理
メモリ解放処理C/C++
Java
マルチコアでGCを集中処理
GC専用コアでバックグランド処理
Java
パラレルGC
コンカレントGC/G1GC
13
![Page 15: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/15.jpg)
GC比較
Copyright 2017 FUJITSU LIMITED
Serial Parallel Concurrent G1
アプリ停止時間(NEW世代)
長い 短い 短い 短い
アプリ停止時間(OLD世代)
長い 短い かなり短い かなり短い
GC実行時間(アプリ処理への影響)
長い 短い 長い 長い
用途 クライアント スループット重視
レスポンス重視
スループット・レスポンス
14
![Page 16: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/16.jpg)
アジェンダ
Copyright 2017 FUJITSU LIMITED
モチベーション
GCの歴史
PA分析
JITとGC
最後に
15
![Page 17: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/17.jpg)
PA
Copyright 2017 FUJITSU LIMITED
CPUの性能統計情報
Solarisではcpustatやcputrack、Linuxではperfなど
load命令の実行回数とか分岐ミスの回数など
CPU使用率の高い時の分析に有効
採取ツール
Developer Studio
JIT翻訳コードとjavaメソッドの対応
16
![Page 18: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/18.jpg)
Developer Studio
Copyright 2017 FUJITSU LIMITED
http://www.oracle.com/technetwork/jp/server-storage/developerstudio/overview/index.html
collectコマンドで採取
er_printコマンドで可視
化% cat scr
outfile result.txt
viewmode machine
metrics e+cycle_counts:e+effective_instruction_counts
func
% er_print -script scr test.1.er
% collect –h cycle_counts,on,effective_instruction_counts,on –j on java –
Xmx96g –Xms96g GCTest
17
![Page 19: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/19.jpg)
関数一覧
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数8.4E+10 1.2E+11 <Total>8.3E+10 1.2E+11 GCTest.run()9.3E+8 1.1E+9 GCTest.run()1.9E+8 4.6E+7 GCTest.run()6.4E+7 1.5E+8 Interpreter6.4E+7 3.8E+7 GCTest.doIt(int)3.2E+7 0 GCTest.run()
0 0 <Unknown>0 0 AbstractCompiler::nsic_available‥0 0 AddPNode::Ideal(PhaseGVN*,bool)0 0 AddPNode::bottom_type()const0 0 AdvancedThresholdPolicy::common‥0 0 AdvancedThresholdPolicy::method‥‥
18
![Page 20: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/20.jpg)
PA分析
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数 CPI IPC (実行時間)
Parallel 8.4E+10 1.2E+11 0.7 1.4 (2.5秒)
G1 7.7E+11 4.2E+11 1.8 0.5 (16秒)
命令数が3倍
サイクル数が9倍
同じプログラムなのに、なぜ命令数が増えているのか?
なぜ命令数以上にサイクル数が増えているのか?
19
![Page 21: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/21.jpg)
PA分析(サイクル数分布)
Copyright 2017 FUJITSU LIMITED
49.2%
22.6%
19.5%
8.8%
8.0%
5.3%
1.2%
2.7%
0.7%
16.3%
21.6%
42.0%
2.3%
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
Parallel G1
PARALLELV.S.
コミット 計算待 ブランチ実行 ブランチミス L2$ミス L1$ミス その他
Parallel v.s. G1
20
![Page 22: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/22.jpg)
関数一覧(ParallelGC)
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数8.4E+10 1.2E+11 <Total>8.3E+10 1.2E+11 GCTest.run()9.3E+8 1.1E+9 GCTest.run()1.9E+8 4.6E+7 GCTest.run()6.4E+7 1.5E+8 Interpreter6.4E+7 3.8E+7 GCTest.doIt(int)3.2E+7 0 GCTest.run()
0 0 <Unknown>0 0 AbstractCompiler::nsic_available‥0 0 AddPNode::Ideal(PhaseGVN*,bool)0 0 AddPNode::bottom_type()const0 0 AdvancedThresholdPolicy::common‥0 0 AdvancedThresholdPolicy::method‥‥
21
![Page 23: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/23.jpg)
関数一覧(G1GC)
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数7.7E+11 4.2E+11 <Total>3.7E+11 2.2E+11 GCTest.run()1.2E+11 8.1E+10 OtherRegionsTable::add_refere‥‥9.5E+10 1.7E+10 ObjArrayKlass::oop_oop_iterate‥‥7.2E+10 8.8E+10 G1UpdateRSOrPushRefOopClosure‥‥5.0E+10 5.2E+9 G1RemSet::refine_card(signed ‥‥5.0E+10 2.1E+9 G1HotCardCache::insert(signed ‥‥4.0E+9 2.2E+9 GCTest.run()1.9E+9 2.8E+9 HeapRegion::oops_on_card_seq_ite‥‥1.4E+9 1.9E+9 RefineCardTableEntryClosure::do‥‥1.1E+9 4.0E+8 DirtyCardQueueSet::apply_closure‥‥9.0E+8 9.5E+8 ObjArrayKlass::oop_oop_iterate‥‥5.1E+8 5.2E+8 G1CardCounts::add_card_count(‥‥
22
![Page 24: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/24.jpg)
PA分析(GCTest.run)
Copyright 2017 FUJITSU LIMITED
8.4E+10
Parallel G1
3.7E+11
7.7E+11Total
Total
GCTest.run()
サイクル数
GCTest.run()
1.2E+112.2E+10
Parallel G1
Total
Total
GCTest.run() GCTest.run()
命令数
4.2E+10
8.3E+10
23
![Page 25: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/25.jpg)
Call Tree(ParallelGC)
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数8.4E+10 1.2E+11 +-<Total>8.4E+10 1.2E+11 +-_lwp_start8.4E+10 1.2E+11 | +-thread_native_entry8.4E+10 1.2E+11 | | +-JavaThread::run()8.4E+10 1.2E+11 | | | +-JavaThread::thread_main_‥‥8.4E+10 1.2E+11 | | | +-thread_entry(JavaThread‥‥8.4E+10 1.2E+11 | | | | +-JavaCalls::call_virtua‥8.4E+10 1.2E+11 | | | | +-JavaCalls::call_virtu‥8.4E+10 1.2E+11 | | | | +-JavaCalls::call_hel‥8.4E+10 1.2E+11 | | | | +-call_stub8.3E+10 1.2E+11 | | | | +-GCTest.run()9.3E+8 1.1E+9 | | | | +-GCTest.run()2.2E+8 6.9E+7 | | | | +-GCTest.run()3.2E+7 2.3E+7 | | | | | +-GCTest.doIt(in‥6.4E+7 1.7E+8 | | | | +-Interpreter
24
![Page 26: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/26.jpg)
Call Tree(G1GC)(1/2)
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数7.7E+11 4.2E+11 +-<Total>7.7E+11 4.2E+11 +-_lwp_start7.7E+11 4.2E+11 | +-thread_native_entry3.9E+11 2.0E+11 | | +-ConcurrentGCThread::run()3.9E+11 2.0E+11 | | | +-ConcurrentG1RefineThread::r‥3.9E+11 2.0E+11 | | | | +-DirtyCardQueueSet::appl‥‥3.9E+11 2.0E+11 | | | | | +-RefineCardTableEntryClo‥3.9E+11 2.0E+11 | | | | | | +-G1RemSet::refine_ca‥‥2.9E+11 1.9E+11 | | | | | | | +-HeapRegion::oops‥‥2.9E+11 1.9E+11 | | | | | | | | +-ObjArrayKlass::‥1.2E+11 8.1E+10 | | | | | | | | | +-OtherRegionsTa‥
・・・
25
![Page 27: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/27.jpg)
Call Tree(G1GC)(2/2)
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数・・・
3.7E+11 2.2E+11 | | +-JavaThread::run()3.7E+11 2.2E+11 | | | +-JavaThread::thread_main_in‥‥3.7E+11 2.2E+11 | | | +-thread_entry(JavaThread*,‥‥3.7E+11 2.2E+11 | | | | +-JavaCalls::call_virtual‥‥3.7E+11 2.2E+11 | | | | +-JavaCalls::call_virtual‥3.7E+11 2.2E+11 | | | | +-JavaCalls::call_help‥‥3.7E+11 2.2E+11 | | | | +-call_stub3.7E+11 2.2E+11 | | | | +-GCTest.run()4.5E+8 3.1E+7 | | | | | +-PtrQueue::enqueue‥2.2E+8 0 | | | | | | +-PtrQueueSet::all‥9.6E+7 0 | | | | | | | +-Monitor::lock‥9.6E+7 0 | | | | | | | | +-Monitor::ILo‥9.6E+7 0 | | | | | | | | +-Monitor::Tr‥3.2E+7 0 | | | | | | | +-Monitor::IUnlo‥
26
![Page 28: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/28.jpg)
CPU使用率(Parallel指定時)
Copyright 2017 FUJITSU LIMITED
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl0 0 0 25 221 1 14 0 4 0 0 1 0 2 0 98
…8 0 0 3 10 0 6 0 3 0 0 0 0 0 0 1009 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10010 0 0 0 7 0 0 4 0 0 0 0 100 0 0 0
…29 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10030 0 0 0 5 0 0 4 0 0 0 0 100 0 0 031 0 0 0 1 0 0 0 0 0 0 0 0 0 0 100
…44 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10045 0 0 0 5 0 0 4 0 0 0 0 100 0 0 046 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10047 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10048 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10049 0 0 0 7 0 0 4 0 0 0 0 100 0 0 050 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10051 0 0 0 1 0 0 0 0 0 0 0 0 0 0 100
…
27
![Page 29: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/29.jpg)
CPU使用率(G1指定時)
Copyright 2017 FUJITSU LIMITED
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl0 0 0 50 213 0 0 7 0 0 0 0 99 1 0 01 0 0 0 1 0 0 0 0 0 0 0 0 0 0 100
…5 0 0 0 7 0 0 5 0 0 0 0 99 1 0 0
…17 0 0 1 6 1 2 0 0 0 0 516 1 0 0 99
…22 0 0 0 7 0 0 5 0 0 0 0 100 0 0 0
…26 0 0 128 17 0 18 7 0 0 0 137 99 0 0 1
…45 0 0 0 6 0 0 5 0 0 0 0 100 0 0 0
…49 0 0 0 7 0 0 5 0 0 0 0 100 0 0 0
…58 0 0 9 10 2 0 7 0 0 0 9 100 0 0 059 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10060 0 0 0 1 0 0 0 0 0 0 0 0 0 0 10061 0 0 1 135 0 255 4 1 0 0 127 48 0 0 5262 0 0 0 6 0 0 5 0 0 0 0 100 0 0 0
…
28
![Page 30: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/30.jpg)
アセンブラ(G1GC)(GCTest.run)
Copyright 2017 FUJITSU LIMITED
サイクル数 命令数 L1$ミス0 0 0 [34] 3fc: mov 32, %g1
3.2E+7 0 0 [34] 400: cmp %l7, %g1
3.2E+7 3.1E+7 0 [34] 404: be,pn %icc,0x2f0
3.2E+7 0 0 [34] 408: nop
0 6.2E+7 0 [34] 40c: ldx [%g2 + 904], %l2
0 0 0 [34] 410: ldx [%g2 + 896], %l7
0 3.1E+7 0 [34] 414: membar #StoreLoad
8.0+E8 8.3+E8 5.5+E8 [34] 418: ldsb [%o0], %g1
3.2E+7 0 0 [34] 41c: cmp %g1, 0
2.9E+8 3.1E+7 2.5+E8 [34] 420: be,pn %icc,0x2f0
6.4E+7 0 6.2E+7 [34] 424: nop
0 0 0 [34] 428: cmp %l2, 0
3.2E+7 0 0 [34] 42c: bne,pn %xcc,0x44c
0 0 0 [34] 430: clrb [%o0]
0 0 0 [34] 434: mov %g2, %o1
0 0 0 [34] 438: call 0x1542cce0 ! (Unable to (‥0 0 0 [34] 43c: mov %g2, %l7
29
![Page 31: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/31.jpg)
メモリバリア
Copyright 2017 FUJITSU LIMITED
http://gee.cs.oswego.edu/dl/jmm/cookbook.html
各CPUで微妙に違うが、大まかな概念はだいたい同じ
4種類のメモリバリア
JSR-133 Cookbook
LoadLoad/StoreStore/LoadStore/StoreLoad
CPUから見て、StoreLoadが最もコスト大
The sequence: Store1; StoreLoad; Load2
ensures that Store1's data are made visible to other
processors (i.e., flushed to main memory) before
data accessed by Load2 and all subsequent load
instructions are loaded.
30
![Page 32: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/32.jpg)
アジェンダ
Copyright 2017 FUJITSU LIMITED
モチベーション
GCの歴史
PA分析
JITとGC
最後に
31
![Page 33: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/33.jpg)
hsdis
Copyright 2017 FUJITSU LIMITED
/export/home/JDK/jdk-9.0.1/bin/java -Xms96g -Xmx96g -verbose:gc
-XX:+UnlockDiagnosticVMOptions '-
XX:CompileCommand=print,GCTest.run()' GCTest | & tee grun.asm
https://github.com/AdoptOpenJDK/jitwatch/wiki/Building-hsdis
https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
HotSpot disassembler
使用方法 (SPARCの場合。他も同様。)
hsdis-sparcv9.soをビルド
hsdis-sparcv9.soを${JDK}/lib/serverへコピー
またはLD_LIBRARY_PATHに設定
オプションを指定してjavaコマンドの実行
32
![Page 34: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/34.jpg)
CompileCommand: print GCTest.run()Java HotSpot(TM) 64-Bit Server VM warning: printing of assembly code is enabled;turning on DebugNonSafepoints to gain additional outputwarm up ...startCompiled method (c1) 8386 147 % 3 GCTest::run @ 15 (78 bytes)total in heap [0xffffffff5cc42b90,0xffffffff5cc44440] = 6320relocation [0xffffffff5cc42d00,0xffffffff5cc42ef8] = 504main code [0xffffffff5cc42f00,0xffffffff5cc43ac0] = 3008stub code [0xffffffff5cc43ac0,0xffffffff5cc43d48] = 648oops [0xffffffff5cc43d48,0xffffffff5cc43d78] = 48metadata [0xffffffff5cc43d78,0xffffffff5cc43e28] = 176scopes data [0xffffffff5cc43e28,0xffffffff5cc44040] = 536scopes pcs [0xffffffff5cc44040,0xffffffff5cc443d0] = 912dependencies [0xffffffff5cc443d0,0xffffffff5cc443d8] = 8handler table [0xffffffff5cc443d8,0xffffffff5cc44420] = 72nul chk table [0xffffffff5cc44420,0xffffffff5cc44440] = 32Loaded disassembler from /export/home/JDK/jdk-9.0.1/lib/server/hsdis-sparcv9.so----------------------------------------------------------------------GCTest.run()V [0xffffffff5cc42f00, 0xffffffff5cc43d48] 3656 bytes[Disassembling for mach='sparc:v9b'][Entry Point][Constants]# {method} {0xffffffe702c007e0} 'run' '()V' in 'GCTest'0xffffffff5cc42f00: ldx [ %o0 + 8 ], %g30xffffffff5cc42f04: cmp %g3, %g50xffffffff5cc42f08: be %xcc, 0xffffffff5cc42f400xffffffff5cc42f0c: nop0xffffffff5cc42f10: sethi %hi(0xa3fb8400), %g30xffffffff5cc42f14: xor %g3, -1024, %g3
アセンブラ-hsdis (GCTest.run)
Copyright 2017 FUJITSU LIMITED33
![Page 35: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/35.jpg)
アセンブラ-hsdis (GCTest.run)
Copyright 2017 FUJITSU LIMITED
0xffffffff63c0bdc0: ldsb [ %o0 ], %l20xffffffff63c0bdc4: mov 0x20, %l70xffffffff63c0bdc8: cmp %l2, %l70xffffffff63c0bdcc: be,pn %icc, 0xffffffff63c0be240xffffffff63c0bdd0: nop0xffffffff63c0bdd4: ldx [ %g2 + 0x388 ], %l70xffffffff63c0bdd8: ldx [ %g2 + 0x380 ], %g10xffffffff63c0bddc: membar #StoreLoad0xffffffff63c0bde0: ldsb [ %o0 ], %l20xffffffff63c0bde4: cmp %l2, 00xffffffff63c0bde8: be,pn %icc, 0xffffffff63c0be240xffffffff63c0bdec: nop0xffffffff63c0bdf0: cmp %l7, 00xffffffff63c0bdf4: bne,pn %xcc, 0xffffffff63c0be140xffffffff63c0bdf8: clrb [ %o0 ]0xffffffff63c0bdfc: mov %g2, %o10xffffffff63c0be00: call 0xffffffff75c4f0e0 ;
{runtime_call void SharedRuntime::g1_wb_post(void*,JavaThread*)}0xffffffff63c0be04: mov %g2, %l7
・・・
0xffffffff63c0beb0: clrb [ %o0 ] ;*putfield b {reexecute=0 rethrow=0 return_oop=0}
; - G$X::<init>@10 (line 47); - G::doIt@18 (line 41); - G::run@26 (line 34)
34
![Page 36: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/36.jpg)
HotSpot(runtime)ソース
Copyright 2017 FUJITSU LIMITED
// G1 write-barrier post: executed after a pointer store.
JRT_LEAF(void, SharedRuntime::g1_wb_post(void* card_addr,
JavaThread* thread))
thread->dirty_card_queue().enqueue(card_addr);
JRT_END
share/vm/runtime/sharedRuntime.cpp
g1_wb_postを呼ぶコードを生成しているJITの場所は?
share/vm/optoあたりを探す
35
![Page 37: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/37.jpg)
HotSpot(JIT)ソース
Copyright 2017 FUJITSU LIMITED
Node* GraphKit::store_oop(Node* ctl,
・・・・Node* store = store_to_memory(control(), adr, val, bt, adr_idx,
mo, mismatched);
post_barrier(control(), store, obj, adr, adr_idx, val, bt,
use_precise);
return store;
share/vm/opto/graphKit.cpp
36
putfieldに対応するコード
![Page 38: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/38.jpg)
HotSpot(JIT)ソース
Copyright 2017 FUJITSU LIMITED
void GraphKit::post_barrier(Node* ctl,
・・・・BarrierSet* bs = Universe::heap()->barrier_set();
set_control(ctl);
switch (bs->kind()) {
case BarrierSet::G1SATBCTLogging:
g1_write_barrier_post(store, obj, adr, adr_idx, val, bt,
use_precise);
break;
share/vm/opto/graphKit.cpp
37
![Page 39: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/39.jpg)
HotSpot(JIT/G1GC)ソース
Copyright 2017 FUJITSU LIMITED
void GraphKit::g1_write_barrier_post(Node* oop_store,・・・・
// Offsets into the threadconst int index_offset = in_bytes(JavaThread::dirty_card_queue_offset() +
DirtyCardQueue::byte_offset_of_index());const int buffer_offset = in_bytes(JavaThread::dirty_card_queue_offset() +
DirtyCardQueue::byte_offset_of_buf());// Pointers into the threadNode* buffer_adr = __ AddP(no_base, tls, __ ConX(buffer_offset));Node* index_adr = __ AddP(no_base, tls, __ ConX(index_offset));
// Now some values// Use ctrl to avoid hoisting these values past a safepoint, which could// potentially reset these fields in the JavaThread.Node* index = __ load(__ ctrl(), index_adr, TypeX_X, TypeX_X->basic_type(), Compile::AliasIdxRaw);Node* buffer = __ load(__ ctrl(), buffer_adr, TypeRawPtr::NOTNULL, T_ADDRESS, Compile::AliasIdxRaw);
// Convert the store obj pointer to an int prior to doing math on it// Must use ctrl to prevent "integerized oop" existing across safepointNode* cast = __ CastPX(__ ctrl(), adr);
// Divide pointer by card sizeNode* card_offset = __ URShiftX( cast, __ ConI(CardTableModRefBS::card_shift) );
// Combine card table base and card offsetNode* card_adr = __ AddP(no_base, byte_map_base_node(), card_offset );
__ if_then(card_val, BoolTest::ne, young_card); {sync_kit(ideal);// Use Op_MemBarVolatile to achieve the effect of a StoreLoad barrier.insert_mem_bar(Op_MemBarVolatile, oop_store);__ sync_kit(this);
share/vm/opto/graphKit.cpp
38
![Page 40: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/40.jpg)
HotSpot(JIT/GC)ソース
Copyright 2017 FUJITSU LIMITED
void GraphKit::g1_mark_card(IdealKit& ideal,
・・・・__ make_leaf_call(tf, CAST_FROM_FN_PTR(address,
SharedRuntime::g1_wb_post),
"g1_wb_post", card_adr, __ thread());
share/vm/opto/graphKit.cpp
39
![Page 41: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/41.jpg)
Mark & Evacuation
Copyright 2017 FUJITSU LIMITED
region 1 free region
root set
region 1
Mark Evacuation
garbage
コピー
40
![Page 42: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/42.jpg)
コンカレントマーク
Copyright 2017 FUJITSU LIMITED
region 1 free region
root set
region 1
Mark Evacuation
region 1Application
不当にgarbage扱い
コピー
region 2
41
![Page 43: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/43.jpg)
ストアバリア
Copyright 2017 FUJITSU LIMITED
region 1 free region
root set
region 1
Mark Evacuation
region 1
root set扱いRemember Set
コピー
Applicationregion 2
42
![Page 44: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/44.jpg)
Refinementスレッド
Copyright 2017 FUJITSU LIMITED
Remember Setを非同期にアップデート+-<Total>+-_lwp_start| +-thread_native_entry| | +-ConcurrentGCThread::run()| | | +-ConcurrentG1RefineThread::run_service()| | | | +-DirtyCardQueueSet::apply_closure_to‥‥| | | | | +-RefineCardTableEntryClosure::do_‥‥| | | | | | +-G1RemSet::refine_card(signed‥‥
43
// G1 write-barrier post: executed after a pointer store.
JRT_LEAF(void, SharedRuntime::g1_wb_post(void* card_addr,
JavaThread* thread))
thread->dirty_card_queue().enqueue(card_addr);
![Page 45: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/45.jpg)
Refinementスレッド
Copyright 2017 FUJITSU LIMITED
Remember Setをアップデート
アプリケーション スレッド
参照の更新
Refinementスレッド
キューに書込み
キューから読込み
非同期
44
![Page 46: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/46.jpg)
アジェンダ
Copyright 2017 FUJITSU LIMITED
モチベーション
GCの歴史
PA分析
JITとGC
最後に
45
![Page 47: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/47.jpg)
Wrap Up
Copyright 2017 FUJITSU LIMITED
GCの評価は、GC処理だけでは不十分
GC選択は、先入観にとらわれず、実機調査
アプリケーション実行時の影響
例:SPECjbb2015
よく分からない時は、PAがヒントになるかも
46
結論
2つのバリア
![Page 48: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/48.jpg)
Q/A
Copyright 2017 FUJITSU LIMITED47
![Page 49: CPUから見たG1GC](https://reader033.vdocuments.mx/reader033/viewer/2022052117/5a6486df7f8b9a2c568b53c1/html5/thumbnails/49.jpg)
Copyright 2010 FUJITSU LIMITED