isao tanaka1,2,3,4 · recommender system for materials discovery big data summer platja d’aro,...
TRANSCRIPT
![Page 1: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/1.jpg)
1 Department of Materials Science and Engineering, Kyoto University, JAPAN2 Elements Strategy Initiative for Structural Materials, Kyoto University, JAPAN3 Center for Materials Research by Information Integration, NIMS, JAPAN4 Nanostructure Research Laboratory, Japan Fine Ceramics Center, JAPAN
Isao Tanaka1,2,3,4
Recommender system for materials discovery
Big Data SummerPlatja d’Aro, Spain, September 9 - 13, 2019
![Page 2: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/2.jpg)
Inorganic Crystal Structure Database (ICSD)
187,000 crystal structures 82,000 structures excluding duplicatesincompletes, etc.
World largest databasefor known inorganic crystals.
2Many systems are yet-unexplored !
Number of chemical elements
Number of chemical combinations
(only for simple composition ratio)
1 ~1002 ~100,0003 ~10,000,0004 ~1,000,000,000 (1billion)
![Page 3: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/3.jpg)
Vast chemistry space to explore
Simple chemical combinations AaBbCcDd (a,b,c,d <10)~1B
ICSD~82k
experimental database for crystal structure
![Page 4: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/4.jpg)
ICSD~82k
Vast chemistry space to explore
thermodynamically unstable compounds
thermodynamically (meta)stable compounds
experimental database for crystal structure
Simple chemical combinations AaBbCcDd (a,b,c,d <10)~1B
![Page 5: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/5.jpg)
Discovery of a novel Sn(II)-based oxide for daylight-driven photocatalyst
DFT calcs + Experiments
Hiroyuki Hayashi, Shota Katayama, Takahiro Komura, Yoyo Hinuma, Tomoyasu Yokoyama, Kou Mibu, Fumiyasu Oba and IT
Hiroyuki Hayashi
Advanced Science 9, (2016) 1600246
5
![Page 6: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/6.jpg)
q M known compounds4 Ti, Zr, Hf SnTiO3, Sn2TiO4
5 V, Nb, Ta SnNb2O6, Sn2Nb2O7, SnTa2O6, Sn2Ta2O7, SnTa4O11
6 Cr, Mo, W SnWO4, Sn2WO5, Sn3WO6
SnO-MOq/2
Only 10 compounds are known
Sn(II)-M-O
SnO-MOq/2 pseudobinary
4A – 6A transition metal oxideswidely used for photocatalystsex. TiO2, WO3, NaTaO3, TaON, …
Wide band-gaps
Sn(II) oxidesNarrow band-gaps
Reported high visible-light photocatalytic activity
Target compounds of interests; Sn(II)-M-oxides
![Page 7: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/7.jpg)
Inorganic Crystal Structure Database (ICSD)
Number of chemical elements
Number of structure prototypes in ICSD
1 1202 1,7003 4,7004 4,300
World largest databasefor known inorganic crystals.
7
177,000 crystal structures 82,000 structures
excluding duplicates,incompletes, etc.
9,100 structure prototypes(e.g. rock-salt, perovskite, ...)
![Page 8: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/8.jpg)
1 2 3 4 5 6
1 154 122 359 209 438 251
2 454 258 663 220 409
3 500 184 297 109
4 444 52 149
5 72 45
6 78
ICSD prototype
NdYbS3 type
NdYbS3 type SnTiO3
NdYbS3 type TiSnO3
Hypothetical compounds with prototype structures
Formal ionic charge
Form
al io
nic
char
ge
# hypothetical compounds
![Page 9: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/9.jpg)
SnO-WO3 pseudo binary system
SnO WO3
Convex hull
Included in ICSD
9
Formation energy by DFT calcs
![Page 10: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/10.jpg)
SnO MoO3
Convex hull
as‐yet‐unknown
10
SnO-MoO3 pseudo binary system
Formation energy by DFT calcs
![Page 11: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/11.jpg)
Reported oxides in ICSD(Red characters) are located onthe convex hull.
Convex hull of SnO-MOq/2 pseudo binary systems
Band gap screening
11
Formation energy by DFT calcs
![Page 12: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/12.jpg)
Band gap of actual photocatalysts ≥ 2 eV (GGA)
2 ~ 3 eV
1 ~ 2 eV
0 ~ 1 eV
over 3 eV
Band gap
• SnO‐Ta2O5• SnO‐WO3• SnO‐MoO3
12
Band Gap
![Page 13: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/13.jpg)
Synthesis of SnMoO4
Mixture of SnCl2 and K2MoO4 powders
1 hour annealing in Ar gas
Washed and dried
13
Experimental results
![Page 14: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/14.jpg)
Newly discovered compound
498 K-synthesized sampleSpace group type: P213
(Cubic)Lattice constant: a = 7.26 Å
Sn
O
Mo
a b
c
Trigonal prism which ischaracteristic of Sn(II)
14
Crystal structure of SnMoO4
![Page 15: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/15.jpg)
Degradation of methylene blue under simulated day-light
Newly-discovered SnMoO4 powder exhibits clear photocatalytic activity. 15
Photocatalytic activity of SnMoO4
![Page 16: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/16.jpg)
ICSD~82k
Vast chemistry space to explore
thermodynamically unstable compounds
thermodynamically (meta)stable compounds
experimental database for crystal structure
Simple chemical combinations AaBbCcDd (a,b,c,d <10)~1B
![Page 17: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/17.jpg)
Recommender system for discovery of CRC (Chemically Relevant Composition)
using ICSD database
A. Seko, H. Hayashi, H. Kashima and IT
17A. Seko, H. Hayashi, H. Kashima, I. Tanaka, Phys. Rev. Mater. 2, 013805 (2018)
A. Seko, H. Hayashi, and I. Tanaka, J. Chem. Phys. 148, 241719 (2018).
![Page 18: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/18.jpg)
“Recommender system” in E-commerce
Amazon.com
A system that can suggest items to customers, which is sometimes useful.
= Recommendation
Netflix.com
18
![Page 19: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/19.jpg)
19
A2X-BX pseudo-binary (A1+, B2+, X2-)
7A2Xꞏ1BX (A14B1X8)
3A2Xꞏ1BX (A6B1X4)
1A2Xꞏ1BX (A2B1X2)
CRC (Chemically Relevant Composition)
Form
atio
n En
ergy
A2X BXComposition
Convex hull
3A2X
ꞏBX
(A6B
1X4)
A2X
ꞏBX
(A2B
1X2)
7A2X
ꞏBX
(A14
B1X
8)
CRC
5A2Xꞏ3BX (A10B3X8)
3A2Xꞏ5BX (A6B5X8)
A10
B3X
8
A6B
5X8
non-CRC
![Page 20: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/20.jpg)
⇒ Application to discover new Chemically Relevant Composition (CRC)
ABCDEFGH
JI
1 2 3 4 5 6 7ACHBFJDG
EI
1 4 3 5 7 2 6ACHBFJDG
EI
1 4 3 5 7 2 6
Rating matrix
Underlying assumption: a low-rank structure of rating matrix.
Rating matrix used for recommender systemC
usto
mer
Item
20
![Page 21: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/21.jpg)
Ternary:AaBbXx max(a, b, x) = 8, N = 7.4 x 106
Quaternary: AaBbCcXx max(a, b, c, x) = 20, N = 1.2 x 109
Quinary: AaBbCcDdXx max(a, b, c, d, x) = 20, N = 2.3 x 1010
Candidate chemical compositions
21
![Page 22: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/22.jpg)
Number of entry compounds in three databases
SpringerMaterials
ICDD
ICSD
Number of entry compounds
Ternary Quaternary Quinary
Training
Test
22
![Page 23: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/23.jpg)
23
Matrix factorization
Non-negative Matrix Factorization
Singular Value Decomposition
r : given rank
SCIKIT-LEARN
r : given rank
![Page 24: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/24.jpg)
24
Type 1Type 2Type 3
Example of Rating Matrix (Type 1)
Matrix representation of ternary composition
![Page 25: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/25.jpg)
25
Num
ber o
f cor
rect
answ
ers
inclu
ded
in IC
DD &
SpM
at
TOP3,000 compositions with high predicted rating.Discovery rate> 21% !!
TOP3,000 compositions with high predicted rating.Discovery rate> 21% !!
TOP100 compositions with high predicted rating.Discovery rate > 45% !!
TOP100 compositions with high predicted rating.Discovery rate > 45% !!
Ternary # Elements: 7,405,200
Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition
Dependence on rank is weak. SVD performs slightly better than NMF. Type 2 representation works best.
![Page 26: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/26.jpg)
Tensor representation of binary composition
170
66
10
Binary: # Elements:66x10x170=112,200 26
![Page 27: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/27.jpg)
Tensor factorization
(canonical polyadic)
(higher order singular value decomposition, HO-SVD)
27SCIKIT-TENSOR
F. L. Hitchcock, Stud. Appl. Math. 6, 164 (1927).
L. R. Tucker, Psychometrika 31, 279 (1966).
![Page 28: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/28.jpg)
Tensor factorization
28
Num
ber o
f cor
rect
answ
ers
inclu
ded
in IC
DD &
SpM
at
Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition
Ternary # Elements: 7,405,200
![Page 29: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/29.jpg)
Num
ber o
f cor
rect
answ
ers
inclu
ded
in IC
DD &
SpM
at
TOP3,000 compositions with high predicted rating.
Discovery rate > 25% !!
TOP3,000 compositions with high predicted rating.
Discovery rate > 25% !!
TOP100 compositions with high predicted rating.
Discovery rate > 59% !!
TOP100 compositions with high predicted rating.
Discovery rate > 59% !!
Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition
29
Ternary # Elements: 7,405,200
![Page 30: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/30.jpg)
Num
ber o
f cor
rect
answ
ers
inclu
ded
in IC
DD &
SpM
at
Validation of CRC prediction by a recommender system for ternary compounds using Tucker decomposition
30
Ternary # Elements: 7,405,200
3000
![Page 31: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/31.jpg)
31
Results for quarternary/quinary systems
59%
52%
15%
Discovery rate > 15% even for quinary systems with TOP100 high predicted rating.
Discovery rate > 15% even for quinary systems with TOP100 high predicted rating.
TOP100 TOP3000
Num
ber o
f cor
rect
answ
ers
inclu
ded
in IC
DD &
SpM
at
![Page 32: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/32.jpg)
32
Further validation by first principles calculations for pseudo-binary compounds with high predicted rating
Rb3InO3Predicted Rating: 0.64
RbInO2PredictedRating: 1.01
![Page 33: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/33.jpg)
Further validation by first principles calculations for TOP 27 pseudo-binary compounds with high predicted rating
23 among 27 compositions(85%) are thermodynamically stable by DFT ! 33
![Page 34: Isao Tanaka1,2,3,4 · Recommender system for materials discovery Big Data Summer Platja d’Aro, Spain, September 9 - 13, 2019. ... simulated day-light Newly-discovered SnMoO4powder](https://reader036.vdocuments.mx/reader036/viewer/2022081402/5f0e4e317e708231d43e98dd/html5/thumbnails/34.jpg)
Systematic discovery of as-yet-unknown CRC
Use of tensor-based recommender system ONLY with
inorganic crystal database, ICSD.
Rating prediction with neither descriptors, nor DFT results.
Validation by two other databases, ICDD-PDF & Springer
Materials. Discovery rate is 59/52/15% for TOP 100
ternary/quarternary/quinary CRC.
Validation by DFT calculations. Among TOP 27 ternary
(pseudo-binary oxides), 85% are thermodynamically stable.
CRC (chemically relevant composition)
34