Download - Deep-Q network for truss topology optimization with stress

Deep-Q network for truss topology optimization with stress constraints

Kazuki Hayashi (Kyoto University)Makoto Ohsaki (Kyoto University)

Ingredients of RL

• ：agent

1. status 2. action 3. reward 4. environment

• Objective is to maximize total reward

2

( )1,Q as

・・・・

・・

・・・

・・・

( )2,Q as

( ), nQ as

1s

2s

ms

1 if meet constraint1 else+−

Python

Neural network

s a r ( , ) { ', }F s a s r=

Problem is NOT always pixel-wise

CNN→ How to apply RL to discrete structures?

available NOT available

continuous truss frame

Structure2Vec (Hanjun et al., 2016)

• Reinforcement learning + graph embedding• Train for MVC and TSP with 50-100 nodes→ Apply the trained model to 1000-1200 nodes→ Solution comparable to CPLEX solver’s one

1 hour12 seconds

Minimum vertex cover (MVC) Traveling salesman problem(TSP)

2

2

4

12

3

2

Minimize (no. of nodes selected) Minimize (total path)

Graph embedding

• Convolutional NN : :• Graph embedding :

Whole graph Node embedding

=

Edge embedding

Image → Feature VectorGraph → Feature Vector

We formulate a new methodfor truss topology optimization

ex ) Structure2Vec(Hanjun et al., 2016)

Edge embedding

• : feature vector of edge eeµ

e e

( )tuµ( )t

uµ

( )tuµ ( )t

uµ

ew

,2ex,1ex

( 1)teµ+

( ),

2( ) ( )

2 2( 1)

1 31 1

,4relu relu relu( )e i

t tu

te

i ive i

ueew xµ µ θ θµθ θ+

= =∈ℵ

← + − +

∑ ∑ ∑

Input features x and w

• : feature vector of edge eeµ

( ),

2( ) ( )

2 2( 1)

1 31 1

,4relu relu relu( )e i

t tu

te

i ive i

ueew xµ µ θ θµθ θ+

= =∈ℵ

← + − +

∑ ∑ ∑

e

( )tuµ( )t

uµ

( )tuµ ( )t

uµ

ew

,2ex,1ex 3

1) Binary feature if node is pin-supported2) Load in x direction3) Load in y direction

5

1) Absolute value of cosine direction2) Absolute value of sine direction3) Member length4) Binary feature if 5) Axial stress over admissible stress

20

Embedded edge features

Q-value using embedded value

• : value to remove edge e in current state s( ),Q s e

T ( ) ( )5 6 7( ( ), ) relu[ , ]T T

u eu V

Q s eµ θ θ µ θ µ∈

= ∑

( )Teµ

( )Tµ( )Tµ

( )Tvµ ( )Tµ

e e

QQ

QQ

( ),Q s e

Q-learning

• Q-learning (Watkins, 1989)

→

s(μ) : expressed by edge embeddinga : remove an edge e withr : +1 (if satisfy constraint) or -1 (else)

( ) ( ) ( ) ( )( )2

1 7m 'inim mize { x, , } ,a ',a

r Q Q aaF θ θ γ+= −s ss

( ) ( ) ( ) ( ) ( )( )' max ' ,,, ,a

Q aQ r aa Qa Qγα← + −+ss ss s

estimated total rewardat current state

estimated total rewardat next state

observed reward +

max ( , )e

Q a e=s

Training modelmaximizesubject to 2.0 ( 1, , )

16.0 ( 1, , )i

j p

number of edges eliminatedi m

u u j n

σ σ≤ = =

≤ = =

Continue until violating constraints

4

assign very small cross-section

4

Choose 1 load randomly

Training result

Agent learned how to remove unnecessary members

Validation model Best scored topology

number of training episodes

validation score= Total rewards

Applicability to another model

12

3 s.t. 2.0, 16.0uσ = =

Trained agent can be used without re-training

1

Applicability to another models.t. 4.0, 50.0uσ = =

1

1

6

6

Conclusion

• Graph embedding method is introduced to express features of truss members

• Conducted Q-learning based on the embedded features

• Trained agent is applicable to any topology and geometry of trusses

Email: [email protected]

vvx

( , )w v u

( )tuµ

( )tuµ

( , )w v u

( , )w v u

( )tuµ

Download - Deep-Q network for truss topology optimization with stress

Top Related