task and motion policy synthesis as liveness games and motion policy synthesis as liveness games yue...

39
Task and Motion Policy Synthesis as Liveness Games Yue Wang Department of Computer Science Rice University May 9, 2016 Joint work with Neil T. Dantam, Swarat Chaudhuri, and Lydia E. Kavraki 1

Upload: buidan

Post on 04-May-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

Task and Motion Policy Synthesis as Liveness Games

Yue Wang

Department of Computer ScienceRice University

May 9, 2016

Joint work with Neil T. Dantam, Swarat Chaudhuri, and Lydia E. Kavraki

1

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Motivation

2

Industrial Robots

Picture from robots.co

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Motivation

2

Highly structured environment

Pre-computed Plan

Industrial Robots

Picture from robots.co

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Motivation

2

Highly structured environment

Pre-computed Plan

Personal Robots

Picture from robohow.eu

Industrial Robots

Picture from robots.co

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Motivation

2

Highly structured environment

Pre-computed Plan

Unstructured environment

?

Personal Robots

Picture from robohow.eu

Industrial Robots

Picture from robots.co

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions

• Pre-computed plan not working

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions

• Pre-computed plan not working

• Need a policy

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions

• Pre-computed plan not working

• Need a policy

Problem: Given (1) Task Specification, (2) Geometric description of Robot and Env, and (3) Discrete abstraction of Robot and Env actions, automatically synthesize a policy that accomplishes the task.

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Challenges

4

• Uncontrollable agents

▪What is the proper model?

• Policy over large state space

▪ How to efficiently synthesize the policy?

▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]

- Information from continuous geometry

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Challenges

4

Games between Robot and Env

• Uncontrollable agents

▪What is the proper model?

• Policy over large state space

▪ How to efficiently synthesize the policy?

▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]

- Information from continuous geometry

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Challenges

4

Games between Robot and Env

Policy SynthesisAlgorithm

• Uncontrollable agents

▪What is the proper model?

• Policy over large state space

▪ How to efficiently synthesize the policy?

▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]

- Information from continuous geometry

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Related Work

5

Static, Deterministic

domain

Uncertain domain

Stochastic Adversarial (Worst case)

Task and Motion

Planning (TMP) [e.g., Bhatia et

al. ’11; Kaelbling and

Lozano-Perez ’11; Srivastava et al. ’14; He et

al. ’15]

MDP [e.g., Lahijanian et al. ’10 ’12; Ding et al. ’11; Wolff

et al. ’12; Luna et al. ’14]

POMDP [e.g., Grady et al. ’13 ’15; Kurniawati et al. ’08; Somani et al. ’13]

Planning in belief space [e.g., Kaelbling and Lozano-Perez. ’13; Levin et al. ’13; Wong et al. ’13; Hadfield-

Menell et al. ’15]

Reactive Synthesis [e.g., Kress-Gazit et al. ’09, Decastro and

Kress-Gazit ’15; Wongpiromsarn et

al. ’10; Ulusoy et al. ’13; Alur, Moarref, and Topcu. ’15]

Our Problem

Differential Dynamics

Mobile Manipulation (High DOF)

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Ideas we extend

6

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Ideas we extend

6

• Program Synthesis

▪ Syntax guided synthesis (SyGuS) [Alur et al. ’13];

▪ Counterexample guided inductive synthesis (CEGIS) [Solar-Lezama et al. ’06]

▪ Satisfiability Modulo Theories (SMT) [De Moura and Bjørner. ’08]

- efficiently handle quantitative constraints

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Ideas we extend

6

• Program Synthesis

▪ Syntax guided synthesis (SyGuS) [Alur et al. ’13];

▪ Counterexample guided inductive synthesis (CEGIS) [Solar-Lezama et al. ’06]

▪ Satisfiability Modulo Theories (SMT) [De Moura and Bjørner. ’08]

- efficiently handle quantitative constraints

•Games

▪ de Alfaro and Henzinger ’00; Alur, Henzinger, and Kupferman ’02

▪ Solving infinite games [Beyene et al. 2014 ]

▪ Liveness Games: eventually reach a certain state

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Liveness Game structure

7

• Game state space

▪ Robot states × Env states

•Game transitions

▪ valid moves for Robot and Env

•Winning condition

▪ Defined using a set dst of goal states

- Winning play should eventually visit a state s ∈ dst.

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Liveness Game structure

7

• Game state space

▪ Robot states × Env states

•Game transitions

▪ valid moves for Robot and Env

•Winning condition

▪ Defined using a set dst of goal states

- Winning play should eventually visit a state s ∈ dst.

Policy: select a proper action for the robot for every state

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Input Game Structure

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game Structure

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Discrete abstraction of Robot and Env actions

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Discrete abstraction of Robot and Env actions

Constraints on system transitions

Game transitions

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Discrete abstraction of Robot and Env actions

Constraints on system transitions

Game transitions

Liveness Task Specification

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Discrete abstraction of Robot and Env actions

Constraints on system transitions

Game transitions

Liveness Task Specification

Liveness winning condition

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Discrete abstraction of Robot and Env actions

Constraints on system transitions

Game transitions

Liveness Task Specification

Liveness winning condition

Construct a policy

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis as Games

8

Geometric description of Robot and Env

Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space

Discrete abstraction of Robot and Env actions

Constraints on system transitions

Game transitions

Liveness Task Specification

Liveness winning condition

Construct a policy Find a winning strategy

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis Algorithm

9

• Iteratively generate a candidate and verifies its correctness

▪Counterexample guided [Solar-Lezama et al. ’06]

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Policy Synthesis Algorithm

9

• Iteratively generate a candidate and verifies its correctness

▪Counterexample guided [Solar-Lezama et al. ’06]

• Apply heuristic to generalize failures

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Geometric-Based Generalization

10

• Generalize the counterexample to a set of similar examples:

▪ Explore geometric structure

▪ reduce necessary iteration numbers - improve efficiency

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Geometric-Based Generalization

10

Counterexample

• Generalize the counterexample to a set of similar examples:

▪ Explore geometric structure

▪ reduce necessary iteration numbers - improve efficiency

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Geometric-Based Generalization

10

Counterexample

• Generalize the counterexample to a set of similar examples:

▪ Explore geometric structure

▪ reduce necessary iteration numbers - improve efficiency

Counterexample set

Generalization

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Experiments

11

Kitchen Scenario• Kitchen environment

▪ 2 chefs moving within the blue region

▪ increasing the size of the blue region (FoodPrep Region)

• Task requirements:

▪ avoid collisions

▪ eventually pick up an object

• Comparison with the GR(1) synthesizer [Piterman, Pnueli, and Saar 2006 ]

▪ back-end solver of LTLMoP [Finucane, Jing, and Kress-Gazit. ’10]

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Results

12

• In the tested benchmark, our method scales better for large problems

• Generalization gives order-of-magnitude speedup

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Performance with quantitative constraints — energy limits

13

• Still scales well

• About one-time slower

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Conclusion

14

• Game model for policy synthesis in adversarial domains

• Algorithm for solving liveness games

▪ utilize geometric information (generalization)

▪ efficiently handle quantitative constraints, e.g., energy limits

• Future extensions:

▪ other uncertainty sources, such as sensor noises

▪ investigate additional generalization heuristics for broader domains

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

15

Thank you!

Questions?