task and motion policy synthesis as liveness games and motion policy synthesis as liveness games yue...

Task and Motion Policy Synthesis as Liveness Games

Yue Wang

Department of Computer ScienceRice University

May 9, 2016

Joint work with Neil T. Dantam, Swarat Chaudhuri, and Lydia E. Kavraki

1

Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games

Motivation

2

Industrial Robots

Picture from robots.co


Motivation

2

Highly structured environment

Pre-computed Plan

Industrial Robots



Motivation

2


Pre-computed Plan

Personal Robots

Picture from robohow.eu

Industrial Robots



Motivation

2


Pre-computed Plan

Unstructured environment

?

Personal Robots

Picture from robohow.eu

Industrial Robots



Example — Kitchen Scenario

3

• Task

▪ avoid collisions

▪ eventually pick up an object

• Assumptions

▪ perfect sensing of current state

▪ deterministic actions



3

• Task



• Assumptions



• Pre-computed plan not working



3

• Task



• Assumptions




• Need a policy



3

• Task



• Assumptions




• Need a policy

Problem: Given (1) Task Specification, (2) Geometric description of Robot and Env, and (3) Discrete abstraction of Robot and Env actions, automatically synthesize a policy that accomplishes the task.


Challenges

4

• Uncontrollable agents

▪What is the proper model?

• Policy over large state space

▪ How to efficiently synthesize the policy?

▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]

- Information from continuous geometry


Challenges

4

Games between Robot and Env








Challenges

4

Games between Robot and Env

Policy SynthesisAlgorithm








Related Work

5

Static, Deterministic

domain

Uncertain domain

Stochastic Adversarial (Worst case)

Task and Motion

Planning (TMP) [e.g., Bhatia et

al. ’11; Kaelbling and

Lozano-Perez ’11; Srivastava et al. ’14; He et

al. ’15]

MDP [e.g., Lahijanian et al. ’10 ’12; Ding et al. ’11; Wolff

et al. ’12; Luna et al. ’14]

POMDP [e.g., Grady et al. ’13 ’15; Kurniawati et al. ’08; Somani et al. ’13]

Planning in belief space [e.g., Kaelbling and Lozano-Perez. ’13; Levin et al. ’13; Wong et al. ’13; Hadfield-

Menell et al. ’15]

Reactive Synthesis [e.g., Kress-Gazit et al. ’09, Decastro and

Kress-Gazit ’15; Wongpiromsarn et

al. ’10; Ulusoy et al. ’13; Alur, Moarref, and Topcu. ’15]

Our Problem

Differential Dynamics

Mobile Manipulation (High DOF)


Ideas we extend

6


Ideas we extend

6

• Program Synthesis

▪ Syntax guided synthesis (SyGuS) [Alur et al. ’13];

▪ Counterexample guided inductive synthesis (CEGIS) [Solar-Lezama et al. ’06]

▪ Satisfiability Modulo Theories (SMT) [De Moura and Bjørner. ’08]

- efficiently handle quantitative constraints


Ideas we extend

6

• Program Synthesis

▪ Syntax guided synthesis (SyGuS) [Alur et al. ’13];

▪ Counterexample guided inductive synthesis (CEGIS) [Solar-Lezama et al. ’06]

▪ Satisfiability Modulo Theories (SMT) [De Moura and Bjørner. ’08]

- efficiently handle quantitative constraints

•Games

▪ de Alfaro and Henzinger ’00; Alur, Henzinger, and Kupferman ’02

▪ Solving infinite games [Beyene et al. 2014 ]

▪ Liveness Games: eventually reach a certain state


Liveness Game structure

7

• Game state space

▪ Robot states × Env states

•Game transitions

▪ valid moves for Robot and Env

•Winning condition

▪ Defined using a set dst of goal states

- Winning play should eventually visit a state s ∈ dst.


Liveness Game structure

7

• Game state space

▪ Robot states × Env states

•Game transitions

▪ valid moves for Robot and Env

•Winning condition

▪ Defined using a set dst of goal states

- Winning play should eventually visit a state s ∈ dst.

Policy: select a proper action for the robot for every state


Policy Synthesis as Games

8

Input Game Structure



8

Geometric description of Robot and Env

Input Game Structure



8


Input Game StructurePlacement Graph[Nedunuri et al. 2014]

Game state space



8



Game state space

Discrete abstraction of Robot and Env actions



8



Game state space


Constraints on system transitions

Game transitions



8



Game state space



Game transitions

Liveness Task Specification



8



Game state space



Game transitions


Liveness winning condition



8



Game state space



Game transitions



Construct a policy



8



Game state space



Game transitions



Construct a policy Find a winning strategy


Policy Synthesis Algorithm

9

• Iteratively generate a candidate and verifies its correctness

▪Counterexample guided [Solar-Lezama et al. ’06]


Policy Synthesis Algorithm

9

• Iteratively generate a candidate and verifies its correctness

▪Counterexample guided [Solar-Lezama et al. ’06]

• Apply heuristic to generalize failures


Geometric-Based Generalization

10

• Generalize the counterexample to a set of similar examples:

▪ Explore geometric structure

▪ reduce necessary iteration numbers - improve efficiency



10

Counterexample






10

Counterexample




Counterexample set

Generalization


Experiments

11

Kitchen Scenario• Kitchen environment

▪ 2 chefs moving within the blue region

▪ increasing the size of the blue region (FoodPrep Region)

• Task requirements:



• Comparison with the GR(1) synthesizer [Piterman, Pnueli, and Saar 2006 ]

▪ back-end solver of LTLMoP [Finucane, Jing, and Kress-Gazit. ’10]


Results

12

• In the tested benchmark, our method scales better for large problems

• Generalization gives order-of-magnitude speedup


Performance with quantitative constraints — energy limits

13

• Still scales well

• About one-time slower


Conclusion

14

• Game model for policy synthesis in adversarial domains

• Algorithm for solving liveness games

▪ utilize geometric information (generalization)

▪ efficiently handle quantitative constraints, e.g., energy limits

• Future extensions:

▪ other uncertainty sources, such as sensor noises

▪ investigate additional generalization heuristics for broader domains


15

Thank you!

Questions?

task and motion policy synthesis as liveness games and motion policy synthesis as liveness games yue...

Documents