machine learning and robotics

MACHINE LEARNING AND ROBOTICSLisa Lyons

10/22/08

OUTLINE

Machine Learning Basics and Terminology An Example: DARPA Grand/Urban Challenge Multi-Agent Systems Netflix Challenge (if time permits)

INTRODUCTION Machine learning is

commonly associated with robotics

When some think of robots, they think of machines like WALL-E (right) – human-looking, has feelings, capable of complex tasks

Goals for machine learning in robotics aren’t usually this advanced, but some think we’re getting there

Next three slides outline some goals that motivate researchers to continue work in this area

HOUSEHOLD ROBOT TO ASSIST HANDICAPPED

Could come preprogrammed with general procedures and behaviors

Needs to be able to learn to recognize objects and obstacles and maybe even its owner (face recognition?)

Also needs to be able to manipulate objects without breaking them

May not always have all information about its environment (poor lighting, obscured objects)

FLEXIBLE MANUFACTURING ROBOT

Configurable robot that could manufacture multiple items

Must learn to manipulate new types of parts without damaging them

LEARNING SPOKEN DIALOG SYSTEM FOR REPAIRS

Given some initial information about a system, a robot could converse with a human and help to repair it

Speech understanding is a very hard problem in itself

MACHINE LEARNING BASICS AND TERMINOLOGYWith applications and examples in robotics

LEARNING ASSOCIATIONS

Association Rule – probability that an event will happen given another event already has (P(Y|X))

CLASSIFICATION

Classification – model where input is assigned to a class based on some data

Prediction – assuming a future scenario is similar to a past one, using past data to decide what this scenario would look like

Pattern Recognition – a method used to make predictions Face Recognition Speech Recognition

Knowledge Extraction – learning a rule from data

Outlier Detection – finding exceptions to the rules

REGRESSION

Linear regression is an example Both Classification and

Regression are “Supervised Learning” strategies where the goal is to find a mapping from input to output

Example: Navigation of autonomous car Training Data: actions of human

drivers in various situations Input: data from sensors (like GPS

or video) Output: angle to turn steering

wheel

UNSUPERVISED LEARNING

Only have input Want to find regularities in the input Density Estimation: finding patterns in the

input space Clustering: find groupings in the input

REINFORCEMENT LEARNING Policy: generating

correct actions to reach the goal

Learn from past good policies

Example: robot navigating unknown environment in search of a goal Some data may be

missing May be multiple

agents in the system

POSSIBLE APPLICATIONS

Exploring a world Learning object

properties Learning to interact

with the world and with objects

Optimizing actions Recognizing states

in world model

Monitoring actions to ensure correctness

Recognizing and repairing errors

Planning Learning action

rules Deciding actions

based on tasks

WHAT WE EXPECT ROBOTS TO DO

Be able to react promptly and correctly to changes in environment or internal state

Work in situations where information about the environment is imperfect or incomplete

Learn through their experience and human guidance

Respond quickly to human interaction Unfortunately, these are very high

expectations which don’t always correlate very well with machine learning techniques

DIFFERENCES BETWEEN OTHER TYPES OF MACHINE LEARNING AND ROBOTICS

Planning can frequently be done offline

Actions usually deterministic

No major time constraints

Often require simultaneous planning and execution (online)

Actions could be nondeterministic depending on data (or lack thereof)

Real-time often required

Other ML Applications Robotics

AN EXAMPLE: DARPA GRAND/URBAN CHALLENGE

THE CHALLENGE

Defense Advanced Research Projects Agency (DARPA)

Goal: to build a vehicle capable of traversing unrehearsed off-road terrain

Started in 2003 142 mile course through Mojave No one made it through more than 5% of the

course in 2004 race In 2005, 195 teams registered, 23 teams

raced, 5 teams finished

THE RULES

Must traverse a desert course up to 175 miles long in under 10 h

Course kept secret until 2h before the race Must follow speed limits for specific areas of

the course to protect infrastructure and ecology

If a faster vehicle needs to overtake a slower one, the slower one is paused so that vehicles don’t have to handle dynamic passing

Teams given data on the course 2h before race so that no global path planning was required

A DARPA GRAND CHALLENGE VEHICLE CRASHING

A DARPA GRAND CHALLENGE VEHICLE THAT DID NOT CRASH

…namely Stanley, the winner of the 2005 challenge

TERRAIN MAPPING AND OBSTACLE DETECTION Data from 5 laser scanners mounted on top

of the car is used to generate a point cloud of what’s in front of the car

Classification problem Drivable Occupied Unknown

Area in front of vehicle as grid Stanley’s system finds the probability that ∆h

> δ where ∆h is the observed height of the terrain in a certain cell

If this probability is higher than some threshold α, the system defines the cell as occupied

(CONT.)

A discriminative learning algorithm is used to tune the parameters

Data is taken as a human driver drives through a mapped terrain avoiding obstacles (supervised learning)

Algorithm uses coordinate ascent to determine δ and α

COMPUTER VISION ASPECT

Lasers only make it safe for car to drive < 25 mph

Needs to go faster to satisfy time constraint Color camera is used for long-range obstacle

detection Still the same classification problem Now there are more factors to consider –

lighting, material, dust on lens Stanley takes adaptive approach

VISION ALGORITHM

1. Take out the sky2. Map a quadrilateral on camera video

corresponding with laser sensor boundaries3. As long as this region is deemed drivable, use

the pixels in the quad as a training set for the concept of drivable surface

4. Maintain Gaussians that model the color of drivable terrain

5. Adapt by adjusting previous Gaussians and/or throwing them out and adding new ones

Adjustment allows for slow adjustment to lighting conditions

Replacement allows for rapid change in color of the road

6. Label regions as drivable if their pixel values are near one or more of the Gaussians and they are connected to laser quadrilateral

ROAD BOUNDARIES

Best way to avoid obstacles on a desert road is to find road boundaries and drive down the middle

Uses low-pass one-dimensional Kalman Filters to determine road boundary on both sides of vehicle

Small obstacles don’t really affect the boundary found

Large obstacles over time have a stronger effect

SLOPE AND RUGGEDNESS

If terrain becomes too rugged or steep, vehicle must slow down to maintain control

Slope is found from vehicle’s pitch estimate Ruggedness is determined by taking data

from vehicle’s z accelerometer with gravity and vehicle vibration filtered out

PATH PLANNING

No global planning necessary Coordinate system used is base trajectory +

lateral offset Base trajectory is smoothed version of

driving corridor on the map given to contestants before the race

PATH SMOOTHING

Base trajectory computed in 4 steps:1. Points are added to the map in proportion to

local curvature2. Least-squares optimization is used to adjust

trajectories for smoothing3. Cubic spline interpolation is used to find a path

that can be resampled efficiently4. Calculate the speed limit

ONLINE PATH PLANNING

Determines the actual trajectory of vehicle during race

Search algorithm that minimizes a linear combination of continuous cost functions

Subject to dynamic and kinematic constraints Max lateral acceleration Max steering angle Max steering rate Max acceleration

Penalize hitting obstacles, leaving corridor, leaving center of road

MULTI-AGENT SYSTEMS

RECURSIVE MODELING METHOD (RMM)

Agents model the belief states of other agents

Beyesian methods implemented Useful in homogeneous non-communicating

Multi-Agent Systems (MAS) Has to be cut off at some point (don’t want a

situations where agent A thinks that agent B thinks that agent A thinks that…)

Agents can affect other agents by affecting the environment to produce a desired reaction

HETEROGENEOUS NON-COMMUNICATING MAS

Competitive and cooperative learning possible

Competitive learning more difficult because agents may end up in “arms race”

Credit-assignment problem Can’t tell if agent benefitted because it’s actions

were good or if opponent’s actions were bad Experts and observers have proven useful Different agents may be given different roles

to reach the goal Supervised learning to “teach” each agent how

to do its part

COMMUNICATION

Allowing agents to communicate can lead to deeper levels of planning since agents know (or think they know) the beliefs of others

Could allow one agent to “train” another to follow it’s actions using reinforcement learning

Negotiations Commitment Autonomous robots could understand their

position in an environment by querying other robots for their believed positions and making a guess based on that (Markov localization, SLAM)

NETFLIX CHALLENGE(if time permits)

REFERENCES Alpaydin, E. Introduction to Machine Learning.

Cambridge, Mass. : MIT Press, 2004. Kreuziger, J. “Application of Machine Learning to

Robotics – An Analysis.” In Proceedings of the Second International Conference on Automation, Robotics, and Computer Vision (ICARCV '92). 1992.

Mitchell et. al. “Machine Learning.” Annu. Rev. Coput. Sci. 1990. 4:417-33.

Stone, P and Veloso, M. “Multiagent Systems: A Survey from a Machine Learning Perspective.” Autonomous Robots 8, 345-383, 2000.

Thrun et. al. “Stanley: The Robot that Won the DARPA Grand Challenge.” Journal of Field Robotics 23(9), 661-692, 2006.

machine learning and robotics

Documents

unsupervised learning

types of machine learning

introduction machine

machine learning techniques

learning action rules

past data

supervised learning

reinforcement learning