research at the decision making lab

35
Research at the Decision Making Lab Fabio Cozman Universidade de São Paulo

Upload: giselle-avery

Post on 03-Jan-2016

37 views

Category:

Documents


1 download

DESCRIPTION

Research at the Decision Making Lab. Fabio Cozman Universidade de São Paulo. Decision Making Lab (2002). Research tree. Bayes nets. Sets of probabilities. Robotics (a bit). Anytime, anyspace (embedded systems). Classification. Algorithms independence. MCMC algorithms - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Research at the  Decision Making Lab

Research at the Decision Making Lab

Fabio CozmanUniversidade de São Paulo

Page 2: Research at the  Decision Making Lab

Decision Making Lab (2002)

Page 3: Research at the  Decision Making Lab

Research tree

Robotics (a bit)

Bayes netsSets of probabilities

Algorithmsindependence

ApplicationsMDPs, robustness analysis, auctions

Anytime, anyspace(embedded systems)

Classification

ApplicationsMedical decisions

MCMC algorithmsinference & testing

Page 4: Research at the  Decision Making Lab

Some (bio)robotics

Page 5: Research at the  Decision Making Lab

Bayesian networks

Page 6: Research at the  Decision Making Lab

Decisions in medical domains (with the University Hospital)

Idea: To improve decisions at medical posts in urban, poor areas

We are building networks that represent cardiac arrest — can be caused by stress, cardiac problems, respiratory problems, etc

– Support by FAPESP

Page 7: Research at the  Decision Making Lab

The HU-network

Page 8: Research at the  Decision Making Lab

A better interface for teaching

Page 9: Research at the  Decision Making Lab

Embedded Bayesian networks

Challenge: to implement inference algorithms compactly and efficiently

Real challenge: to develop anytime anyspace inference algorithms

Idea: decompose networks, apply several algorithms (UAI2002 workshop on RT)

– Support by HP Labs

Page 10: Research at the  Decision Making Lab

Decomposing networks

How to decompose and assign algorithms to meet space and time constraints with reasonable accuracy

Page 11: Research at the  Decision Making Lab

Application: Failure analysis in car-wash systems

Page 12: Research at the  Decision Making Lab

The car-wash network

Page 13: Research at the  Decision Making Lab

Generating random networks

Problem is easy to state, hard to solve: critical properties of DAGs are not known

Method based on MCMC simulation, with constraints on induced width and degree

– Support by FAPESP

Page 14: Research at the  Decision Making Lab

Research tree (again)

Biorobotics (a bit of it)

Bayes netsSets of probabilities

Algorithmsindependence

ApplicationsMDPs, robustness analysis, auctions

Anytime, anyspace(embedded systems)

Classification

ApplicationsMedical decisions

MCMC algorithmsinference & testing

Page 15: Research at the  Decision Making Lab

Bayesian network classifiers

Goal is to use probabilistic models for classification – to “learn” classifiers using labeled and unlabeled data

Work with Ira Cohen, Alex Bronstein and Marsha Duro (UIUC and HP Labs)

Page 16: Research at the  Decision Making Lab

Using Bayesian networks to learn from labeled and unlabeled data

Suppose we want to classify events based on observations; we have recorded data that are sometimes labeled and sometimes unlabeled

What is the value of unlabeled data?

Page 17: Research at the  Decision Making Lab

The Naïve Bayes classifier

A Bayesian-network like classifier with excellent credentials:

Use Bayes rule to get classification

p(label | attrs.) p(label) i=0…N p(attr. i | Class)

Attribute 1

Class

Attribute 2 Attribute N

Page 18: Research at the  Decision Making Lab

The TAN classifier

Attribute NXN

Attribute 1X1

Class

Attribute 2X2

Attribute 3X3

Page 19: Research at the  Decision Making Lab

Now, let’s consider unlabeled data

Our database:– American baseball hamburger – Brazilian soccer rice and beans– American golf apple pie– ? saloon soccer rice and beans– ? golf rice and beans

Question: How to use the unlabeled data?

Page 20: Research at the  Decision Making Lab

Unlabeled data can help…

Learning a Naïve Bayes for data generated from a Naïve Bayes model (10 attributes):

100

101

102

103

104

0.06

0.07

0.08

0.09

0.1

0.11

Number of Unlabeled records

Pro

ba

bil

ity

of

err

or

30 Labeled

300 Labeled

3000 Labeled

Page 21: Research at the  Decision Making Lab

… but unlabeled data may degrade performance!

Surprising fact:more data may not help; more data may hurt

Page 22: Research at the  Decision Making Lab

Some math: asymptotic analysis

Asymptotic bias:

Variance decreases with more data

Page 23: Research at the  Decision Making Lab

A very simple example

Consider the following situation:

Class

X

Y

Class

X Y

“Real”

“Assumed”

X and Y are Gaussian given Class

Page 24: Research at the  Decision Making Lab

Effect of unlabeled data – a different perspective

101

102

103

104

105

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Classification error: 0%, 50%, 99% unlabeled records

Number of records (log)

Cla

ssifi

catio

n er

ror

0%, complete 50%, only labeled50%, complete 99%, only labeled99%, complete

Page 25: Research at the  Decision Making Lab

Searching for structures

Previous tests suggest that we should pay attention to modeling assumptions when dealing with unlabeled data

In the context of Bayesian network classifiers, we must look for structures

This is not easy; worse, existing algorithms do not focus on classification

Page 26: Research at the  Decision Making Lab

Stochastic Structure Search (SSS)

Idea: search for structures using classification error

Hard: search space is too messy

Solution: Metropolis-Hastings sampling with underlying measure proportional to 1/perror

Page 27: Research at the  Decision Making Lab

Some classification results

Page 28: Research at the  Decision Making Lab

Some words on unlabeled data

Unlabeled data can improve performance, can degrade performance — really hard!

Current understanding about this problem is shaky – people think outliers or mismatches between

labeled and unlabeled data cause the problem

Page 29: Research at the  Decision Making Lab

Research tree (once again)

Biorobotics (a bit of it)

Bayes netsSets of probabilities

Algorithmsindependence

ApplicationsMDPs, robustness analysis, auctions

Anytime, anyspace(embedded systems)

Classification

ApplicationsMedical decisions

MCMC algorithmsinference & testing

Page 30: Research at the  Decision Making Lab

Sets of probabilities

Instead of probability of rain is 0.2, say probability of rain is [0.1, 0.3]

Instead of expected value of stock is 10,admit expected value of stock is [0, 1000]

Page 31: Research at the  Decision Making Lab

An example

Consider a set of probabilities p(1) p(2), p(3)

Set of probabilities

Page 32: Research at the  Decision Making Lab

Why?

More realistic and quite expressive as representation language

Excellent tool for – robustness/sensitivity analysis– modeling incomplete beliefs (probabilistic logic)– group decision-making – analysis of economic interactions – for example, to

study arbitrage and design auctions

Page 33: Research at the  Decision Making Lab

What we have been doing

Trying to formalize and apply “interval” reasoning, particularly independence

Building algorithms for manipulation of these intervals and sets – To deal with independence and networks – JavaBayes is the only available software that can

deal with this (to some extent!)

Page 34: Research at the  Decision Making Lab

Credal networks

Using graphical models to represent sets of joint probabilities

Question: what do the networks represent?

Several open questions and need for algorithms

Family In? Dog Sick?

Lights On?

Dog Barking?

Dog Out?

Page 35: Research at the  Decision Making Lab

Concluding

To summarize, we want to understand how to use probabilities in AI, and then we add a bit of robotics

Support from FAPESP and HP Labs has been generous

Visit the lab in your next trip to São Paulo