the bycatch of bayes nets kerrie mengersen qut australia

Post on 26-Dec-2015

220 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The bycatch of Bayes Nets

Kerrie Mengersen QUT

Australia

 

Australian Research Council Centre of Excellence

Mathematical & Statistical Frontiers:

Big Data, Big Models, New Insights

7 year horizon

6 Universities

7 Partner Organisations 18 CIs, 8 PIs, 23 AIs, 18 RAs, 40PhDs

Bayesian Research and Applications Group (BRAG)

Our vision: To engage in world-class, relevant fundamental and collaborative statistical research, training and application through Bayesian (and other) modelling + fast computation + translation

Bayesian stats + food security

• Process modelling for plant biosecurity• Conservation• Surveillance design• “Intelli-sensing”, eg satellite data and UAVs

4

Spiralling WhiteflyAleurodicus dispersus

Countries where spiralling whitefly has been detected. Administrative regions within some countries are shown when documented. Source (CABI 2004, Monteiro et al. 2005, CABI 2006). Personal communications (J.H. Martin, 2008, B.M. Waterhouse, 2008)

The Problem Major tropical plant pest Lives on 100 hosts + Restricts market access to other

states

Information Literature: Characteristics,

growth, spread Detectability (inspectors) Surveillance data (> 30 000

records)

Scope of modelling Local, district and statewide

• Data Model: Pr(data | incursion process and data parameters) – How data is observed given underlying pest extent

• Process Model: Pr(incursion process | process parameters) – Potential extent given epidemiology / ecology

• Parameter Model: Pr(data and process parameters)– Prior distribution to describe uncertainty in detectability, exposure, growth …

• The posterior distribution of the incursion process (and parameters) is related to the prior distribution and data by:

Pr(process, parameters | data) Pr(data | process, parameters ) Pr( process | parameters ) Pr(parameters)

Hierarchical Bayesian model

Early Warning Surveillance

Priors

Surveillance data

Posterior learning modest reduction in

area freedom large reduction in

estimated extent residual “risk” maps to

target surveillance

Invasion Parameter Estimates

Useful for local management

Observation parameter estimates

Also learn about:

• Host suitability

• Inspector efficiency

Conservation and food security

Modelling complex systems

EconomicHuman impact

GovtBiology

UnknownsExternal factors

Social

En

viro

nm

ent

“There's so much talk about the system. And so little understanding.”

Robert Pirsig

Zen and the Art of Motorcycle Maintenance

“Move away from indicators reported

separately towards methods based on

understanding complexity and emergence.”

Tony Morton

Systems Models

Bayesian Networks

G

E

F

G

E F normal high

yes

low 0.4 0.6

medium 0.2 0.8

high 0.1 0.9

no

low 0.5 0.5

medium 0.6 0.4

high 0.4 0.6

F

low 0.7

medium 0.2

high 0.1

• Be able to model the system– Include many diverse factors and their interactions– Bring together disparate knowledge, including data, model

outputs, expert information, etc– Include costs, benefits, utility

• Use the model to:– Identify key drivers– Explore scenarios of change (“what if…?”)– Identify critical control points– Suggest optimal strategies for improved outcomes– Understand impact of management and policy decisions

Why BNs?

Systems models (BNs) related to food security

• Conservation • Water quality• Recycled water and health• Dairy sustainability• Plant biosecurity risk

Indicator Category Farm Factory Market RatingEconomic Commodity prices 0.8 0.6 0.6 0.7

Legal and administrative environment 0.0 0.1 0.8 0.3Access to capital and labour 0.6 0.8 0.8 0.7Profitability 0.3 0.8 0.7 0.6Workforce capabiility 0.1 0.8 0.6 0.5Economic sustainability rating 0.4 0.6 0.7 0.6

Social Lifestyle and community 0.0 0.5 0.1 0.3Health and well being 0.6 0.9 0.8 0.7Value and contribution 0.1 0.6 0.6Product, safety and production 0.8 0.0 0.0 0.1Social relevance 0.6 0.8 0.8 0.7Social sustainability rating 0.4 0.6 0.3 0.5

Environment Energy, effl uent and water 0.6 0.2 0.2 0.3Materials, suppliers and transport 0.2 0.8 0.6Products and services 0.8 0.8 0.2 0.6Biodiversity 0.2 0.8 0.6 0.6Compliance 0.2 0.0 0.0 0.1Environment sustainability rating 0.4 0.5 0.2 0.4

0.4 0.6 0.4 0.5Dairy Industry sustainability rating

Study 1: viability of wild cheetah population in Namibia

Human Factors Subnetwork

Biological Factors Subnetwork

Ecological Factors Subnetwork

Combined “Object Oriented” BN (OOBN)

Study 2: Sustainability scorecard Measuring the complex interactions of sustainability

Collaboration with Dairy Australia

Aim: to develop a sustainability scorecard to measure Triple Bottom Line (TBL – economic, social and environmental) performance of agricultural systems.

– Key Dairy Stakeholder Review

– 2009 Diary Sustainability Project

– 2011 Materiality Survey (NetBalance)

– 2007/08 Australian Dairy Manufacturing Industry Sustainability Report (DMSC)

– Stakeholder TBL reports

Vital Capital Survey, SAFE framework, DairySAT, Fonterra Sustainability Indicators, Unilever Sustainable Code, Nestle, Lactalis / Parmalat / Pauls, Danone Sustainability Report, Dutch Dairy Farming, RISE, GRI

Sustainability Measurement Review

Dairy Scorecard – Conceptual BN

Social Farm

Economic Farm

Environmental Farm

Measurement of indicator

Initial Sustainability at the Farm

Using the quantified BN submodels & putting them together gives the initial predictive scores for sustainability at the farm level

• Now able to ask questions of the model, e.g.

1. If we improve social sustainability, how will it affect overall sustainability at the farm level?

What if …..?

High: 20% 39%, Medium: 48% 33%, Low: 32% 28%

What if …. ?2. If we improve sustainability at the farm level, what is the

effect on the TBL?

H,M,L: 25%, 39%, 26% 70%, 18%, 12%H,M,L: 25%, 62%, 13% 48%, 48%, 4%

H,M,L: 5%, 51%, 43% 13%, 60%, 27%

Economic

Social

Environmental

Sustainability scorecardIndicator Category Farm Factory Market RatingEconomic Commodity prices 0.8 0.6 0.6 0.7

Legal and administrative environment 0.0 0.1 0.8 0.3Access to capital and labour 0.6 0.8 0.8 0.7Profitability 0.3 0.8 0.7 0.6Workforce capabiility 0.1 0.8 0.6 0.5Economic sustainability rating 0.4 0.6 0.7 0.6

Social Lifestyle and community 0.0 0.5 0.1 0.3Health and well being 0.6 0.9 0.8 0.7Value and contribution 0.1 0.6 0.6Product, safety and production 0.8 0.0 0.0 0.1Social relevance 0.6 0.8 0.8 0.7Social sustainability rating 0.4 0.6 0.3 0.5

Environment Energy, effl uent and water 0.6 0.2 0.2 0.3Materials, suppliers and transport 0.2 0.8 0.6Products and services 0.8 0.8 0.2 0.6Biodiversity 0.2 0.8 0.6 0.6Compliance 0.2 0.0 0.0 0.1Environment sustainability rating 0.4 0.5 0.2 0.4

0.4 0.6 0.4 0.5Dairy Industry sustainability rating

Study 3: Water qualityInitiation of lyngbya in Moreton Bay

The policy questions

What is the overall scientific consensus about the drivers of lyngbya?

What management actions should be taken to reduce lyngbya in Moreton Bay, Australia?

Temperature

LowHigh

49.550.5

19.6 ± 9

Light Quantity

OptimalSubOptimal

20.080.0

Light Quality

PoorBorderlineHigh

10.040.050.0

Wind direction

NorthSEOther

21.024.055.0

Wind Speed

LowHigh

59.940.1

Ground Water Amount

LowHigh

73.126.9

Rain - Present

LowMediumHigh

62.026.012.0

142 ± 190

Dissolved Fe Concentration

LowHigh

56.743.3

Dissolved P Concentration

LowHigh

62.137.9

199 ± 300

Dissolved N Concentration

LowHigh

49.650.4

Dissolved Organics

LowHigh

51.049.0Sediment Nutrient Climate

NonReducingReducing

58.441.6

Avail nutrient pool (dissolved)

EnoughNot enough

33.666.4

Land Run-off Load

LowHigh

51.648.4

Tide

SpringNeap

50.050.0

Bottom Current Climate

LowHigh

48.052.0

Turbidity

LowHigh

45.454.6

Light Climate

InadequateAdequate

71.328.7

20.7 ± 12

Point Sources

LowMediumHigh

26.330.143.7

No.of previous dry days

LowMediumHigh

10.050.040.0

75.6 ± 110

Air

LowHigh

57.442.6

Particulates (Nutr)

LowHigh

45.154.9

2.8 ± 3.3

INITIATION MODEL

Bloom Initiation

NoYes

76.423.6

Most influential factors

1. Available Nutrient Pool

2. Bottom Current Climate

3. Sediment Nutrients

4. Dissolved Iron

5. Dissolved Phosphorous

6. Light

7. Temperature

MANAGEMENT

ACTIONS

“What-if” scenarios

Factor Change in P(Bloom)(%)

Available Nutrient Pool 77 (3% - 80%)

Bottom Current Climate 28 (15% - 43%)

Sediment Nutrient Climate 17 (21% - 38%)

Dissolved Fe 16 (21% - 37%)

Dissolved P 15 (23% - 38%)

Light Climate 14 (18% - 32% )

Temperature 14 (21% - 35%)

Dissolved N 13 (22% - 35%)

Rain – present 10 (25% - 35%)

Light Quantity 9 (21% - 30%)

From Science to Management

Study 4: Recycled Water and Health Handbook

Study 5: “Beyond Compliance”

An integrated approach to pest risk management

STDF – WTO funded project

5 SEA partners + OC: + QUT

Mumford et al.

44

• Production Chain

• Decision Support

• Control Point BN (CP-BN)

1: Production chainExporting Malaysian jackfruit to China

Decision support spreadsheet

Key Factors Score UncertaintyA2.01 Overall rating - Entry Unlikely LowA2.02 Overall rating - Establishment Moderately unlikely LowA2.03 Overall rating - Spread Moderate LowA2.04 Overall rating - Impact Minor Low

A2.05 How easy is it to detect the key organisms on the commodity / pathway?

Easy Medium

A2.06 How easy is it to identify the key organisms?

With some diffi culty Medium

A2.07 How well organised is the sector at risk in the importing country?

Mod. well organised Medium

A2.08 What is the estimated prevalence of the pest in the area where commodity is cultivated?

High Low

Decision support spreadsheet

1.1 a) What is its potential contribution to risk reduction?

1.1 b) Uncertainty

Graphic

1.2 a) The measure can be verified?

1.2 b) Uncertainty Graphic

Sterile insect technique (SIT)

Very high Low Very easy Very low

Pesticides spray program

High Medium Easy Low

Male annihilation, utilizing the attraction of males to methyl eugenol baits

High Low With some difficulty Low

Culling of over-crowded and disease infested fruits

High Low Easy Low

Bagging of fruits 14 days after fruit set

Very high Low Easy Low

High Low

High Low

Risk management measures available (automatically read in from Table B2)

Efficacy Verification

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VH H M L VL

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

0

0.2

0.4

0.6

0.8

1

VE E SD D VD

CP-BN

Economics add-on

• The final target node gives the probability of infestation at the point of export. This must be sufficiently low to comply with the requirements of the dragon fruit importer concerned.

• We also need to include the equally important issues of loss to fruit

production due to this infestation, and costs of control or preventive measures

• That is, what is the net value of the crop?

49

Economicsadding costs via utility nodes

50

Economicsadding losses utility nodes

51

J. Holt, A. W. Leach, S. Johnson, D. M. Tu, D. T. Nhu, N. T. Anh, L. N. Quang, M. M. Quinlan, P. J. L.Whittle, K. Mengersen and J. D. Mumford (in prep.) Bayesian networks to compare pest control interventions on commodities along agricultural production chains.

Methods Questions

52

1. How to elicit information from experts?

2. How to combine information from multiple experts?

3. How to assess the validity and reliability of a BN?

4. How to incorporate uncertainty into BNs?

5. How to combine BNs?

1. Eliciting expert information• Train experts prior to elicitation• Elicit using “outside-in” method

– Extrema: absolute lower and upper limits– Quantiles: realistic limits

(L, U) + uncertainty/sureness around these bounds– Mode: most plausible value

• Record as count, percentage or multiplicative factor• Encode via least squares as normal, lognormal, extended beta etc

2. Combining expert judgements

– Delphi method– Pooling– Modelling

Pooling

1. Average expert opinions for each node and propagate the averages through the network

2. Average after transforming probability to log odds

3. Propagate the opinions through the network for each expert and average the outputs for each expert

Average = linear or geometric, weighted or unweighted

Add a random effect for between-expert deviations

Modelling

• Random effects model• Measurement error model• Item response model

• Can obtain estimates of combined probabilities, node differences, expert differences

Probability in nodel lOverall Node effect Expert effect

3. Validity and reliability of a BN

57

Psychometric approachNomological: sits well within current academic thoughtFace: valid representation of the underlying systemContent: includes all potentially relevant factorsConcurrent: related measures in time/space vary similarlyConvergent: theoretically related measures matchDiscriminant: theoretically unrelated measures are different

Pitchforth, 2013

4. Incorporating uncertainty

• Add prior distributions to nodes• Propagate populations through the BN

(Donald et al. ANZJS 2015)

58

Prob. gastroenteritis (95% CI) = 0.030 (0.026, 0.034)

5. Combining BNs

Many perspectives = many potential models

How to combine outputs?

Model averaging approach– Obtain an estimate of goodness of fit for each

BN– Generate probabilities or ‘data’ from each BN– Obtain a weighted average of the desired

measures

How to combine structures?

TBC…59

60

Conclusion: Why BNs? Because sometimes the solutions are not where we are looking

top related