neural networks and logistic regression lucila ohno-machado decision systems group brigham and...

60
Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Upload: edith-agatha-mccarthy

Post on 18-Dec-2015

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Neural Networks and Logistic Regression

Lucila Ohno-Machado

Decision Systems Group

Brigham and Women’s Hospital

Department of Radiology

Page 2: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

STOP

CoronaryDisease

NeuralNet

Page 3: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Outline

• Examples, neuroscience analogy

• Perceptrons, MLPs: How they work

• How the networks learn from examples

• Backpropagation algorithm

• Learning parameters

• Overfitting

Page 4: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Examples in MedicalPattern Recognition

Diagnosis

• Protein Structure Prediction

• Diagnosis of Giant Cell Arteritis

• Diagnosis of Myocardial Infarction

• Interpretation of ECGs

• Interpretation of PET scans, Chest X-rays

Prognosis

• Prognosis of Breast Cancer

• Outcomes After Spinal Cord Injury

Page 5: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Myocardial Infarction Network

0.8Myocardial Infarction “Probability” of MI

112 150

MaleAgeSmokerECG: STPainIntensity

4

PainDuration Elevation

Page 6: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Abdominal Pain Perceptron

Male Age Temp WBC PainIntensity

PainDuration

37 10 11 20 1adjustableweights

0 1 0 0000

AppendicitisDiverticulitis

PerforatedNon-specific

CholecystitisSmall Bowel

PancreatitisObstructionPainDuodenal Ulcer

Page 7: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Biological Analogy

Synapses

Axon

Dendrites

Synapses++

+--

(weights)

Nodes

Page 8: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

.

Input layer

Output layer

Input patterns000011

11

01

110010

01 11

11

11

Sortedpatterns

00

00

00 10

10

10

Page 9: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Perceptrons

weights

Output units

No disease Pneumonia Flu Meningitis

Input units

Cough Headache

what we gotwhat we wanted-error

rulechange weights todecrease the error

Page 10: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Perceptrons

Input units

Input to unit j:aj =wij ai

j

i

Input to unit i:ai

measured value of variable i

Output of unit j:

oj = 1/ (1 + e- (aj+j) )Output

units

Page 11: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

AND

input output00011011

0001

y

x1 x2

w1 w2

f(x1w1 + x2w2) = y

f(0w1 + 0w2) = 0 f(0w1 + 1w2) = 0 f(1w1 + 0w2 ) = 0 f(1w1 + 1w2 ) = 1

= 0.5

f(a) = 1, for a > 0, for a

some possible values for w1 and w2

w1 w2

0.200.200.250.40

0.350.400.300.20

Page 12: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

XOR

input output00011011

0110

y

x1 x2

w1 w2

f(x1w1 + x2w2) = y

f(0w1 + 0w2) = 0 f(0w1 + 1w2) = 1 f(1w1 + 0w2) = 1 f(1w1 + 1w2) = 0

= 0.5

f(a) = 1, for a > 0, for a

some possible values for w1 and w2

w1 w2

Page 13: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

XORinput output

00011011

0110

y

x 1 x 2

= 0.5

f(a) = 1, for a > 0, for a

z = 0.5w 3 w 4

f(w1, w2, w3, w4, w5)

w5

a possible set of values for ws

(w1, w2, w3, w4, w5)

(0.3,0.3,1,1,-2)

w1 w2

Page 14: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

XORinput output

00011011

0110

f(a) = 1, for a > 0, for a

f(w1, w2, w3, w4, w5 , w6)

a possible set of values for ws

(w1, w2, w3, w4, w5 , w6)

(0.6,-0.6,-0.7,0.8,1,1)

w1 w4w3w2

w5 w6

= 0.5 for all units

Page 15: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Linear Separation

Cough No cough

CoughNo coughNo headache No headache

Headache Headache

No disease

Meningitis Flu

Pneumonia

No treatmentTreatment

00 10

01 11

000 100

010

101

111011

110

Page 16: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Y = a(X) + b Y =1 + e-a(X) + b

1

Linear LogisticRegressionDiscriminant

Page 17: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Abdominal Pain

37 10 1

Appendicitis Diverticulitis

PerforatedNon-specific

CholecystitisSmall Bowel

Pancreatitis

1 20

Male Age Temp WBC PainIntensity

1

PainDuration

0 1 0 0000

adjustableweights

ObstructionPainDuodenal Ulcer

Page 18: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Multilayered Perceptrons

Input units

Input to unit j:aj =wijai

j

i

Input to unit i:aimeasured value of variable i

Output of unit j:

oj = 1/ (1 + e- (a j+j) )

Output units

Perceptron

MultilayeredInput to unit k:

perceptron

Output of unit k:ok = 1/ (1 + e- (ak+k) )

k

Hiddenunits

ak =wjkoj

Page 19: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Regression vs. Neural Networks

X1 X2 X3

“X1” “X1X3” “X1X2X3”

Y

“X2”

X1 X2 X3 X1X2 X1X3 X2X3

Y

(23-1) possible combinations

X1X2X3

Y = a(X1) + b(X2) + c(X3) + d(X1X2) + ...

Page 20: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Logistic Regression

• One independent variable

f(x) = 1

1 + e -(ax + cte)

• Two

f(x) = 1

1 + e -(ax1 + bx2 + cte)

f(x)

x

1

0

Page 21: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Logistic function

p = 1

1 + e -(ax + cte)

log (p/1-p) = ax + cte

log(p/1-p)

x

1

0

linear

a

Page 22: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Logistic function

p = 1

1 + e -(ax + cte)

log (p/1-p) = ax + cte

linear

a is the odds for1 unit of increase in x

Page 23: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Jargon Pseudo-Correspondence

• Independent variable = input variable

• Dependent variable = output variable

• Coefficients = “weights”

• Estimates = “targets”

• Cycles = epoch

Page 24: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Logistic Regression ModelInputs

Coefficients

a, b, c

Output

Independent variables

x1, x2, x3

Dependent variable

p

Prediction

Age 34

1Gender

Stage 4

“Probability of beingAlive”

5

8

4 0.6

Page 25: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

is the sum of inputs * weightsInputs

Coefficients

Output

Independent variables

Prediction

Age 34

1Gender

Stage 4

5

8

4

Page 26: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Logistic functionInputs

Coefficients

Output

Independent variables

Prediction

Age 34

1Gender

Stage 4

.5

.8

.40.6

“Probability of beingAlive”

p = 1 1 + e -( + cte)

Page 27: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Activation Functions...

• Linear

• Threshold or step function

• Logistic, sigmoid, “squash”

• Hyperbolic tangent

Page 28: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Neural Network ModelInputs

Weights

Output

Independent variables

Dependent variable

Prediction

Age 34

2Gender

Stage 4

.6

.5

.8

.2

.1

.3.7

.2

WeightsHiddenLayer

“Probability of beingAlive”

0.6

.4

.2

Page 29: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

“Combined logistic models”Inputs

Weights

Output

Independent variables

Dependent variable

Prediction

Age 34

2Gender

Stage 4

.6

.5

.8

.1

.7

WeightsHiddenLayer

“Probability of beingAlive”

0.6

Page 30: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Inputs

Weights

Output

Independent variables

Dependent variable

Prediction

Age 34

2Gender

Stage 4

.5

.8.2

.3

.2

WeightsHiddenLayer

“Probability of beingAlive”

0.6

Page 31: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Inputs

Weights

Output

Independent variables

Dependent variable

Prediction

Age 34

1Gender

Stage 4

.6.5

.8.2

.1

.3.7

.2

WeightsHiddenLayer

“Probability of beingAlive”

0.6

Page 32: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Not really, no target for hidden units...

WeightsIndependent variables

Dependent variable

Prediction

Age 34

2Gender

Stage 4

.6

.5

.8

.2

.1

.3.7

.2

WeightsHiddenLayer

“Probability of beingAlive”

0.6

.4

.2

Page 33: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Perceptrons

weights

Output units

No disease Pneumonia Flu Meningitis

Input units

Cough Headache

what we gotwhat we wanted-error

rulechange weights todecrease the error

Page 34: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Hidden Units and Backpropagation

Input units

Output units

Hiddenunits

what we gotwhat we wanted-error

rule

rule

bac kpropag ati on

Page 35: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Error Functions

• Mean Squared Error (for most problems)

(t - o)2/n

• Cross Entropy Error (for dichotomous or binary outcomes)

(t ln o) + (1-t) ln (1-o)

Page 36: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Minimizing the Error

winitialwtrained

initial error

final error

Error surface

positive change

negative derivative

local minimum

Page 37: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Numerical Methods

a(x3) + b(x2) + c(x) + d = 0

1st pair of guessed roots

+-

2nd pair of guessed roots

x

y

Page 38: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Gradient descent

Local minimum

Global minimum

Error

Page 39: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Overfitting

Overfitted ModelReal Distribution

Page 40: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Overfitting

b = training set

a = test set

Overfitted model

tss

Epochs

mintss )

tss a

tss b

Stopping criterion

Page 41: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Overfitting in Neural NetsC

HD

age0

Overfitted model “Real” model

cycles

error

Overfitted model

holdout

training

Page 42: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Parameter Estimation

Logistic regression• It models “just” one

function– Maximum likelihood

– Fast

– Optimizations• Fisher

• Newton-Raphson

Neural network• It models several

functions– Backpropagation

– Iterative

– Slow

– Optimizations• Quickprop• Scaled conjugate g.d.• Adaptive learning rate

Page 43: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

What do you want?Insight versus prediction

Insight into the model• Explain importance of

each variable• Assess model fit to

existing data

Accurate predictions• Make a good estimate

of the “real” probability

• Assess model prediction in new data

Page 44: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Model SelectionFinding influential variables

Logistic• Forward• Backward• Stepwise• Arbitrary• All combinations• Relative risk

Neural Network• Weight elimination• Automatic Relevance

Determination• “Relevance”

Page 45: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Regression DiagnosticsFinding influential observations

Logistic• Analysis of residuals• Cook’s distance• Deviance• Difference in

coefficients when case is left out

Neural Network• Ad-hoc

Page 46: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

How accurate are predictions?

• Construct training and test sets or bootstrap to assess “unbiased” error

• Assess – Discrimination

• How model “separates” alive and dead

– Calibration• How close the estimates are from “real” probability

Page 47: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

“Unbiased” EvaluationTraining and Tests Sets

• Training set is used to build the model (may include holdout set to control for overfitting)

• Test set left aside for evaluation purposes

• Ideal: yet another validation data set, from different source to test if model generalizes to other settings

Page 48: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Small sets: Cross-validation

• Several training and test set pairs are created so that the union of all test sets corresponds exactly to the original set

• Results from the different models are pooled and overall performance is estimated

• “Leave-n-out”

• Jackknife

Page 49: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

ECG Interpretation

R-R interval

S-T elevation

P-R interval

QRS duration

AVF lead

QRS amplitude

SV tachycardia

Ventricular tachycardia

LV hypertrophy

RV hypertrophy

Myocardial infarction

Page 50: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Thyroid Diseases

Hiddenlayer

Patientdata

Partialdiagnoses

TSH

T4U

Clinical¼nding1

.

.

.

.

.

(5 or 10 units)

Normal

Hyperthyroidism

Hypothyroidism

Otherconditions

Patients whowill be evaluatedfurther

Hiddenlayer

Patientdata

Finaldiagnoses

TSH

T4U

Clinical¼nding

1

.

.

.

T3

TT4

TBG

.

.

(5 or 10 units)

Normal

Primaryhypothyroidism

CompensatedhypothyroidismSecondaryhypothyroidism

Hypothyroidism

OtherconditionsAdditional

input

Page 51: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Time Series

Hidden units

Xn X n+1

Input units

Y = Xn+2

Output units(dependent variables)

(independent variables)

Weights(estimated parameters)

Page 52: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Time Series

Hidden units

Xn Xnn+1 X n+1

Input units

Y = Xn+2Xn+1 n+2

Output units(dependent variables)

(independent variables)

Weights(estimated parameters)

Page 53: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Evaluation

Training Test

Validation

Randomizationof cases

Modeldevelopment

Modelenhancement

Model evaluation

“A” “B”

A

B

Type I

Type II

OK

OK

Page 54: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Evaluation: Area Under ROCs

1 - Speci¼city

Data Models

Neural network1 - Speci¼city

1 - Speci¼city

ROCs

Area under ROCComparison

Page 55: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

ROC Analysis: Variations

ROC

Area under ROC

Slope andIntercept

Confidence interval

Wilcoxon statistic

Page 56: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Expert Systems and Neural Nets

# Examples

ExpertSystems

NeuralNetworks

Page 57: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Model Comparison(personal biases)

Modeling ExamplesExplanation

Effort Needed Provided

Rule-based Exp. Syst. high low high

Bayesian Nets high low moderate

Classification Trees low high “high”

Neural Nets low high low

Regression Models high moderate moderate

Page 58: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Conclusion

Neural Networks are

• mathematical models that resemble nonlinear regression models, but are also useful to model nonlinearly separable spaces

• “knowledge acquisition tools” that learn from examples

• Neural Networks in Medicine are used for:– pattern recognition (images, diseases, etc.)

– exploratory analysis, control

– predictive models

Page 59: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Conclusion

• No final indication for using either logistic regression or neural network

• Try both, select best

• Make unbiased evaluation

• Compare statistically

Page 60: Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Some References

Introductory Textbooks• Rumelhart, D.E., and McClelland, J.L. (eds) Parallel Distributed

Processing. MIT Press, Cambridge, 1986.• Hertz JA; Palmer RG; Krogh, AS. Introduction to the Theory of Neural

Computation. Addison-Wesley, Redwood City, 1991.• Pao, YH. Adaptive Pattern Recognition and Neural Networks. Addison-

Wesley, Reading, 1989.• Reggia JA. Neural computation in medicine. Artificial Intelligence in

Medicine, 1993 Apr, 5(2):143–57.• Miller AS; Blott BH; Hames TK. Review of neural network applications

in medical imaging and signal processing.Medical and Biological Engineering and Computing, 1992 Sep, 30(5):449–64.

• Bishop CM. Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995.