[fdd 2017] pablo ribalta - machine learning done right

Machine learning done right

An approach to successfully building AI products

Pablo Ribalta Lorenzo

R&D Lead Engineer

[email protected]


Building a Machine Learning product

Machine learning done right: An approach to building successful products

Choosing your metric

Building your dataset

Tuning your parameters

Comparing your results


Choosing your metric


MRI scan

Manual MRI segmentation

MRI scan Doctor’s prediction

Manual MRI segmentation

MRI scan

Automatic MRI segmentation

MRI scan

?


Ground truth

MRI scan

Training


Ground truth

MRI scan

ML-system predictionTraining


ML-system predictionGround truth

vs

ML-system predictionGround truth

vs

Approach #0: Pixelwise comparison

Approach #1: Exploiting confusion matrices

Truepositives

Falsepositives

True negativesFalse negatives

Approach #1: Exploiting confusion matrices

Truepositives

Falsepositives

True negativesFalse negatives

Relevant elements

Selected elements

Precision =

FPTP

TP

Recall =

TP

FN TP

What is our tendency to oversegment? What is our tendency to miss items?

[0, 1] [0, 1]


Ultimate goal: Single metric


𝐹1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗1

1𝑟𝑒𝑐𝑎𝑙𝑙

+1

𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛

= 2 ∗𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙

[0, 1]


Choosing your metric: Summary


• Like business requirements, choosing a good metric comes as result of understanding the needs and expectations of the model’s users

• A model can be excellent in one metric, but very poor in others

• Train using the metric you plan on judging the model with


Building your dataset



How much data can we collect?



Dealing with data scarcity


Medical records


Dealing with data scarcity


Medical records Only few patients


Machine learning done right: An approach to building successful productsSecret sauce: Data augmentation

Pablo Ribalta LorenzoDeformed Original


Rotation

Deformed Original

0° 45° 90°


Rotation

Horizontalflip

Deformed Original

0° 45° 90°

Yes Yes Yes


Rotation

Horizontalflip

Deformed Original

0° 45° 90°

Yes Yes Yes

Verticalflip

Yes Yes Yes YesYes Yes


Building your dataset: Summary


• Many approaches to augmenting data

• We must ensure that our dataset is balanced and correctly describes the data’s statistical distribution

• Although not mentioned, splitting a dataset into Training, Validation and Test is fundamental for a correct training and evaluation of the results


Tuning your model



Hyperparameter optimisation




Automatic hyper-parameter selection: Particle Swarm Optimization



• Pablo Ribalta Lorenzo, Jakub Nalepa, Michal Kawulok, Luciano Sanchez Ramos, and José Ranilla Pastor. 2017. Particle swarm optimization for hyper-parameter selection in deep neural networks. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '17). ACM, New York, NY, USA, 481-488.

• Pablo Ribalta Lorenzo, Jakub Nalepa, Luciano Sanchez Ramos, and José Ranilla Pastor. 2017. Hyper-parameter selection in deep neural networks using parallel particle swarm optimization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '17). ACM, New York, NY, USA, 1864-1871.

When possible, go automatic


Tuning your model: Summary


• Hyper-parameter optimization is probably the most time consuming aspect of building a Machine Learning product

• We need to be confident that our selected settings will translate well in the majority of the cases

• Use automatic approaches when possible


Comparing your results



Surpassing human performance in medical classification





• Typical human performance: 3% error





• Typical doctor performance: 1% error






• Experienced doctor performance: 0.7% error







• Team of experienced doctors performance: 0.5% error







• Team of experienced doctors performance: 0.5% error

What is human performance?

F1 score = 0.817 F1 score = 0.845 F1 score = 0.545 F1 score = 0.801


Comparing with the state of the art


• Superpixel segmentation algorithm





3x State of the art performance for single stage lesions

2x State of the art performance for multiple stage lesions





3x State of the art performance for single stage lesions


Comparing your results: Summary


• It is hard to compare with human performance, and the majority of the time can be misleading

• We have to strive for achieving statistically significant results across different subsets of our data

• Comparing with the state of the art is always a good idea, but we must ensure a fair comparison


About us



ECONIB in numbers


• 18 months ongoing

• 8 publications

• Featured in social media

• Healthcare and research partnership

• NVIDIA Inception member

• Still more research in progress


Conclusions


• Building ML products is possible with a rigorous scientific approach

• Maximising the performance of our model is a nuanced process that requires a thorough understanding of the problem and the theory behind it

• It is not only about the model, but also what’s around it


Machine learning done rightAn approach to building successful ML projects

[email protected]

www.future-processing.pl

[fdd 2017] pablo ribalta - machine learning done right

Data & Analytics