a novel active learning approach for constructing high ... › dtic › tr › fulltext › u2 ›...

1
A Novel Active Learning Approach for Constructing High-Fidelity Mobility Maps Quad Members: Gary Marple 1 , Dave Mechergui 2 , Tamer Wasfy 3 , Shravan Veerapaneni 1 , Paramsothy Jayakumar 2 1 University of Michigan's Department of Mathematics, 2 U.S. Army TARDEC, 3 Advanced Science and Automation Corporation Motivation Objective Constructing mobility maps requires thousands of physics-based simulations, each of which can take weeks using high-performance computing 1 . Trained machine learning (ML) classifiers can quickly generate mobility maps 2 . According to PAC learning theory, data that can be separated by a classifier is expected to require up to O(1/) randomly selected points 3 (simulations) to train the classifier with error less than . Develop a sampling technique that will train ML classifiers using far fewer simulations. 1 2 3 ... n ... Automotive Research Center Approach Each classifier is trained on a randomly chosen subset of the labeled data. Each classifier predicts the label of each point in the unlabeled pool. If the majority of the classifiers cannot agree on the label of a point, then the point is queried. The ensemble consists of n classifiers. Ensemble of Classifiers Active Learning Paradigm for Terramechanics Randomly Choose a Few Points From Unlabeled Pool Generate Data Points for Unlabeled Pool Run Simulations to Obtain Labels (Speed-Made-Good) Use QBag 4 to Select Informative Point(s) from Unlabeled Pool Add Labeled Point(s) to Labeled Pool Train Ensemble Using Labeled Pool Physics-based Simulation Test Function Classifier Errors Mobility Map We constructed a test function using data from 528 physics-based simulations. QBag can significantly reduce the number of simulations needed to train multilayer perceptron (MLP) and SVM classifiers. With 100 points, the MLP that used QBag had comparable accuracies to the MLP that was trained with 300 randomly chosen points. Future Work Increase the size of the feature space to include more variables that affect mobility, such as the soil density, cohesion, and friction. Investigate tools for reducing redundancy when sampling in batches. Use simulations to label data points that are selected by the active learner. References 1. McCullough, Michael, et al. "The Next Generation NATO Reference mobility model development." Journal of Terramechanics (2017). 2. Jayakumar, Paramsothy and Dave Mechergui. "Efficient Generation of Accurate Mobility Maps Using Machine Learning Algorithms." Under review (2018). 3. Freund, Yoav, et al. "Selective Sampling Using the Query by Committee Algorithm." Machine Learning (1997). 4. Mamitsuka, Naoki Abe Hiroshi. "Query learning strategies using boosting and bagging." Machine learning: proceedings of the fifteenth international conference (ICML’98). Vol. 1. Morgan Kaufmann Pub, 1998. Results 3% Error 200 pts, n=20, 99.2% Accuracy Active Learning vs 200 pts, n=20, 95.3% Accuracy Random Sampling 100 pts, n=20, 91.8% Accuracy 100 pts, n=20, 96.9% Accuracy DISTRIBUTION STATEMENT A. Approved for public release; distribution unlimited. OPSEC# 1251

Upload: others

Post on 03-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Novel Active Learning Approach for Constructing High ... › dtic › tr › fulltext › u2 › 1057820.pdf · , Tamer Wasfy. 3, Shravan Veerapaneni. 1, Paramsothy Jayakumar. 2

A Novel Active Learning Approach for Constructing High-Fidelity Mobility MapsQuad Members: Gary Marple1, Dave Mechergui2, Tamer Wasfy3, Shravan Veerapaneni1, Paramsothy Jayakumar2

1University of Michigan's Department of Mathematics, 2U.S. Army TARDEC, 3Advanced Science and Automation Corporation

Motivation

Objective

• Constructing mobility maps requires thousands of physics-based simulations,each of which can take weeks using high-performance computing1.

• Trained machine learning (ML) classifiers can quickly generate mobility maps2.• According to PAC learning theory, data that can be separated by a classifier is

expected to require up to O(1/𝜀𝜀) randomly selected points3 (simulations) totrain the classifier with error less than 𝜀𝜀.

• Develop a sampling technique that will train ML classifiersusing far fewer simulations.

1 2 3

...

n...

Automotive Research Center

Approach

Each classifier is trained on a randomly chosen subset of the labeled data.

Each classifier predicts the label of each point in the unlabeled pool. If the majority of the classifiers cannot agree on the label of a point, then the point is queried.

The ensemble consists of n classifiers.

Ensemble of Classifiers

Active Learning Paradigm for Terramechanics

Randomly Choose a Few Points From Unlabeled Pool

Generate Data Points for Unlabeled Pool

Run Simulations to Obtain Labels (Speed-Made-Good)

Use QBag4 to Select Informative Point(s) from Unlabeled Pool

Add Labeled Point(s) to Labeled Pool

Train Ensemble Using Labeled Pool

Physics-based SimulationTest Function

Classifier Errors

Mobility Map

• We constructed a test function using data from 528 physics-basedsimulations.

• QBag can significantly reduce the number of simulations needed totrain multilayer perceptron (MLP) and SVM classifiers.

• With 100 points, the MLP that used QBag had comparable accuraciesto the MLP that was trained with 300 randomly chosen points.

Future Work• Increase the size of the feature space to include more variables that affect mobility, such

as the soil density, cohesion, and friction.• Investigate tools for reducing redundancy when sampling in batches.• Use simulations to label data points that are selected by the active learner.

References1. McCullough, Michael, et al. "The Next Generation NATO Reference mobility model development." Journal of Terramechanics (2017).2. Jayakumar, Paramsothy and Dave Mechergui. "Efficient Generation of Accurate Mobility Maps Using Machine Learning Algorithms." Under review (2018).3. Freund, Yoav, et al. "Selective Sampling Using the Query by Committee Algorithm." Machine Learning (1997).4. Mamitsuka, Naoki Abe Hiroshi. "Query learning strategies using boosting and bagging." Machine learning: proceedings of the fifteenth international conference (ICML’98). Vol. 1.

Morgan Kaufmann Pub, 1998.

Results

3% Error

200 pts, n=20, 99.2% Accuracy

Active Learningvs

200 pts, n=20, 95.3% Accuracy

Random Sampling100 pts, n=20, 91.8% Accuracy 100 pts, n=20, 96.9% Accuracy

DISTRIBUTION STATEMENT A. Approved for public release; distribution unlimited. OPSEC# 1251