a theoretical framework for robustness of ......a theoretical framework for robustness of (deep)...

1
A THEORETICAL FRAMEWORK FOR ROBUSTNESS OF (DEEP) CLASSIFIERS UNDER ADVERSARIAL EXAMPLES Beilun Wang, Ji Gao and Yanjun Qi Department of Computer Science, University of Virginia Problem Setting: Define Adversarial Examples: Towards Principled Solutions (for DNNs): Our theorems suggest a list of possible solutions that may improve the robustness of DNN classifiers against adversarial samples. Options include, like (1) learning a better 1 2 ; (2) modifying unnecessary features (See Poster DeepMask-Tuesday Morning W18 ). For (1), the alternative method for hardening the DNN models is minimizing some loss functions 3 4 5 (7, 7′) so that when : . (; . 7 ,; . (7′)) < = (approximated by (>, ∥⋅∥)), this loss 3 4 5 (7, 7′) is small. A table of comparing existing hardening solutions using this method is shown as following: Experiment Evaluation Define (A B , C)-Strong-robustness: Why DNN model is not strong-robust. Why a classifier is vulnerable to adversarial samples. Sufficient Condition for Strong-robustness: Strong-robustness for D . Experimental Evaluation: Towards Principled Understanding

Upload: others

Post on 14-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A THEORETICAL FRAMEWORK FOR ROBUSTNESS OF ......A THEORETICAL FRAMEWORK FOR ROBUSTNESS OF (DEEP) CLASSIFIERS UNDER ADVERSARIAL EXAMPLES Beilun Wang, Ji Gao and Yanjun Qi Department

ATHEORETICALFRAMEWORKFORROBUSTNESSOF(DEEP)CLASSIFIERSUNDERADVERSARIALEXAMPLES

BeilunWang,JiGaoandYanjun QiDepartmentofComputerScience,UniversityofVirginia

ProblemSetting:

DefineAdversarialExamples:

TowardsPrincipledSolutions(forDNNs):

OurtheoremssuggestalistofpossiblesolutionsthatmayimprovetherobustnessofDNNclassifiersagainstadversarialsamples.Optionsinclude,like(1)learningabetter12 ;(2)modifyingunnecessaryfeatures(SeePosterDeepMask-TuesdayMorningW18).

• For(1),thealternativemethodforhardeningtheDNNmodelsisminimizingsomelossfunctions345(7, 7′)sothatwhen:.(;. 7 , ;.(7′)) < =(approximatedby(>, ∥⋅∥)),thisloss345(7, 7′)issmall.Atableofcomparingexistinghardeningsolutionsusingthismethodisshownasfollowing:

ExperimentEvaluation

Define(AB, C)-Strong-robustness:

WhyDNNmodelisnotstrong-robust.

Whyaclassifierisvulnerabletoadversarialsamples.

SufficientConditionforStrong-robustness:

Strong-robustness forD.

ExperimentalEvaluation:

TowardsPrincipledUnderstanding