dlime: a deterministic local interpretable model-agnostic ...dlime: a deterministic local...

6
DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Approach for Computer-Aided Diagnosis Systems Muhammad Rehman Zafar Ryerson University Toronto, Canada [email protected] Naimul Mefraz Khan Ryerson University Toronto, Canada [email protected] ABSTRACT Local Interpretable Model-Agnostic Explanations (LIME) is a popu- lar technique used to increase the interpretability and explainability of black box Machine Learning (ML) algorithms. LIME typically generates an explanation for a single prediction by any ML model by learning a simpler interpretable model (e.g. linear classifier) around the prediction through generating simulated data around the in- stance by random perturbation, and obtaining feature importance through applying some form of feature selection. While LIME and similar local algorithms have gained popularity due to their simplic- ity, the random perturbation and feature selection methods result in instability in the generated explanations, where for the same prediction, different explanations can be generated. This is a critical issue that can prevent deployment of LIME in a Computer-Aided Diagnosis (CAD) system, where stability is of utmost importance to earn the trust of medical professionals. In this paper, we propose a deterministic version of LIME. Instead of random perturbation, we utilize agglomerative Hierarchical Clustering (HC) to group the training data together and K-Nearest Neighbour (KNN) to select the relevant cluster of the new instance that is being explained. After finding the relevant cluster, a linear model is trained over the selected cluster to generate the explanations. Experimental results on three different medical datasets show the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME), where we quantitatively determine the stability of DLIME compared to LIME utilizing the Jaccard similarity among multiple generated explanations. KEYWORDS Explainable AI (XAI), Interpretable Machine Learning, Explanation, Model Agnostic, LIME, Healthcare, Deterministic ACM Reference Format: Muhammad Rehman Zafar and Naimul Mefraz Khan. 2019. DLIME: A De- terministic Local Interpretable Model-Agnostic Explanations Approach for Computer-Aided Diagnosis Systems. In Proceedings of Anchorage ’19: ACM SIGKDD Workshop on Explainable AI/ML (XAI) for Accountability, Fairness, and Transparency (Anchorage ’19). ACM, New York, NY, USA, 6 pages. 1 INTRODUCTION AI and ML has has become a key element in medical imaging and precision medicine over the past few decades. However, most widely Anchorage ’19, August 04–08, 2019, Anchorage, AK © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of Anchorage ’19: ACM SIGKDD Workshop on Explainable AI/ML (XAI) for Accountability, Fairness, and Transparency (Anchorage ’19),. used ML models are opaque. In the health sector, sometimes the binary “yes” or “no” answer is not sufficient and questions like “how” or “where” something occurred is more significant. To achieve transparency, quite a few interpretable and explainable models have been proposed in recent literature. These approaches can be grouped based on different criterion [10, 20, 23] such as i) Model agnostic or model specific ii) Local, global or example based iii) Intrinsic or post-hoc iv) Perturbation or saliency based. Among them, model agnostic approaches are quite popular in practice, where the target is to design a separate algorithm that can explain the decision making process of any ML model. LIME [25] is a well-known model agnostic algorithm. LIME is an instance- based explainer, which generates simulated data points around an instance through random perturbation, and provides explanations by fitting a sparse linear model over predicted responses from the perturbed points. The explanations of LIME are locally faithful to an instance regardless of classifier type. The process of perturbing the points randomly makes LIME a non-deterministic approach, lacking “stability”, a desirable property for an interpretable model, especially in CAD systems. In this paper, a Deterministic Local Interpretable Model-Agnostic Explanations (DLIME) framework is proposed. DLIME uses Hierar- chical Clustering (HC) to partition the dataset into different groups instead of randomly perturbing the data points around the instance. The adoption of HC is based on its deterministic characteristic and simplicity of implementation. Also, HC does not require prior knowledge of clusters and its output is a hierarchy, which is more useful than the unstructured set of clusters returned by flat clus- tering such as K-means [19]. Once the cluster membership of each training data point is determined, for a new test instance, KNN clas- sifier is used to find the closest similar data points. After that, all the data points belonging to the predominant cluster is used to train a linear regression model to generate the explanations. Utilizing HC and KNN to generate the explanations instead of random perturba- tion results in consistent explanations for the same instance, which is not the case for LIME. We demonstrate this behavior through experiments on three benchmark datasets from the UCI repository, where we show both qualitatively and quantitatively how the expla- nations generated by DLIME are consistent and stable, as opposed to LIME. 2 RELATED WORK For brevity, we restrict our literature review to locally interpretable models, which encourage understanding of learned relationship be- tween input variable and target variable over small regions. Local in- terpretability is usually applied to justify the individual predictions made by a classifier for an instance by generating explanations. arXiv:1906.10263v1 [cs.LG] 24 Jun 2019

Upload: others

Post on 21-Mar-2020

45 views

Category:

Documents


0 download

TRANSCRIPT

DLIME: A Deterministic Local Interpretable Model-AgnosticExplanations Approach for Computer-Aided Diagnosis Systems

Muhammad Rehman ZafarRyerson UniversityToronto, Canada

[email protected]

Naimul Mefraz KhanRyerson UniversityToronto, Canada

[email protected]

ABSTRACTLocal Interpretable Model-Agnostic Explanations (LIME) is a popu-lar technique used to increase the interpretability and explainabilityof black box Machine Learning (ML) algorithms. LIME typicallygenerates an explanation for a single prediction by anyMLmodel bylearning a simpler interpretable model (e.g. linear classifier) aroundthe prediction through generating simulated data around the in-stance by random perturbation, and obtaining feature importancethrough applying some form of feature selection. While LIME andsimilar local algorithms have gained popularity due to their simplic-ity, the random perturbation and feature selection methods resultin instability in the generated explanations, where for the sameprediction, different explanations can be generated. This is a criticalissue that can prevent deployment of LIME in a Computer-AidedDiagnosis (CAD) system, where stability is of utmost importanceto earn the trust of medical professionals. In this paper, we proposea deterministic version of LIME. Instead of random perturbation,we utilize agglomerative Hierarchical Clustering (HC) to group thetraining data together and K-Nearest Neighbour (KNN) to selectthe relevant cluster of the new instance that is being explained.After finding the relevant cluster, a linear model is trained overthe selected cluster to generate the explanations. Experimentalresults on three different medical datasets show the superiorityfor Deterministic Local Interpretable Model-Agnostic Explanations(DLIME), where we quantitatively determine the stability of DLIMEcompared to LIME utilizing the Jaccard similarity among multiplegenerated explanations.

KEYWORDSExplainable AI (XAI), Interpretable Machine Learning, Explanation,Model Agnostic, LIME, Healthcare, DeterministicACM Reference Format:Muhammad Rehman Zafar and Naimul Mefraz Khan. 2019. DLIME: A De-terministic Local Interpretable Model-Agnostic Explanations Approach forComputer-Aided Diagnosis Systems. In Proceedings of Anchorage ’19: ACMSIGKDD Workshop on Explainable AI/ML (XAI) for Accountability, Fairness,and Transparency (Anchorage ’19). ACM, New York, NY, USA, 6 pages.

1 INTRODUCTIONAI and ML has has become a key element in medical imaging andprecisionmedicine over the past few decades. However, most widely

Anchorage ’19, August 04–08, 2019, Anchorage, AK© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.This is the author’s version of the work. It is posted here for your personal use. Notfor redistribution. The definitive Version of Record was published in Proceedings ofAnchorage ’19: ACM SIGKDD Workshop on Explainable AI/ML (XAI) for Accountability,Fairness, and Transparency (Anchorage ’19), .

used ML models are opaque. In the health sector, sometimes thebinary “yes” or “no” answer is not sufficient and questions like “how”or “where” something occurred is more significant. To achievetransparency, quite a few interpretable and explainable modelshave been proposed in recent literature. These approaches can begrouped based on different criterion [10, 20, 23] such as i) Modelagnostic or model specific ii) Local, global or example based iii)Intrinsic or post-hoc iv) Perturbation or saliency based.

Among them, model agnostic approaches are quite popular inpractice, where the target is to design a separate algorithm that canexplain the decision making process of any ML model. LIME [25]is a well-known model agnostic algorithm. LIME is an instance-based explainer, which generates simulated data points around aninstance through random perturbation, and provides explanationsby fitting a sparse linear model over predicted responses from theperturbed points. The explanations of LIME are locally faithful toan instance regardless of classifier type. The process of perturbingthe points randomly makes LIME a non-deterministic approach,lacking “stability”, a desirable property for an interpretable model,especially in CAD systems.

In this paper, a Deterministic Local Interpretable Model-AgnosticExplanations (DLIME) framework is proposed. DLIME uses Hierar-chical Clustering (HC) to partition the dataset into different groupsinstead of randomly perturbing the data points around the instance.The adoption of HC is based on its deterministic characteristicand simplicity of implementation. Also, HC does not require priorknowledge of clusters and its output is a hierarchy, which is moreuseful than the unstructured set of clusters returned by flat clus-tering such as K-means [19]. Once the cluster membership of eachtraining data point is determined, for a new test instance, KNN clas-sifier is used to find the closest similar data points. After that, all thedata points belonging to the predominant cluster is used to train alinear regression model to generate the explanations. Utilizing HCand KNN to generate the explanations instead of random perturba-tion results in consistent explanations for the same instance, whichis not the case for LIME. We demonstrate this behavior throughexperiments on three benchmark datasets from the UCI repository,where we show both qualitatively and quantitatively how the expla-nations generated by DLIME are consistent and stable, as opposedto LIME.

2 RELATEDWORKFor brevity, we restrict our literature review to locally interpretablemodels, which encourage understanding of learned relationship be-tween input variable and target variable over small regions. Local in-terpretability is usually applied to justify the individual predictionsmade by a classifier for an instance by generating explanations.

arX

iv:1

906.

1026

3v1

[cs

.LG

] 2

4 Ju

n 20

19

Anchorage ’19, August 04–08, 2019, Anchorage, AK Muhammad Rehman Zafar and Naimul Mefraz Khan

LIME [25] is one of the first locally interpretable models, whichgenerates simulated data points around an instance through ran-dom perturbation, and provides explanations by fitting a sparselinear model over predicted responses from the perturbed points.In [26], LIME was extended using decision rules. In the same vein,leave-one covariate-out (LOCO) [16] is another popular techniquefor generating local explanation models that offer local variableimportance measures. In [11], authors proposed an approach topartition the dataset using K-means instead of perturbing the datapoints around an instance being explained. Authors in [13] pro-posed an approach to partition the dataset using a supervised treebased approach. Authors in [15] used LIME in precision medicineand discussed the importance of interpretablility to understand thecontribution of important features in decision making.

Authors in [27] proposed a method to decompose the predic-tions of a classifier on individual contribution of each feature. Thismethodology is based on computing the difference between originalpredictions and predictions made by eliminating a set of features.In [17], authors have demonstrated the equivalence among variouslocal interpretable models [4, 8, 28] and also introduced a gametheory based approach to explain the model named SHAP (SHapleyAdditive exPlanations). Baehrens et al. [1] proposed an approachto yield local explanations using the local gradients that depictthe movement of data points to change its expected label. A sim-ilar approach was used in [29–32] to explain and understand thebehaviour of image classification models.

One of the issues of the existing locally interpretable modelsis lack of “stability”. In [9], this is defined as “explanation leveluncertainty”, where the authors show that explanations generatedby different locally interpretable models have an amount of uncer-tainty associated with it due to the simplification of the black boxmodel. In this paper, we address this issue at a more granular level.The basic question that we want to answer is: can explanationsgenerated by a locally interpretable model provide consistent resultsfor the same instance? As we will see in the experimental resultssection, due to the random nature of perturbation in LIME, for thesame instance, the generated explanations can be different, withdifferent selected features and feature weights. This can reduce thehealthcare practitioner’s trust in the ML model. Hence, our target isto increase the stability of the interpretable model. Stability in ourwork specifically refers to intensional stability of feature selectionmethod, which can be measured by the variability in the set offeatures selected [14, 21]. Measures of stability include average Jac-card similarity and Pearson’s correlation among all pairs of featuresubsets selected from different training sets generated using crossvalidation, jacknife or bootstrap.

3 METHODOLOGYBefore explaining DLIME, we briefly describe the LIME framework.LIME is a surrogate model that is used to explain the predictionsof an opaque model individually. The objective of LIME is to trainsurrogate models locally and explain individual prediction. Figure 1shows a high level block diagram of LIME. It generates a syntheticdataset by randomly permuting the samples around an instancefrom a normal distribution, and gathers corresponding predictionsusing the opaque model to be explained. Then, on this perturbed

Test instanceRandom data perturbation

Weight the new samples

Train a weighted, interpretable model

UserInterpretable

RepresentationFeature selection

Figure 1: A block diagram of the LIME framework

dataset, LIME trains an interpretable model e.g. linear regression.Linear regression maintains relationships amongst variables whichare dependent such as Y and multiple independent attributes suchasX by utilizing a regression lineY = a+bX , where “a” is intercept,“b” is slope of the line. This equation can be used to predict the valueof target variable from given predictor variables. In addition to that,LIME takes as an input the number of important features to be usedto generate the explanation, denoted by K . The lower the valueof K , the easier it is to understand the model. There are severalapproaches to select the K important features such as i) backwardor forward selection of features and ii) highest weights of linearregression coefficients. LIME uses the forward feature selectionmethod for small datasets which have less than 6 attributes, andhighest weights approach for higher dimensional datasets.

As discussed before, a big problem with LIME is the “instability”of generated explanations due to the random sampling process.Because of the randomness, the outcome of LIME is different whenthe sampling process is repeated multiple times, as shown in ex-periments. The process of perturbing the points randomly makesLIME a non-deterministic approach, lacking “stability”, a desirableproperty for an interpretable model, especially in CAD systems.

3.1 DLIMEIn this section, we present Deterministic Local Interpretable Model-Agnostic Explanations (DLIME), where the target is to generateconsistent explanations for a test instance. Figure 2 shows the blockdiagram of DLIME.

Test instancePredict cluster

label using KNNWeight the new

samplesTrain a weighted,

interpretable model

UserInterpretable

RepresentationFeature Selection

Input dataHierarchical Clustering

Pick cluster

Figure 2: A block diagram of the DLIME framework

The key idea behind DLIME is to utilize HC to partition thetraining dataset into different clusters. Then to generate a set ofsamples and corresponding predictions (similar to LIME), instead ofrandom perturbation, KNN is first used to find the closest neighbors

DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Anchorage ’19, August 04–08, 2019, Anchorage, AK

to the test instance. The samples with the majority cluster labelamong the closest neighbors are used as the set of samples to trainthe linear regression model that can generate the explanations. Thedifferent components of the proposed method are explained furtherbelow.

3.1.1 Hierarchical Clustering (HC). HC is the most widely usedunsupervised ML approach, which generates a binary tree withcluster memberships from a set of data points. The leaves of thetree represents data points and nodes represents nested clustersof different sizes. There are two main approaches to hierarchicalclustering: divisive and agglomerative clustering. Agglomerativeclustering follows the bottom-up approach and divisive clusteringuses the top-down approach to merge the similar clusters. Thisstudy uses the traditional agglomerative approach for HC as dis-cussed in [7]. Initially it considers every data point as a clusteri.e. it starts with N clusters and merge the most similar groupsiteratively until all groups belong to one cluster. HC uses euclideandistance between closest data points or clusters mean to computethe similarity or dissimilarity of neighbouring clusters [12].

One important step to use HC for DLIME is determining theappropriate number of clusters C . HC is in general visualized asa dendrogram. In dendrogram each horizontal line represents amerge and the y-coordinate of it is the similarity of the two mergedclusters. A vertical line shows the gap between two successiveclusters. By cutting the dendrogram where the gap is the largestbetween two successive groups, we can determine the value of C .As we can see in Figure 3 the largest gap between two clusters is atlevel 2 for the breast cancer dataset. Since all the datasets we haveused for the experiments are binary, C = 2 was always the choicefrom the dendrogram. For multi-class problems, the value ofC maychange.

461

212 82 521

503

236

339

180

352

265

368

567

233

260

563

237

499

365

535

372 95 565 2

302

366 78 517 4

321

210

533

417 77 72 168

460

129

343

127

161

432

244

451 6 87 121 45 134

433 85 498

198

280

487

282 30 42 56 162 0 24 564

181

393

250

449 33 70 337

323 1

373

218

300

254

252

256

122

202

164

369

108

272 23 18 219 79 436

312

519

228

454

496

143

247 20 331

135

407

306

384

124

361

457

543

208

494 76 178

246 58 150

458 81 402

200

367

490

497

383 74 409

552

554

322

475

195

268

284

356

191

397

360

429 69 294

325

362

285

418 37 292

403

179

287

336

386

404

493

111

319

355

502

241

315

107

334

102

480

163

301

130

170

189

115

146

401 48 324

286

529

399 84 346

394

471

473

155

527 88 485

506

316

431 96 382

145

349 50 187

344

158 52 327

216

293

415

452

445

450 98 120

142 55 251

109

410

456

106

420 80 469 3

211

531

297

281

350

488

160

136

530 41 379

427

549

405

440

159

411

242

380

515

333

398 67 289

183

540

137

276

313

478

348

422

419

271

354

388

231

232

275

510

188

305

310

463

249

304

537

559

359

248

185

426

266

174

269

173

296

428

166

550

507

534

139

288

561

522

332

551

144

474

153

342

381

391

358

318

504

116 71 307 61 114 59 525

101 46 151

175

314

539

538

568

555

556

345

547

376

113

303

299

443

226

390

320

546

217

104

245 60 152

424

425

222

338

176

524

467

470

206

140

234

416

459

273

341

103

548 97 110

192

412

553 66 557 21 505 63 68 520

258 75 141

392 28 11 17 34 118

182

370 10 132

489

328

566

203 16 117

131

172

223 25 239

264 27 186

317

516

446

468

492 12 83 389

277

197

491

253

261 29 201 35 167

207

230

119

400

444 53 274

408 32 335

441

156

262

157

213

283

330

406

479

508

329

177

363

184

205 22 514

353 54 94 259

263

138 15 65 57 62 171

199

225

340

513

413

500 26 347

472

133

351

257

414

209

509

562 13 375

484 91 371

278

545

215

387

438

560

243

442

125

149

235

477 38 462

558

112

536

148

123

453

326

437

190

238 39 465 47 8 44 5 193

204

518 9

229

528

279

374 51 295

378

270

395

439

357

544

224

526

455 49 93 221

240

483

308 19 220 14 423

523

309

154

464

466

364

482

267

377

396 31 64 43 7 100

105 92 512

126

435

476

255

481 99 298 36 214 40 532 73 196

501

194

169

165

495

434

511

128

385

311

542

447

448

486

430

147

227

291

290 89 421 86 90 541

0

2500

5000

7500

10000

12500

15000

17500

Figure 3: Dendrogram of breast cancer dataset

3.1.2 K-Nearest Neighbor (KNN). Similar to other clustering ap-proaches, HC does not assign labels to new instances. Therefore,KNN is trained over the training dataset to find the indices of theneighbours and predict the label of new instance. KNN is a simpleclassification model based on Euclidean distance [3]. It computesthe distance between training and test sets. Let xi be an instancethat belongs to a training datasetDtrain of size n where, i in rangeof 1, 2, ...,n. Each instance xi has m features (xi1,xi2, ...,xim ). InKNN, the Euclidean distance between new instance x and a training

instance xi is computed. After computing the distance, indices ofthe k smallest distances are called k -nearest neighbors. Finally,the cluster label for the test instance is assigned the majority labelamong the k-nearest neighbors. After that, all the data points be-longing to the predominant class is used to train a linear regressionmodel to generate the explanations. In our experiments, k = 1 wasused, as higher values of k did not make a difference in the results.

Utilizing the notations introduced above, Algorithm 1 formallypresents the proposed DLIME framework.

Algorithm 1: Deterministic Local Interpretable Model-Agnostic ExplanationsInput: Dataset Dtrain , Instance x , length of explanation K

1 Initialize Y ← {}2 Initialize clusters for i in 1 . . . N do3 Ci ← {i}4 end5 Initialize clusters to merge S ← for i in 1 . . . N6 while no more clusters are available for merging do7 Pick two most similar clusters with minimum distance d :

(j,k) ← arдmind (j,k) ∈ S8 Create new cluster Cl ← Cj ∪Ck9 Mark j and k unavailable to merge

10 if Cl , i in 1 . . .N then11 Mark l as available, S ← S ∪ {l}12 end13 foreach i ∈ S do14 Update similarity matrix by computing distance d(i, l)15 end16 end17 while i in 1, ... , n do18 d(xi ,x) =

√(xi1 − x1)2 + · · · + (xim − xm )2

19 end20 ind ← Find indices for the k smallest distance d(xi ,x)21 y ← Get majority label for x ∈ ind22 ns ← Filter Dtrain based on y23 foreach i in 1,... ,n do24 Y ← Pairwise distance of each instance in cluster ns with

the original instance x25 end26 ω ← LinearRegression(ns ,Y,K)27 return ω

4 EXPERIMENTS4.1 DatasetTo conduct the experimentswe have used the following three health-care datasets from the UCI repository [6].

4.1.1 Breast Cancer. The first case study is demonstrated by usingthe widely adopted Breast CancerWisconsin (Original) dataset. Thedataset consists of 699 observations and 11 features [18].

Anchorage ’19, August 04–08, 2019, Anchorage, AK Muhammad Rehman Zafar and Naimul Mefraz Khan

4.1.2 Liver Patients. The second case study is demonstrated byusing an Indian liver patient dataset. The dataset consists of 583observations and 11 features [24].

4.1.3 Hepatitis Patients. The third case study is demonstrated byusing hepatitis patients dataset. The dataset consists of 20 featuresand 155 observations out of which only 80 were used to conductthe experiment after removing the missing values [5].

4.2 Opaque ModelsWe trained two opaque models from the scikit-learn package [22]over the three datasets: Random Forest (RF) and Neural Networks(NN), which are difficult to interpret without an explainable model.

4.2.1 Random Forest. Random Forest (RF) is a supervised machinelearning algorithm that can be used for both regression and clas-sification [2]. RF constructs many decision trees and selects theone which best fits on the input data. The performance of RF isbased on the correlation and strength of each tree. If the correlationbetween two trees is high the error will also be high. The tree withthe lowest error rate is the strength of the classifier.

4.2.2 Neural Network. Neural Network (NN) is a biologically in-spired model, which is composed of a large number of highly in-terconnected neurons working concurrently to resolve particularproblems. There are different types of NNs, among which we pickedthe popular feed forward artificial NN. It typically has multiple lay-ers and is trained with the backpropagation. The particular NN weutilized has two hidden layers. The first hidden layer has 5 hiddenunits and the second hidden layer has 2 hidden units.

For both opaque models, 80% data is used for training and theremaining 20% data is used for evaluation. Here, it is worth men-tioning that, both NN and RF models scored over 90% accuracy oneach dataset which is reasonable. Their performance can be im-proved by hyper parameter tuning. However, our aim is to producedeterministic explanations, therefore we have not spent additionaleffort to further tune these models.

After training these opaque models, both LIME (default pythonimplementation) and DLIME were used to generate explanationsfor a randomly selected test instance. For the same instance, 10iterations of both algorithms were executed to determine stability.

4.3 ResultsFig. 4 shows the results for two iterations of explanations generatedby DLIME and LIME for a randomly selected test instance with thetrained NN on the breast cancer dataset. On the left hand side inFig. 4 (a) and Fig. 4 (c) are the explanations generated by DLIME, andon the right hand side Fig. 4 (b) and Fig. 4 (d) are the explanationsgenerated by LIME. The red bars in Fig. 4 (a), shows the negativecoefficients and green bars shows the positive coefficients of thelinear regression model. The positive coefficients shows the positivecorrelation among the dependent and independent attributes. Onthe other hand negative coefficients shows the negative correlationamong the dependent and independent attributes.

As we can see, LIME is producing different explanations for thesame test instance. The yellow highlighted attributes in Fig. 4 (b) aredifferent from those in Fig. 4 (d). On the other hand, explanationswhich are generated with DLIME are deterministic and stable as

Table 1: Average Jdistance after 10 iterations

Dataset Opaque Model DLIME LIMEBreast Cancer RF 0 9.43%Breast Cancer NN 0 57.95%Liver Patients RF 0 17.87%Liver Patients NN 0 55.00%Hepatitis Patients RF 0 16.46%Hepatitis Patients NN 0 39.04%

shown in Fig. 4 (a) and Fig. 4 (b). Here, it is worth mentioningthat, both DLIME and LIME frameworks may generate differentexplanations. LIME is using 5000 randomly perturbed data pointsaround an instance to generate the explanations. On the other hand,DLIME is using only clusters from the original dataset. For DLIMElinear regression, the dataset has less number of data points sinceit is limited by the size of the cluster. Therefore, the explanationsgenerated with LIME and DLIME can be different. However, DLIMEexplanations are stable, while those generated by LIME are not.

To further quantify the stability of the explanations, we haveused the Jaccard coefficient. The Jaccard coefficient is a similarityand diversity measure among finite sets. It computes the similaritybetween two sets of data points by computing the number of ele-ments in intersection divided by the number of elements in union.The mathematical notation is given in equation 1.

J (S1, S2) =|S1 ∩ S2 ||S1 ∪ S2 |

(1)

Where, S1 and S2 are the two sets of explanations. The result ofJ (S1, S2) = 1 means S1 and S2 are highly similar sets. J (S1, S2) = 0when |S1 ∩ S2 | = 0, implying that S1 and S2 are highly dissimilarsets.

Based on the Jaccard coefficient, Jaccard distance is defined inequation 2.

Jdistance = 1 − J (S1, S2) (2)To quantify the stability of DLIME and LIME, after 10 iterations,

Jdistance is computed among the generated explanations.Fig. 4 (e) and Fig. 4 (f) shows the Jdistance . It is a 10 × 10 matrix.

The diagonal of this matrix is 0 that shows the Jdistance of theexplanation with itself and lower and upper diagonal shows theJdistance of explanations from each other. Lower and upper diago-nal are representing the same information. It can be observed that,the Jdistance in Fig. 4 (e) is 0 which means the generated explana-tions with DLIME are deterministic and stable on each iteration.However, for LIME, as we can see, the Jdistance contain significantvalues, further proving the instability of LIME.

Finally, Table 1 lists the average Jdistance obtained for DLIMEand LIME utilizing the two opaque models on the three datasets(10 iterations for one randomly selected test instance). As we cansee, in every scenario, the Jdistance for DLIME is zero, while forLIME it contains significant values, demonstration the stability ofDLIME when compared with LIME.

5 CONCLUSIONIn this paper, we propose a deterministic approach to explain thedecisions of black box models. Instead of random perturbation,

DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Anchorage ’19, August 04–08, 2019, Anchorage, AK

0.2 0.1 0.0 0.1 0.2

0.02 < symmetry error <= 0.02area error > 45.28

worst texture > 29.88worst radius > 18.55

0.01 < smoothness error <= 0.01worst concave points > 0.16

mean texture > 21.880.10 < mean smoothness <= 0.110.12 < worst smoothness <= 0.13

0.25 < worst symmetry <= 0.28

Local explanation for class: malignant

(a) Iteration 1: Explanations generated with DLIME for NN (b) Iteration 1: Explanations generated with LIME for NN

0.2 0.1 0.0 0.1 0.2

0.02 < symmetry error <= 0.02area error > 45.28

worst texture > 29.88worst radius > 18.55

0.01 < smoothness error <= 0.01worst concave points > 0.16

mean texture > 21.880.10 < mean smoothness <= 0.110.12 < worst smoothness <= 0.13

0.25 < worst symmetry <= 0.28

Local explanation for class: malignant

(c) Iteration 2: Explanations generated with DLIME for NN (d) Iteration 2: Explanations generated with LIME for NN

0 2 4 6 8

0

2

4

6

8

0.100

0.075

0.050

0.025

0.000

0.025

0.050

0.075

0.100

(e) Jaccard distance of DLIME explanations

0 2 4 6 8

0

2

4

6

8

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

(f) Jaccard distance of LIME explanations

Figure 4: Explanations generated by DLIME and LIME, and respective Jaccard distances over 10 iterations

DLIME uses HC to group the similar data in local regions, andutilizes KNN to find cluster of data points that are similar to atest instance. Therefore, it can produce deterministic explanationsfor a single instance. On the other hand, LIME and other similarmodel agnostic approaches based on random perturbing may keepchanging their explanation on each iteration, creating distrust par-ticularly in medical domain where consistency is highly required.We have evaluated DLIME on three healthcare datasets. The exper-iments clearly demonstrate that the explanations generated withDLIME are stable on each iteration, while LIME generates unstableexplanations.

Since DLIME depends on hierarchical clustering to find similardata points, the number of samples in a dataset may affect thequality of clusters, and consequently, the accuracy of the localpredictions. In future, we plan to investigate how to solve this

issue while keeping the model explanations stable. We also plan toexperiment with other data types, such as images and text.

Keeping up with the sipirit of reproducible research, all datasetsand source code can be accessed through the Github repositoryhttps://github.com/rehmanzafar/dlime_experiments.git.

REFERENCES[1] David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja

Hansen, and Klaus-Robert MÞller. 2010. How to explain individual classificationdecisions. Journal of Machine Learning Research 11, Jun (2010), 1803–1831.

[2] Gérard Biau and Erwan Scornet. 2016. A random forest guided tour. Test 25, 2(2016), 197–227.

[3] T. Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Transac-tions on Information Theory 13, 1 (January 1967), 21–27. https://doi.org/10.1109/TIT.1967.1053964

[4] Piotr Dabkowski and Yarin Gal. 2017. Real Time Image Saliency for Black BoxClassifiers. In Advances in Neural Information Processing Systems 30, I. Guyon,U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett

Anchorage ’19, August 04–08, 2019, Anchorage, AK Muhammad Rehman Zafar and Naimul Mefraz Khan

(Eds.). Curran Associates, Inc., 6967–6976.[5] Persi Diaconis and Bradley Efron. 1983. Computer-intensive methods in statistics.

Scientific American 248, 5 (1983), 116–131.[6] Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http:

//archive.ics.uci.edu/ml[7] Richard O Duda, Peter E Hart, et al. 1973. Pattern classification and scene analysis.

Vol. 3. Wiley New York.[8] Ruth C. Fong and Andrea Vedaldi. 2017. Interpretable Explanations of Black Boxes

by Meaningful Perturbation. In The IEEE International Conference on ComputerVision (ICCV).

[9] Alicja Gosiewska and Przemyslaw Biecek. 2019. iBreakDown: Uncertaintyof Model Explanations for Non-additive Predictive Models. arXiv preprintarXiv:1903.11420 (2019).

[10] Riccardo Guidotti and Salvatore Ruggieri. 2018. Assessing the Stability of Inter-pretable Models. arXiv preprint arXiv:1810.09352 (2018).

[11] Patrick Hall, Navdeep Gill, Megan Kurka, andWen Phan. 2017. Machine LearningInterpretability with H2O Driverless AI.

[12] Katherine A Heller and Zoubin Ghahramani. 2005. Bayesian hierarchical clus-tering. In Proceedings of the 22nd international conference on Machine learning.ACM, 297–304.

[13] Linwei Hu, Jie Chen, Vijayan N Nair, and Agus Sudjianto. 2018. Locally inter-pretable models and effects based on supervised partitioning (LIME-SUP). arXivpreprint arXiv:1806.00663 (2018).

[14] Alexandros Kalousis, Julien Prados, andMelanie Hilario. 2007. Stability of FeatureSelection Algorithms: A Study on High-dimensional Spaces. Knowl. Inf. Syst. 12,1 (May 2007), 95–116.

[15] Gajendra Jung Katuwal and Robert Chen. 2016. Machine learning model inter-pretability for precision medicine. arXiv preprint arXiv:1610.09045 (2016).

[16] Jing Lei, Max GâĂŹSell, Alessandro Rinaldo, Ryan J Tibshirani, and LarryWasser-man. 2018. Distribution-free predictive inference for regression. J. Amer. Statist.Assoc. 113, 523 (2018), 1094–1111.

[17] Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting ModelPredictions. In Advances in Neural Information Processing Systems 30, I. Guyon,U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett(Eds.). Curran Associates, Inc., 4765–4774.

[18] Olvi L Mangasarian, W Nick Street, and William H Wolberg. 1995. Breast cancerdiagnosis and prognosis via linear programming. Operations Research 43, 4 (1995),570–577.

[19] Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. In-troduction to Information Retrieval. Cambridge University Press, New York, NY,USA.

[20] Christoph Molnar. 2019. Interpretable Machine Learning. Online. https://christophm.github.io/interpretable-ml-book/.

[21] Sarah Nogueira and Gavin Brown. 2016. "Measuring the Stability of FeatureSelection". In "European Conference Proceedings, Part I, ECML PKDD 2016, Rivadel Garda, Italy, September 19-23, 2016" (Lecture Notes in Artificial Intelligence).Springer Press, 442–457.

[22] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M.Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-napeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: MachineLearning in Python . Journal of Machine Learning Research 12 (2011), 2825–2830.

[23] Gregory Plumb, Denali Molitor, and Ameet S Talwalkar. 2018. Model AgnosticSupervised Local Explanations. In Advances in Neural Information ProcessingSystems. 2520–2529.

[24] Bendi Venkata Ramana, M Surendra Prasad Babu, and NB Venkateswarlu. 2011.A critical study of selected classification algorithms for liver disease diagnosis.International Journal of Database Management Systems 3, 2 (2011), 101–114.

[25] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should itrust you?: Explaining the predictions of any classifier. In Proceedings of the 22ndACM SIGKDD international conference on knowledge discovery and data mining.ACM, 1135–1144.

[26] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference onArtificial Intelligence.

[27] M. Robnik-Åăikonja and I. Kononenko. 2008. Explaining Classifications ForIndividual Instances. IEEE Transactions on Knowledge and Data Engineering 20, 5(May 2008), 589–600.

[28] Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. 2017. Rightfor the Right Reasons: Training Differentiable Models by Constraining theirExplanations. In Proceedings of the Twenty-Sixth International Joint Conference onArtificial Intelligence, IJCAI-17. 2662–2670. https://doi.org/10.24963/ijcai.2017/371

[29] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep InsideConvolutional Networks: Visualising Image Classification Models and SaliencyMaps. CoRR abs/1312.6034 (2013).

[30] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attributionfor Deep Networks. In Proceedings of the 34th International Conference on MachineLearning - Volume 70 (ICML 17). JMLR.org, 3319–3328.

[31] Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convo-lutional Networks. In Computer Vision – ECCV 2014, David Fleet, Tomas Pajdla,Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing,818–833.

[32] Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba.2016. Learning Deep Features for Discriminative Localization. In The IEEE Con-ference on Computer Vision and Pattern Recognition (CVPR).