explainable ai – overvie€¦ · computing for life sciences. he led the establishment of sfsu...

105
AI Ethics and Trust: Explainable AI Part I overview Part II - case study of R andom F orest Ex plainability (RFEX) method Prof. D. Petkovic https://cs.sfsu.edu/people/faculty/dragutin-petkovic with help from A. Alavi, D. Cai, S. Barlaskar, J. Yang San Francisco State University, USA April 2020 04/25/20 FINAL 1 Copyright: Dragutin Petkovic unless noted otherwise

Upload: others

Post on 13-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

AI Ethics and Trust: Explainable AI

Part I – overview Part II - case study of Random Forest

Explainability (RFEX) method

Prof. D. Petkovic https://cs.sfsu.edu/people/faculty/dragutin-petkovic

with help from A. Alavi, D. Cai, S. Barlaskar, J. Yang

San Francisco State University, USA April 2020

04/25/20 FINAL

1 Copyright: Dragutin Petkovic unless noted

otherwise

Page 2: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

About the speaker

• Prof. D. Petkovic obtained his Ph.D. at UC Irvine, in the area of biomedical image processing. He spent over 15 years at IBM Almaden Research Center as a scientist and in various management roles. His contributions ranged from use of computer vision for inspection, to multimedia and content management systems. He is the founder of IBM’s well-known QBIC (query by image content) project, which significantly influenced the content-based retrieval field. Dr. Petkovic received numerous IBM awards for his work and became an IEEE Fellow in 1998 and IEEE LIFE Fellow in 2018 for leadership in content-based retrieval area. Dr. Petkovic also had various technical management roles in Silicon Valley startups. In 2003 Dr. Petkovic joined CS Department as a Chair and also founded SFSU Center for Computing for Life Sciences in 2005. Currently, Dr. Petkovic is the Associate Chair of the SFSU Department of Computer Science and Director of the Center for Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy. Research and teaching interests of Prof. Petkovic include Machine Learning with emphasis on Explainability and Ethics, teaching methods for Global SW Engineering and engineering teamwork, and the design and development of easy to use systems.

Copyright: Dragutin Petkovic unless noted otherwise

2

Page 3: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Outline Part I • Motivation – why AI Explainability • Example cases where AI Explainability was important • Governance, legal, societal and ethical issues related to AI Explainability • Some definitions: Explainability vs. Transparency; causality vs. correlation;

Model and Sample explainability • User driven approach to AI Explainability • What has been done today in Classic AI and DeepLearning CNN

explainability – Brief overview of well known LIME explainer (Local Interpretable Model-

Agnostic Explanations) – Briefly on explainers for deep learning and CNN

• Some thoughts for the future

Part II • Case study: SFSU Random Forest Explainability method – RFEX (Petkovic et

al) 3

Copyright: Dragutin Petkovic unless noted otherwise

Page 4: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

AI will have wide impact on society huge investments aggressive R&D and

deployments

• Drug development, medical diagnostics, health care delivery

• Autonomous cars

• Loan and credit approvals

• Recruiting

• Rental approval

• Policing and crime prevention

• News and information filtering

• Military applications

• Control of society?

It is happening FAST (too fast?)

Copyright: Dragutin Petkovic unless noted otherwise

4

Page 5: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

BUT ….

Copyright: Dragutin Petkovic unless noted otherwise

5

Page 6: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case 1: Wrong decisions in critical medical application

• AI system has been trained to predict which patients with pneumonia must be kept in ER

• Predicted well except in critical case: patients with asthma

• Reason: training DB did not contain the right data – patients with asthma have been taken care well in other interventions not recorded in training DB

– “Can AI be Taught to Explain Itself”, NY Times, November 21 2017

Copyright: Dragutin Petkovic unless noted otherwise

6

Page 7: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case 2: “Leakage” in Training Data

• AI system correctly predicted patient disease based on number of measurements and tests

All good, right? • AI actually used the (encoded) info in the training

database reflecting hospital the patients were taken to which implicitly contained info about what disease they had

• Production system would fail – S. Kaufman, S. Rosset, C. Perlich: “Leakage in Data Mining:

Formulation, Detection, and Avoidance”, ACM Transactions on Knowledge Discovery from Data 6(4):1-21, December 2012

Copyright: Dragutin Petkovic unless noted

otherwise 7

Page 8: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

12/19/19

Case 3: AI can Be biased!

Copyright: Dragutin Petkovic unless noted otherwise

But, face reco is Intended to help Policing and law enforcement

8

Page 9: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case 4: Even worst: AI can be fooled…

Face recognition

Street signs modified At strategic places Copyright: Dragutin Petkovic unless noted

otherwise 9

Arresting wrong person????

Page 10: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case 5: AI not there yet in health area - Scientific American December 2019

Copyright: Dragutin Petkovic unless noted otherwise

10

Page 11: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case 5: “Artificial Intelligence Is Rushing Into Patient Care—And Could Raise Risks” – Scientific American

12/24/19 • “Systems developed in one hospital often flop when deployed in a different facility,

Cho said. Software used in the care of millions of Americans has been shown to discriminate against minorities. And AI systems sometimes learn to make predictions based on factors that have less to do with disease than the brand of MRI machine used, the time a blood test is taken or whether a patient was visited by a chaplain. In one case, AI software incorrectly concluded that people with pneumonia were less likely to die if they had asthma, an error that could have led doctors to deprive asthma patients of the extra care they need.”

• “Doctors at New York’s Mount Sinai Hospital hoped AI could help them use chest X-rays to predict which patients were at high risk of pneumonia. Although the system made accurate predictions from X-rays shot at Mount Sinai, the technology flopped when tested on images taken at other hospitals. Eventually, researchers realized the computer had merely learned to tell the difference between that hospital’s portable chest X-rays—taken at a patient’s bedside—with those taken in the radiology department. Doctors tend to use portable chest X-rays for patients too sick to leave their room, so it’s not surprising that these patients had a greater risk of lung infection.”

Copyright: Dragutin Petkovic unless noted otherwise

11

Page 12: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Awareness and concerns are growing in general public too Genie is out! (e.g. it is not only up to scientists any more)

Copyright: Dragutin Petkovic unless noted otherwise

12

Future AI: Terminator (evil) Or StarTreck (force of good)?

Page 13: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Concerns raised the awareness and gave birth to AI Ethics

• Emergence of AI Ethics as a major topic (academia, industry, government)

• AI Explainability is integral component of AI Ethics

• Discovery and analysis of problems in all presented case studies required application of AI explainability

– NOTE: Most produced seemingly accurate classification, only upon involving explainability problems were found

13 Copyright: Dragutin Petkovic unless noted

otherwise

Page 14: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Principles of “ethical” AI Development and Deployment are Emerging at Highest

Levels • New EU General Data Protection laws (GDPR) effective May

2018 – includes strong data privacy and “right to know” how algorithms work (recital 71) – https://www.privacy-regulation.eu/en/r71.htm

• Asilomar 23 AI Principles adopted by CA legislature – https://futureoflife.org/ai-principles/

• G20 AI Principles (OECD - Organization for Economic Co-operation and Development) – https://www.oecd.org/going-digital/ai/principles/

• ACM Policy on Algorithm transparency and accountability – https://www.acm.org/binaries/content/assets/public-

policy/2017_usacm_statement_algorithms.pdf

Copyright: Dragutin Petkovic unless noted

otherwise 14

Page 15: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Also in industry (in US…)

• Google Responsible AI Practices

– https://ai.google/responsibilities/responsible-ai-practices/

• Microsoft AI Principles

– https://www.microsoft.com/en-us/AI/our-approach-to-ai

• Facebook investing in AI Ethics Institute in Munich – https://interestingengineering.com/facebook-invests-75-million-to-

launch-ai-ethics-institute

Copyright: Dragutin Petkovic unless noted otherwise

15

Page 16: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Explainability and transparency are part of all AI Ethics policies and recommendations

16 Copyright: Dragutin Petkovic unless noted

otherwise

Page 17: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

GDPR – EU (r 71)

About automated decisions:

“In any case, such processing should be subject to suitable safeguards, which should include specific information to the data subject and the right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision.

17 Copyright: Dragutin Petkovic unless noted

otherwise

Page 18: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Asilomar AI Principles – Ethics and

Values

• Safety:

• Failure Transparency: If an AI system causes harm, it should be possible to ascertain why.

• Judicial Transparency: Any involvement by an autonomous system in judicial decision-making should provide a satisfactory explanation auditable by a competent human authority.

• Responsibility:

• Value Alignment:

• Human Values:

• Personal Privacy:

• Liberty and Privacy:

• Shared Benefit:

• Shared Prosperity:

• Human Control:

• Non-subversion:

• AI Arms Race: Copyright: Dragutin Petkovic unless noted

otherwise 18

Page 19: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

3 of 5 OECD important AI

principles

• ETHICS AND CONTROL: AI systems should be designed in a way that respects the rule of law, human rights, democratic values and diversity, and they should include appropriate safeguards – for example, enabling human intervention where necessary – to ensure a fair and just society.

• EXPLAINABILITY/TRANSPARENCY: There should be transparency and responsible disclosure around AI systems to ensure that people understand AI-based outcomes and can challenge them.

• SAFETY/SECURITY: AI systems must function in a robust, secure and safe way throughout their life cycles and potential risks should be continually assessed and managed.

Copyright: Dragutin Petkovic unless noted

otherwise 19

Page 20: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

ACM Principles for Algorithmic Transparency and Accountability

• Awareness:

• Access and redress:.

• Accountability:

• Explanation: Systems and institutions that use algorithmic decision-making are encouraged to produce explanations regarding both the procedures followed by the algorithm and the specific decisions that are made. This is particularly important in public policy contexts.

• Data Provenance: A description of the way in which the training data was collected should be maintained by the builders of the algorithms, accompanied by an exploration of the potential biases induced by the human or algorithmic data-gathering process. Public scrutiny of the data provides maximum opportunity for corrections. However, concerns over privacy, protecting trade secrets, or revelation of analytics that might allow malicious actors to game the system can justify restricting access to qualified and authorized individuals.

• Auditability:

• Validation and Testing

Copyright: Dragutin Petkovic unless noted otherwise

20

Data provenance is also critical

Page 21: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Microsoft AI Principles

• Fairness - AI systems should treat all people fairly

• Inclusiveness - AI systems should empower everyone and engage people

• Reliability & Safety - AI systems should perform reliably and safely

• Transparency - AI systems should be understandable

• Privacy & Security - AI systems should be secure and respect privacy

• Accountability - AI systems should have algorithmic accountability

21

Copyright: Dragutin Petkovic unless noted otherwise

Page 22: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Let us review a use case

22 Copyright: Dragutin Petkovic unless noted

otherwise

Page 23: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

In business world: Mary is deciding whether to adopt a AI-based

diagnostic method

Copyright: Dragutin Petkovic unless noted otherwise

23

Page 24: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Mary has to make a decision based on Current state-of-the-art of presenting

AI accuracy evaluation data •AI algorithm used •Info about the Training DB •Information about specific SW used •Accuracy and methods used to estimate it

Mary’s decision is critical for patients’ Well-being and for the company. Mary has legal responsibilities. She is also under pressure to make profit

Copyright: Dragutin Petkovic unless noted otherwise

24

Page 25: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Mary’s AI vendor uses state of the art methods to estimate AI accuracy:

• Algorithm used: Random Forest

• SW used: R toolkit

• Training Data: 1000000000 samples with ground truth, each with 155 features; data well balanced; no missing data

• Accuracy: Used grid search for NTREE, MTRY, CUTOFF RF parameters to achieve best F1 score of 0.9

• All good, right?

Copyright: Dragutin Petkovic unless noted

otherwise 25

Page 26: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Results and methods look good! TO TRUST OR NOT TO TRUST?

Copyright: Dragutin Petkovic unless noted otherwise

26

Page 27: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

What could go wrong even if results are good?

• Bias in training database

• Algorithm Bias

• Algorithm + database interaction

• Good decision achieved for the wrong reasons e.g. by using “wrong” features like patient id would not work on real data since those features would not be available or reliable

• Poor evaluation procedures

• SW bugs

• Business pressure to make profits

• Missing critical thinking & ethical considerations: risk / harm / implicit bias

Copyright: Dragutin Petkovic unless noted

otherwise 27

Page 28: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

AI Explainability Issues/Questions/Challenges

• What are the value and benefits of AI Explainability • How to achieve it (hard for some AI methods like CNN)

– Develop new interpretable algorithms – Develop explainers to explain the existing algorithms – Best practices and processes

• It is NOT only about algorithms but more and more about DATA used to train ML

• How to leverage “user in the loop” and human intervention

• Ultimate goal: How to make AI Explainability effective and usable for users who are non-AI experts: – Domain experts who are non-AI experts – Decision makers, managers, executives – Legal professionals – Politicians – Public

Copyright: Dragutin Petkovic unless noted otherwise

28

Page 29: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Benefits of AI Explainability • Increased user trust (especially with sample explainability) • Better testing, audit and verification of AI systems (e.g. finding

what factors influence AI decision can point to problems) – Can be used in conjunction with box AI to verify it, establish

confidence

• Training data editing, curation and quality control (e.g. supervision process – assigning labels or class assignments) e.g. managing of “outliers” - key to better AI development

• Reduction of cost (e.g. invest in extracting only small number of critical features )

• Better maintenance – changes depending on feedback from production usage

• Possibly gaining knew knowledge by finding interesting patterns from explanations – Example: D. Petkovic, S. Barlaskar, J. Yang, R. Todtenhoefer: “From

Explaining How Random Forest Classifier Predicts Learning of Software Engineering Teamwork to Guidance for Educators” Frontiers of Education FIE 2018, October 2018, San Jose CA

Copyright: Dragutin Petkovic unless noted

otherwise 29

Page 30: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Note on “black box” AI (one does not know how it works but it works)

• “Black box” AI solution is of value too

• Shows ultimate accuracy and may be used in parallel with explainable systems

• We have many “solutions to problems” that work although we do not know why

• Explainable version of AI system (with maybe less accuracy) may serve to validate the black box solution – E. Holm: “In defense of black box”, Science 05 Apr

2019: Vol. 364, Issue 6435, pp. 26-27

30

Copyright: Dragutin Petkovic unless noted otherwise

Page 31: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

So….should we… • Develop explainers that explain current AI

algorithms

OR

• Develop and use inherently explainable AI – Excellent and provoking reading by C. Rudin, Duke Univ.

“Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead”, Sept 2019

• https://arxiv.org/abs/1811.10154

Copyright: Dragutin Petkovic unless noted otherwise

31

Page 32: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Discussion and some thinking • Q1: Do you think AI epxlainability is important • Q2: Would you adopt a

– 99% accurate black box AI system – OR 95% AI system that is explainable? – Would you settle for 85% accurate system that is explainable?

1. Are you a researcher publishing papers OR 2. You are business and professional whose career is depending on

adopting viable AI systems 3. The AI system you need to decide about is impacting peoples’

lives and livelihood 4. Your decision is legally binding (Try to think as 2 and 3 and 4)

32

Copyright: Dragutin Petkovic unless noted otherwise

Page 33: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Author believes explainability is very important

• Critical in fields that impact people (health, policing, law, “AI governance”..) – But maybe less so in some fields like target marketing etc.

• Recent workshops by the author (in biomedical area) point to critical importance of explainability – Petkovic D, Kobzik L, Re C. “Machine learning and deep

analytics for biocomputing: call for better explainability”. Pacific Symposium on Biocomputing Hawaii, January 2018;23:623-7.

– Petkovic D, Kobzik L, Ganaghan R,“AI Ethics and Values in Biomedicine – Technical Challenges and Solutions”, Pacific Symposium on Biocomputing, Hawaii January 3-7, 2020

Copyright: Dragutin Petkovic unless noted otherwise

33

Page 34: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Let us now look at some definitions

34 Copyright: Dragutin Petkovic unless noted

otherwise

Page 35: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

AI Explainability

• AI Explainability: Ability to explain how (not why) AI systems work (e.gt. Make their decisions)

• AI agnostic – explainer not tied to a particular AI method; OR

• AI specific – explainer tied to specific AI alg

• Direct: uses the same AI method as the original black box

• Indirect: Uses different methods (e.g. approximations, different AI alg) than original

Copyright: Dragutin Petkovic unless noted otherwise

35

Page 36: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

AI Model vs. Sample Explainability

• AI Model explainability: gives an idea of how AI works on the totality of data (e.g. whole training database) – global measure

• AI Sample explainability: how AI decided on specific sample

– Critical for non-AI expert user trust in the system

• Dzindolet M, Peterson S, Pomranky R, Pierce L, Beck H. “The role of trust in automation reliance”. International J Human-Computer Studies. 2003;58(6):697-718.

36

Copyright: Dragutin Petkovic unless noted otherwise

Page 37: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Interpretability vs. Explainability (some confusion in the literature)

• Interpretability – broader concept – hard! – Can we reason and interpret cause and effect

• Explainability: ability to explain (mechanically) how the classifier made its decisions

• Reproducibility and transparency: Process of documenting and publishing ALL data used in the ML analysis so that others can reproduce it and verify

– Transparent and reproducible system is not necessarily explainable

• Correlation vs. causality – AI explainers simply look at correlations (HOW data is classified) – whatever is in the data. Can not make causal inference e.g. WHY data patterns exist

Copyright: Dragutin Petkovic unless noted otherwise

37

Page 38: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise

38

confusion matrix e1 non e1 class.error

e1 297 2 0.67%

non e1 1 571 0.17%

OOB 0.34%

F1 99.5%

ntree range 500, 1000, 10000, 100000

Best ntree 1000

mtry range 1, 2, 5, 8, 10, 12, 25, 50

Best mtry 50

cutoff range 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9

Best cutoff (0.7 for + class, 0.3 for - class)

Example of ML experiment – all data recorded so the

experiment is transparent and reproducible but it is not explainable

Yang J, Petkovic D: “Application of Improved Random Forest Explainability (Rfex 2.0) on Data from JCV Institute LaJolla, California“, SFSU technical report TR 19.01, 06/16/2019

Page 39: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Technical approaches to Explainability Two categories of approaches: • For traditional AI (trained algorithms on well structured

tabular data with distinct features such as decision trees, Random Forest, Support Vector machine, KNN etc.)

• For deep learning and convolution neural networks (harder!!!!)

• KEY COMMON IDEA: based on what features/factors/image regions/signals segments contribute most to the AI decision Reduce problem space (large number of features/image regions) to a manageable smaller problem – few key features, few key image regions) – Leverage some form of feature ranking/importance, ranking of

image regions wrt. classification power – Apply forms of sensitivity analysis (tradeoffs of accuracy vs.

features used) 39

Copyright: Dragutin Petkovic unless noted otherwise

Page 40: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Explaining traditional AI: Leverage some form of feature ranking (like in Random

Forest)

Importance score

features

40 Copyright: Dragutin Petkovic unless noted

otherwise

Page 41: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Explaining CNN – leverage image salient regions (regions participating in classification)

https://dspace.library.uu.nl/handle/1874/380741

Originally thought AI works well, results looked great, but upon further explanations using Salient image regions discovered that AI made decisions based on background (e.g. snow) not dog’s head 41

Copyright: Dragutin Petkovic unless noted otherwise

Page 42: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

BUT AI Explainability is not only technical issue, needs to be user centered

• Most of the attempts for explainability developed by AI experts for AI Experts – highly technical and complex

• They are hard to understand by the key constituency: adopters, domain experts, managers etc. who are often not AI experts (remember Mary from our use case?)

• Need “user centered approach” – ask what domain users and adopters need to know from AI system in order to trust it, adopt it, and be able to use it effectively in their profession

• Create explainability formats and interfaces that are easy to understand and familiar for target users

• Allow human oversight and intervention

42 Copyright: Dragutin Petkovic unless noted

otherwise

Page 43: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

User Centered Approach to AI Explainability – Start with Personae – who are the key users

• Who users are and why they need it

– General public affected by AI “governance” (rental control, policing, loan approvals)

– Adopters in industry, government – risk/benefit analysis, deployment decisions

– Legal, government, law enforcement: enforcement of law, rules and standards

– Also AI Developers – maintenance, QA, improvements

• Skills

– Except AI developers, mostly low to zero knowledge of AI algorithms and tools

– Basic knowledge of computing office and basic science apps like spreadsheet

– Can interpret basic info: tables, lists, forms, very basic statistics

– Deep Domain knowledge

• Concerns, pain points

– Many have fear and feel intimidation from AI

– Hard time learning how AI works

– Fear of basic critical decisions based on something they do no understand

– Lack of trust Copyright: Dragutin Petkovic unless noted otherwise

43

Page 44: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

NEEDED: User driven design of AI Explainability - what would AI users and adopters need from

AI explainability?

• Transparency and reproducibility: full information about ALL setups, parameters and data in AI pipeline

• Full information about the database used for training an testing: Complete undemanding of the data used (statistics, demography, errors, biases, ground truth…)

• Answers to a number of basic explainability questions

• Ease of use: All the data needs to be presented in formats easy to understand and act upon and familiar for intended users (tables, simple graphs, lists)

44 Copyright: Dragutin Petkovic unless noted

otherwise

Page 45: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Example explainers for traditional AI systems

• LIME: Local Interpretable Model-Agnostic Explanations (also works on image classification explanation using extracted features from image regions)

– Ribeiro M, Singh S, Guestrin C. "Why Should I Trust You? Explaining the Predictions of Any Classifier”, arXiv. 2016;arXiv:1602.04938.

– Ribeiro M, Singh S, Guestrin C.:”Nothing Else Matters: Model-Agnostic Explanations by Identifying Prediction Invariance”, 30th Conf. of Neural Information Processing Systems (NIPS 2016), Barcelona, Spain 2016

– https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime/

• RFEX 2.0 – Random Forest Explainer – (Part II of the lecture)

– D. Petkovic, A. Alavi, D. Cai, J. Yang, S. Barlaskar: “RFEX – Simple random Forest Model and Sample Explainer for non-ML experts”, Biorxiv https://www.biorxiv.org/content/10.1101/819078v1; posted 10/25/19

– Barlaskar S, Petkovic D: “Applying Improved Random Forest Explainability (RFEX 2.0) on synthetic data”, SFSU TR 18.01, 11/27/20181; with related toolkit at https://www.youtube.com/watch?v=neSVxbxxiCE

– Petkovic D, Altman R, Wong M, Vigil A.: “Improving the explainability of Random Forest classifier - user centered approach”. Pacific Symposium on Biocomputing. 2018;23:204-15. (RFEX 1.)

Copyright: Dragutin Petkovic unless noted otherwise

45

Page 46: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

LIME: Local Interpretable Model-Agnostic Explanations

• LIME is agnostic explainer (works on all classifiers including images/signals/text provided basic features are extracted)

• It is indirect explainer: it explains unknown black box AI by approximating it locally with linear classifier

• Original LIME is Sample explainer – explains AI model only for local regions around samples chosen by the user • Ribeiro M, Singh S, Guestrin C. "Why Should I Trust You? Explaining the

Predictions of Any Classifier”, arXiv. 2016;arXiv:1602.04938.

• Global (Model) explanations attempted by aLIME (anchor LIME) • Ribeiro M, Singh S, Guestrin C.:”Nothing Else Matters: Model-Agnostic Explanations by

Identifying Prediction Invariance”, 30th Conf. of Neural Information Processing Systems (NIPS 2016), Barcelona, Spain 2016

• Developed at CS department, University of Washington • Toolkit available, well known and well documented

– https://github.com/marcotcr/lime/blob/master/README.md Copyright: Dragutin Petkovic unless noted otherwise

46

Page 47: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

LIME – how it works (high level) • Original black box AI works on features which may not be interpretable

(complex functions based on pixels, words, signals)

• LIME uses interpretable representation instead of original features

– For text: binary vector of desired length indicating presence (1) or absence (0) words

– For images: binary vector of desired length indicating presence of absence of a patches of pixels (super-pixels)

• Interpretable representations of samples around test sample X are perturbed by random choice of components in its interpretable representation training set Z for local linear classifier (weighted by proximity to X)

– Original model AI is used as a black box to get prediction from each perturbed interpretable sample as a ground truth – sample class label

• Linear classifier used to find best classification of set of Z (perturbed interpretable samples) around X most important features of this linear classifier are LIME explanations of original AI model

Copyright: Dragutin Petkovic unless noted otherwise

47

Page 48: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise

48

Original AI black box system Decision surfaces

X - Sample being explained

LIME creates set of interpretable samples Z in vicinity of X derived by random perturbation of presence or absence of interpretable features, each getting prediction from original AI model

LIME uses linear classifier W around X To classify samples Z (weighted by Vicinity to X)

Features with most discriminatory Power from above linear classifier (e.g. with largest coefficients in W) are LIME Local explanations of model for sample X

LIME – intuitive simple explanation https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime/

This is in effect random exploration in the Vicinity of X

Page 49: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

LIME example on explaining text classification done by Random Forest

• Data from text benchmark 20 newsgroups dataset – (http://qwone.com/~jason/20Newsgroups/)

• LIME task: Explain Random Forest (RF) classification of text into topic of Christianity vs. Atheism – RF achieved accuracy of 92.4%

• RF achieved high accuracy, but was it due to the right reasons? Need epxlainability to check this

• Interpretable representation = binary vector of length K denoting presence or absence of certain words 1= presence; 0 = absence.

• Test sample interpretable representation of K words is randomly perturbed by omitting some words to form training set Z for local linear classifier… class label obtained by original black box model

• https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime/

Copyright: Dragutin Petkovic unless noted otherwise

49

Page 50: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise

50

Test instance (e-mail) predicted correctly BUT for the wrong reason https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime/

Turns out that the word "Posting" (in email header) appears in 21.6% of the examples in the training set, but only two times in the class 'Christianity'. Similarly in test set. Hence, RF classifier predicted “Atheism” which is statistically correct but in fact wrong.

LIME shows best local predictors for the test sample e-mail

This would NOT work on real data!!!! Explainability is critical even if classifier decisions seem Correct also make sure training DB is reflective of the problem

Page 51: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

LIME example on explaining images classified by Google Inception network

• Explaining Google Inception network for interpreting images. Used test images as samples – Google Inception network: “Going Deeper with Convolutions”, CVPR

2015. Christian Szegedy et al

• LIME bases its explanations on parts of images or super- pixels (NOT on raw images) that are most descriptive of certain image class (dog, guitar…)

• Interpretable representation = presence or absence (grayed out) of certain image patches (super-pixels)

• Test samples Z = randomly perturbed test samples formed by omitting certain super pixels form training set for local linear classifier

• https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime/

Copyright: Dragutin Petkovic unless noted

otherwise 51

Page 52: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise

52

LIME explanation of image classification https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime/

Super-pixels

Wrong classification, but pretty close, using he correct image super-pixels Knowing this builds user trust

Page 53: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

LIME explanation of image classification (2) https://www.oreilly.com/content/introduction-to-local-interpretable-model-agnostic-explanations-lime

Copyright: Dragutin Petkovic unless noted otherwise

53

Set Z Class prob./Labels From original black Box model

Linear model

Explain frog classification

Super-pixels

Page 54: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

More on LIME for global predictions

• Technical details on original LIME in – Ribeiro M, Singh S, Guestrin C. "Why Should I Trust You? Explaining the Predictions of

Any Classifier”, arXiv. 2016;arXiv:1602.04938.

• BUT LIME only explains local decisions around test samples – So how many samples I need to explain the whole AI

model globally?

• To answer the above LIME authors developed aLIME – “anchored LIME” which: – aLIME outputs IF-THEN-ELSE rules as predictors – Identifies anchor rules which stay relevant for most of

predictions (with high probability) helps explain AI model globally

• Ribeiro M, Singh S, Guestrin C.:”Nothing Else Matters: Model-Agnostic Explanations by Identifying Prediction Invariance”, 30th Conf. of Neural Information Processing Systems (NIPS 2016), Barcelona, Spain 2016

Copyright: Dragutin Petkovic unless noted otherwise

54

Page 55: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Deep Learning and Convolutional Neural Networks (CNN) explainability

• Hugely popular and powerful

BUT

• Very hard to explain

• Some good readings – W. Samek et al:”Explainable Artificial Intelligence: Understanding,

Visualizing and Interpreting Deep Learning Models”, ITU Journal ICT Discoveries, Special issues No 1, 13 Oct 2017

– Q. Zhang et al:”Visual Interpretability for Deep Learning: a Survey”, 2018, https://arxiv.org/abs/1802.00614

– TechTalks:” Explainable AI: Interpreting the neuron soup of deep learning”, Oct 2018, https://bdtechtalks.com/2018/10/15/kate-saenko-explainable-ai-deep-learning-rise/ :

Copyright: Dragutin Petkovic unless noted

otherwise 55

Page 56: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Convolutional Neural Networks

https://towardsdatascience.com/traffic-sign-detection-using-convolutional-neural-network-660fb32fe90e

ReLu refers to the Rectifier Unit, commonly deployed activation function for the outputs of the CNN neurons ReLu = MAX(0,X) (clip neg. values) – used to detect basic features The pooling layer is usually placed after the Convolutional layer to reduce the spatial dimensions (e.g. it is merging more pixels into less)

MANY Final decisions

56 Copyright: Dragutin Petkovic unless noted

otherwise

Page 57: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

CNN usage

• CNNs best suited for unstructured data like images, signals, text where not only values of each item (e.g. pixel, word, signal sample) but local relationships between items (pixels, words, signal samples matter

• Note that other traditional classifiers like Random Forest also can be used on image features derived form raw images

• LIME can be used as explainer of CNNs if its inputs are image regions, not raw pixels

Copyright: Dragutin Petkovic unless noted

otherwise 57

Page 58: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Explainability approaches for deep learning and Convolutional Neural

Networks (CNN)

• Deep learning and CNNs produced amazing results, see for example excellent classification results for Google Inception network for image classification of a very large set of images • Google Inception network: “Going Deeper with Convolutions”, CVPR 2015.

Christian Szegedy et al

• CNNs are trained by using large set of training data with known labels, then optimizing (millions of) weights and connections to minimize overall classification error

• CNNs are very accurate, trained by (large set of) examples but inherently very hard to explain – “black boxes”!

Copyright: Dragutin Petkovic unless noted otherwise

58

Page 59: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

http://henrysprojects.net/projects/conv-net.html

CNN example

Raw images Decision

59 Copyright: Dragutin Petkovic unless noted

otherwise

Page 60: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

BUT: Try to explain trailed CNN – it is hard!!!

https://towardsdatascience.com/the-most-intuitive-and-easiest-guide-for-convolutional-neural-network-3607be47480

CNNs have millions of connections/weights and hundreds of images from the layers – too many numbers and images un-inuitive for humans to deal with

1000s X 1000s 100sX100s

100s

Humans can not interpret numbers related to neural weights or “images” of intermediate layers – they may have no symbolic or conceptual meaning (unlike in decision trees).

60 Copyright: Dragutin Petkovic unless noted

otherwise

Page 61: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

http://henrysprojects.net/projects/conv-net.html

If we open CNN “box” what can we use for explainability?

Inside CNN: Complex images and numbers – Very large number of them!!! Very hard to interpret

61 Copyright: Dragutin Petkovic unless noted

otherwise

Page 62: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

So, how to explain CNNs

• At high level: apply similar approach used by LIME and Random Forest variable importance ranking: – Perturb input of trained CNNs – Observe changes in classification – Assign higher importance to inputs whose change produces

larger change (drop) in accuracy of classification – Pixels producing largest change in classification form salient

images or heat map

• Can be used to check which pixels contributed to decision (“where the CNN was looking”), which s good BUT may not be enough for the explanation how image was classified

• It can also detect anomalies e.g. wrong pixels producing seemingly correct classification – a common problem and critical for good audit

Copyright: Dragutin Petkovic unless noted otherwise

62

Page 63: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise

63

From TechTalks:” Explainable AI: Interpreting the neuron soup of deep learning”, Oct 2018 https://bdtechtalks.com/2018/10/15/kate-saenko-explainable-ai-deep-learning-rise/

Prof. Saaenko’s RISE method 1. Random masks overlaid over test images in trained CNN 2. Classification performed and repeated for many combinations of random masks 3. Observe which parts of an image cause biggest drop in classification accuracy Heat maps or salient images

Random masks

Heat map showing part Of image (red) contributing Most to classification

Test images

Tested CNN

Page 64: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

From: W. Samek et al:”Explainable Artificial Intelligence: Understanding,

Visualizing and Interpreting Deep Learning Models”, ITU Journal ICT

Discoveries, Special issues No 1, 13 Oct 2017

Copyright: Dragutin Petkovic unless noted otherwise

64

Two approaches To explaining CNNs

Excellent justification For explainability

Page 65: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

From: W. Samek et al:”Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models”, ITU Journal ICT

Discoveries, Special issues No 1, 13 Oct 2017

Approaches to CNN Explainability

• Sensitivity analysis (SA): most relevant image regions (salient or hear maps) are those whose change causes biggest change in output. Uses locally evaluated gradients. Similar to previous approach, but may not answer explainability

• Layer-wise relevance propagation (LRP): Redistribute predictions from the last (output) CNN layer backward until reaching input pixels. Keep the total relevance constant at each CNN layer during the redistribution. Input pixels with highest redistributed relevance form saliency pixels or heat map

Copyright: Dragutin Petkovic unless noted

otherwise 65

Page 66: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Example of the use of CNN explanations to verify/audit/debug image classification

https://dspace.library.uu.nl/handle/1874/380741

Originally thought AI works well, results looked great, but upon further explanations using Salient image regions discovered that AI made decisions based on background (e.g. snow) not dog’s head – great for the auditing! 66

Copyright: Dragutin Petkovic unless noted otherwise

Page 67: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Adversarial AI – attempts to circumvent or defeat AI systems

Copyright: Dragutin Petkovic unless noted otherwise

67

Use salient regions from explainable CNNs to find key places to corrupt image and cause wrong classification

Page 68: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

PART II: RFEX 2.0: SIMPLE RANDOM FOREST MODEL AND SAMPLE EXPLAINER FOR NON-

MACHINE LEARNING EXPERTS

Copyright: Dragutin Petkovic unless noted otherwise

68

Page 69: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

PART II: RFEX 2.0: SIMPLE RANDOM FOREST MODEL AND SAMPLE EXPLAINER FOR NON-

MACHINE LEARNING EXPERTS

Prof. D. Petkovic,, A. Alavi, D. Cai, S. Barlaskar, J. Young

Computer Science Department, San Francisco State University

Work has been partially supported by NIH grant R01 LM005652 and by SFSU COSE Computing for Life Sciences.

D. Petkovic, A. Alavi, D. Cai, J. Yang, S. Barlaskar: “RFEX – Simple random Forest Model and Sample Explainer for non-ML experts”, Biorxiv https://www.biorxiv.org/content/10.1101/819078v1; posted 10/25/19 (Also presented as a poster at Pacific Symposium on Biocomputing, PSB 2020, January 2020, Hawaii, USA)

Petkovic D, Altman R, Wong M, Vigil A.: “Improving the explainability of Random Forest classifier - user centered approach”. Pacific Symposium on Biocomputing. 2018;23:204-15.

Tutorial: Barlaskar S, Petkovic D: “Applying Improved Random Forest Explainability (RFEX 2.0) on synthetic data”, SFSU TR 18.01, 11/27/20181; with related toolkit at https://www.youtube.com/watch?v=neSVxbxxiCE

69 Copyright: Dragutin Petkovic unless noted

otherwise

Page 70: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Quick overview of Random Forest

70 Copyright: Dragutin Petkovic unless noted

otherwise

Page 71: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Random Forest (RF) Classifiers • One of the most powerful and widely used, both for classification

and regression • RF uses large number of slightly different decision trees (1000s)

working together and voting for a class decision • Incorporates CART for training/growing each tree (no pruning) - but

makes sure trees are not too similar! • Invented at UC Berkeley by Brieman 2001

– Breiman L, “Random forests,” Machine Learning, vol. 45, no. 1, pp.5–32, 2001

• Best for problems with features that are structured as feature vectors with distinct features in tabular format. Not well suited for raw text, images and signals (but can work on extracted features arranged in tabular format)

• Can use numerical and categorical features • Good in ignoring useless features; deals with missing values • Fast in training and run time • Excellent tools exist (R, SciKit etc.) • RF has inherent explainability potential since it is tree based and

offers feature ranking

Copyright: Dragutin Petkovic unless noted otherwise

71

Page 72: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Random Forest overview • Use combination of many (NTREE) of CART generated decision

trees (1000s) which then vote for correct decision (e.g. majority vote or use cutoff threshold) with CUTOFF threshold

• Each tree trained from slightly different random bootstrap version of training data each tree “sees” slightly different training set

• Each tree trained using modified CART (at each node tries random choice of MTRY features, with replacement); no pruning each tree is slightly different

• RF optimizing parameters: NTREE, MTRY, CUTOFF

• RF has its own accuracy measure (Out of Bag Error – OOB) obtained by testing RF on training data BUT such that testing in each tree is done only with samples not used in training that tree RF has built in cross validation (text samples different from training samples) – so no need for separate CV Copyright: Dragutin Petkovic unless noted

otherwise 72

Page 73: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise 73

Page 74: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RF even offers feature ranking – critical for explainability

• GINI based: relates to features which cause on average (over all trees) greatest increase in “purity” in subsequent split – they are important

• MDA – Mean Decrease in Accuracy – modify feature value randomly and record average drop in accuracy over all trees. Choose those features whose perturbation causes biggest drop in average accuracy as more important • Can also be class specific (MDA+ and MDA -) – can produce

different rankings especially if training data in unbalanced

• GINI and MDA value can be used to rank features by

importance – IMPORTANT for explainability • Available in RF SW tools and packages

Copyright: Dragutin Petkovic unless noted otherwise

74

Page 75: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RF Explainability – work of others

Copyright: Dragutin Petkovic unless noted otherwise

75

Page 76: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

76

D. Delen: “A comparative analysis of machine learning techniques for student retention management”, Decision Support Systems, Volume 49, Issue 4, November 2010, Pages 498–506

Simply use Ranked List of features (but no other Type of info) It helps but much More can be done (see RFEX later)

Copyright: Dragutin Petkovic unless noted otherwise

Page 77: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Copyright: Dragutin Petkovic unless noted otherwise

F1

F2

a b

c

d

X1 X2

X3

O1 O2 O3

O4 O5

O6 O7 O8

F1 >=a

Y N

O 1,4,6 Terminal node (all points same class)

F2>=c

O 7,8 Terminal node

F1<=b

Y N

O 3,5 Terminal node

F2 <=d

O 2 Terminal node X 1,2,3 Terminal node

Can we try extracting rules from forest of trees

Extracted rule: If F1>=a AND F2>=c And F2<=b AND F2<=D THEN class X ELSE Class O

One of the Trees in RF

77

Page 78: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RF Explainability by extracting rules from trained ensemble of trees

• From ensemble of trees extract rules (100000s…) then use optimization to chose “best” and relatively small set (10s, 100s) to help explainability/transparency

• Paper with good overview and using hill climbing to extract summary rules – Morteza Mashayekhi, , Robin Gras:”Rule Extraction from Random Forest: the RF+HC

Methods”, Advances in Artificial Inreligence, Volume 9091 of the series Lecture Notes in Computer Science pp 223-237, 29 April 2015

• Paper in Biological context – Liu, S., Dissanayake, S., Patel, S., Dang, X., Mlsna, T., Chen, Y., & Wilkins, D. (2014).

Learning accurate and interpretable models based on regularized random forests regression. BMC Systems Biology, 8(Suppl 3), S5. http://doi.org/10.1186/1752-0509-8-S3-S5

• SW Tools – Accessing trees in R package

• http://stackoverflow.com/questions/14996619/random-forest-output-interpretation

– Rule extraction SW “inTrees” and related paper • H. Deng: “Interpreting Tree Ensembles with inTrees”, 2014 http://arxiv.org/pdf/1408.5456.pdf

• Problem: still too many rules (100s), each with multiple

conditions hard to use for explainability 78

Copyright: Dragutin Petkovic unless noted otherwise

Page 79: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RFEX 2.0: SIMPLE RANDOM FOREST MODEL AND SAMPLE EXPLAINER FOR NON-MACHINE LEARNING

EXPERTS (PETKOVIC ET AL)

Abstract: • We present novel Random Forest [1] (RF) Explainer

(RFEX 2.0) offering integrated model and novel sample explainability

• RFEX 2.0 is designed in User Centric way with non-AI experts in mind, and with simplicity and familiarity, e.g. providing a one-page tabular output and measures familiar to most users.

• We demonstrate RFEX in a case study from our collaboration with the J. Craig Venter Institute (JCVI). – https://www.biorxiv.org/content/10.1101/819078v1 – https://cs.sfsu.edu/sites/default/files/technical-

reports/RFEX%202%20JCVI_Jizhou%20Petkovic%20%2006-16-19_0.pdf

79 Copyright: Dragutin Petkovic unless noted

otherwise

Page 80: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case study data and ground truth

JCV and Allen Institute for Brain Science DATA and “ground truth” The matrix of gene expression values from single nuclei samples derived from human middle temporal gyrus (MTG) layer 1 forms the features for RF classification (608 of them), and groups them by 16 different cell type clusters which constitute 16 classes for RF classification. Training database for e1 cluster: 299 + samples, 572 – samples, 608 features, no missing data; all features are numerical Ground truth established independently: e1 cluster is defined as “A human middle temporal gyrus cortical layer 1 excitatory neuron that selectively expresses TESPA1, LINC00507 and SLC17A7 mRNAs, and lacks expression of KNCP1 mRNA” Aevermann B., Novotny M., Bakken T., Miller J., Diehl A., Osumi-Sutherland D., Lasken R., Lein E., Scheuermann R.: “Cell type discovery using single cell transcriptomics: implications for ontological representation”, Human Molecular Genetics 27(R1): R40-R47 · March 2018

80 Copyright: Dragutin Petkovic unless noted

otherwise

Page 81: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Case Study - Application of RFEX 2.0 to human nervous system cell type clusters from gene

expressions using data from J. Craig Venter Institute and Allen Institute for Brain Science

Goals: • Investigate if improved RFEX 2.0 Model Summary reflects

“ground truth” information on key gene markers as published • Are key gene markers established by biological analysis

consistent with key “explainability information”; • Can this be easily observed by non-AI experts from RFEX

2.0 Model explainer tabular output

• Can novel RFEX 2.0 Sample explainer be used to identify specific samples and features that are possibly “out of range” and may need to be removed from the training set • Can this be easily observed by non-AI experts from RFEX

2.0 Sample explainer tabular output

81 Copyright: Dragutin Petkovic unless noted

otherwise

Page 82: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Explainability questions users might ask guided User-Centric RFEX design

Based on info in research papers and our informal surveys) Focus on non-expert AI users but domain experts Global (Model based) explainability questions – Which features (preferably small manageable number) are most

important for predicting the class of interest (this questions is critical in reducing complexity of the problem space)?

– What is the tradeoff between using a subset of features and the related accuracy?

– What are basic statistics/ranges and separation of feature values between those for + and – class?

– Which groups of features work well together?

Local (Sample based) explainability questions – Why was my sample classified incorrectly or correctly – Which features or factors contributed to incorrect classification of my

sample? – How can I identify outliers among data samples and features

Copyright: Dragutin Petkovic unless noted otherwise

82

Page 83: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

MODEL explainer (based on global

Info from population)

SAMPLE explainer (how tests from a

patent Relate to global model)

X X

X X

X

Patient (sample) test and diagnosis explained By relating patient data to global model

RFEX approach modeled by typical medical tests

AI Model = Medical knowledge

AI Model Features = Specific medical Tests with test ranges for healthy patients

Patient testing Patent’s Sample values

83 Copyright: Dragutin Petkovic unless noted

otherwise

Page 84: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Training Data

Random Forest Training

Accuracy: F1, OOB, confusion Matrices, ntree, mtry, cutoff

RFEX Model Summary Report Table (one page)

Standard RF Classification

RFEX Model Explainer RFEX: Model Explainer

RFEX Sample Explainer

RFEX: Sample Explainer

User sample

RFEX Sample Summary Report Table (one page)

RFEX 2.0 approach

84 Copyright: Dragutin Petkovic unless noted

otherwise

Human Interpretation And oversight

Page 85: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RFEX approach RFEX Model Explainer:

– Perform standard RF training to generate trained RF and establish base accuracy

– Perform a pipeline of six analysis steps to process trained RF to extract information for RFEX Model Explainer

– Present extracted information in one page easy to use and familiar tabular format

RFEX Sample Explainer – Perform a pipeline of four steps to extract information for RFEX

Sample Explainer – Present the information using same ranked features as RFEX Model

explainer relate sample to the model – Present extracted information in one page easy to use and familiar

tabular format

• Both are easy to implement as a set of steps in AI analysis pipeline (e.g. use jupyther notebook with R or SciKit) • Barlaskar S, Petkovic D: “Applying Improved Random Forest Explainability (RFEX 2.0) on

synthetic data”, SFSU TR 18.01, 11/27/20181; with related toolkit at https://www.youtube.com/watch?v=neSVxbxxiCE

Copyright: Dragutin Petkovic unless noted otherwise

85

Page 86: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RFEX 2.0 MODEL Explainer

First step: Get basic accuracy in predicting e1 cluster achieved by standard RF training:

F1=0.995 (for ntree = 1000, mtry = 50 and cutoff (0.3, 0.7)) (F1 = 2 * (recall*precision)/(recall + precision) )

RFEX 2.0 Model Tabular summary extracted from trained RF – one page (see next slide). Table columns include:

• Feature Rank (MDA value): We rank features by their predictive power using Mean Decrease in Accuracy (MDA) measured from the trained RF classifier

• Cumulative F1 score: We provide tradeoffs between using subsets of top ranked features (up to the topK) and RF accuracy by computing “cumulative F1 score” for each combination of top ranked 2, top ranked 3,…top ranked topK features by re-training RF for each combination

• Basic stats of + and – class samples for each feature: AV/SD; [MAX,MIN] • Cohen Distance between + and – class samples (ABS(AV+ - AV-)/SD) • Cliques of N features: Most predictive groups of N features (clique of N),

from the topK features, N usually 2 or 3 (exhaustive test)

Effective idea to reduce feature Space – most times for over 90%

86 Copyright: Dragutin Petkovic unless noted

otherwise

Page 87: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RFEX Key idea: feature reduction • Feature Rank (MDA value): We rank features by their predictive power

using Mean Decrease in Accuracy (MDA) measured from the trained RF classifier . NOTE: we recommend using class specific MDA ranking (supported in R toolkit) especially in case of unbalanced data – rankings for + and – class are often different

• Cumulative F1 score: Provides tradeoffs between using subsets of top ranked features (up to the topK) and RF accuracy by computing “cumulative F1 score” for each combination of top ranked 2, top ranked 3,…top ranked topK features (by re-training RF for each) simple and key idea for reducing large feature space to a handful (fits one page). Over 90% reduction for all applications we tried

• Best combinations of N features – clique of N: Now that we are down to 10 or so features we can do many things using exhaustive search like finding best combinations of N features (cliques of N). E.G. 10 chose 3 = 120 so it is feasible to try all combinations!!!!!!!

• Feature reduction allows RFEX info to be put on one page for ease of use Copyright: Dragutin Petkovic unless noted

otherwise 87

Page 88: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Cohen Distance for measuring separation of populations for + and – class for each feature

• Feature Class Separation: To indicate separation of feature values for + and – class for each of topK features (again for easy visual analysis), we use Cohen Distance between feature values of two populations e.g. of + and – class, defined as

Cohen Distance = ABS(AV+ - AV-)/SD (1) – where AV+ is the average of feature values for the positive class as measured

from training DB; AV- is the average of feature values for the negative class; and SD is the (larger or smaller) standard deviation of the two feature value populations. As noted in ref. below, Cohen Distance of less than 0.2 denotes a small; from 0.2 to 0.5 denotes a medium; from 0.5 to 0.8 a large, and above 1.3 a very large separation. Stats are obtained from training DB

• Cohen Distance is better than p-value for measuring independence of two

populations gives a measure of separation and not just yes/no – Solla F, Tran A, Bertoncelli D, Musoff C, Bertoncelli CM: “Why a P-Value is Not

Enough”, Clin Spine Surg. 2018 Nov;31(9):385-388

Copyright: Dragutin Petkovic unless noted

otherwise 88

Page 89: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Feature

index

Feature

name

MDA

value

Cumulative

F1 score

AV/SD

e1 class;

[MIN,MAX]

AV/SD

non e1 class;

[MIN,MAX]

Cohen

Distance

Top 10 cliques of 3

features

1 TESPA1 18.8 N/A 363.5/266

[0, 1836]

3.9 /29

[0,423]

1.35 -[TESPA1, SLC17A7, ZNF536]

-[TESPA1, KCNIP1, TBR1]

-[TESPA1, SLC17A7, TBR1]

-[TESPA1 , ZNF536, TBR1]

-[TESPA1, SLC17A7, PROX1]

- [TESPA1, GAD2.1, TBR1]

-[TESPA1, GAD2, TBR1]

-[TESPA1, ADARB2.AS1, TBR1

-[TESPA1, PROX1, TBR1]

-TESPA1, TBR1, PTCHD4]

2 LINC00507 16.5 0.980 234/203

[0,1646]

1.8/15

[0,288]

1.14

3 SLC17A7 16.3 0.9816 82.8 / 77

[0,498]

1.1 / 11

[0,246]

1.06

4 LINC00508 12.8 0.9799 97.4/108

[0,727]

0.6/4.8

[0,96]

0.9

5 KCNIP1 12.7 0.9866 1.0 /2.8

[0,43]

310.6 /377

[0,2743]

0.82

6 NPTX1 12.5 0.9901 142/176

[0,1241]

3/21

[0,301]

0.79

7 TBR1 12.3 0.9917 34/57

[0,413]

0.4/4.1

[0,67]

0.59

8 SFTA1P 12.1 0.9917 108/119

[0,629]

1.1/15

[0,234]

0.9

RFEX Model Summary Table vs. Explainability Questions

89 Copyright: Dragutin Petkovic unless noted

otherwise

Feature importance

Tradeoff between features used vs. accuracy

Basic feature stats For easy viewing

Shows which features Work well together

Feature class separation

Top 5 features (out of 608) carry most of the prediction (0.986 vs. 0.99).

Page 90: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Feature

index

Feature

name

MDA

value

Cumulative

F1 score

AV/SD

e1 class;

[MIN,MAX]

AV/SD

non e1 class;

[MIN,MAX]

Cohen

Distance

Top 10 cliques of 3

features

1 TESPA1 18.8 N/A 363.5/266

[0, 1836]

3.9 /29

[0,423]

1.35 -[TESPA1, SLC17A7, ZNF536]

-[TESPA1, KCNIP1, TBR1]

-[TESPA1, SLC17A7, TBR1]

-[TESPA1 , ZNF536, TBR1]

-[TESPA1, SLC17A7, PROX1]

- [TESPA1, GAD2.1, TBR1]

-[TESPA1, GAD2, TBR1]

-[TESPA1, ADARB2.AS1, TBR1

-[TESPA1, PROX1, TBR1]

-TESPA1, TBR1, PTCHD4]

2 LINC00507 16.5 0.980 234/203

[0,1646]

1.8/15

[0,288]

1.14

3 SLC17A7 16.3 0.9816 82.8 / 77

[0,498]

1.1 / 11

[0,246]

1.06

4 LINC00508 12.8 0.9799 97.4/108

[0,727]

0.6/4.8

[0,96]

0.9

5 KCNIP1 12.7 0.9866 1.0 /2.8

[0,43]

310.6 /377

[0,2743]

0.82

6 NPTX1 12.5 0.9901 142/176

[0,1241]

3/21

[0,301]

0.79

7 TBR1 12.3 0.9917 34/57

[0,413]

0.4/4.1

[0,67]

0.59

8 SFTA1P 12.1 0.9917 108/119

[0,629]

1.1/15

[0,234]

0.9

RFEX Model Summary data for JCVI data - e1 cluster: Base RF accuracy is F1=0.995, for ntree = 1000, mtry = 50 and cutoff (0.3, 0.7)

Low expression

High expression

NEW insight: TBR1 appears in many cliques?

Key “ground truth” gene markers ranked as the top 1,2,3, 5 and do most of the prediction

90 Copyright: Dragutin Petkovic unless noted

otherwise

Page 91: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Discussion on RFEX 2.0 Model Explainer RFEX Model Explainer in previous slide indeed provided correct and visually easy to

interpret explanations of RF classification verifying ground truth explanation on e1 cluster as follows:

1. By easily observing feature ranking in RFEX Model Summary Table one confirms that key 4 defining genes for e1 cluster are among the top 5 ranked ones

2. Tradeoffs of explainability and accuracy: By looking at cumulative F1 score (column 4) one easily observes that top 5 features (out of 608) carry most of the prediction (0.986 vs. 0.99). Hence over 90% reduction in features for minimal loss of accuracy

3. One can easily confirm from the table the correct levels of gene expression of these key genes e.g. top 3 ranked genes show high expressions indicated by high AV for + class feature values, and low AV for – class, and KCNIP1 shows low expression indicated by low AV value for the + class vs. high AV for the – class 1 and 2 correlate well with ground truth

4. Furthermore, high Cohen Distance values confirm that all highly ranked features show good separation between + and – classes, and notably this separation is highest for highly ranked feature increases user confidence

5. Features working together: Most predictive feature groups (cliques) of 3 (last column in the table ) are dominated by top ranked “key deciding” TESPA1 gene but also show strong participation of TBR1 gene, information which was previously unknown to our JCVI collaborators thus possibly offering new insights from this kind of explainability analysis.

91 Copyright: Dragutin Petkovic unless noted

otherwise

Page 92: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Have not done formal usability study for RFEX 2.0 but demonstrated Increased user trust for similar RFEX 1.0

format in usability test (13 users, 2018)

Petkovic D, Altman R, Wong M, Vigil A.: “Improving the explainability of Random Forest classifier - user centered approach”. Pacific Symposium on Biocomputing. 2018;23:204-15.

92 Copyright: Dragutin Petkovic unless noted

otherwise

Page 93: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Questions?

93 Copyright: Dragutin Petkovic unless noted

otherwise

Page 94: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Novel RFEX 2.0 SAMPLE Explainer Integrated with RFEX 2.0 Model Explainer - Same feature list

and ranking of features • First, some basic information

– Sample Correct classification (TRUE/FALSE) – Sample Classification confidence level measured by Voting Fraction:

% of Ntrees from RF forest voting correct class for this sample

RFEX Model Tabular summary– one page (see next slide) • Feature MDA Rank • Feature Name; • Feature MDA rank value; • Feature stats for all + samples: AV/SD, [Min,Max] • Feature stats for all - samples: AV/SD, [Min,Max] • Feature Value of Tested Sample; • Sample Cohen Distance to + class AV • Sample Cohen Distance to – class AV; • K Nearest Neighbor Ratio to correct class defined as fraction of K nearest

samples to sample feature value belonging to correct class

94 Copyright: Dragutin Petkovic unless noted

otherwise

Same as in MODEL Explainer for consistency

Computed from Training DB)

Page 95: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Feature

Rank

Feature

name

Feature

MDA

rankings

Feature

AV/SD for

e1 (+) Class;

Range

[Min,Max]

Feature

AV/SD for

non-e1 (-)

Class;

Range

[Min,Max

]

Feature

Value of

tested

sample

Sample

Cohen

Distance

To

e1 class

Sample

Cohen

Distance

To

Non e1 class

K Nearest

Neighbor

ratio

1 TESPA1 18.8 363.5/266

[0, 1836] 3.9 /29

[0,423]

125.5 0.89 4.2 57/60

2 LINC00507 16.5 234/203

[0,1646] 1.8/15

[0,288]

68.4

0.82 4.4 56/60

3 SLC17A7 16.3 82.8 / 77

[0,498] 1.1 / 11

[0,246]

209.9 1.65 18.9 59/60

4 LINC00508 12.8 97.4/108

[0,727] 0.6/4.8

[0,96]

14.1 0.77 2.8 49/60

5 KCNIP1 12.7 1.0 /2.8

[0,43] 310.6 /377

[0,2743]

0 0.36 0.82 57/60

6 NPTX1 12.5 142/176

[0,1241] 3/21

[0,301]

214.9 0.41 10.1 56/60

7 TBR1 12.3 34/57

[0,413] 0.4/4.1

[0,67]

81.3 0.83 19.7 58/60

8 SFTA1P 12.1 108/119

[0,629] 1.1/15

[0,234]

126.5 0.16 8.36 56/60

RFEX 2.0 SAMPLE summary table – e1 + sample

95 Copyright: Dragutin Petkovic unless noted

otherwise

Same feature ordering and stats As in Model explainer

Cohen distances From tested sample to each population

Neighbor stats

Page 96: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

K Nearest Neighbors Ratio measure for each feature

• K Nearest Neighbor Ratio or KNN is defined for each feature as a fraction of K nearest neighbors with correct feature value (as in training DB)

– KNN>0.5 means most neighbor samples are of the correct class higher confidence

• KNN complements the Sample Cohen Distances in that it is more local and rank based, as well as non-parametric. For K, we recommend 20% of the number of samples of the smaller class.

Copyright: Dragutin Petkovic unless noted otherwise

96

Page 97: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Intuitive and simple rules to identify potential outliers from training DB

Candidate Outlier or “problematic” sample meriting further review: • Usually a few isolated and “far away from main clusters” • May be correctly classified sample but at the “border” (e.g. voted by with NTREE

fraction close to voting CUTOFF) OR incorrectly classified sample “at the border” • Intuition (confirmed experimentally): outliers will have less influence on GINI

purity measure used by CART to split nodes in each tree, hence less tree branches will be “tuned” to them, hence less trees will vote for them low NTREE Voting Fraction is an indicator of possible outlier

• Rule for identifying potential outlier samples: – Sample received low % votes from all NTREES for the correct class (e.g.

bottom 10% vote of all samples OR very close to CUTOFF voting threshold)

• Rule for identifying potential outlier features of a sample – ((Sample CohenD to CorrectClass) > (Sample CohenD to IncorrectClass)) OR (KNN<50%)

Outlier identification can be used to QA and edit training data – Remove samples with wrong ground truth – Identify erroneous or noisy features

Copyright: Dragutin Petkovic unless noted

otherwise 97

Page 98: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Feature

Rank

Feature

name

Feature

MDA

rankings

Feature

AV/SD for

e1 (+) Class;

Range

[Min,Max]

Feature

AV/SD for

non-e1 (-)

Class;

Range

[Min,Max

]

Feature

Value of

tested

sample

Sample

Cohen

Distance

To

e1 class

Sample

Cohen

Distance

To

Non e1 class

K Nearest

Neighbor

ratio

1 TESPA1 18.8 363.5/266

[0, 1836] 3.9 /29

[0,423]

125.5 0.89 4.2 57/60

2 LINC00507 16.5 234/203

[0,1646] 1.8/15

[0,288]

68.4

0.82 4.4 56/60

3 SLC17A7 16.3 82.8 / 77

[0,498] 1.1 / 11

[0,246]

209.9 1.65 18.9 59/60

4 LINC00508 12.8 97.4/108

[0,727] 0.6/4.8

[0,96]

14.1 0.77 2.8 49/60

5 KCNIP1 12.7 1.0 /2.8

[0,43] 310.6 /377

[0,2743]

0 0.36 0.82 57/60

6 NPTX1 12.5 142/176

[0,1241] 3/21

[0,301]

214.9 0.41 10.1 56/60

7 TBR1 12.3 34/57

[0,413] 0.4/4.1

[0,67]

81.3 0.83 19.7 58/60

8 SFTA1P 12.1 108/119

[0,629] 1.1/15

[0,234]

126.5 0.16 8.36 56/60

RFEX 2.0 SAMPLE Explainer summary for “good” sample

“Good” + sample - received HIGH VOTING FRACTION of 100% (all Ntree trees

voted correctly) – all features are in expected “range” (GREEN)

All features’ AV closer to AV of + class than – class AND all KNN fractions > 50% 98 Copyright: Dragutin Petkovic unless noted

otherwise

Page 99: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Feature

Rank

Feature

name

Feature

MDA

rankings

Feature

AV/SD for

e1 (+) Class;

Range

[Min,Max]

Feature

AV/SD for

non-e1 (-)

Class;

Range

[Min,Max]

Feature

Value of

tested

sample

Sample

Cohen

Distance

To

e1 (+) class

Sample

Cohen

Distance

To

Non e1 (-) class

K Nearest

Neighbor

ratio

1 TESPA1 18.8 363.5/266

[0, 1836] 3.9 /29

[0,423]

601 0.89 20.6 60/60

2 LINC00507 16.5 234/203

[0,1646] 1.8/15

[0,288]

2 1.14 0.01 4/60

3 SLC17A7 16.3 82.8 / 77

[0,498] 1.1 / 11

[0,246]

1 1.06 0.009 11/60

4 LINC00508 12.8 97.4/108

[0,727] 0.6/4.8

[0,96]

1 0.89 0.08 31/60

5 KCNIP1 12.7 1.0 /2.8

[0,43] 310.6 /377

[0,2743]

1 0 0.82 51/60

6 NPTX1 12.5 142/176

[0,1241] 3/21

[0,301]

78 0.36 3.57 58/60

7 TBR1 12.3 34/57

[0,413] 0.4/4.1

[0,67]

0 0.6 0.1 31/60

8 SFTA1P 12.1 108/119

[0,629] 1.1/15

[0,234]

2 0.89 0.06 9/60

RFEX 2.0 SAMPLE Explainer summary for “problematic” sample Marginal problematic + sample – correctly classified but received LOW

VOTING FRACTION of 80%. Outlier features are RED

Out of range features: ((CohenD to + class) > (CohenD to – class)) OR (KNN<50%) 99

Copyright: Dragutin Petkovic unless noted otherwise

Page 100: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RFEX explainers and categorical features, regression

RFEX and categorical variables

• So far we showed only numerical features (values represent a measured value of variable, with natural ordering)

• If one identifies categorical features that need to be shown in RFEX summary tables (values are distinct categories) instead of feature statistics columns (AV, SD, MIN, MAX) one can list top 2-3 categories and their % presence in + and – class samples of the training data and instead of Cohen Distance should use more appropriate distance measure which might be application driven. KNN measure can reflect % of neighbors of categories present in correct class from the training data

RFEX for RF regression

• If RF is used for regression, instead of F1 cumulative score one can use RMS or similar error measures computed for top 2, top3…topK ranked features

Copyright: Dragutin Petkovic unless noted otherwise

100

Page 101: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

RFEX vs. LIME explainer (LIME: Ribeiro M, Singh S, Guestrin C. "Why Should I Trust You? Explaining the Predictions of Any

Classifier”, arXiv. 2016;arXiv:1602.04938)

• RFEX is RF specific, LIME is AI agnostic

• RFEX is direct explainer – uses RF to explain RF, LIME is indirect - uses liner classifier local approximation

• RFEX provide global MODEL explanation using all data, and local SAMPLE explanation depicting how sample relates to global MODEL. LIME only provides local SAMPLE explanation which may or may not relate to the MODEL (may exclude info about how sample relates to globally most important features)

• RFEX offers tradeoffs between accuracy and subsets of features used, LIME does not

• RFEX offers information about most descriptive groups of features (e.g. cliques), LIME does not

• RFEX offers means to identify “marginal samples ” from training data, e.g. those classified with low confidence, as well a marginal sample features

• RFEX offers familiar easy to read one page tabular format with basic feature statistics modeled by common medical tests formats Copyright: Dragutin Petkovic unless noted

otherwise 101

Page 102: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Acknowledgements

• We are grateful to Dr. R. Scheuermann and B. Aevermann from JCVI for the data for our case study and their feedback. Work has been partially supported by NIH grant R01 LM005652 and by SFSU COSE Computing for Life Sciences

Copyright: Dragutin Petkovic unless noted otherwise

102

Page 103: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Questions?

Copyright: Dragutin Petkovic unless noted otherwise

103

Page 104: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

For some further thinking • Is Explainability important? In some (which?) fields more than

in others (marketing vs. health)?

• Should we try to explain current AI alg. or develop and use inherently more explainable AI alg.?

• Use explainable algorithms to audit “black boxes” indirectly

• Should we enforce that all published and deployed AI systems be explainable?

• If all commercial AI systems are explainable how can companies make money?

– It seems it is the DATA that is the key more than AI algorithms?

• Would explainable AI be easier to hack and defeat?

• Extend RFEX concept to non-RF AI – could work as long as there is some form of feature ranking Copyright: Dragutin Petkovic unless noted

otherwise 104

Page 105: Explainable AI – overvie€¦ · Computing for Life Sciences. He led the establishment of SFSU Graduate Certificate in AI Ethics, jointly with SFSU Schools of Business and Philosophy

Thank you!!!!!!!!!!

Copyright: Dragutin Petkovic unless noted otherwise

105