getting under the skin of an in silico approach to ... under the sآ getting under the skin of an...
Post on 18-Jun-2020
Embed Size (px)
Getting under the skin of an in silico approach to predicting dermal sensitisation Donna S. Macmillan & Martyn L. Chilton Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS
Derek Nexus is an expert toxicity prediction tool established by Lhasa Limited in 1983. It uses structure-activity relationships and reasoning rules developed by Lhasa experts to predict over 70 different toxicity endpoints including genotoxicity, carcinogenicity, skin sensitisation, HERG channel inhibition, hepatotoxicity and irritation. Skin sensitisation has been a key endpoint over the past few years, in part due to the implementation of EU regulation 1223/20091 which prohibits the sale and marketing of any cosmetics and cosmetic ingredients which have been tested on animals, alongside REACH2 and CLP3 which state that non-animal methods must be exhausted prior to considering the use of animal tests. The use of in silico tools such as Derek Nexus is increasingly popular due to their rapid toxicity assessment of chemicals, transparent predictions and access to the wealth of toxicity data and mechanistic information provided.
Skin sensitisation alerts - The performance of the Derek Nexus skin sensitisation alerts from 2014-2018 was assessed using an in-house dataset of 1243 sensitisers and 1300 non-sensitisers (n = 2543) with murine local lymph node assay (LLNA) and/or guinea pig data (conservative call applied when results for both were present). Chemicals activating an alert with a likelihood of equivocal or above were classified as sensitisers. Chemicals activating an alert with a likelihood of improbable or non-alerting chemicals were classified as non-sensitisers. The following metrics were calculated: sensitivity (Se), (TP/[TP+FN]*100); specificity (Sp), (TN/[TN+FP]*100); positive predictivity (PP), (TP/[TP+FP]*100); negative predictivity (NP), (TN/[TN+FN]*100); accuracy (Acc), (TP+TN/[TN+TP+FP+FN]*100).
EC3 prediction model – a k-Nearest Neighbours model was developed based on a curated in-house dataset of over 1000 publicly available, LLNA studies. The model was validated using 103 previously unseen chemicals with LLNA data4.
Integrated testing strategy (ITS) - a previously published dataset of 213 compounds with LLNA data and in chemico/in vitro data (DPRA, n = 194; KeratinoSens, n = 187; LuSens, n = 78; h-CLAT, n = 166; U-SENS, n = 149) was used, with Derek predictions and physicochemical parameters, to develop a decision tree for ITS-15 and ITS-2.
Skin sensitisation alerts in Derek currently perform with an accuracy of 76%, a sensitivity of 80% and a specificity of 72%. The alerts are continually being improved through the refinement of existing alerts and the development of new alerts. Furthermore, new functionality such as the EC3 prediction model can aid users in evaluating potency, and incorporating Derek in an ITS can increase the accuracy compared to other ITS. Overall, Derek can be used to contribute to the replacement, reduction and refinement of the use of animals in research.
Skin sensitisation alerts Derek alerts use structure-activity relationships (SAR) created by Lhasa experts to predict the toxicity of a given chemical. The predictions are supported by a graphical explanation of the SAR, mechanistic rationale, toxicity data of known compounds within the SAR and key references. Public, proprietary and regulatory data are used to build the alerts thereby providing extensive coverage of chemical space. The number and performance of the skin sensitisation alerts has increased steadily from 2014-2018 (Figure 1). The alerts are reviewed regularly, particularly if new public data is sourced, leading to alert refinement and/or the development of new alerts e.g. alerts 867, 878-879 and 882 (Figure 2). When proprietary data is donated, the data are anonymised and used to expand alert coverage, allowing all Derek members to benefit from improved alert predictivity and/or new alerts.
EC3 prediction model validation Moving from qualitative predictions of sensitiser/non-sensitiser to quantitative predictions is a challenging but important area of research. Given the desire and/or regulatory requirement(s) to reduce, refine and replace animal testing, a methodology to quantitatively predict LLNA EC3 values was explored. A k-Nearest Neighbours model, using the weighted mean of EC3s from up to 10 similar compounds in the same mechanistic domain was developed4 (Figure 3).
Query chemical with unknown skin sensitisation potential
Outcome - Detailed overview of skin sensitisation potential of the query chemical including alerting features, toxicity data, mechanistic rationale, EC3 prediction, and how Derek can be used effectively in an ITS
Comparison of integrated testing strategies (ITS)/defined approaches (DA)
It is generally accepted that no in chemico or in vitro assay can be used as a standalone method to replace animal models for the prediction of skin sensitisation potential. The focus has instead turned to combining multiple assays and/or molecular descriptors to derive a more accurate assessment of hazard or risk. These ITS are also known as DA and are key elements within integrated approaches to testing and assessment (IATA) for skin sensitisation, used for regulatory decision-making.
Lhasa’s first approach (ITS-1) used a Derek prediction, assigned the test chemical as in or out of the applicability domain of the in chemico/in vitro assay (based on physicochemical parameters), alongside up to two in chemico/in vitro assays to predict sensitiser/non-sensitiser5 (S/NS). Lhasa’s subsequent, as yet unpublished, ITS-2 is similar to ITS-1 (Figure 6) but (1) utilises even more information from Derek, including alert likelihood and the new negative prediction functionality and (2) additional reactivity properties are also considered when defining the applicability domain. Finally, (3) the hazard (S/NS) is predicted alongside an estimate of GHS category, based on data in the EC3 prediction model. ITS-2 improves significantly upon the specificity of ITS-1, with only a minor reduction in sensitivity (Table 2).
Alert 439 Alert 867
KB n PP (%) n PP (%)
Derek 2018 (unreleased) 101 58 9 100
Derek 2015 125 56 N/A N/A
Derek 2014 79 57 N/A N/A
categories compared to ECETOC’s 4. The majority, 66/103 (64%) were correctly assigned to their GHS category. 26/103 (25%) were predicted as more potent, and 11/103 (11%) were predicted as less potent (Figure 5). As many industries only require distinction between strong and weak sensitisers, a model which correctly assigns the GHS category may be sufficient for most use cases.
Figure 4. GHS and ECETOC categories relative to EC3 values.
Figure 5. Validation of EC3 prediction model using an unseen dataset.
Figure 1. Improvement in Derek performance metrics between 2014-2018.
Table 1. Refinement of alert 439 by separating out vinylic or allylic anisoles.
ITS/DA Acc(%) Se (%)
Sp (%) n Information used
Lhasa ITS-15 86 96 57 213 Physchem properties, in silico tools, in chemico/in vitro assays
Lhasa ITS-26 88 91 80 213 Physchem properties, in silico tools, in chemico/in vitro assays
Urbisch et al7 79 82 72 180 In chemico/in vitro assays
Van der Veen et al8 83 93 64 41 Bayesian QSAR, in chemico/in vitro assays and gene signature
Patlewicz et al9 74 74 74 100 Physchem properties, in silico tools
Takenouchi et al10 84 89 70 139 In silico tools, in chemico/in vitroassays
Table 2. Comparison of ITS-1 and ITS-2 against other published ITS/DA.
Figure 6. General workflow of ITS-1 and ITS-2. New features in ITS-2 are highlighted in bold.
Case study: Refinement of substituted phenol alert
References: (1) European Union. (2013) Off. J. Eur. Union 56, 34–66. (2) http://www.hse.gov.uk/reach/ (3) https://echa.europa.eu/testing-clp (4) Canipa et al (2017) J. Appl. Toxicol. 37, 985–995. (5) Macmillan et al (2016) Regul. Toxicol. Pharmacol. 76, 30–38. (6) Chilton & Macmillan, unpublished. (7) Urbisch et al (2015) Regul. Toxicol. Pharmacol. 71, 337–351. (8) van der Veen et al (2014) Regul. Toxicol. Pharmacol. 69, 371–379. (9) Patlewicz et al (2014) Regul. Toxicol. Pharmacol. 69, 529–545. (10) Takenouchi et al (2015) J. Appl. Toxicol. 35, 1318–1332.
Figure 3. Methodology for predicting skin sensitisation potency based on a k-NN model.
The EC3 prediction model was evaluated using an external validation dataset (n = 103), consisting of LLNA data donated to Lhasa Limited by a number of members for the specific purpose of providing a test set of unseen compounds. The experimental EC3 values in this dataset were compared to the EC3s predicted by the model and were judged as correct when the experimental (Exp) and predicted (Pred) values fell within the same GHS (2 classes) or ECETOC (4 classes) category (Figure 4). 37/103 (36%) were assigned to the correct ECETOC category, 47/103 (46%) were predicted as more potent, and the remaining 19/103 (19%) were predicted as less potent. GHS categories may be less challenging to predict than ECETOC categories as there are only 2 possible
The scope of alert 439, substituted phenol or precursor, was investigated for any improvements to increase predictivity. It was found that substituted phenols produce mixed positive and negative toxicity data in mouse, guinea pig and human assays, whereas a specific sub-class of vinylic or allylic anisoles is consistently positive in the same assays. Consequently, a new alert, with excellent predictivity, was created for these anisoles (alert 867) and alert 439 refined to exclude these, leading to a slight improvement in performance (Table 1).