covid19 drug repository: text-mining the literature in

9
Published online 9 November 2020 Nucleic Acids Research, 2021, Vol. 49, Database issue D1113–D1121 doi: 10.1093/nar/gkaa969 COVID19 Drug Repository: text-mining the literature in search of putative COVID19 therapeutics Dmitry Tworowski , Alessandro Gorohovski , Sumit Mukherjee, Gon Carmi, Eliad Levy, Rajesh Detroja, Sunanda Biswas Mukherjee and Milana Frenkel-Morgenstern * Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, Azrieli Faculty of Medicine, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel Received August 25, 2020; Revised October 07, 2020; Editorial Decision October 08, 2020; Accepted November 04, 2020 ABSTRACT The recent outbreak of COVID-19 has generated an enormous amount of Big Data. To date, the COVID- 19 Open Research Dataset (CORD-19), lists 130,000 articles from the WHO COVID-19 database, PubMed Central, medRxiv, and bioRxiv, as collected by Se- mantic Scholar. According to LitCovid (11 August 2020), 40,300 COVID19-related articles are currently listed in PubMed. It has been shown in clinical set- tings that the analysis of past research results and the mining of available data can provide novel oppor- tunities for the successful application of currently ap- proved therapeutics and their combinations for the treatment of conditions caused by a novel SARS- CoV-2 infection. As such, effective responses to the pandemic require the development of efficient ap- plications, methods and algorithms for data naviga- tion, text-mining, clustering, classification, analysis, and reasoning. Thus, our COVID19 Drug Repository represents a modular platform for drug data navi- gation and analysis, with an emphasis on COVID- 19-related information currently being reported. The COVID19 Drug Repository enables users to focus on different levels of complexity, starting from general information about (FDA-) approved drugs, PubMed references, clinical trials, recipes as well as the de- scriptions of molecular mechanisms of drugs’ action. Our COVID19 drug repository provide a most updated world-wide collection of drugs that has been repur- posed for COVID19 treatments around the world. INTRODUCTION The COVID-19 pandemic outbreak has triggered imme- diate reactions from the medical and scientific commu- nities, and has resulted in an explosive growth of novel data regarding possible therapies or therapeutic oppor- tunities (1,2). The COVID-19 data portal (https://www. covid19dataportal.org/) established by the European Com- mission in April, 2020 has facilitated the exchange and shar- ing of COVID-19 research data. One of the first open initia- tives realized with creation of this portal was the develop- ment of the COVID-19 Open Research Dataset (CORD- 19) (2). The CORD-19 (https://www.semanticscholar.org/ cord19) currently lists 130,000 articles from the WHO COVID-19 database, PubMed Central, medRxiv, and bioRxiv, as collected by Semantic Scholar. Another com- prehensive list of COVID-19 databases and journals can be found on the Centres for Disease Control (CDC) library webpage: https://www.cdc.gov/library/researchguides/ 2019novelcoronavirus/databasesjournals.html. According to recent records from LitCovid resource (1), 40 300 COVID19-related articles have been currently listed in PubMed (1). The rapid accumulation of COVID-19 lit- erature requires novel tools for the data collection and or- ganization with efficient navigation capabilities. Such nav- igation capabilities are based on the literature-based dis- covery (LBD) concept (3) and can be achieved by imple- menting text-mining, clustering, and classification meth- ods (1,4–8). Available text and data-mining tools, such as those found at LitCovid (1), PubTator (4,9,10), the iSearch platform (https://icite.od.nih.gov/covid19/search/), Neural- Covidex (https://covidex.ai/) (7), the COVID-19 Data Portal (https://www.covid19dataportal.org/), Carrot/Lingo (https://search.carrot2.org/#/web)(11) and ProtFus (12), efficiently extract target information across articles and other text sources. Using the mentioned tools for text- mining, we have created the COVID-19 Drug Repository. The goal of our COVID-19 Drug Repository was to au- tomatically collect data on drugs used against COVID- 19 around the world and build a structured repository that includes drug descriptions, side effects and available publications. The repository also contains medicine- and pharmacology-oriented data, including annotated informa- tion on (FDA-)approved drugs, therapeutic agents (exper- imental drugs), and drug-like synthetic or natural chemi- * To whom correspondence should be addressed. Tel: +972 722644901; Email: [email protected] The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors. C The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Upload: others

Post on 28-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COVID19 Drug Repository: text-mining the literature in

Published online 9 November 2020 Nucleic Acids Research 2021 Vol 49 Database issue D1113ndashD1121doi 101093nargkaa969

COVID19 Drug Repository text-mining the literature insearch of putative COVID19 therapeuticsDmitry Tworowskidagger Alessandro Gorohovski dagger Sumit Mukherjee Gon Carmi Eliad LevyRajesh Detroja Sunanda Biswas Mukherjee and Milana Frenkel-Morgenstern

Laboratory of Cancer Genomics and Biocomputing of Complex Diseases Azrieli Faculty of Medicine Bar-IlanUniversity Henrietta Szold 8 Safed 13195 Israel

Received August 25 2020 Revised October 07 2020 Editorial Decision October 08 2020 Accepted November 04 2020

ABSTRACT

The recent outbreak of COVID-19 has generated anenormous amount of Big Data To date the COVID-19 Open Research Dataset (CORD-19) lists sim130000articles from the WHO COVID-19 database PubMedCentral medRxiv and bioRxiv as collected by Se-mantic Scholar According to LitCovid (11 August2020) sim40300 COVID19-related articles are currentlylisted in PubMed It has been shown in clinical set-tings that the analysis of past research results andthe mining of available data can provide novel oppor-tunities for the successful application of currently ap-proved therapeutics and their combinations for thetreatment of conditions caused by a novel SARS-CoV-2 infection As such effective responses to thepandemic require the development of efficient ap-plications methods and algorithms for data naviga-tion text-mining clustering classification analysisand reasoning Thus our COVID19 Drug Repositoryrepresents a modular platform for drug data navi-gation and analysis with an emphasis on COVID-19-related information currently being reported TheCOVID19 Drug Repository enables users to focus ondifferent levels of complexity starting from generalinformation about (FDA-) approved drugs PubMedreferences clinical trials recipes as well as the de-scriptions of molecular mechanisms of drugsrsquo actionOur COVID19 drug repository provide a most updatedworld-wide collection of drugs that has been repur-posed for COVID19 treatments around the world

INTRODUCTION

The COVID-19 pandemic outbreak has triggered imme-diate reactions from the medical and scientific commu-nities and has resulted in an explosive growth of noveldata regarding possible therapies or therapeutic oppor-

tunities (12) The COVID-19 data portal (httpswwwcovid19dataportalorg) established by the European Com-mission in April 2020 has facilitated the exchange and shar-ing of COVID-19 research data One of the first open initia-tives realized with creation of this portal was the develop-ment of the COVID-19 Open Research Dataset (CORD-19) (2) The CORD-19 (httpswwwsemanticscholarorgcord19) currently lists sim130000 articles from the WHOCOVID-19 database PubMed Central medRxiv andbioRxiv as collected by Semantic Scholar Another com-prehensive list of COVID-19 databases and journals can befound on the Centres for Disease Control (CDC) librarywebpage

httpswwwcdcgovlibraryresearchguides2019novelcoronavirusdatabasesjournalshtml

According to recent records from LitCovid resource (1)40 300 COVID19-related articles have been currently listedin PubMed (1) The rapid accumulation of COVID-19 lit-erature requires novel tools for the data collection and or-ganization with efficient navigation capabilities Such nav-igation capabilities are based on the literature-based dis-covery (LBD) concept (3) and can be achieved by imple-menting text-mining clustering and classification meth-ods (14ndash8) Available text and data-mining tools such asthose found at LitCovid (1) PubTator (4910) the iSearchplatform (httpsiciteodnihgovcovid19search) Neural-Covidex (httpscovidexai) (7) the COVID-19 DataPortal (httpswwwcovid19dataportalorg) CarrotLingo(httpssearchcarrot2orgweb) (11) and ProtFus (12)efficiently extract target information across articles andother text sources Using the mentioned tools for text-mining we have created the COVID-19 Drug Repository

The goal of our COVID-19 Drug Repository was to au-tomatically collect data on drugs used against COVID-19 around the world and build a structured repositorythat includes drug descriptions side effects and availablepublications The repository also contains medicine- andpharmacology-oriented data including annotated informa-tion on (FDA-)approved drugs therapeutic agents (exper-imental drugs) and drug-like synthetic or natural chemi-

To whom correspondence should be addressed Tel +972 722644901 Email milanamorgensternbiuacildaggerThe authors wish it to be known that in their opinion the first two authors should be regarded as Joint First Authors

Ccopy The Author(s) 2020 Published by Oxford University Press on behalf of Nucleic Acids ResearchThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (httpcreativecommonsorglicensesby40) whichpermits unrestricted reuse distribution and reproduction in any medium provided the original work is properly cited

D1114 Nucleic Acids Research 2021 Vol 49 Database issue

Figure 1 The structuremap of the COVID-19 Drugs Repository

cal substances The data was collected and integrated bymethods developed for the lsquoomicsrsquo field (13) in particularchemogenomics (ie chemical genomics) (14ndash17) pharma-cogenomics (18ndash27) genomics and personomics (28ndash30)In addition we made use of a number of chemogenomics(31ndash33) and pharmacogenomics (34ndash36) approaches thatfocused on the repositioning (ie repurposing) of FDAapproved drugs and clinical trials in the treatment ofCOVID-19 All the data collected in the COVID-19 DrugRepository are designed for use by researchers and clin-icians in the field The information cannot be used forself-medication

RESULTS

The COVID-19 Drug Repository structure and technical de-scription

COVID-19 Drug Repository is an open-source modularplatform built on the MySQL server platform comprising15 curated tables The structure of the database presentingthe logical relations between these tables and the data col-lection process are shown in Figures 1 and 2 respectivelyTo ensure consistency between the drug syn drug recipecovid salt drug link drug pubmed Clinicaltrial text mining

and covid drug tables the insertionupdatedeletion of rowsis linked to the covid drug table Each covid drug entryis linked to 15 data fields corresponding to drug dataand a target Most of the data fields (ie ACC id UNIICAS1 CAS2 CAS3 and PubChem cid) are hyperlinked toother databases (ie DrugBank (3738) ClinicalTrialsgovPubChem (39) IUPHARBPC (40) and Chemical Ab-stracts Service (4142) (Figure 1 Figure 2 Table 1) Eachcovid recipe entry is associated with 12 data fields includ-ing drug formulation (recipe) citing the country of manu-facture FDA-approved drugs guidelines etc The COVID-19 Drug Repository supports text query inputs using thesearch box on the homepage (Figure 3) The MyISAMengine (43) was implemented to support the FULLTEXTsearch functionality with the lsquoutf8rsquo DEFAULT CHARSETDetailed instructions on the browsing and search toolsfound in the database were provided below and can also befound on the database homepage (under lsquoHelprsquo option) Fi-nally the database update process is semi-automatic as fol-lows (a) selection of potential COVID19 therapeutic sub-stances found in research articles is manual (b) updatingand adding new records to COVID19 Drug Repositorydatabase is fully automated using Perl scripts (c) hyperlinksto PubMed and other sources and maps are generated au-tomatically by python scripts

Nucleic Acids Research 2021 Vol 49 Database issue D1115

Figure 2 The data collection workflow implemented in the COVID-19 Drugs Repository

The database search query syntax

The database search is case-insensitive Simple queries caninclude either the full or partial drug name (lsquoDrug searchrsquobox) Advanced queries can be constructed by combiningidentifiers of the various databases (Table 1) andor con-cepts (eg compound class viral target etc) The lsquocom-pound classrsquo for example includes the following termsAntibody | Metabolite | Natural product | Inorganic | Pep-tide | Synthetic organic while the lsquoviral targetrsquo categorycomprises the acronym of viral name BCV | BtCoV-(HKU3 | HKU5 | SHC014) | HcoV-(229E | NL63 | OC43)| (MERS|SARS)-CoV(-2) | CcoV | FIPV | PEDV | SARS |TGEV | IBV | MHV A query combination can be builtusing delimiters and plus + []amp[][][] (the [] indicates that the lsquospacersquo delimiter maybe used in the query string) for example

Query Oseltamivir Ritonavir DB0008934PMID32145363 CID37542 118390ndash30-0

The search results page shows all relevant instances as-sociated with a query drug The HTML page is generatedwith hyperlinks to all databases (Table 1) associated withthe drug of interest Alternatively the information can alsobe accessed by selecting a drug name from the menu in theselection field (see the Database lsquoHelprsquo page httpcovid19mdbiuacil) The Treatment Options section enablesaccess to a detailed description of the query drug

COVID19 Drug Repository web interface

On the menu bar there are seven buttons (BROWSE COL-LECTIONS DOWNLOADS NEWS HELP ABOUT USand CONTACT on the right and the lsquoCancer Genomicsamp BioComputing of complex Diseases labrsquo logo on theleft) that users can visit with a single click Clicking theBROWSE button returns the user to the homepage (Fig-

ure 3) which displays introductory and technical infor-mation about the Repository Clicking the lsquoVIEW ALLDRUGSrsquo button or on the lsquoCOLLECTIONSrsquo submenuitems (eg lsquoAnti-viral drugsrsquo) brings the user to the drugs-listing webpage (Figure 3) where one can see all query drugspresented there By entering the first letters of a query drugname in the lsquoDrug searchrsquo field (Figure 3) users can selecttheir drug of interest from the dropdown menu (Figure 3)

Features and functionality

The Repository is a COVID19-targeted collection (short-list) of sim460 items representing 184 approved drugs 384investigated therapeutic agents and 76 drug-like syntheticor natural chemical substances The main focus of therepository is cross-referencing to PubMed articles linkingthese drugs with multiple research sources mapping as-sociations between the drugs and COVID19-related con-cepts textdata-mining clustering and visualization Fur-thermore the Biopython collection of modules (44) inparticular the Entrez BioEntrez module (44) was imple-mented into our database framework so as to enable fastdata retrieval via efficient command-line interactions withall NCBI resources and databases (45) sub-divided into sixcategories Literature Genes Proteins Genomes Geneticsand Chemicals A toolset of Perl and Python scripts hadbeen created for specific tasks such as (a) automatic gen-eration of links between the Repository and other sources(b) collection of datareferences and creation of work tablesand (c) mapping associationsconcepts with visualizationoptions being realized via external tools

COVID19 drugs search strategy

Recent information on approved drugs and therapeuticcombinations thereof considered useful for the treatment

D1116 Nucleic Acids Research 2021 Vol 49 Database issue

Table 1 Databases linked with the COVID-19 Drugs Repository

Database Identifier Databaseresource Links

ACC id () COVID19 Drug Repository httpcovid19mdbiuacilUNII ID Unique Ingredient Identifier httpswwwfdagovindustryfda-resources-data-standardsfdas-global-

substance-registration-systemCAS ID () Chemical Abstracts Service httpswwwcasorgPubChem CID PubChem Compound httpspubchemncbinlmnihgovPubChem SID PubChem Substance httpspubchemncbinlmnihgovPubMed ID PubMed httpspubmedncbinlmnihgovDrugBank ID DrugBank httpswwwdrugbankcaIUPHAR ID IUPHARBPS httpsiupharorgClinical Trials ID ClinicalTrialsgov httpsclinicaltrialsgov

of SARS-CoV-2 infection has been reported at Clinical-Trialsgov and PubMed Experimental therapeutic agentsdrug-like synthetic or natural chemical substances have alsobeen studied in the context of COVID-19 as reported in thePubMed database To extend the coverage of our databasewe reviewed articles from PubMed to extract COVID-19-related concepts and keywords Specifically using lsquodrugrepositioningrsquo lsquodrug repurposingrsquo lsquoCOVID19rsquo and lsquocoro-navirus infectionsdrug therapyrsquo as query keywords wereached PubMed publications with titles andor abstractscontaining combinations of these keywords Next we cre-ated another pattern of queries to search target informa-tion at PubMed and using Google and Google Scholar Thepattern was expressed as a pseudo-Regular Expression (46)where text in uppercase letters denotes variable nameslsquo((DRUG NAME)|(DRUG ALIAS)) ((sars)|(mers)|(corona covid)) (patients)rsquoExamples of the multi-step search queries are as followslsquocamostat corona covidrsquo rarr [output 1 informationrefere

nces] rarr [therapeutic agents]lsquocamostat sarsrsquo rarr [output 2 informationreferences] rarr [therapeutic agents]lsquocamostat sars patientsrsquo rarr [output 3 informationreferences] rarr [therapeutic agents]lsquoONO-3403 sars patientsrsquo rarr [output 4 informationreferences] rarr [therapeutic agents]

Using this search strategy we found sim100 substanceswith activities associated with COVID-19 The queries andthe patterns used and the information obtained daily arethe main sources for Repository updates

All lsquoactiversquo chemical entitiessubstances (ie those withdemonstrated or proposed therapeutic potential) were col-lected and linked via their identifiers in CAS PubChemIUPHAR etc Furthermore we adopted the web-basedtext clustering engine Carrot2 (47) for visualization ofpair-wise lsquodrug-COVID-19 conceptrsquo associations found inPubMed abstracts for each pair

In this version of our database COVID19 drugs weremapped to a dictionary of 21 terms related to conceptsof lsquoCOVID-19rsquo eg lsquoviral infectionsrsquo lsquorespiratory diseasesrsquolsquoinflammatory cellrsquo lsquocoronavirus pneumoniarsquo etc (Sup-plemental Table S1) These terms are the most frequentwordscombinations clustered around the central wordssuch as lsquovirusrsquo lsquoinfectionrsquo lsquoinflammationrsquo lsquopneumoniarsquolsquolungsrsquo To create the conceptsrsquo dictionary a variety ofclusters were generated by experimentation with differ-ent hierarchical clustering algorithms applied to the col-lection of PubMed titlesabstracts Links to all PubMed

abstracts associated with these lsquodrug-conceptrsquo pairs weregenerated and enumerated using Python scripts PubMedsearch queries were created according to PubMed querysyntax (48) and MeSH terms (49) The automatically gen-erated tables (available in the lsquoDOWNLOADSrsquo section) listthe number of retrieved PubMed publications correspond-ing to each lsquodrug-COVID-19 conceptrsquo pair These num-bers are hyperlinked with the corresponding PubMed pub-lications Links to references and the Carrot2 text cluster-ing and visualization tool can be updated on a regular ba-sis Such updates are necessary as the web and PubMeddatabase are constantly expanding with new references andsites appearing daily All desired data can be downloadedfrom the COVID19 Drug Repository website (link) as Exceltables containing the list of keywords (ie the lsquodictionaryrsquo)used for text-mining and mapping In subsequent versionsof the database users will be able to modify the list or intro-duce additional concepts With this simple mapping toolone can discover and visualize new concepts and associa-tions that would not otherwise be found

Drug repurposing sources

Currently there are 384 mapped drug names mentionedin 960 COVID-19 clinical studies (data retrieved on Au-gust 15 2020) with at least 1 drug intervention (Table2) None of these drugs are novel Rather they exemplifya lsquodrug repurposingrepositioningrsquo approach (263450ndash52) Recently numerous COVID19-specific web pages andchemical libraries have been created by different researchorganizations (CAS IUPHAR (5354) ChEMBL Open-Data Portal (httpsopendatancatsnihgovcovid19indexhtml) etc) and companies (MedChem Express) and usedfor the high throughput screening against SARS-CoV-2 in-fection (55ndash63) All these molecular libraries and collections(Figure 2 Table 2) are being used in our data collection pro-cess (Figure 2) and listed in the Repository web page (lsquoUse-ful Linksrsquo)

Drug-gene interaction networks

The BiopythonEntrez-based Python command-line script(as discussed in the Features and functionality section) wascreated to access the NCBI Gene database (45) and to au-tomatically retrieve human or microbial (and in particularviral) genes associated with a given list of drugs or chemicalsubstances The output (Figure 4) provides a list of geneswith a short description of the biological role associatedwith each gene product in the output list

Nucleic Acids Research 2021 Vol 49 Database issue D1117

Figure 3 Web interface of the COVID19 Drug Repository Selection of the drug of interest (eg Favipiravir) from the dropdown menulist

Those genes associated with a set of drugs can be anal-ysed clustered or served as input for building lsquodrug-genersquonetworks and then visualized using external programs Asa working example protein-protein association networkswere built for output gene sets using the STRINGv11database (64) Moreover the application programming in-terface (API) implemented in the STRINGv11 databaseenables efficient interaction of external databases with theSTRING visualization and analysis tools (64) For exam-ple visualization of the set of genes associated with the

PDE5A inhibitor sildenafil a vasodilator agent revealedother interesting targets (Figure 4) such as the enzymePDE6G Both enzymes are active in the lungs (6566)Further network analysis of available data showed thatPDE5APDE6G inhibition by sildenafil in lung blood ves-sels can trigger different anti-inflammatory pathways

To build a network (Figure 5) we extracted additionalinformation from the literature and external databases Inthe PDE5A and PDE6G protein expression summaries ob-tained from the Human Protein Atlas (67) the PGE6G

D1118 Nucleic Acids Research 2021 Vol 49 Database issue

Table 2 COVID19-specific collections and chemical libraries

ResourceDataset (title catalogue orcategory) Drugs active agents Links

ClinicalTrialsgov COVID-19 studies bymapped drug intervention

384 httpsclinicaltrialsgovct2covid viewdrugs

CAS COVID-19 AntiviralCandidate CompoundsDataset

sim50000 httpswwwcasorgcovid-19-antiviral-compounds-dataset

IUPHAR Ligands relevant toSARS-CoV-2(COVID-19

64 items httpswwwguidetopharmacologyorgGRACCoronavirusForward

ChEMBL ChEMBL 27 SARS CoV-2Release 8 datasets

133 selected(IC50EC50 betterthan 10 M)

httpchemblblogspotcom202005chembl27-sars-cov-2-releasehtml

StructuralBioinformatics andNetwork BiologyGroup (SBNB)

Drugs from sthe COVID19literature

307 httpssbnbirbbarcelonaorgcovid19

OpenData Portal NCATS Anti-infectivesCollection

740 httpsopendatancatsnihgovcovid19indexhtml

MedChemExpress SARS-CoV List of Drugs 76 httpswwwmedchemexpresscomTargetsSARS-CoVhtml

MedChemExpress Anti-Virus CompoundLibrary

512 httpswwwmedchemexpresscomscreeningAnti-virus Compound Libraryhtml

MedChemExpress Anti-COVID-19 CompoundLibrary

1552 httpswwwmedchemexpresscomscreeninganti-covid-19-compound-libraryhtml

Figure 4 Example output of the DruGeNetwork python script list of genes (column lsquo1rsquo) with a short descriptionannotation of the underlying biologicalmechanism associated with each gene (column lsquo1rsquo) The output is generated for the query keywords sent to the NCBI Gene resources ldquosildenafilrdquo ANDldquoHomo sapiensrdquo[porgn] AND alive[prop]

gene is categorized as lsquoGroup enrichedrsquo in natural killer(NK) cells according to consensus transcriptomics dataNK cells acting as cytotoxic lymphocytes are involved ininnate immune system regulation including rapid cytokineproduction in the presence of virus-infected cells (67) TheNK-mediated antiviral immune response is associated withthe NCR1 gene that encodes the natural cytotoxicity re-ceptor 1 (68) In the next step both the PDE6 and NCR1genes were detected in the Chronic Obstructive PulmonaryDisease (COPD)-related Gene Set using Harmonizome onthe collection of lsquoomicsrsquo Big Data sets (69) This geneset was deposited in the GEO Signatures of DifferentiallyExpressed Genes for Diseases under the name lsquoCOPD-Chronic Obstructive Pulmonary Disease Muscle-Striated(Skeletal)-Diaphragm (MMHCC) GSE47rsquo The data showthat the expression of the PDE6G gene is significantly in-creased whereas decreased expression was reported for theNCR1 gene

Therefore in the context of drug repurposing strate-gies it is reasonable to expect that sildenafil will be use-

ful for the treatment of COVID19 complications Accord-ingly two recent clinical studies (ClinicalTrialsgov identi-fiers NCT04304313 and NCT04489446) were initiated tostudy the efficacy and safety of sildenafil in patients withCOVID-19 (NCT04304313) and to assess the role of silde-nafil in improving oxygenation among hospitalised patients(NCT04489446)

Identification of drug target genes

We extracted target gene information for each putativeCOVID-19 drug from the Therapeutic Target Database(70) and by text-mining of the literature at PubMed Tounderstand the expression profile of drug target genes iden-tified in this manner in the COVID-19 infection we per-formed transcriptome analysis of infected bronchial epithe-lial cells For this we retrieved raw RNA-sequencing datafor SARS-CoV-2-infected bronchial epithelial cells from thesequence read archive (SRA) database under accession noPRJNA615032 (71) The FASTQ files were mapped and

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 2: COVID19 Drug Repository: text-mining the literature in

D1114 Nucleic Acids Research 2021 Vol 49 Database issue

Figure 1 The structuremap of the COVID-19 Drugs Repository

cal substances The data was collected and integrated bymethods developed for the lsquoomicsrsquo field (13) in particularchemogenomics (ie chemical genomics) (14ndash17) pharma-cogenomics (18ndash27) genomics and personomics (28ndash30)In addition we made use of a number of chemogenomics(31ndash33) and pharmacogenomics (34ndash36) approaches thatfocused on the repositioning (ie repurposing) of FDAapproved drugs and clinical trials in the treatment ofCOVID-19 All the data collected in the COVID-19 DrugRepository are designed for use by researchers and clin-icians in the field The information cannot be used forself-medication

RESULTS

The COVID-19 Drug Repository structure and technical de-scription

COVID-19 Drug Repository is an open-source modularplatform built on the MySQL server platform comprising15 curated tables The structure of the database presentingthe logical relations between these tables and the data col-lection process are shown in Figures 1 and 2 respectivelyTo ensure consistency between the drug syn drug recipecovid salt drug link drug pubmed Clinicaltrial text mining

and covid drug tables the insertionupdatedeletion of rowsis linked to the covid drug table Each covid drug entryis linked to 15 data fields corresponding to drug dataand a target Most of the data fields (ie ACC id UNIICAS1 CAS2 CAS3 and PubChem cid) are hyperlinked toother databases (ie DrugBank (3738) ClinicalTrialsgovPubChem (39) IUPHARBPC (40) and Chemical Ab-stracts Service (4142) (Figure 1 Figure 2 Table 1) Eachcovid recipe entry is associated with 12 data fields includ-ing drug formulation (recipe) citing the country of manu-facture FDA-approved drugs guidelines etc The COVID-19 Drug Repository supports text query inputs using thesearch box on the homepage (Figure 3) The MyISAMengine (43) was implemented to support the FULLTEXTsearch functionality with the lsquoutf8rsquo DEFAULT CHARSETDetailed instructions on the browsing and search toolsfound in the database were provided below and can also befound on the database homepage (under lsquoHelprsquo option) Fi-nally the database update process is semi-automatic as fol-lows (a) selection of potential COVID19 therapeutic sub-stances found in research articles is manual (b) updatingand adding new records to COVID19 Drug Repositorydatabase is fully automated using Perl scripts (c) hyperlinksto PubMed and other sources and maps are generated au-tomatically by python scripts

Nucleic Acids Research 2021 Vol 49 Database issue D1115

Figure 2 The data collection workflow implemented in the COVID-19 Drugs Repository

The database search query syntax

The database search is case-insensitive Simple queries caninclude either the full or partial drug name (lsquoDrug searchrsquobox) Advanced queries can be constructed by combiningidentifiers of the various databases (Table 1) andor con-cepts (eg compound class viral target etc) The lsquocom-pound classrsquo for example includes the following termsAntibody | Metabolite | Natural product | Inorganic | Pep-tide | Synthetic organic while the lsquoviral targetrsquo categorycomprises the acronym of viral name BCV | BtCoV-(HKU3 | HKU5 | SHC014) | HcoV-(229E | NL63 | OC43)| (MERS|SARS)-CoV(-2) | CcoV | FIPV | PEDV | SARS |TGEV | IBV | MHV A query combination can be builtusing delimiters and plus + []amp[][][] (the [] indicates that the lsquospacersquo delimiter maybe used in the query string) for example

Query Oseltamivir Ritonavir DB0008934PMID32145363 CID37542 118390ndash30-0

The search results page shows all relevant instances as-sociated with a query drug The HTML page is generatedwith hyperlinks to all databases (Table 1) associated withthe drug of interest Alternatively the information can alsobe accessed by selecting a drug name from the menu in theselection field (see the Database lsquoHelprsquo page httpcovid19mdbiuacil) The Treatment Options section enablesaccess to a detailed description of the query drug

COVID19 Drug Repository web interface

On the menu bar there are seven buttons (BROWSE COL-LECTIONS DOWNLOADS NEWS HELP ABOUT USand CONTACT on the right and the lsquoCancer Genomicsamp BioComputing of complex Diseases labrsquo logo on theleft) that users can visit with a single click Clicking theBROWSE button returns the user to the homepage (Fig-

ure 3) which displays introductory and technical infor-mation about the Repository Clicking the lsquoVIEW ALLDRUGSrsquo button or on the lsquoCOLLECTIONSrsquo submenuitems (eg lsquoAnti-viral drugsrsquo) brings the user to the drugs-listing webpage (Figure 3) where one can see all query drugspresented there By entering the first letters of a query drugname in the lsquoDrug searchrsquo field (Figure 3) users can selecttheir drug of interest from the dropdown menu (Figure 3)

Features and functionality

The Repository is a COVID19-targeted collection (short-list) of sim460 items representing 184 approved drugs 384investigated therapeutic agents and 76 drug-like syntheticor natural chemical substances The main focus of therepository is cross-referencing to PubMed articles linkingthese drugs with multiple research sources mapping as-sociations between the drugs and COVID19-related con-cepts textdata-mining clustering and visualization Fur-thermore the Biopython collection of modules (44) inparticular the Entrez BioEntrez module (44) was imple-mented into our database framework so as to enable fastdata retrieval via efficient command-line interactions withall NCBI resources and databases (45) sub-divided into sixcategories Literature Genes Proteins Genomes Geneticsand Chemicals A toolset of Perl and Python scripts hadbeen created for specific tasks such as (a) automatic gen-eration of links between the Repository and other sources(b) collection of datareferences and creation of work tablesand (c) mapping associationsconcepts with visualizationoptions being realized via external tools

COVID19 drugs search strategy

Recent information on approved drugs and therapeuticcombinations thereof considered useful for the treatment

D1116 Nucleic Acids Research 2021 Vol 49 Database issue

Table 1 Databases linked with the COVID-19 Drugs Repository

Database Identifier Databaseresource Links

ACC id () COVID19 Drug Repository httpcovid19mdbiuacilUNII ID Unique Ingredient Identifier httpswwwfdagovindustryfda-resources-data-standardsfdas-global-

substance-registration-systemCAS ID () Chemical Abstracts Service httpswwwcasorgPubChem CID PubChem Compound httpspubchemncbinlmnihgovPubChem SID PubChem Substance httpspubchemncbinlmnihgovPubMed ID PubMed httpspubmedncbinlmnihgovDrugBank ID DrugBank httpswwwdrugbankcaIUPHAR ID IUPHARBPS httpsiupharorgClinical Trials ID ClinicalTrialsgov httpsclinicaltrialsgov

of SARS-CoV-2 infection has been reported at Clinical-Trialsgov and PubMed Experimental therapeutic agentsdrug-like synthetic or natural chemical substances have alsobeen studied in the context of COVID-19 as reported in thePubMed database To extend the coverage of our databasewe reviewed articles from PubMed to extract COVID-19-related concepts and keywords Specifically using lsquodrugrepositioningrsquo lsquodrug repurposingrsquo lsquoCOVID19rsquo and lsquocoro-navirus infectionsdrug therapyrsquo as query keywords wereached PubMed publications with titles andor abstractscontaining combinations of these keywords Next we cre-ated another pattern of queries to search target informa-tion at PubMed and using Google and Google Scholar Thepattern was expressed as a pseudo-Regular Expression (46)where text in uppercase letters denotes variable nameslsquo((DRUG NAME)|(DRUG ALIAS)) ((sars)|(mers)|(corona covid)) (patients)rsquoExamples of the multi-step search queries are as followslsquocamostat corona covidrsquo rarr [output 1 informationrefere

nces] rarr [therapeutic agents]lsquocamostat sarsrsquo rarr [output 2 informationreferences] rarr [therapeutic agents]lsquocamostat sars patientsrsquo rarr [output 3 informationreferences] rarr [therapeutic agents]lsquoONO-3403 sars patientsrsquo rarr [output 4 informationreferences] rarr [therapeutic agents]

Using this search strategy we found sim100 substanceswith activities associated with COVID-19 The queries andthe patterns used and the information obtained daily arethe main sources for Repository updates

All lsquoactiversquo chemical entitiessubstances (ie those withdemonstrated or proposed therapeutic potential) were col-lected and linked via their identifiers in CAS PubChemIUPHAR etc Furthermore we adopted the web-basedtext clustering engine Carrot2 (47) for visualization ofpair-wise lsquodrug-COVID-19 conceptrsquo associations found inPubMed abstracts for each pair

In this version of our database COVID19 drugs weremapped to a dictionary of 21 terms related to conceptsof lsquoCOVID-19rsquo eg lsquoviral infectionsrsquo lsquorespiratory diseasesrsquolsquoinflammatory cellrsquo lsquocoronavirus pneumoniarsquo etc (Sup-plemental Table S1) These terms are the most frequentwordscombinations clustered around the central wordssuch as lsquovirusrsquo lsquoinfectionrsquo lsquoinflammationrsquo lsquopneumoniarsquolsquolungsrsquo To create the conceptsrsquo dictionary a variety ofclusters were generated by experimentation with differ-ent hierarchical clustering algorithms applied to the col-lection of PubMed titlesabstracts Links to all PubMed

abstracts associated with these lsquodrug-conceptrsquo pairs weregenerated and enumerated using Python scripts PubMedsearch queries were created according to PubMed querysyntax (48) and MeSH terms (49) The automatically gen-erated tables (available in the lsquoDOWNLOADSrsquo section) listthe number of retrieved PubMed publications correspond-ing to each lsquodrug-COVID-19 conceptrsquo pair These num-bers are hyperlinked with the corresponding PubMed pub-lications Links to references and the Carrot2 text cluster-ing and visualization tool can be updated on a regular ba-sis Such updates are necessary as the web and PubMeddatabase are constantly expanding with new references andsites appearing daily All desired data can be downloadedfrom the COVID19 Drug Repository website (link) as Exceltables containing the list of keywords (ie the lsquodictionaryrsquo)used for text-mining and mapping In subsequent versionsof the database users will be able to modify the list or intro-duce additional concepts With this simple mapping toolone can discover and visualize new concepts and associa-tions that would not otherwise be found

Drug repurposing sources

Currently there are 384 mapped drug names mentionedin 960 COVID-19 clinical studies (data retrieved on Au-gust 15 2020) with at least 1 drug intervention (Table2) None of these drugs are novel Rather they exemplifya lsquodrug repurposingrepositioningrsquo approach (263450ndash52) Recently numerous COVID19-specific web pages andchemical libraries have been created by different researchorganizations (CAS IUPHAR (5354) ChEMBL Open-Data Portal (httpsopendatancatsnihgovcovid19indexhtml) etc) and companies (MedChem Express) and usedfor the high throughput screening against SARS-CoV-2 in-fection (55ndash63) All these molecular libraries and collections(Figure 2 Table 2) are being used in our data collection pro-cess (Figure 2) and listed in the Repository web page (lsquoUse-ful Linksrsquo)

Drug-gene interaction networks

The BiopythonEntrez-based Python command-line script(as discussed in the Features and functionality section) wascreated to access the NCBI Gene database (45) and to au-tomatically retrieve human or microbial (and in particularviral) genes associated with a given list of drugs or chemicalsubstances The output (Figure 4) provides a list of geneswith a short description of the biological role associatedwith each gene product in the output list

Nucleic Acids Research 2021 Vol 49 Database issue D1117

Figure 3 Web interface of the COVID19 Drug Repository Selection of the drug of interest (eg Favipiravir) from the dropdown menulist

Those genes associated with a set of drugs can be anal-ysed clustered or served as input for building lsquodrug-genersquonetworks and then visualized using external programs Asa working example protein-protein association networkswere built for output gene sets using the STRINGv11database (64) Moreover the application programming in-terface (API) implemented in the STRINGv11 databaseenables efficient interaction of external databases with theSTRING visualization and analysis tools (64) For exam-ple visualization of the set of genes associated with the

PDE5A inhibitor sildenafil a vasodilator agent revealedother interesting targets (Figure 4) such as the enzymePDE6G Both enzymes are active in the lungs (6566)Further network analysis of available data showed thatPDE5APDE6G inhibition by sildenafil in lung blood ves-sels can trigger different anti-inflammatory pathways

To build a network (Figure 5) we extracted additionalinformation from the literature and external databases Inthe PDE5A and PDE6G protein expression summaries ob-tained from the Human Protein Atlas (67) the PGE6G

D1118 Nucleic Acids Research 2021 Vol 49 Database issue

Table 2 COVID19-specific collections and chemical libraries

ResourceDataset (title catalogue orcategory) Drugs active agents Links

ClinicalTrialsgov COVID-19 studies bymapped drug intervention

384 httpsclinicaltrialsgovct2covid viewdrugs

CAS COVID-19 AntiviralCandidate CompoundsDataset

sim50000 httpswwwcasorgcovid-19-antiviral-compounds-dataset

IUPHAR Ligands relevant toSARS-CoV-2(COVID-19

64 items httpswwwguidetopharmacologyorgGRACCoronavirusForward

ChEMBL ChEMBL 27 SARS CoV-2Release 8 datasets

133 selected(IC50EC50 betterthan 10 M)

httpchemblblogspotcom202005chembl27-sars-cov-2-releasehtml

StructuralBioinformatics andNetwork BiologyGroup (SBNB)

Drugs from sthe COVID19literature

307 httpssbnbirbbarcelonaorgcovid19

OpenData Portal NCATS Anti-infectivesCollection

740 httpsopendatancatsnihgovcovid19indexhtml

MedChemExpress SARS-CoV List of Drugs 76 httpswwwmedchemexpresscomTargetsSARS-CoVhtml

MedChemExpress Anti-Virus CompoundLibrary

512 httpswwwmedchemexpresscomscreeningAnti-virus Compound Libraryhtml

MedChemExpress Anti-COVID-19 CompoundLibrary

1552 httpswwwmedchemexpresscomscreeninganti-covid-19-compound-libraryhtml

Figure 4 Example output of the DruGeNetwork python script list of genes (column lsquo1rsquo) with a short descriptionannotation of the underlying biologicalmechanism associated with each gene (column lsquo1rsquo) The output is generated for the query keywords sent to the NCBI Gene resources ldquosildenafilrdquo ANDldquoHomo sapiensrdquo[porgn] AND alive[prop]

gene is categorized as lsquoGroup enrichedrsquo in natural killer(NK) cells according to consensus transcriptomics dataNK cells acting as cytotoxic lymphocytes are involved ininnate immune system regulation including rapid cytokineproduction in the presence of virus-infected cells (67) TheNK-mediated antiviral immune response is associated withthe NCR1 gene that encodes the natural cytotoxicity re-ceptor 1 (68) In the next step both the PDE6 and NCR1genes were detected in the Chronic Obstructive PulmonaryDisease (COPD)-related Gene Set using Harmonizome onthe collection of lsquoomicsrsquo Big Data sets (69) This geneset was deposited in the GEO Signatures of DifferentiallyExpressed Genes for Diseases under the name lsquoCOPD-Chronic Obstructive Pulmonary Disease Muscle-Striated(Skeletal)-Diaphragm (MMHCC) GSE47rsquo The data showthat the expression of the PDE6G gene is significantly in-creased whereas decreased expression was reported for theNCR1 gene

Therefore in the context of drug repurposing strate-gies it is reasonable to expect that sildenafil will be use-

ful for the treatment of COVID19 complications Accord-ingly two recent clinical studies (ClinicalTrialsgov identi-fiers NCT04304313 and NCT04489446) were initiated tostudy the efficacy and safety of sildenafil in patients withCOVID-19 (NCT04304313) and to assess the role of silde-nafil in improving oxygenation among hospitalised patients(NCT04489446)

Identification of drug target genes

We extracted target gene information for each putativeCOVID-19 drug from the Therapeutic Target Database(70) and by text-mining of the literature at PubMed Tounderstand the expression profile of drug target genes iden-tified in this manner in the COVID-19 infection we per-formed transcriptome analysis of infected bronchial epithe-lial cells For this we retrieved raw RNA-sequencing datafor SARS-CoV-2-infected bronchial epithelial cells from thesequence read archive (SRA) database under accession noPRJNA615032 (71) The FASTQ files were mapped and

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 3: COVID19 Drug Repository: text-mining the literature in

Nucleic Acids Research 2021 Vol 49 Database issue D1115

Figure 2 The data collection workflow implemented in the COVID-19 Drugs Repository

The database search query syntax

The database search is case-insensitive Simple queries caninclude either the full or partial drug name (lsquoDrug searchrsquobox) Advanced queries can be constructed by combiningidentifiers of the various databases (Table 1) andor con-cepts (eg compound class viral target etc) The lsquocom-pound classrsquo for example includes the following termsAntibody | Metabolite | Natural product | Inorganic | Pep-tide | Synthetic organic while the lsquoviral targetrsquo categorycomprises the acronym of viral name BCV | BtCoV-(HKU3 | HKU5 | SHC014) | HcoV-(229E | NL63 | OC43)| (MERS|SARS)-CoV(-2) | CcoV | FIPV | PEDV | SARS |TGEV | IBV | MHV A query combination can be builtusing delimiters and plus + []amp[][][] (the [] indicates that the lsquospacersquo delimiter maybe used in the query string) for example

Query Oseltamivir Ritonavir DB0008934PMID32145363 CID37542 118390ndash30-0

The search results page shows all relevant instances as-sociated with a query drug The HTML page is generatedwith hyperlinks to all databases (Table 1) associated withthe drug of interest Alternatively the information can alsobe accessed by selecting a drug name from the menu in theselection field (see the Database lsquoHelprsquo page httpcovid19mdbiuacil) The Treatment Options section enablesaccess to a detailed description of the query drug

COVID19 Drug Repository web interface

On the menu bar there are seven buttons (BROWSE COL-LECTIONS DOWNLOADS NEWS HELP ABOUT USand CONTACT on the right and the lsquoCancer Genomicsamp BioComputing of complex Diseases labrsquo logo on theleft) that users can visit with a single click Clicking theBROWSE button returns the user to the homepage (Fig-

ure 3) which displays introductory and technical infor-mation about the Repository Clicking the lsquoVIEW ALLDRUGSrsquo button or on the lsquoCOLLECTIONSrsquo submenuitems (eg lsquoAnti-viral drugsrsquo) brings the user to the drugs-listing webpage (Figure 3) where one can see all query drugspresented there By entering the first letters of a query drugname in the lsquoDrug searchrsquo field (Figure 3) users can selecttheir drug of interest from the dropdown menu (Figure 3)

Features and functionality

The Repository is a COVID19-targeted collection (short-list) of sim460 items representing 184 approved drugs 384investigated therapeutic agents and 76 drug-like syntheticor natural chemical substances The main focus of therepository is cross-referencing to PubMed articles linkingthese drugs with multiple research sources mapping as-sociations between the drugs and COVID19-related con-cepts textdata-mining clustering and visualization Fur-thermore the Biopython collection of modules (44) inparticular the Entrez BioEntrez module (44) was imple-mented into our database framework so as to enable fastdata retrieval via efficient command-line interactions withall NCBI resources and databases (45) sub-divided into sixcategories Literature Genes Proteins Genomes Geneticsand Chemicals A toolset of Perl and Python scripts hadbeen created for specific tasks such as (a) automatic gen-eration of links between the Repository and other sources(b) collection of datareferences and creation of work tablesand (c) mapping associationsconcepts with visualizationoptions being realized via external tools

COVID19 drugs search strategy

Recent information on approved drugs and therapeuticcombinations thereof considered useful for the treatment

D1116 Nucleic Acids Research 2021 Vol 49 Database issue

Table 1 Databases linked with the COVID-19 Drugs Repository

Database Identifier Databaseresource Links

ACC id () COVID19 Drug Repository httpcovid19mdbiuacilUNII ID Unique Ingredient Identifier httpswwwfdagovindustryfda-resources-data-standardsfdas-global-

substance-registration-systemCAS ID () Chemical Abstracts Service httpswwwcasorgPubChem CID PubChem Compound httpspubchemncbinlmnihgovPubChem SID PubChem Substance httpspubchemncbinlmnihgovPubMed ID PubMed httpspubmedncbinlmnihgovDrugBank ID DrugBank httpswwwdrugbankcaIUPHAR ID IUPHARBPS httpsiupharorgClinical Trials ID ClinicalTrialsgov httpsclinicaltrialsgov

of SARS-CoV-2 infection has been reported at Clinical-Trialsgov and PubMed Experimental therapeutic agentsdrug-like synthetic or natural chemical substances have alsobeen studied in the context of COVID-19 as reported in thePubMed database To extend the coverage of our databasewe reviewed articles from PubMed to extract COVID-19-related concepts and keywords Specifically using lsquodrugrepositioningrsquo lsquodrug repurposingrsquo lsquoCOVID19rsquo and lsquocoro-navirus infectionsdrug therapyrsquo as query keywords wereached PubMed publications with titles andor abstractscontaining combinations of these keywords Next we cre-ated another pattern of queries to search target informa-tion at PubMed and using Google and Google Scholar Thepattern was expressed as a pseudo-Regular Expression (46)where text in uppercase letters denotes variable nameslsquo((DRUG NAME)|(DRUG ALIAS)) ((sars)|(mers)|(corona covid)) (patients)rsquoExamples of the multi-step search queries are as followslsquocamostat corona covidrsquo rarr [output 1 informationrefere

nces] rarr [therapeutic agents]lsquocamostat sarsrsquo rarr [output 2 informationreferences] rarr [therapeutic agents]lsquocamostat sars patientsrsquo rarr [output 3 informationreferences] rarr [therapeutic agents]lsquoONO-3403 sars patientsrsquo rarr [output 4 informationreferences] rarr [therapeutic agents]

Using this search strategy we found sim100 substanceswith activities associated with COVID-19 The queries andthe patterns used and the information obtained daily arethe main sources for Repository updates

All lsquoactiversquo chemical entitiessubstances (ie those withdemonstrated or proposed therapeutic potential) were col-lected and linked via their identifiers in CAS PubChemIUPHAR etc Furthermore we adopted the web-basedtext clustering engine Carrot2 (47) for visualization ofpair-wise lsquodrug-COVID-19 conceptrsquo associations found inPubMed abstracts for each pair

In this version of our database COVID19 drugs weremapped to a dictionary of 21 terms related to conceptsof lsquoCOVID-19rsquo eg lsquoviral infectionsrsquo lsquorespiratory diseasesrsquolsquoinflammatory cellrsquo lsquocoronavirus pneumoniarsquo etc (Sup-plemental Table S1) These terms are the most frequentwordscombinations clustered around the central wordssuch as lsquovirusrsquo lsquoinfectionrsquo lsquoinflammationrsquo lsquopneumoniarsquolsquolungsrsquo To create the conceptsrsquo dictionary a variety ofclusters were generated by experimentation with differ-ent hierarchical clustering algorithms applied to the col-lection of PubMed titlesabstracts Links to all PubMed

abstracts associated with these lsquodrug-conceptrsquo pairs weregenerated and enumerated using Python scripts PubMedsearch queries were created according to PubMed querysyntax (48) and MeSH terms (49) The automatically gen-erated tables (available in the lsquoDOWNLOADSrsquo section) listthe number of retrieved PubMed publications correspond-ing to each lsquodrug-COVID-19 conceptrsquo pair These num-bers are hyperlinked with the corresponding PubMed pub-lications Links to references and the Carrot2 text cluster-ing and visualization tool can be updated on a regular ba-sis Such updates are necessary as the web and PubMeddatabase are constantly expanding with new references andsites appearing daily All desired data can be downloadedfrom the COVID19 Drug Repository website (link) as Exceltables containing the list of keywords (ie the lsquodictionaryrsquo)used for text-mining and mapping In subsequent versionsof the database users will be able to modify the list or intro-duce additional concepts With this simple mapping toolone can discover and visualize new concepts and associa-tions that would not otherwise be found

Drug repurposing sources

Currently there are 384 mapped drug names mentionedin 960 COVID-19 clinical studies (data retrieved on Au-gust 15 2020) with at least 1 drug intervention (Table2) None of these drugs are novel Rather they exemplifya lsquodrug repurposingrepositioningrsquo approach (263450ndash52) Recently numerous COVID19-specific web pages andchemical libraries have been created by different researchorganizations (CAS IUPHAR (5354) ChEMBL Open-Data Portal (httpsopendatancatsnihgovcovid19indexhtml) etc) and companies (MedChem Express) and usedfor the high throughput screening against SARS-CoV-2 in-fection (55ndash63) All these molecular libraries and collections(Figure 2 Table 2) are being used in our data collection pro-cess (Figure 2) and listed in the Repository web page (lsquoUse-ful Linksrsquo)

Drug-gene interaction networks

The BiopythonEntrez-based Python command-line script(as discussed in the Features and functionality section) wascreated to access the NCBI Gene database (45) and to au-tomatically retrieve human or microbial (and in particularviral) genes associated with a given list of drugs or chemicalsubstances The output (Figure 4) provides a list of geneswith a short description of the biological role associatedwith each gene product in the output list

Nucleic Acids Research 2021 Vol 49 Database issue D1117

Figure 3 Web interface of the COVID19 Drug Repository Selection of the drug of interest (eg Favipiravir) from the dropdown menulist

Those genes associated with a set of drugs can be anal-ysed clustered or served as input for building lsquodrug-genersquonetworks and then visualized using external programs Asa working example protein-protein association networkswere built for output gene sets using the STRINGv11database (64) Moreover the application programming in-terface (API) implemented in the STRINGv11 databaseenables efficient interaction of external databases with theSTRING visualization and analysis tools (64) For exam-ple visualization of the set of genes associated with the

PDE5A inhibitor sildenafil a vasodilator agent revealedother interesting targets (Figure 4) such as the enzymePDE6G Both enzymes are active in the lungs (6566)Further network analysis of available data showed thatPDE5APDE6G inhibition by sildenafil in lung blood ves-sels can trigger different anti-inflammatory pathways

To build a network (Figure 5) we extracted additionalinformation from the literature and external databases Inthe PDE5A and PDE6G protein expression summaries ob-tained from the Human Protein Atlas (67) the PGE6G

D1118 Nucleic Acids Research 2021 Vol 49 Database issue

Table 2 COVID19-specific collections and chemical libraries

ResourceDataset (title catalogue orcategory) Drugs active agents Links

ClinicalTrialsgov COVID-19 studies bymapped drug intervention

384 httpsclinicaltrialsgovct2covid viewdrugs

CAS COVID-19 AntiviralCandidate CompoundsDataset

sim50000 httpswwwcasorgcovid-19-antiviral-compounds-dataset

IUPHAR Ligands relevant toSARS-CoV-2(COVID-19

64 items httpswwwguidetopharmacologyorgGRACCoronavirusForward

ChEMBL ChEMBL 27 SARS CoV-2Release 8 datasets

133 selected(IC50EC50 betterthan 10 M)

httpchemblblogspotcom202005chembl27-sars-cov-2-releasehtml

StructuralBioinformatics andNetwork BiologyGroup (SBNB)

Drugs from sthe COVID19literature

307 httpssbnbirbbarcelonaorgcovid19

OpenData Portal NCATS Anti-infectivesCollection

740 httpsopendatancatsnihgovcovid19indexhtml

MedChemExpress SARS-CoV List of Drugs 76 httpswwwmedchemexpresscomTargetsSARS-CoVhtml

MedChemExpress Anti-Virus CompoundLibrary

512 httpswwwmedchemexpresscomscreeningAnti-virus Compound Libraryhtml

MedChemExpress Anti-COVID-19 CompoundLibrary

1552 httpswwwmedchemexpresscomscreeninganti-covid-19-compound-libraryhtml

Figure 4 Example output of the DruGeNetwork python script list of genes (column lsquo1rsquo) with a short descriptionannotation of the underlying biologicalmechanism associated with each gene (column lsquo1rsquo) The output is generated for the query keywords sent to the NCBI Gene resources ldquosildenafilrdquo ANDldquoHomo sapiensrdquo[porgn] AND alive[prop]

gene is categorized as lsquoGroup enrichedrsquo in natural killer(NK) cells according to consensus transcriptomics dataNK cells acting as cytotoxic lymphocytes are involved ininnate immune system regulation including rapid cytokineproduction in the presence of virus-infected cells (67) TheNK-mediated antiviral immune response is associated withthe NCR1 gene that encodes the natural cytotoxicity re-ceptor 1 (68) In the next step both the PDE6 and NCR1genes were detected in the Chronic Obstructive PulmonaryDisease (COPD)-related Gene Set using Harmonizome onthe collection of lsquoomicsrsquo Big Data sets (69) This geneset was deposited in the GEO Signatures of DifferentiallyExpressed Genes for Diseases under the name lsquoCOPD-Chronic Obstructive Pulmonary Disease Muscle-Striated(Skeletal)-Diaphragm (MMHCC) GSE47rsquo The data showthat the expression of the PDE6G gene is significantly in-creased whereas decreased expression was reported for theNCR1 gene

Therefore in the context of drug repurposing strate-gies it is reasonable to expect that sildenafil will be use-

ful for the treatment of COVID19 complications Accord-ingly two recent clinical studies (ClinicalTrialsgov identi-fiers NCT04304313 and NCT04489446) were initiated tostudy the efficacy and safety of sildenafil in patients withCOVID-19 (NCT04304313) and to assess the role of silde-nafil in improving oxygenation among hospitalised patients(NCT04489446)

Identification of drug target genes

We extracted target gene information for each putativeCOVID-19 drug from the Therapeutic Target Database(70) and by text-mining of the literature at PubMed Tounderstand the expression profile of drug target genes iden-tified in this manner in the COVID-19 infection we per-formed transcriptome analysis of infected bronchial epithe-lial cells For this we retrieved raw RNA-sequencing datafor SARS-CoV-2-infected bronchial epithelial cells from thesequence read archive (SRA) database under accession noPRJNA615032 (71) The FASTQ files were mapped and

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 4: COVID19 Drug Repository: text-mining the literature in

D1116 Nucleic Acids Research 2021 Vol 49 Database issue

Table 1 Databases linked with the COVID-19 Drugs Repository

Database Identifier Databaseresource Links

ACC id () COVID19 Drug Repository httpcovid19mdbiuacilUNII ID Unique Ingredient Identifier httpswwwfdagovindustryfda-resources-data-standardsfdas-global-

substance-registration-systemCAS ID () Chemical Abstracts Service httpswwwcasorgPubChem CID PubChem Compound httpspubchemncbinlmnihgovPubChem SID PubChem Substance httpspubchemncbinlmnihgovPubMed ID PubMed httpspubmedncbinlmnihgovDrugBank ID DrugBank httpswwwdrugbankcaIUPHAR ID IUPHARBPS httpsiupharorgClinical Trials ID ClinicalTrialsgov httpsclinicaltrialsgov

of SARS-CoV-2 infection has been reported at Clinical-Trialsgov and PubMed Experimental therapeutic agentsdrug-like synthetic or natural chemical substances have alsobeen studied in the context of COVID-19 as reported in thePubMed database To extend the coverage of our databasewe reviewed articles from PubMed to extract COVID-19-related concepts and keywords Specifically using lsquodrugrepositioningrsquo lsquodrug repurposingrsquo lsquoCOVID19rsquo and lsquocoro-navirus infectionsdrug therapyrsquo as query keywords wereached PubMed publications with titles andor abstractscontaining combinations of these keywords Next we cre-ated another pattern of queries to search target informa-tion at PubMed and using Google and Google Scholar Thepattern was expressed as a pseudo-Regular Expression (46)where text in uppercase letters denotes variable nameslsquo((DRUG NAME)|(DRUG ALIAS)) ((sars)|(mers)|(corona covid)) (patients)rsquoExamples of the multi-step search queries are as followslsquocamostat corona covidrsquo rarr [output 1 informationrefere

nces] rarr [therapeutic agents]lsquocamostat sarsrsquo rarr [output 2 informationreferences] rarr [therapeutic agents]lsquocamostat sars patientsrsquo rarr [output 3 informationreferences] rarr [therapeutic agents]lsquoONO-3403 sars patientsrsquo rarr [output 4 informationreferences] rarr [therapeutic agents]

Using this search strategy we found sim100 substanceswith activities associated with COVID-19 The queries andthe patterns used and the information obtained daily arethe main sources for Repository updates

All lsquoactiversquo chemical entitiessubstances (ie those withdemonstrated or proposed therapeutic potential) were col-lected and linked via their identifiers in CAS PubChemIUPHAR etc Furthermore we adopted the web-basedtext clustering engine Carrot2 (47) for visualization ofpair-wise lsquodrug-COVID-19 conceptrsquo associations found inPubMed abstracts for each pair

In this version of our database COVID19 drugs weremapped to a dictionary of 21 terms related to conceptsof lsquoCOVID-19rsquo eg lsquoviral infectionsrsquo lsquorespiratory diseasesrsquolsquoinflammatory cellrsquo lsquocoronavirus pneumoniarsquo etc (Sup-plemental Table S1) These terms are the most frequentwordscombinations clustered around the central wordssuch as lsquovirusrsquo lsquoinfectionrsquo lsquoinflammationrsquo lsquopneumoniarsquolsquolungsrsquo To create the conceptsrsquo dictionary a variety ofclusters were generated by experimentation with differ-ent hierarchical clustering algorithms applied to the col-lection of PubMed titlesabstracts Links to all PubMed

abstracts associated with these lsquodrug-conceptrsquo pairs weregenerated and enumerated using Python scripts PubMedsearch queries were created according to PubMed querysyntax (48) and MeSH terms (49) The automatically gen-erated tables (available in the lsquoDOWNLOADSrsquo section) listthe number of retrieved PubMed publications correspond-ing to each lsquodrug-COVID-19 conceptrsquo pair These num-bers are hyperlinked with the corresponding PubMed pub-lications Links to references and the Carrot2 text cluster-ing and visualization tool can be updated on a regular ba-sis Such updates are necessary as the web and PubMeddatabase are constantly expanding with new references andsites appearing daily All desired data can be downloadedfrom the COVID19 Drug Repository website (link) as Exceltables containing the list of keywords (ie the lsquodictionaryrsquo)used for text-mining and mapping In subsequent versionsof the database users will be able to modify the list or intro-duce additional concepts With this simple mapping toolone can discover and visualize new concepts and associa-tions that would not otherwise be found

Drug repurposing sources

Currently there are 384 mapped drug names mentionedin 960 COVID-19 clinical studies (data retrieved on Au-gust 15 2020) with at least 1 drug intervention (Table2) None of these drugs are novel Rather they exemplifya lsquodrug repurposingrepositioningrsquo approach (263450ndash52) Recently numerous COVID19-specific web pages andchemical libraries have been created by different researchorganizations (CAS IUPHAR (5354) ChEMBL Open-Data Portal (httpsopendatancatsnihgovcovid19indexhtml) etc) and companies (MedChem Express) and usedfor the high throughput screening against SARS-CoV-2 in-fection (55ndash63) All these molecular libraries and collections(Figure 2 Table 2) are being used in our data collection pro-cess (Figure 2) and listed in the Repository web page (lsquoUse-ful Linksrsquo)

Drug-gene interaction networks

The BiopythonEntrez-based Python command-line script(as discussed in the Features and functionality section) wascreated to access the NCBI Gene database (45) and to au-tomatically retrieve human or microbial (and in particularviral) genes associated with a given list of drugs or chemicalsubstances The output (Figure 4) provides a list of geneswith a short description of the biological role associatedwith each gene product in the output list

Nucleic Acids Research 2021 Vol 49 Database issue D1117

Figure 3 Web interface of the COVID19 Drug Repository Selection of the drug of interest (eg Favipiravir) from the dropdown menulist

Those genes associated with a set of drugs can be anal-ysed clustered or served as input for building lsquodrug-genersquonetworks and then visualized using external programs Asa working example protein-protein association networkswere built for output gene sets using the STRINGv11database (64) Moreover the application programming in-terface (API) implemented in the STRINGv11 databaseenables efficient interaction of external databases with theSTRING visualization and analysis tools (64) For exam-ple visualization of the set of genes associated with the

PDE5A inhibitor sildenafil a vasodilator agent revealedother interesting targets (Figure 4) such as the enzymePDE6G Both enzymes are active in the lungs (6566)Further network analysis of available data showed thatPDE5APDE6G inhibition by sildenafil in lung blood ves-sels can trigger different anti-inflammatory pathways

To build a network (Figure 5) we extracted additionalinformation from the literature and external databases Inthe PDE5A and PDE6G protein expression summaries ob-tained from the Human Protein Atlas (67) the PGE6G

D1118 Nucleic Acids Research 2021 Vol 49 Database issue

Table 2 COVID19-specific collections and chemical libraries

ResourceDataset (title catalogue orcategory) Drugs active agents Links

ClinicalTrialsgov COVID-19 studies bymapped drug intervention

384 httpsclinicaltrialsgovct2covid viewdrugs

CAS COVID-19 AntiviralCandidate CompoundsDataset

sim50000 httpswwwcasorgcovid-19-antiviral-compounds-dataset

IUPHAR Ligands relevant toSARS-CoV-2(COVID-19

64 items httpswwwguidetopharmacologyorgGRACCoronavirusForward

ChEMBL ChEMBL 27 SARS CoV-2Release 8 datasets

133 selected(IC50EC50 betterthan 10 M)

httpchemblblogspotcom202005chembl27-sars-cov-2-releasehtml

StructuralBioinformatics andNetwork BiologyGroup (SBNB)

Drugs from sthe COVID19literature

307 httpssbnbirbbarcelonaorgcovid19

OpenData Portal NCATS Anti-infectivesCollection

740 httpsopendatancatsnihgovcovid19indexhtml

MedChemExpress SARS-CoV List of Drugs 76 httpswwwmedchemexpresscomTargetsSARS-CoVhtml

MedChemExpress Anti-Virus CompoundLibrary

512 httpswwwmedchemexpresscomscreeningAnti-virus Compound Libraryhtml

MedChemExpress Anti-COVID-19 CompoundLibrary

1552 httpswwwmedchemexpresscomscreeninganti-covid-19-compound-libraryhtml

Figure 4 Example output of the DruGeNetwork python script list of genes (column lsquo1rsquo) with a short descriptionannotation of the underlying biologicalmechanism associated with each gene (column lsquo1rsquo) The output is generated for the query keywords sent to the NCBI Gene resources ldquosildenafilrdquo ANDldquoHomo sapiensrdquo[porgn] AND alive[prop]

gene is categorized as lsquoGroup enrichedrsquo in natural killer(NK) cells according to consensus transcriptomics dataNK cells acting as cytotoxic lymphocytes are involved ininnate immune system regulation including rapid cytokineproduction in the presence of virus-infected cells (67) TheNK-mediated antiviral immune response is associated withthe NCR1 gene that encodes the natural cytotoxicity re-ceptor 1 (68) In the next step both the PDE6 and NCR1genes were detected in the Chronic Obstructive PulmonaryDisease (COPD)-related Gene Set using Harmonizome onthe collection of lsquoomicsrsquo Big Data sets (69) This geneset was deposited in the GEO Signatures of DifferentiallyExpressed Genes for Diseases under the name lsquoCOPD-Chronic Obstructive Pulmonary Disease Muscle-Striated(Skeletal)-Diaphragm (MMHCC) GSE47rsquo The data showthat the expression of the PDE6G gene is significantly in-creased whereas decreased expression was reported for theNCR1 gene

Therefore in the context of drug repurposing strate-gies it is reasonable to expect that sildenafil will be use-

ful for the treatment of COVID19 complications Accord-ingly two recent clinical studies (ClinicalTrialsgov identi-fiers NCT04304313 and NCT04489446) were initiated tostudy the efficacy and safety of sildenafil in patients withCOVID-19 (NCT04304313) and to assess the role of silde-nafil in improving oxygenation among hospitalised patients(NCT04489446)

Identification of drug target genes

We extracted target gene information for each putativeCOVID-19 drug from the Therapeutic Target Database(70) and by text-mining of the literature at PubMed Tounderstand the expression profile of drug target genes iden-tified in this manner in the COVID-19 infection we per-formed transcriptome analysis of infected bronchial epithe-lial cells For this we retrieved raw RNA-sequencing datafor SARS-CoV-2-infected bronchial epithelial cells from thesequence read archive (SRA) database under accession noPRJNA615032 (71) The FASTQ files were mapped and

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 5: COVID19 Drug Repository: text-mining the literature in

Nucleic Acids Research 2021 Vol 49 Database issue D1117

Figure 3 Web interface of the COVID19 Drug Repository Selection of the drug of interest (eg Favipiravir) from the dropdown menulist

Those genes associated with a set of drugs can be anal-ysed clustered or served as input for building lsquodrug-genersquonetworks and then visualized using external programs Asa working example protein-protein association networkswere built for output gene sets using the STRINGv11database (64) Moreover the application programming in-terface (API) implemented in the STRINGv11 databaseenables efficient interaction of external databases with theSTRING visualization and analysis tools (64) For exam-ple visualization of the set of genes associated with the

PDE5A inhibitor sildenafil a vasodilator agent revealedother interesting targets (Figure 4) such as the enzymePDE6G Both enzymes are active in the lungs (6566)Further network analysis of available data showed thatPDE5APDE6G inhibition by sildenafil in lung blood ves-sels can trigger different anti-inflammatory pathways

To build a network (Figure 5) we extracted additionalinformation from the literature and external databases Inthe PDE5A and PDE6G protein expression summaries ob-tained from the Human Protein Atlas (67) the PGE6G

D1118 Nucleic Acids Research 2021 Vol 49 Database issue

Table 2 COVID19-specific collections and chemical libraries

ResourceDataset (title catalogue orcategory) Drugs active agents Links

ClinicalTrialsgov COVID-19 studies bymapped drug intervention

384 httpsclinicaltrialsgovct2covid viewdrugs

CAS COVID-19 AntiviralCandidate CompoundsDataset

sim50000 httpswwwcasorgcovid-19-antiviral-compounds-dataset

IUPHAR Ligands relevant toSARS-CoV-2(COVID-19

64 items httpswwwguidetopharmacologyorgGRACCoronavirusForward

ChEMBL ChEMBL 27 SARS CoV-2Release 8 datasets

133 selected(IC50EC50 betterthan 10 M)

httpchemblblogspotcom202005chembl27-sars-cov-2-releasehtml

StructuralBioinformatics andNetwork BiologyGroup (SBNB)

Drugs from sthe COVID19literature

307 httpssbnbirbbarcelonaorgcovid19

OpenData Portal NCATS Anti-infectivesCollection

740 httpsopendatancatsnihgovcovid19indexhtml

MedChemExpress SARS-CoV List of Drugs 76 httpswwwmedchemexpresscomTargetsSARS-CoVhtml

MedChemExpress Anti-Virus CompoundLibrary

512 httpswwwmedchemexpresscomscreeningAnti-virus Compound Libraryhtml

MedChemExpress Anti-COVID-19 CompoundLibrary

1552 httpswwwmedchemexpresscomscreeninganti-covid-19-compound-libraryhtml

Figure 4 Example output of the DruGeNetwork python script list of genes (column lsquo1rsquo) with a short descriptionannotation of the underlying biologicalmechanism associated with each gene (column lsquo1rsquo) The output is generated for the query keywords sent to the NCBI Gene resources ldquosildenafilrdquo ANDldquoHomo sapiensrdquo[porgn] AND alive[prop]

gene is categorized as lsquoGroup enrichedrsquo in natural killer(NK) cells according to consensus transcriptomics dataNK cells acting as cytotoxic lymphocytes are involved ininnate immune system regulation including rapid cytokineproduction in the presence of virus-infected cells (67) TheNK-mediated antiviral immune response is associated withthe NCR1 gene that encodes the natural cytotoxicity re-ceptor 1 (68) In the next step both the PDE6 and NCR1genes were detected in the Chronic Obstructive PulmonaryDisease (COPD)-related Gene Set using Harmonizome onthe collection of lsquoomicsrsquo Big Data sets (69) This geneset was deposited in the GEO Signatures of DifferentiallyExpressed Genes for Diseases under the name lsquoCOPD-Chronic Obstructive Pulmonary Disease Muscle-Striated(Skeletal)-Diaphragm (MMHCC) GSE47rsquo The data showthat the expression of the PDE6G gene is significantly in-creased whereas decreased expression was reported for theNCR1 gene

Therefore in the context of drug repurposing strate-gies it is reasonable to expect that sildenafil will be use-

ful for the treatment of COVID19 complications Accord-ingly two recent clinical studies (ClinicalTrialsgov identi-fiers NCT04304313 and NCT04489446) were initiated tostudy the efficacy and safety of sildenafil in patients withCOVID-19 (NCT04304313) and to assess the role of silde-nafil in improving oxygenation among hospitalised patients(NCT04489446)

Identification of drug target genes

We extracted target gene information for each putativeCOVID-19 drug from the Therapeutic Target Database(70) and by text-mining of the literature at PubMed Tounderstand the expression profile of drug target genes iden-tified in this manner in the COVID-19 infection we per-formed transcriptome analysis of infected bronchial epithe-lial cells For this we retrieved raw RNA-sequencing datafor SARS-CoV-2-infected bronchial epithelial cells from thesequence read archive (SRA) database under accession noPRJNA615032 (71) The FASTQ files were mapped and

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 6: COVID19 Drug Repository: text-mining the literature in

D1118 Nucleic Acids Research 2021 Vol 49 Database issue

Table 2 COVID19-specific collections and chemical libraries

ResourceDataset (title catalogue orcategory) Drugs active agents Links

ClinicalTrialsgov COVID-19 studies bymapped drug intervention

384 httpsclinicaltrialsgovct2covid viewdrugs

CAS COVID-19 AntiviralCandidate CompoundsDataset

sim50000 httpswwwcasorgcovid-19-antiviral-compounds-dataset

IUPHAR Ligands relevant toSARS-CoV-2(COVID-19

64 items httpswwwguidetopharmacologyorgGRACCoronavirusForward

ChEMBL ChEMBL 27 SARS CoV-2Release 8 datasets

133 selected(IC50EC50 betterthan 10 M)

httpchemblblogspotcom202005chembl27-sars-cov-2-releasehtml

StructuralBioinformatics andNetwork BiologyGroup (SBNB)

Drugs from sthe COVID19literature

307 httpssbnbirbbarcelonaorgcovid19

OpenData Portal NCATS Anti-infectivesCollection

740 httpsopendatancatsnihgovcovid19indexhtml

MedChemExpress SARS-CoV List of Drugs 76 httpswwwmedchemexpresscomTargetsSARS-CoVhtml

MedChemExpress Anti-Virus CompoundLibrary

512 httpswwwmedchemexpresscomscreeningAnti-virus Compound Libraryhtml

MedChemExpress Anti-COVID-19 CompoundLibrary

1552 httpswwwmedchemexpresscomscreeninganti-covid-19-compound-libraryhtml

Figure 4 Example output of the DruGeNetwork python script list of genes (column lsquo1rsquo) with a short descriptionannotation of the underlying biologicalmechanism associated with each gene (column lsquo1rsquo) The output is generated for the query keywords sent to the NCBI Gene resources ldquosildenafilrdquo ANDldquoHomo sapiensrdquo[porgn] AND alive[prop]

gene is categorized as lsquoGroup enrichedrsquo in natural killer(NK) cells according to consensus transcriptomics dataNK cells acting as cytotoxic lymphocytes are involved ininnate immune system regulation including rapid cytokineproduction in the presence of virus-infected cells (67) TheNK-mediated antiviral immune response is associated withthe NCR1 gene that encodes the natural cytotoxicity re-ceptor 1 (68) In the next step both the PDE6 and NCR1genes were detected in the Chronic Obstructive PulmonaryDisease (COPD)-related Gene Set using Harmonizome onthe collection of lsquoomicsrsquo Big Data sets (69) This geneset was deposited in the GEO Signatures of DifferentiallyExpressed Genes for Diseases under the name lsquoCOPD-Chronic Obstructive Pulmonary Disease Muscle-Striated(Skeletal)-Diaphragm (MMHCC) GSE47rsquo The data showthat the expression of the PDE6G gene is significantly in-creased whereas decreased expression was reported for theNCR1 gene

Therefore in the context of drug repurposing strate-gies it is reasonable to expect that sildenafil will be use-

ful for the treatment of COVID19 complications Accord-ingly two recent clinical studies (ClinicalTrialsgov identi-fiers NCT04304313 and NCT04489446) were initiated tostudy the efficacy and safety of sildenafil in patients withCOVID-19 (NCT04304313) and to assess the role of silde-nafil in improving oxygenation among hospitalised patients(NCT04489446)

Identification of drug target genes

We extracted target gene information for each putativeCOVID-19 drug from the Therapeutic Target Database(70) and by text-mining of the literature at PubMed Tounderstand the expression profile of drug target genes iden-tified in this manner in the COVID-19 infection we per-formed transcriptome analysis of infected bronchial epithe-lial cells For this we retrieved raw RNA-sequencing datafor SARS-CoV-2-infected bronchial epithelial cells from thesequence read archive (SRA) database under accession noPRJNA615032 (71) The FASTQ files were mapped and

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 7: COVID19 Drug Repository: text-mining the literature in

Nucleic Acids Research 2021 Vol 49 Database issue D1119

Figure 5 The Drug-Gene local Network built using the output for sildenafil (A) Further functional enrichment reveals the group of genes involved in theinnate immunity and inflammatory processes Moreover VDR (receptors activated by calcitriol) and the HIF1A-STAT3 path suppressed by resveratrolare also involved in the local anti-inflammatory pharmacological network thereby suggesting the ldquosildenafil-resveratrol-vitamin D3rdquo drug combinationto treat the COVID-19 complications

aligned to the hg38 reference genome using STAR (72)Differentially expressed genes were identified using edgeR(73) with parameters set at 20-fold change and lt005 P-value cut-off We thus found target genes for 41 drugs fromour database (eg sildenafil as discussed in previous para-graph) which are significantly differentially expressed dur-ing COVID-19 infection These drug-gene pairs are given inthe Supplemental Table S2

COVID19 Drug Repository server hardware and software re-quirements

The COVID19 Drug Repository server was built on anApache web server and deployed on the RedHat EnterpriseLinux (RHEL) 74 server of an Intel(R) Xeon(R) CPU E5ndash2620 v2 210GHz and 32GB RAM unit The COVID19Drug Repository has a code base and infrastructure simi-lar to that of the ChiTaRS database (7475) The COVID19Drug Repository website is compatible with modern webbrowsers (such as Chrome Firefox Microsoft Edge Operaand Safari) provided that JavaScript is enabled We recom-mend using the latest release version of these web browsersfor optimal rendering

Downloads

The COVID19 Drug Repository not only provides extendedlsquoSearchrsquo options but also offers the possibility to downloadall database tables and data sets in a user-friendly mannerThe repository is available at httpcovid19mdbiuacil

CONCLUSIONS AND FUTURE PLANS

The COVID19 Drug Repository maps data from chemoge-nomics and pharmacogenomics studies and provides viraland human genomics and proteomics information on ap-proved drugs and other therapeutics The database enablesthe user to focus on different levels of complexity startingfrom general information clinical trials and formulationsand increasing the resolution to the level of molecular mech-anisms of drug action Therefore the database can serve as anavigation and recommendation tool both for research andfor healthcare purposes Future plans include the followingadditions to the database (a) continuous updating with newdata on approved drugs experimental drugs and drug-likesynthetic or natural chemical substances (b) automatic ma-chine learning and text-mining-based annotation and visu-alization of lsquoMode of Actionrsquo (MoA) data as well as lsquoDrug-Genersquo and lsquoDrug-Symptomrsquo networks and (c) incorpora-tion of the DrugsNGS analysis tools (lsquotranscriptomicsrsquo) toaccelerate the translation of knowledge for use as personal-ized medicine for COVID19 patients

DATA AVAILABILITY

httpcovid19mdbiuacil

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 8: COVID19 Drug Repository: text-mining the literature in

D1120 Nucleic Acids Research 2021 Vol 49 Database issue

ACKNOWLEDGEMENTS

EL has been working as a volunteer at the Frenkel-Morgensternrsquos lab in the COVID-19 related project

FUNDING

MF-M is supported by the Kamin grant of Israel In-novation Authority (66824 2019-2021) and COVID-19Data Science Institute (DSI) grant Bar-Ilan University(247017 2020) SM is supported by the PBC fellowshipfor outstanding postdoctoral researchers from India by theCouncil of Higher Education Israel (VaTaT 104-2019 atM F-M lab)Conflict of interest statement Authors declare no conflict ofinterest

REFERENCES1 ChenQ AllotA and LuZ (2020) Keep up with the latest

coronavirus research Nature 579 1932 Lu WangL LoK ChandrasekharY ReasR YangJ EideD

FunkK KinneyR LiuZ MerrillW et al (2020) CORD-19 TheCovid-19 Open Research Dataset arXiv doihttpsarxivorgabs200410706v2 10 July 2020 preprint not peerreviewed

3 SwansonDR (2008) In BruzaP and WeeberM (eds)Literature-based Discovery Springer Berlin Heidelberg BerlinHeidelberg pp 3ndash11

4 WeiCH KaoHY and LuZ (2013) PubTator a web-based textmining tool for assisting biocuration Nucleic Acids Res 41W518ndashW522

5 RobertsK AlamT BedrickS Demner-FushmanD LoKSoboroffI VoorheesE WangLL and HershWR (2020)TREC-COVID rationale and Structure of an Information RetrievalShared Task for COVID-19 J Am Med Inform Assoc 271431ndash1436

6 CrichtonG BakerS GuoY and KorhonenA (2020) Neuralnetworks for open and closed Literature-based Discovery PLoS One15 e0232891

7 ZhangE GuptaN NogueiraR ChoK and LinJ (2020) Rapidlydeploying a neural search engine for the covid-19 open researchdataset Preliminary thoughts and lessons learned arXiv doihttpsarxivorgabs200405125 10 April 2020 preprint not peerreviewed

8 TarasovaO IvanovS FilimonovDA and PoroikovV (2020) Dataand text mining help identify key proteins involved in the molecularmechanisms shared by SARS-CoV-2 and HIV-1 Molecules 25 2944

9 WeiCH HarrisBR LiD BerardiniTZ HualaE KaoHYand LuZ (2012) Accelerating literature curation with text-miningtools a case study of using PubTator to curate genes in PubMedabstracts Database (Oxford) 2012 bas041

10 WeiCH AllotA LeamanR and LuZ (2019) PubTator centralautomated concept annotation for biomedical full text articlesNucleic Acids Res 47 W587ndashW593

11 OsinskiS and WeissD (2004) Conceptual Clustering Using LingoAlgorithm Evaluation on Open Directory Project Data InKłopotekMA WierzchonST and TrojanowskiK (eds) IntelligentInformation Processing and Web Mining Advances in Soft ComputingSpringer Berlin Heidelberg Vol 25 pp 369ndash377

12 TagoreS GorohovskiA JensenLJ and Frenkel-MorgensternM(2019) ProtFus a comprehensive method characterizingprotein-protein interactions of fusion proteins PLoS Comput Biol15 e1007239

13 MeyerUA ZangerUM and SchwabM (2013) Omics and drugresponse Annu Rev Pharmacol Toxicol 53 475ndash502

14 ZhengXF and ChanTF (2002) Chemical genomics a systematicapproach in biological research and drug discovery Curr Issues MolBiol 4 33ndash43

15 EngelbergA (2004) Iconix Pharmaceuticals Incndashremoving barriersto efficient drug discovery through chemogenomicsPharmacogenomics 5 741ndash744

16 MagarinosMP CarmonaSJ CrowtherGJ RalphSARoosDS ShanmugamD Van VoorhisWC and AgueroF (2012)TDR targets a chemogenomics resource for neglected diseasesNucleic Acids Res 40 D1118ndashD1127

17 WangL MaC WipfP LiuH SuW and XieXQ (2013)TargetHunter an in silico target identification tool for predictingtherapeutic potential of small organic molecules based onchemogenomic database AAPS J 15 395ndash406

18 TributO LessardY ReymannJM AllainH and Bentue-FerrerD(2002) Pharmacogenomics Med Sci Monit 8 RA152ndashRA163

19 HoodE (2003) Pharmacogenomics the promise of personalizedmedicine Environ Health Perspect 111 A581ndashA589

20 WeinshilboumR and WangL (2004) Pharmacogenomics bench tobedside Nat Rev Drug Discov 3 739ndash748

21 ServiceRF (2005) Pharmacogenomics Going from genome to pillScience 308 1858ndash1860

22 BlakeCA and SobelBE (2008) Progress in pharmacogenomics andits promise for medicine Exp Biol Med (Maywood) 2331482ndash1483

23 WangL (2010) Pharmacogenomics a systems approach WileyInterdiscip Rev Syst Biol Med 2 3ndash22

24 BurtT and DhillonS (2013) Pharmacogenomics in early-phaseclinical development Pharmacogenomics 14 1085ndash1097

25 MusaA TripathiS DehmerM Yli-HarjaO KauffmanSA andEmmert-StreibF (2019) Systems pharmacogenomic landscape ofdrug similarities from LINCS data Drug Association Networks SciRep 9 7849

26 KalamaraA TobalinaL and Saez-RodriguezJ (2018) How to findthe right drug for each patient Advances and challenges inpharmacogenomics Curr Opin Syst Biol 10 53ndash62

27 KirkRJ HungJL HornerSR and PerezJT (2008) Implicationsof pharmacogenomics for drug development Exp Biol Med(Maywood) 233 1484ndash1497

28 Sharifi-NoghabiH ZolotarevaO CollinsCC and EsterM (2019)MOLI multi-omics late integration with deep neural networks fordrug response prediction Bioinformatics 35 i501ndashi509

29 PulleyJM RhoadsJP JeromeRN ChallaAP ErregerKBJolyMM LavieriRR PerryKE ZaleskiNM Shirey-RiceJKet al (2020) Using what we already have uncovering new drugrepurposing strategies in existing omics data Annu Rev PharmacolToxicol 60 333ndash352

30 SindelarRD (2013) Genomics other ldquoOmicrdquo technologiespersonalized medicine and additional biotechnology-relatedtechniques In CrommelinDJA SindelarRD and MeibohmB(eds) Pharmaceutical Biotechnology Fundamentalsand ApplicationsSpringer International Publishing NY pp 179ndash221

31 Uran LandaburuL BerensteinAJ VidelaS MaruPShanmugamD ChernomoretzA and AgueroF (2020) TDRTargets 6 driving drug discovery for human pathogens throughintensive chemogenomic data integration Nucleic Acids Res 48D992ndashD1005

32 FengZ ChenM LiangT ShenM ChenH and XieXQ (2020)Virus-CKB an integrated bioinformatics platform and analysisresource for COVID-19 research Brief Bioinform bbaa155

33 Duran-FrigolaM PaulsE Guitart-PlaO BertoniM AlcaldeVAmatD Juan-BlancoT and AloyP (2020) Extending thesmall-molecule similarity principle to all levels of biology with thechemical checker Nature Biotechnology 38 1087ndash1096

34 SinghN DecrolyE KhatibAM and VilloutreixBO (2020)Structure-based drug repositioning over the human TMPRSS2protease domain search for chemical probes able to repressSARS-CoV-2 Spike protein cleavages Eur J Pharm Sci 153105495

35 GordonDE JangGM BouhaddouM XuJ ObernierKWhiteKM OrsquoMearaMJ RezeljVV GuoJZ SwaneyDL et al(2020) A SARS-CoV-2 protein interaction map reveals targets fordrug repurposing Nature 583 459ndash468

36 TakahashiT LuzumJA NicolMR and JacobsonPA (2020)Pharmacogenomics of COVID-19 therapies NPJ Genom Med 5 35

37 WishartDS KnoxC GuoAC ShrivastavaS HassanaliMStothardP ChangZ and WoolseyJ (2006) DrugBank acomprehensive resource for in silico drug discovery and explorationNucleic Acids Res 34 D668ndashD672

38 WishartDS FeunangYD GuoAC LoEJ MarcuAGrantJR SajedT JohnsonD LiC SayeedaZ et al (2018)

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142

Page 9: COVID19 Drug Repository: text-mining the literature in

Nucleic Acids Research 2021 Vol 49 Database issue D1121

DrugBank 50 a major update to the DrugBank database for 2018Nucleic Acids Res 46 D1074ndashD1082

39 KimS ChenJ ChengT GindulyteA HeJ HeS LiQShoemakerBA ThiessenPA YuB et al (2019) PubChem 2019update improved access to chemical data Nucleic Acids Res 47D1102ndashD1109

40 ArmstrongJF FaccendaE HardingSD PawsonAJSouthanC SharmanJL CampoB CavanaghDRAlexanderSPH DavenportAP et al (2020) The IUPHARBPSGuide to PHARMACOLOGY in 2020 extendingimmunopharmacology content and introducing the IUPHARMMVGuide to MALARIA PHARMACOLOGY Nucleic Acids Res 48D1006ndashD1021

41 HuffenbergerMA and WigingtonRL (1975) Chemical AbstractsService approach to management of large data bases J Chem InfComput Sci 15 43ndash47

42 WeisgerberDW (1997) Chemical Abstracts Service ChemicalRegistry System history scope and impacts J Am Soc Inf Sci 48349ndash360

43 WideniusM AxmarkD and ArnoK (2002) In MySQL ReferenceManual Documentation From the Source OrsquoReilly Media Inc

44 CockPJ AntaoT ChangJT ChapmanBA CoxCJ DalkeAFriedbergI HamelryckT KauffF WilczynskiB et al (2009)Biopython freely available Python tools for computational molecularbiology and bioinformatics Bioinformatics 25 1422ndash1423

45 CoordinatorsNR (2016) Database resources of the National Centerfor Biotechnology Information Nucleic Acids Res 44 D7ndashD19

46 ThompsonK (1968) Programming techniques regular expressionsearch algorithm Commun ACM 11 419ndash422

47 StefanowskiJ and WeissD (2003) Carrot2 and Language Propertiesin Web Search Results Clustering In MenasalvasE SegoviaJ andSzczepaniakPS (eds) Advances in Web Intelligence Springer BerlinHeidelberg pp 240ndash249

48 SayersE (2010) E-utilities Quick Start EntrezProgramming UtilitiesHelp [Internet] Bethesda MD National Center for BiotechnologyInformation (US) httpwwwncbinlmnihgovbooksNBK25500

49 LipscombCE (2000) Medical subject headings (MeSH) Bull MedLibr Assoc 88 265

50 SahuNU and KharkarPS (2016) Computational DrugRepositioning a lateral approach to traditional drug discovery CurrTop Med Chem 16 2069ndash2077

51 SamE and AthriP (2019) Web-based drug repurposing tools asurvey Brief Bioinform 20 299ndash316

52 DonnerY KazmierczakS and FortneyK (2018) Drug repurposingusing deep embeddings of gene expression profiles Mol Pharm 154314ndash4325

53 AlexanderSPH ArmstrongJF DavenportAP DaviesJAFaccendaE HardingSD Levi-SchafferF MaguireJJPawsonAJ SouthanC et al (2020) A rational roadmap forSARS-CoV-2COVID-19 pharmacotherapeutic research anddevelopment IUPHAR Review 29 Br J Pharmacol 1774942ndash4966

54 FaccendaE ArmstrongJ DavenprotA HardingS PawsonASouthanC and DaviesJ (2020) Coronavirus InformationIUPHARBPS Guide to Pharmacologyhttpswwwguidetopharmacologyorgcoronavirusjsp

55 JeonS KoM LeeJ ChoiI ByunSY ParkS ShumD andKimS (2020) Identification of antiviral drug candidates againstSARS-CoV-2 from FDA-approved drugs Antimicrob AgentsChemother 64 e00819-20

56 RivaL YuanS YinX Martin-SanchoL MatsunagaNBurgstaller-MuehlbacherS PacheL De JesusPP HullMVChangM et al (2020) A large-scale drug repositioning survey forSARS-CoV-2 antivirals bioRxiv doihttpsdoiorg10110120200416044016 17 April 2020 preprintnot peer reviewed

57 BrimacombeKR ZhaoT EastmanRT HuX WangKBackusM BaljinnyamB ChenCZ ChenL EicherT et al (2020)An OpenData portal to share COVID-19 drug repurposing data inreal time bioRxiv doi httpsdoiorg10110120200604135046 05June 2020 preprint not peer reviewed

58 MirabelliC WotringJW ZhangCJ McCartySM FursmidtRFrumT KadambiNS AminAT OrsquoMearaTR PrettoCD et al(2020) Morphological cell profiling of SARS-CoV-2 infectionidentifies drug repurposing candidates for COVID-19 bioRxiv doi

httpsdoiorg10110120200527117184 27 May 2020 preprintnot peer reviewed

59 TangH AbouleilaY SiL Ortega-PrietoAM MummeryCLIngberDE and MashaghiA (2020) Human organs-on-chips forvirology Trends Microbiol 28 934ndash946

60 WestonS ColemanCM HauptR LogueJ MatthewsK LiYReyesHM WeissSR and FriemanMB (2020) Broadanti-coronaviral activity of FDA approved drugs againstSARS-CoV-2 J Virol 94 e01218-20

61 TouretF GillesM BarralK NougairedeA van HeldenJDecrolyE de LamballerieX and CoutardB (2020) In vitroscreening of a FDA approved chemical library reveals potentialinhibitors of SARS-CoV-2 replication Sci Rep 10 13093

62 EllingerB BojkovaD ZalianiA CinatlJ ClaussenCWesthausS ReinshagenJ KuzikovM WolfM and GeisslingerG(2020) Identification of inhibitors of SARS-CoV-2 in-vitro cellulartoxicity in human (Caco-2) cells using a large scale drug repurposingcollection ResearchSquare doi1021203rs3rs-23951v1

63 HeiserK McLeanPF DavisCT FogelsonB GordonHBJacobsonP HurstB MillerB AlfaRW EarnshawBA et al(2020) Identification of potential treatments for COVID-19 throughartificial intelligence-enabled phenomic analysis of human cellsinfected with SARS-CoV-2 bioRxiv doihttpsdoiorg10110120200421054387 23 April 2020 preprintnot peer reviewed

64 SzklarczykD GableAL LyonD JungeA WyderSHuerta-CepasJ SimonovicM DonchevaNT MorrisJH BorkPet al (2019) STRING v11 protein-protein association networks withincreased coverage supporting functional discovery in genome-wideexperimental datasets Nucleic Acids Res 47 D607ndashD613

65 NikolovaS GuentherA SavaiR WeissmannN GhofraniHAKonigshoffM EickelbergO KlepetkoW VoswinckelRSeegerW et al (2010) Phosphodiesterase 6 subunits are expressedand altered in idiopathic pulmonary fibrosis Respir Res 11 146

66 HemnesAR ZaimanA and ChampionHC (2008) PDE5Ainhibition attenuates bleomycin-induced pulmonary fibrosis andpulmonary hypertension through inhibition of ROS generation andRhoARho kinase activation Am J Physiol Lung Cell MolPhysiol 294 L24ndashL33

67 UhlenM FagerbergL HallstromBM LindskogC OksvoldPMardinogluA SivertssonA KampfC SjostedtE AsplundAet al (2015) Proteomics Tissue-based map of the human proteomeScience 347 1260419

68 PessinoA SivoriS BottinoC MalaspinaA MorelliLMorettaL BiassoniR and MorettaA (1998) Molecular cloning ofNKp46 a novel member of the immunoglobulin superfamily involvedin triggering of natural cytotoxicity J Exp Med 188 953ndash960

69 RouillardAD GundersenGW FernandezNF WangZMonteiroCD McDermottMG and MarsquoayanA (2016) Theharmonizome a collection of processed datasets gathered to serveand mine knowledge about genes and proteins Database (Oxford)2016 baw100

70 WangY ZhangS LiF ZhouY ZhangY WangZ ZhangRZhuJ RenY TanY et al (2020) Therapeutic target database 2020enriched resource for facilitating research and early development oftargeted therapeutics Nucleic Acids Res 48 D1031ndashD1041

71 Blanco-MeloD Nilsson-PayantBE LiuWC UhlSHoaglandD MoslashllerR JordanTX OishiK PanisM SachsDet al (2020) Imbalanced host response to SARS-CoV-2 drivesdevelopment of COVID-19 Cell 181 1036ndash1045

72 DobinA DavisCA SchlesingerF DrenkowJ ZaleskiC JhaSBatutP ChaissonM and GingerasTR (2013) STAR ultrafastuniversal RNA-seq aligner Bioinformatics 29 15ndash21

73 RobinsonMD McCarthyDJ and SmythGK (2010) edgeR aBioconductor package for differential expression analysis of digitalgene expression data Bioinformatics 26 139ndash140

74 BalamuraliD GorohovskiA DetrojaR PalandeVRaviv-ShayD and Frenkel-MorgensternM (2019) ChiTaRS 50 thecomprehensive database of chimeric transcripts matched withdruggable fusions and 3D chromatin maps Nucleic Acids Res 48D825ndashD834

75 Frenkel-MorgensternM GorohovskiA LacroixV RogersMIbanezK BoullosaC Andres LeonE Ben-HurA andValenciaA (2013) ChiTaRS a database of human mouse and fruitfly chimeric transcripts and RNA-sequencing data Nucleic AcidsRes 41 D142