anap: an integrated knowledge base for arabidopsis

11
Bioinformatics ANAP: An Integrated Knowledge Base for Arabidopsis Protein Interaction Network Analysis 1[C][W][OA] Congmao Wang, Alex Marshall, Dabing Zhang, and Zoe A. Wilson* State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology (C.W., D.Z.), and Bio-X Center (D.Z.), Shanghai Jiao Tong University, Shanghai 200240, China; and School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, Leicestershire LE12 5RD, United Kingdom (C.W., A.M., Z.A.W.) Protein interactions are fundamental to the molecular processes occurring within an organism and can be utilized in network biology to help organize, simplify, and understand biological complexity. Currently, there are more than 10 publicly available Arabidopsis (Arabidopsis thaliana) protein interaction databases. However, there are limitations with these databases, including different types of interaction evidence, a lack of defined standards for protein identifiers, differing levels of information, and, critically, a lack of integration between them. In this paper, we present an interactive bioinformatics Web tool, ANAP (Arabidopsis Network Analysis Pipeline), which serves to effectively integrate the different data sets and maximize access to available data. ANAP has been developed for Arabidopsis protein interaction integration and network-based study to facilitate functional protein network analysis. ANAP integrates 11 Arabidopsis protein interaction databases, comprising 201,699 unique protein interaction pairs, 15,208 identifiers (including 11,931 The Arabidopsis Information Resource Arabidopsis Genome Initiative codes), 89 interaction detection methods, 73 species that interact with Arabidopsis, and 6,161 references. ANAP can be used as a knowledge base for constructing protein interaction networks based on user input and supports both direct and indirect interaction analysis. It has an intuitive graphical interface allowing easy network visualization and provides extensive detailed evidence for each interaction. In addition, ANAP displays the gene and protein annotation in the generated interactive network with links to The Arabidopsis Information Resource, the AtGenExpress Visualization Tool, the Arabidopsis 1,001 Genomes GBrowse, the Protein Knowledgebase, the Kyoto Encyclopedia of Genes and Genomes, and the Ensembl Genome Browser to significantly aid functional network analysis. The tool is available open access at http://gmdd.shgmo.org/ Computational-Biology/ANAP. Protein interaction networks can provide a global view of cellular processes, thus facilitating the study of complex, dynamic biological systems (Jansen et al., 2003). Interactions between proteins can be direct physical interactions and also indirect, which may involve intermediate molecules to facilitate interac- tions. For example, an indirect interaction means that if proteins A and B, and also B and C, have direct interactions, then A and C indirectly interact. These interactions are key to cellular events associated with protein localization, translation rates, gene regulation, and posttranslational modifications (Bork et al., 2004). The development of full-genome- and proteomics-based technologies, such as next-generation sequencing, tran- scriptomics, and high-throughput yeast two-hybrid screening, has generated huge amounts of biological data. To capitalize upon these data for functional bio- logical studies, this information needs to be analyzed, effectively integrated, and stored to facilitate rapid searching and in-depth analysis. There have been a number of model organisms for which large-scale protein interaction data sets have been generated, which include Saccharomyces cerevisiae (Schwikowski et al., 2000; Uetz et al., 2000), Drosophilia melanogaster (Giot et al., 2003), Caenorhabditis elegans (Li et al., 2004), and the human protein interactome (Rual et al., 2005). These data sets, and many others, have increased the amount of available protein interaction data hugely over the past 10 years, but currently, they are all collated into different protein interaction data- bases (Arabidopsis Interactome Mapping Consortium, 2011). To date, a significant amount of protein interac- 1 This work was supported by the Royal Society International Joint Projects/The National Natural Science Foundation of China (NSFC; China Costshare Award), the Biotechnology and Biological Sciences Research Council (to Z.A.W.), the National Basic Research Program of China (grant nos. 2009CB941500 and 2007CB108700 to D.Z.), the National Natural Science Foundation of China (grant nos. 30725022, 30830014, and 90717109 to D.Z.), the Science and Technology Com- mission of Shanghai Municipality, the Shanghai Leading Academic Discipline Project (grant no. B205 to D.Z.), the China Postdoctoral Science Foundation (grant no. 20090450706 to D.Z.), the Shanghai Postdoctoral Scientific Program (grant no. 09R21414300 to D.Z.), a Scholarship Award for Excellent Doctoral Students from the Ministry of Education, and the International Cooperation project of the Shanghai Science and Technology Committee (grant no. 08540702700). * Corresponding author; e-mail [email protected]. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Zoe A. Wilson ([email protected]). [C] Some figures in this article are displayed in color online but in black and white in the print edition. [W] The online version of this article contains Web-only data. [OA] Open Access articles can be viewed online without a sub- scription. www.plantphysiol.org/cgi/doi/10.1104/pp.111.192203 Plant Physiology Ò , April 2012, Vol. 158, pp. 1523–1533, www.plantphysiol.org Ó 2012 American Society of Plant Biologists. All Rights Reserved. 1523 Downloaded from https://academic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 October 2021

Upload: others

Post on 03-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Bioinformatics

ANAP: An Integrated Knowledge Base for ArabidopsisProtein Interaction Network Analysis1[C][W][OA]

Congmao Wang, Alex Marshall, Dabing Zhang, and Zoe A. Wilson*

State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology (C.W., D.Z.), and Bio-XCenter (D.Z.), Shanghai Jiao Tong University, Shanghai 200240, China; and School of Biosciences, Universityof Nottingham, Sutton Bonington Campus, Loughborough, Leicestershire LE12 5RD, United Kingdom (C.W.,A.M., Z.A.W.)

Protein interactions are fundamental to the molecular processes occurring within an organism and can be utilized in networkbiology to help organize, simplify, and understand biological complexity. Currently, there are more than 10 publicly availableArabidopsis (Arabidopsis thaliana) protein interaction databases. However, there are limitations with these databases, includingdifferent types of interaction evidence, a lack of defined standards for protein identifiers, differing levels of information, and,critically, a lack of integration between them. In this paper, we present an interactive bioinformatics Web tool, ANAP(Arabidopsis Network Analysis Pipeline), which serves to effectively integrate the different data sets and maximize access toavailable data. ANAP has been developed for Arabidopsis protein interaction integration and network-based study to facilitatefunctional protein network analysis. ANAP integrates 11 Arabidopsis protein interaction databases, comprising 201,699 uniqueprotein interaction pairs, 15,208 identifiers (including 11,931 The Arabidopsis Information Resource Arabidopsis GenomeInitiative codes), 89 interaction detection methods, 73 species that interact with Arabidopsis, and 6,161 references. ANAP canbe used as a knowledge base for constructing protein interaction networks based on user input and supports both direct andindirect interaction analysis. It has an intuitive graphical interface allowing easy network visualization and provides extensivedetailed evidence for each interaction. In addition, ANAP displays the gene and protein annotation in the generated interactivenetwork with links to The Arabidopsis Information Resource, the AtGenExpress Visualization Tool, the Arabidopsis 1,001Genomes GBrowse, the Protein Knowledgebase, the Kyoto Encyclopedia of Genes and Genomes, and the Ensembl GenomeBrowser to significantly aid functional network analysis. The tool is available open access at http://gmdd.shgmo.org/Computational-Biology/ANAP.

Protein interaction networks can provide a globalview of cellular processes, thus facilitating the study ofcomplex, dynamic biological systems (Jansen et al.,2003). Interactions between proteins can be direct

physical interactions and also indirect, which mayinvolve intermediate molecules to facilitate interac-tions. For example, an indirect interaction means thatif proteins A and B, and also B and C, have directinteractions, then A and C indirectly interact. Theseinteractions are key to cellular events associated withprotein localization, translation rates, gene regulation,and posttranslational modifications (Bork et al., 2004).The development of full-genome- and proteomics-basedtechnologies, such as next-generation sequencing, tran-scriptomics, and high-throughput yeast two-hybridscreening, has generated huge amounts of biologicaldata. To capitalize upon these data for functional bio-logical studies, this information needs to be analyzed,effectively integrated, and stored to facilitate rapidsearching and in-depth analysis.

There have been a number of model organisms forwhich large-scale protein interaction data sets havebeen generated, which include Saccharomyces cerevisiae(Schwikowski et al., 2000; Uetz et al., 2000), Drosophiliamelanogaster (Giot et al., 2003), Caenorhabditis elegans (Liet al., 2004), and the human protein interactome (Rualet al., 2005). These data sets, and many others, haveincreased the amount of available protein interactiondata hugely over the past 10 years, but currently, theyare all collated into different protein interaction data-bases (Arabidopsis InteractomeMapping Consortium,2011). To date, a significant amount of protein interac-

1 This work was supported by the Royal Society International JointProjects/The National Natural Science Foundation of China (NSFC;China Costshare Award), the Biotechnology and Biological SciencesResearch Council (to Z.A.W.), the National Basic Research Program ofChina (grant nos. 2009CB941500 and 2007CB108700 to D.Z.), theNational Natural Science Foundation of China (grant nos. 30725022,30830014, and 90717109 to D.Z.), the Science and Technology Com-mission of Shanghai Municipality, the Shanghai Leading AcademicDiscipline Project (grant no. B205 to D.Z.), the China PostdoctoralScience Foundation (grant no. 20090450706 to D.Z.), the ShanghaiPostdoctoral Scientific Program (grant no. 09R21414300 to D.Z.), aScholarship Award for Excellent Doctoral Students from the Ministryof Education, and the International Cooperation project of the ShanghaiScience and Technology Committee (grant no. 08540702700).

* Corresponding author; e-mail [email protected] author responsible for distribution of materials integral to the

findings presented in this article in accordance with the policydescribed in the Instructions for Authors (www.plantphysiol.org) is:Zoe A. Wilson ([email protected]).

[C] Some figures in this article are displayed in color online but inblack and white in the print edition.

[W] The online version of this article contains Web-only data.[OA] Open Access articles can be viewed online without a sub-

scription.www.plantphysiol.org/cgi/doi/10.1104/pp.111.192203

Plant Physiology�, April 2012, Vol. 158, pp. 1523–1533, www.plantphysiol.org � 2012 American Society of Plant Biologists. All Rights Reserved. 1523

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

tion data have been generated for Arabidopsis (Arabi-dopsis thaliana); however, this has been produced usinga range of methods. These data sets are stored in avariety of databases, including Agile Protein Interac-tion DataAnalyzer (APID; Prieto and De Las Rivas,2006), Arabidopsis thaliana Protein Interactome Data-base (AtPID; Cui et al., 2008), Arabidopsis thalianaProtein Interaction Network (AtPIN; Brandao et al.,2009), the Biomolecular Interaction Network Database(BIND; Bader et al., 2003), Biological General Repos-itory for Interaction Datasets (BioGRID; Stark et al.,2006, 2011), ChEMBL (Overington, 2009), The Data-base of Interacting Proteins (DIP; Xenarios et al., 2000,2001, 2002), IntAct (Aranda et al., 2010), InteroPORC(Michaut et al., 2008), iRefIndex (Razick et al., 2008),The Molecular INTeraction database (MINT; Ceolet al., 2010), MolCon (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml), and SearchTool for the Retrieval of Interacting Genes/Proteins(STRING; Jensen et al., 2009; Szklarczyk et al., 2011).

Currently, it is not easy to directly access these data tointegrate information from different sources and meth-odologies to provide biological network information.Fortunately, an excellent recent resource, PSICQUIC(for Protemics Standard Initiative Common QUeryInterfaCe; Aranda et al., 2010), has provided an inter-face for protein interaction databases to allow easyaccess to these data. The main goal of the PSICQUICproject is to provide a common query interface andimplement data quality assessment from these dispar-ate databases; this is now being successfully usedfor many projects, including Cytoscape, IntAct, andReactome (http://code.google.com/p/psicquic/wiki/WhoUsesPsicquic).

There are a number of bioinformatics tools, such asATTED (Obayashi et al., 2007), that utilize coexpressiondata for network analysis; however, these have theirlimitations, since they are based upon transcript levelsand do not utilize protein data. One of the initialnetwork analysis tools for visualizing the Arabidopsisinteractome was the Arabidopsis Interaction Viewer(Geisler-Lee et al., 2007). The Arabidopsis InteractionViewer currently contains nearly 99,466 Arabidopsisinteracting proteins, which were collected from BIND,MINT, literature sources such as Arabidopsis Interac-tome Mapping (Arabidopsis Interactome MappingConsortium, 2011), and some predictions generatedby the authors. The Arabidopsis thaliana Protein Inter-action Network also offers an online tool that integratessome of the available Arabidopsis protein interactiondatabases, including the Predicted Interactome for Ara-bidopsis (Geisler-Lee et al., 2007), Arabidopsis protein-protein interaction data curated from The ArabidopsisInformation Resource (TAIR) curators (http://www.arabidopsis.org/index.jsp), BioGRID (Stark et al.,2006, 2011), and IntAct (Aranda et al., 2010).

There are many variables that have to be addressedto facilitate data integration between the large numbersof available protein interaction data sets. These includedata standards, the use of single types of protein

identifiers, and well-defined ontology terms. A largeamount of these data are generated from differentsources with no shared database design, many with noclearly defined standards and the use of different iden-tifiers. Therefore, it is vital to develop a set of definitivestandards for the collection, integration, and analysis ofprotein interaction data to enable the establishment ofnetworks that utilize data from both small-scale exper-iments and high-throughput approaches. This is partic-ularly important since, if interactions have beendemonstrated by multiple approaches, it provides agreater validity and robustness to the network.

To address these issues and to facilitate effectiveprotein interaction network construction, we have de-veloped an interactive bioinformatics Web tool entitledthe Arabidopsis Network Analysis Pipeline (ANAP) forArabidopsis network analysis. The main aims of ANAPare to integrate the currently available Arabidopsisprotein interaction data sets and to provide biologistswith a novel, easy-to-use, and intuitive interface thatenables researchers to carry out high-throughput de-tailed network analysis with limited bioinformatics ex-perience. Protein interaction data sets were integratedand formatted from 11 public Arabidopsis protein inter-action databases. At publication, ANAP contained201,699 unique protein interaction pairs, comprising15,208 identifiers (include 11,931 TAIR Arabidopsis Ge-nome Initiative [AGI] codes) with 89 interaction detec-tion methods, 73 proteins from different species thatinteract with Arabidopsis proteins, and 6,161 references(Table I). This provides an extensive and valuableknowledge base for generating protein interaction net-works from the integrated data sets, thus producing a farmore detailed and reliable network than if producedfrom any single protein interaction database.

ANAP allows for either single or multiple proteinsearches to be conducted for each query protein. Thenetworks generated display the various interactiondetection methods and data sources in unique colorsto enable effective network viewing. There are addi-tional functions available to conduct “in-depth” pro-tein searches, which identify the indirect interactionsfrom the original input source protein. This is veryimportant, as a network, or a protein interaction com-plex,may include indirect interactions tomanyotherpro-teins. This type of approach has previously been shownto be a very useful way to recognize new interactionswithin a complex (Jensen et al., 2009). Each protein in

Table I. Statistics of the integrated ANAP protein interaction sourcedata

Category No.

Protein interaction databases 11Species interacting with Arabidopsis 73Interaction detection methods 89References 6,161Unique TAIR AGI codes 11,931Unique molecules 15,208Protein interaction pairs 201,699

Wang et al.

1524 Plant Physiol. Vol. 158, 2012

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

Figure 1. (Legend appears on following page.)

ANAP: An Arabidopsis Protein Interaction Network Tool

Plant Physiol. Vol. 158, 2012 1525

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

the network is described using its TAIR AGI code,UniProt identifier (ID), and a short description; addi-tionally, the full TAIR locus details can be viewed bydouble clicking on the protein. Direct links to fivepopular Arabidopsis resources (AtGenExpress Visuali-zation Tool, Arabidopsis 1,001 Genomes GBrowse, Pro-tein Knowledgebase, Kyoto Encyclopedia of Genes andGenomes, and Ensembl Genome Browser) are also pro-vided. The detailed evidence of the network and eachinteraction can be saved in various file formats, includingPNG, PDF, SVG, SIF, GRAPHML, and XGMML. The fileformats SIF, GRAPHML, and XGMML are particularlyuseful for large networkswhere the userwishes to importthe resulting ANAP network into Cytoscape, which is awell-established network analysis tool (Shannon et al.,2003; Kohl et al., 2011; Smoot et al., 2011). ANAP alsosupports the import of the resulting network into othernetwork analysis tools, such as Network Workbench(GRAPHML; NWB Team, 2006). ANAP is a fully func-tional integration and analysis pipeline that will serve asan extremely valuable resource for biologists. Itwill enablethem to capitalize upon the currently available Arabi-dopsis protein interaction data for effective network-based analysis, enabling greater predictions of functionand selection of targets for further biological analysis.

RESULTS

ANAP Framework and Searches

The ANAP tool has been developed to integrate theavailable Arabidopsis protein interaction data that havebeen generated from different sources by a variety ofapproaches. These data are then used to generate accurateprotein interaction networks, which will facilitate greaterunderstanding of biological processes. ANAP can be usedas a platform to construct protein interaction networksbased on both direct and indirect interaction analysis.

ANAP has an intuitive graphical user interface thatallows the user to easily construct molecular networksusing single or multiple starting proteins as inputs; theresults are displayed showing the proteins that interactwith the initial query protein(s). Figure 1A shows theANAP tool interface, which includes ID Mapping and aHelp link. The user enters the Arabidopsis TAIR AGIcode(s) or the protein UniProt ID(s) into the centralsearch box, with the option of selecting two types ofnode relationship: “Source Database” and “InteractionDetection Method.” The selection of node relationshipdoes not affect the overall network that is generated butrather the presentation of the links between the nodes;Source Database lists the database information used togenerate the links, while Interaction Detection Method

presents the experimental technique that has been usedto generate the relationship.

Figure 1B shows the whole framework of the ANAPoutput, which includes several useful functions to enablethe user to easily extract extensive information from theresultant network. This framework includes the networkmap in the center of the main panel, a “Change theColor” button underneath, network information aboutnumbers of nodes and interactions, and a panel forsearching andmapping data onto the network. There is apanel for saving the resultant network, another panel foruseful information, which includes links to the support-ing evidence for the interactions, and a Simple Interac-tion Format (SIF; Cytoscape format) file containing theSource Database and Interaction Detection Method. Thispanel also contains a “Depth Search” button, whichsupports the indirect interaction search option. More-over, there are another two panels at the bottom of theframework, one is “network filtering,” which is usefulfor simplifying the output of a complex network andallows users to toggle between different databases anddifferent interaction detection methods to generate net-works; the other is “upload network,”which is useful forreanalyzing the generated network and making theinput nodes remain in their original positions.

Single Protein Searches

The locus AT5G42970 (which encodes subunit 4 ofthe COP9 signalosome [CSN] complex) was used as anexample for analysis using the ANAP tool. Figure 2shows the resulting network of 34 nodes and 130 edges,based on the direct protein interactions generated afterselecting the option of Interaction Detection Method.The query protein AT5G42970 is marked in red in thecenter of the figure, and each associated protein islinked by a uniquely colored line, based on the inter-action detection method and the rendering rules fromthe complete list of all interaction detection methods(Supplemental Data Set S1).

The CSN is a highly conserved protein complex thatis associated with the ubiqutin-proteolytic breakdownpathway. In eukaryotes, it is formed of nine subunits,of which subunit 4 (AT5G42970) is one member(Schwechheimer and Isono, 2010). Searching ANAPwith AT5G42970 identified 34 nodes; the gene identi-ties and functions of these are shown in Figure 2B. Thenine components of the CSN (COP9 subunit 4 andeight others) were all identified and are highlighted inorange (Fig. 2). To construct the network, ANAP hasutilized data from multiple sources, comprising bothpredicted interactions and experimental evidence; thenumbers of each for these proteins are shown in Figure

Figure 1. ANAP search page and the framework of the network result. A, Portal of the ANAP tool, which can search for single ormultiple proteins using TAIR AGI code format and/or UniProt ID, based upon the node relationship of the Source Database andInteraction Detection Method. B, The framework of the ANAP result contains a map of the resultant network, a panel forsearching and mapping data onto the network, options for saving and exporting the network, a panel of useful informationincluding evidence and depth search (for indirect interaction searches), and two panels for “network filtering” and “uploadnetwork.” [See online article for color version of this figure.]

Wang et al.

1526 Plant Physiol. Vol. 158, 2012

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

Figure 2. Network generated using theCOP9 signalosome protein (AT5G42970).A, Network map based on direct proteininteractions and the node relationship ofthe Interaction Detection Method. Thequery protein is marked in red, and eachinteraction detection method is indicatedby a unique linking colored line. B, Tablegenerated from the “evidence” fromthe COP9 interaction network. The tableshows the numbers of interactionsdetected by ANAP, which of these arefrom experimental data and frominference-based approaches, and thetotal number of databases supportingthe interactions. Members of the COP9signalosome are shown in orange in bothA and B.

ANAP: An Arabidopsis Protein Interaction Network Tool

Plant Physiol. Vol. 158, 2012 1527

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

2B. This wide range of data provides valuable supportfor any interactions; for example, a large number ofinteractions were seen between COP9 subunit 4 andthe other well-established components of the CSN.The other proteins that have been identified in thenetwork range from those with established roles inubiquitination pathways (Schwechheimer and Isono,2010) to other developmental processes that are regu-lated by ubiquitin proteolysis. It is also very easy to godirectly from the identified proteins to PubMed sour-ces to aid in characterizing the network and furtherinterrogate the validity of the predicted parts of thenetworks. The multiple sources of data accessed byANAP offer excellent opportunities to confirm knownnetworks but also to extend these further to identifynovel targets. The range and depth of the data utilizedfor network generation, therefore, provide a valuablemechanism to assess the validity of such predictionsprior to follow-up experimental analysis.

Table II shows an example of five evidence recordsgenerated when searching using AT5G42970 based ondirect protein interactions. In addition, SupplementalData Set S2 lists the complete relevant evidence recordsfor the AT5G42970 protein network. The user candynamically interact with the network by using themouse-over function on the nodes; this shows theprotein’s UniProt ID, TAIR AGI code, and a shortdescription of the predicted protein function. Addition-ally, there are links to the relevant locus details, whichare visible when the node is double clicked. Moreover,each node in the network has a direct link to theAtGenExpress Visualization Tool, the Arabidopsis 1,001Genomes GBrowse, the Protein Knowledgebase, theKyoto Encyclopedia of Genes and Genomes, and theEnsembl Genome Browser. A similar feature is alsoseen with the edges in the network, which highlight theinteraction method when the mouse hovers over eachedge. ANAP provides the opportunity for the user toselect a node(s) of interest in the resultant network andto use this to construct a new network and extract theevidence data. Furthermore, users can also search forspecific protein(s) in the resultant network; such pro-teins are marked in blue when in the resultant networkand marked in fuchsia when it is the same as the queryprotein(s). Using AT5G42970, a networkwas constructedbased on the same configuration as the network inFigure 2 by selecting the option of Source Database(Supplemental Fig. S1). The edges of each source data-

base are indicated by a unique color based on therendering rules for the complete list of all sourcedatabases (Supplemental Data Set S1).

There is also an added feature that allows users toeasily identify the indirect interactions of the originalprotein using the “Depth Search” button. SupplementalFigure S2 shows the network constructed based on theindirect protein interactiondata generated forAT5G42970.This approach is useful for recognizing new potentialinteractions in the network (Jansen et al., 2003) to assignputative functions to less well-characterized proteinsand to provide more comprehensive understanding ofthe query protein at the system level with the help ofeach cluster in the constructed network.

Multiple Protein Searches

Currently, more and more researchers are employ-ing transcriptomic, next-generation sequencing andmany high-throughput technologies in the fields ofmolecular, cell, and developmental biology to deciphernovel biological phenomena. By using bioinformatics-based approaches, lists of key genes can be furtherclassified to confirm candidates by biological experi-mentation. However, for such gene selection and func-tional analysis to be effective, particularly at a proteinlevel, these data sets require supplementation anddetailed analysis. Therefore, it is critical to produceprotein interaction networks using multiple proteinsas a way of visualizing and analyzing all the inter-actions simultaneously to aid in functional analysis.ANAP supports such multiple protein searches andprotein interaction network construction, so that userscan submit targets as TAIR AGI code, UniProt ID, or acombination of these identifiers into the ANAP tool.Such networks, therefore, provide valuable informa-tion establishing links between proteins, which arelikely to represent functional and regulatory conser-vation.

Figure 3 shows the network generated by searchingusing five proteins (AT1G02090, AT1G10840, AT1G22920,AT1G29150, and AT1G30950) from the AT5G42970(COP9 signalosome complex) ANAP interaction net-work. This was constructed based on direct proteininteractions, with the option of selecting based on Inter-action Detection Method. Each of the query proteins ismarked as a red node, and each interaction detectionmethod is allocated a unique color. Several clusters from

Table II. Five evidence records generated when searching ANAP using protein AT5G42970 based on direct protein interactions

Ath, Arabidopsis.

Name Molecule A Name Molecule B Interaction Detection Method Species Molecule A Species Molecule BPubMed

Identifier

Source

Database

AT1G02090 AT5G42970 Two hybrid Ath Ath 12615944 IntActAT1G10840 AT5G42970 Affinity chromatography technology Ath Ath 15548739 BioGRIDAT1G22920 AT5G42970 Predictive text mining Ath Ath 10521526 STRINGAT1G30950 AT5G42970 Anti-tag coimmunoprecipitation Ath Ath 12724534 APIDAT4G19490 AT5G42970 Interolog mapping Ath Ath 18508856 InteroPORC

Wang et al.

1528 Plant Physiol. Vol. 158, 2012

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

each query protein can be easily recognized within thenetwork graph (Fig. 3).

DISCUSSION

The Current Challenges of Integrating ProteinInteraction Networks

Protein interaction networks can give a system-levelview that is vital for the detailed analysis of complexbiological systems (Jansen et al., 2003). However, pro-vidingmechanisms to integrate protein interaction datathat have been generated from various sources posessignificant challenges. For instance, two proteins may

only interact during a certain developmental stageand/or in a specific tissue; however, most of the cur-rently available protein interaction data do not providetemporal or spatial specificity. Furthermore, these datasets have frequently been generated in ectopic expres-sion systems and thus may not represent the genuineinteractions occurring in vivo. These limitations reducethe accuracy of the established networks, although suchproblems can be lessened by the successful integrationof the increasing amounts of protein interaction datathat have been generated by different approaches. Theimportance of data integration is now being fullyappreciated, and there is a general emphasis towardthe development of standards for large data sets with

Figure 3. Network result generated by using amultiple protein search (five proteins searched: AT1G02090, AT1G10840, AT1G22920,AT1G29150, and AT1G30950) based on direct protein interactions and the node relationship Interaction Detection Method. Eachquery protein is marked in red, and each interaction detection method is indicated by a uniquely colored line. Several node clusterscan be identified, with each query protein evident in the network graph. [See online article for color version of this figure.]

ANAP: An Arabidopsis Protein Interaction Network Tool

Plant Physiol. Vol. 158, 2012 1529

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

defined specific formats, which include PSI-MI (Kaiser,2002) for protein interactions and BioPAX (Demir et al.,2010) and SBML (Hucka et al., 2003) for pathwaystandards. Several other approaches utilize controlledvocabularies with a defined glossary of terms for typesof interactions (Cote et al., 2006) and the use of a specificprotein identifiers, which are constant in all the avail-able protein interaction databases to facilitate easierintegration. There is also a need for these same stan-dards to be established in published scientific journalsto further enhance the effectiveness of text mining tosupplement the ANAP integrated protein interactiondata set.

Interaction with Other Resources

Defining protein function is an essential requirementfor effective, functional network characterization.Moreover, recent studies have shown that protein in-teraction networks are able to give a good prediction ofprotein function (Jansen et al., 2003; Sharan et al., 2007).

Therefore, bridging target genes from transcriptomicdata, or next-generation sequencing data, with the helpof Gene Ontology term enrichment to the proteininteraction network can provide added substance fornetwork characterization (Maere et al., 2005).

Currently, ANAP provides the function of mappingup- and down-regulated transcriptomic data, next-generation sequencing data, and other biology-basedresults onto the generated network. For transcriptomicmapping, the nodes are colored in red or green, whichrepresent the up- or down-regulated genes in thenetwork (Supplemental Fig. S3). The node can alsobe highlighted in blue if customized gene list data (anyinteresting data that users want to overlay onto theANAP network) are mapped onto the network nodes.This makes the ANAP tool very flexible for the user toidentify specific proteins and transcriptomic regula-tory relationships within the network. The node iscolored in fuchsia or turquoise if the mapped custom-ized data also exist in the up- or down-regulatedtranscriptomics data. Moreover, ANAP provides an-

Figure 4. Flow chart of the ANAP tool.The main modules in ANAP includedata collection, data integration, andnetwork visualization. Details of eachmodule are described in the text. [Seeonline article for color version of thisfigure.]

Wang et al.

1530 Plant Physiol. Vol. 158, 2012

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

other seven colors (olive, orange, purple, yellow, ma-roon, navy, and teal) in the mapping function for usersto integrate data such as different subcellular locali-zations or other biology-based data, rather than onlyusing this to indicate differing expression levels. An-other strength of the ANAP tool is the ability for theuser to be able to import the resultant networks intoCytoscape and other software for subsequent addi-tional analysis (Shannon et al., 2003; Kohl et al., 2011;Smoot et al., 2011). The user can import the SIF,GRAPHML, or XGMML file generated by ANAP intoCytoscape. The Cytoscape mapping functions can thenbe used to integrate different resources and plugins foranalyzing existing networks, inferring new networks,functional enrichment of networks, etc. This tool alsosupports import into other network analysis tools, suchas Network Workbench (GRAPHML; NWB, 2006).

CONCLUSION

In this paper, the Web-based ANAP tool has beendesigned and implemented for Arabidopsis proteininteraction network analysis. ANAP currently inte-grates approximately 201,699 unique protein interactionpairs into a tool that has a well-designed, simple-to-use,intuitive interface for biologists that can be exported toCytoscape. Thus, it can be widely used for Arabidopsisprotein interaction network construction and analysis.This is particularly valuable where large numbers ofgenes of interest have been selected from microarrayand next-generation sequencing experiments andwhereonly limited information is known. Case studies usingsingle protein searches and multiple protein searchesfrom the COP9 signalosome complex (Figs. 2 and 3;Supplemental Figs. S1 and S2) and the cytokinin regu-latory pathway (ANAP user guide; Supplemental Fig.S3) have demonstrated the consistently good perfor-mance of ANAP for Arabidopsis protein interactionnetwork analysis. The current ANAP framework pro-vides a novel, intuitive, and easy-to-interpret tool thatwill greatly aid biologists in understanding plant de-velopmental networks, which will allow them to de-cipher their specific biological network interactions farmore quickly than by using biological techniquesalone. Furthermore, ANAP has been designed to ea-sily add features for extending functionality as the tooldevelops. Future work is planned to extend this tool tointegrate the protein interaction data with metabolicpathway data, gene coexpression data, and other typesof interactions to decipher biological problems moreeffectively.

MATERIALS AND METHODS

Data Sets

Arabidopsis (Arabidopsis thaliana) protein interaction data sets integrated

into ANAP include APID (8,014 pairs; Prieto and De Las Rivas, 2006), BIND

(1,545 pairs; Bader et al., 2003), BioGRID (5,862 pairs; Stark et al., 2006, 2011),

ChEMBL (54 pairs; Overington, 2009), DIP (403 pairs; Xenarios et al., 2000,

2001, 2002), IntAct (16,286 pairs; Aranda et al., 2010), InteroPORC (14,722

pairs; Michaut et al., 2008), iRefIndex (18,362 pairs; Razick et al., 2008), MINT

(499 pairs; Ceol et al., 2010), MolCon (116 pairs; Aranda et al., 2010), and

STRING (21,5358 pairs; Jensen et al., 2009; Szklarczyk et al., 2011). All these

data were collected from the PSICQUIC Registry of the European Bioinfor-

matics Institute (Aranda et al., 2010), which is an accurate and frequently used

resource for many projects, such as Bio::Homology::InterologWalk, Cytoscape,

EnVision 2, IMEx Consortium, IntAct, Reactome, Taverna, etc. (http://code.

google.com/p/psicquic/wiki/WhoUsesPsicquic).

In a recent ANAP update, we also integrated 5,664 confirmed binary

interactions between 2,661 proteins from the Arabidopsis Interactome Map-

ping Consortium (2011), which is a recently published high-throughput

Arabidopsis yeast two-hybrid data set.

Access Availability

ANAP is implemented in HTML, Shell, AWK, PHP, and JavaScript with

the support of the Cytoscape Web, which allows the developer to embed

dynamic networks into HTML (Lopes et al., 2010). The tool is open access for

any use and available at http://gmdd.shgmo.org/Computational-Biology/

ANAP. The top right corner of the index page includes a Help link, which is

very useful to new users. The Help page contains a “Video Tutorial,”

“Frequently Asked Questions,” and a “User Guide.” If users have questions

regarding using ANAP or some problems in understanding the terms or

concepts, please refer to the Help page.

Generally, ANAP is updated with new interaction data every 3 months;

however, we have developed a semiautomatic formatting and updating

program for ANAP. This has been rigorously tested with random access

checks and manual checks to ensure stable and accurate integration of new

data. In addition, we have established a log analysis tool to analyze access to

ANAP.

Flow Chart of the ANAP Tool

The main modules in ANAP connect together data collection, data inte-

gration, and network viewing. The architecture of the ANAP pipeline is

shown in the flow chart in Figure 4. We first searched the Arabidopsis protein

interaction data based on the mnemonic (ARATH), taxon identifier (3702),

scientific name (Arabidopsis thaliana), common name (Mouse-ear cress) and

other names [Arabidopsis thaliana (L.) Heynh., Arabidopsis thaliana (thale cress),

Arabidopsis thaliana, thale cress, and thale-cress] used in current protein

interaction databases. The collected protein interaction data were then for-

matted to establish the ANAP database source (Supplemental Data Set S3); the

graphical user interface was designed to support querying the protein(s)

using a TAIR AGI code or UniProt ID. At the same time, the network

rendering rules (Supplemental Data Set S1), based on the statistical analysis of

the source data, were generated. The option is provided to select Source

Database and Interaction Detection Method for the user to choose the desired

node relationship. ANAP then produces the resultant network and extracts

the interaction evidence. In addition, ANAP generates query keywords that

extract the connecting proteins for in-depth searching. Finally, users can

interact with the network and save it in different formats, including network

maps or as network data.

Protein Interaction Data Format

A protein interaction data format was designed to convert and then

integrate the 11 Arabidopsis data sets. Initially, there were numerous issues

associated with independently integrating the protein interaction data from

the 11 different databases. Different programs were written to collect and

format each database; however, these posed problems for subsequent auto-

mated continuous updates. In addition, each database had a different data

access method, which meant that integrating all the Arabidopsis data was

difficult. Furthermore, each database contained very different formats for the

interaction evidence. We found an excellent recent resource named PSICQUIC

(Aranda et al., 2010), which had integrated raw data from about 22 protein

interaction databases. However, after searching and checking extensive ran-

dom data from each of the 11 Arabidopsis available databases, we found that

they did not format well (Supplemental Data Set S3), whichmade it unsuitable

for protein interaction network-based analysis. Taking Interaction Detection

Method, for example, the raw PSICQUIC data have 149 unique methods while

ANAP: An Arabidopsis Protein Interaction Network Tool

Plant Physiol. Vol. 158, 2012 1531

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

the formatted ANAP has 89 unique methods, and among these 149 unique

methods, many, such as “two hybrid,” “2 hybrid,” and “two-hybrid-test” have

the same ontology code, “MI:0018.”

Furthermore, the raw data collected from PSICQUIC have 17 fields

(Aranda et al., 2010), but most of them are blank, including “Links Molecule

A,” “Links Molecule B,” “Alt. Identifiers Molecule A,” “Alt. Identifiers

Molecule B,” “Interaction Type,” ” Interaction AC,” and “Confidence Value.”

Based on the protein interaction network analysis and the current

proteomics standard PSI-MI (Kaiser, 2002), the following seven fields were

created: “Name Molecule A,” “Name Molecule B,” “Interaction Detection

Method,” “PubMed Identifier,” “Species Molecule A,” “Species Molecule B,”

and “Source Database” (Table II).

Name Molecule A and Name Molecule B represent the TAIR AGI code,

protein complex name, or small molecule name. Interaction Detection Method

represents the methods used to support the protein interactions, such as yeast

two-hybrid, coimmunoprecipitation, or electrophoretic mobility shift assays,

etc. If the interaction has been published, then the PubMed Identifier will also

be provided. Species Molecule A and Species Molecule B describe the species

name of two molecules, since some Arabidopsis proteins may interact with

proteins from other species. Source Database describes which database

contains the interaction data. Since all integrated protein interaction data,

including TAIR, UniProt, GeneID, RefSeq, and Gene Symbol, etc., use different

protein names, it was necessary to convert all these IDs to the accepted

standard TAIR AGI code for the network-based analysis. Standard mapping

conversions to AGI can be carried out using a variety of tools, including

DAVID (Dennis et al., 2003; Huang et al., 2009), UniProt ID Mapping (Jain

et al., 2009), and bioDBnet (Mudunuri et al., 2009), but there are issues. When

using the UniProt IDMapping tool, it autofilters the repetitive IDs and ignores

the IDs that cannot be converted to TAIR IDs, which is not suitable to format a

large amount of data. Also, DAVID is not well supported for the plant

community, whereas bioDBnet offers the best method to covert IDs to the

TAIR Arabidopsis AGI code. However, the conversion data were not always

current with bioDBnet. Based on these findings, a combined approach was

employed that first used bioDBnet and subsequently converted the unmapped

IDs using data downloaded from the newest different identifier annotations

from UniProt and the National Center for Biotechnology Information.

Supplemental Data

The following materials are available in the online version of this article.

Supplemental Figure S1. ANAP network result for searching using

protein AT5G42970 based on direct protein interactions and the node

relationship for the Source Database.

Supplemental Figure S2. Network result of searching protein AT5G42970

based on the indirect protein interactions and the node relationship of

the Interaction Detection Method.

Supplemental Figure S3. ANAP network mapping of the Cytokinin

microarray data set.

Supplemental Data Set S1.Color legend showing the interaction detection

methods and the database source for the ANAP data and networks.

Supplemental Data Set S2. Evidence list from the network result gener-

ated by searching using protein AT5G42970, based on direct protein

interactions and the node relationship of the Interaction Detection

Method.

Supplemental Data Set S3. Summary of the PSICQUIC data format,

including the summary of PSICQUIC raw data and a summary of the

formatted ANAP data, Interaction Detection Method, Ontology Code,

Species Molecule A and B, and taxonomic ID used in the formatted

ANAP data.

Received December 12, 2011; accepted February 12, 2012; published February

16, 2012.

LITERATURE CITED

Arabidopsis Interactome Mapping Consortium (2011) Evidence for network

evolution in an Arabidopsis interactome map. Science 333: 601–607

Aranda B, Achuthan P, Alam-Faruque Y, Armean I, Bridge A, Derow C,

Feuermann M, Ghanbarian AT, Kerrien S, Khadake J, et al (2010) The

IntAct molecular interaction database in 2010. Nucleic Acids Res 38:

D525–D531

Bader GD, Betel D, Hogue CW (2003) BIND: the Biomolecular Interaction

Network Database. Nucleic Acids Res 31: 248–250

Bork P, Jensen LJ, von Mering C, Ramani AK, Lee I, Marcotte EM (2004)

Protein interaction networks from yeast to human. Curr Opin Struct Biol

14: 292–299

Brandao MM, Dantas LL, Silva-Filho MC (2009) AtPIN: Arabidopsis

thaliana Protein Interaction Network. BMC Bioinformatics 10: 454

Ceol A, Chatr Aryamontri A, Licata L, Peluso D, Briganti L, Perfetto L,

Castagnoli L, Cesareni G (2010) MINT, the Molecular Interaction

Database: 2009 update. Nucleic Acids Res 38: D532–D539

Cote RG, Jones P, Apweiler R, Hermjakob H (2006) The Ontology Lookup

Service, a lightweight cross-platform tool for controlled vocabulary

queries. BMC Bioinformatics 7: 97

Cui J, Li P, Li G, Xu F, Zhao C, Li Y, Yang Z, Wang G, Yu Q, Li Y, et al (2008)

AtPID: Arabidopsis thaliana Protein Interactome Database—an inte-

grative platform for plant systems biology. Nucleic Acids Res 36: D999–

D1008

Demir E, Cary MP, Paley S, Fukuda K, Lemer C, Vastrik I, Wu G,

D’Eustachio P, Schaefer C, Luciano J, et al (2010) The BioPAX commu-

nity standard for pathway data sharing. Nat Biotechnol 28: 935–942

Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki

RA (April 3, 2003) DAVID: Database for Annotation, Visualization, and

IntegratedDiscovery.GenomeBiol 4:http://dx.doi.org/10.1186/gb-2003-4-5-p3

Geisler-Lee J, O’Toole N, Ammar R, Provart NJ, Millar AH, Geisler M (2007)

A predicted interactome for Arabidopsis. Plant Physiol 145: 317–329

Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi

CE, Godwin B, Vitols E, et al (2003) A protein interaction map of

Drosophila melanogaster. Science 302: 1727–1736

Huang W, Sherman BT, Lempicki RA (2009) Systematic and integrative

analysis of large gene lists using DAVID bioinformatics resources. Nat

Protoc 4: 44–57

Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP,

Bornstein BJ, Bray D, Cornish-Bowden A, et al (2003) The systems

biology markup language (SBML): a medium for representation and

exchange of biochemical network models. Bioinformatics 19: 524–531

Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ,

McGarvey P, Gasteiger E (2009) Infrastructure for the life sciences:

design and implementation of the UniProt website. BMC Bioinformatics

10: 136

Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A,

Snyder M, Greenblatt JF, Gerstein M (2003) A Bayesian networks

approach for predicting protein-protein interactions from genomic data.

Science 302: 449–453

Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T,

Julien P, Roth A, Simonovic M, et al (2009) STRING 8: a global view on

proteins and their functional interactions in 630 organisms. Nucleic

Acids Res 37: D412–D416

Kaiser J (2002) Proteomics: public-private group maps out initiatives.

Science 296: 827

Kohl M, Wiese S, Warscheid B (2011) Cytoscape: software for visualization

and analysis of biological networks. Methods Mol Biol 696: 291–303

Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO,

Han JD, Chesneau A, Hao T, et al (2004) A map of the interactome

network of the metazoan C. elegans. Science 303: 540–543

Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, Bader GD (2010)

Cytoscape Web: an interactive Web-based network browser. Bioinfor-

matics 26: 2347–2348

Maere S, Heymans K, Kuiper M (2005) BiNGO: a Cytoscape plugin to

assess overrepresentation of Gene Ontology categories in biological

networks. Bioinformatics 21: 3448–3449

Michaut M, Kerrien S, Montecchi-Palazzi L, Chauvat F, Cassier-Chauvat

C, Aude JC, Legrain P, Hermjakob H (2008) InteroPORC: automated

inference of highly conserved protein interaction networks. Bioinfor-

matics 24: 1625–1631

Mudunuri U, Che A, Yi M, Stephens RM (2009) bioDBnet: the biological

database network. Bioinformatics 25: 555–556

NWB Team (2006) Network Workbench Tool. Indiana University, Northeast-

ern University, and University of Michigan. http://nwb.slis.indiana.edu

Obayashi T, Kinoshita K, Nakai K, Shibaoka M, Hayashi S, Saeki M,

Shibata D, Saito K, Ohta H (2007) ATTED-II: a database of co-expressed

Wang et al.

1532 Plant Physiol. Vol. 158, 2012

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021

genes and cis elements for identifying co-regulated gene groups in

Arabidopsis. Nucleic Acids Res 35: D863–D869

Overington J (2009) ChEMBL: an interview with John Overington, team

leader, chemogenomics at the European Bioinformatics Institute Out-

station of the European Molecular Biology Laboratory (EMBL-EBI).

Interview by Wendy A. Warr. J Comput Aided Mol Des 23: 195–198

Prieto C, De Las Rivas J (2006) APID: Agile Protein Interaction Data-

Analyzer. Nucleic Acids Res 34: W298–W302

Razick S, Magklaras G, Donaldson IM (2008) iRefIndex: a consolidated

protein interaction database with provenance. BMC Bioinformatics 9: 405

Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N,

Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, et al (2005)

Towards a proteome-scale map of the human protein-protein interaction

network. Nature 437: 1173–1178

Schwechheimer C, Isono E (2010) The COP9 signalosome and its role in

plant development. Eur J Cell Biol 89: 157–162

Schwikowski B, Uetz P, Fields S (2000) A network of protein-protein

interactions in yeast. Nat Biotechnol 18: 1257–1261

Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N,

Schwikowski B, Ideker T (2003) Cytoscape: a software environment for

integrated models of biomolecular interaction networks. Genome Res

13: 2498–2504

Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein

function. Mol Syst Biol 3: 88

Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Cytoscape 2.8:

new features for data integration and network visualization. Bioinfor-

matics 27: 431–432

Stark C, Breitkreutz BJ, Chatr-Aryamontri A, Boucher L, Oughtred R,

Livstone MS, Nixon J, Van Auken K, Wang X, Shi X, et al (2011) The

BioGRID Interaction Database: 2011 update. Nucleic Acids Res 39:

D698–D704

Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M

(2006) BioGRID: a general repository for interaction datasets. Nucleic

Acids Res 34: D535–D539

Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez

P, Doerks T, Stark M, Muller J, Bork P, et al (2011) The STRING

database in 2011: functional interaction networks of proteins, globally

integrated and scored. Nucleic Acids Res 39: D561–D568

Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon

D, Narayan V, Srinivasan M, Pochart P, et al (2000) A comprehensive

analysis of protein-protein interactions in Saccharomyces cerevisiae.

Nature 403: 623–627

Xenarios I, Fernandez E, Salwinski L, Duan XJ, Thompson MJ, Marcotte

EM, Eisenberg D (2001) DIP: the Database of Interacting Proteins. 2001

update. Nucleic Acids Res 29: 239–241

Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D

(2000) DIP: the database of interacting proteins. Nucleic Acids Res 28:

289–291

Xenarios I, Salwınski L, Duan XJ, Higney P, Kim SM, Eisenberg D (2002)

DIP, the Database of Interacting Proteins: a research tool for studying

cellular networks of protein interactions. Nucleic Acids Res 30: 303–305

ANAP: An Arabidopsis Protein Interaction Network Tool

Plant Physiol. Vol. 158, 2012 1533

Dow

nloaded from https://academ

ic.oup.com/plphys/article/158/4/1523/6109075 by guest on 22 O

ctober 2021