6/26/2015 function prediction of protein complexes with domain correlation ya zhang xue-wen chen...

1
11/03/22 Function Prediction of Protein Function Prediction of Protein Complexes with Domain Correlation Complexes with Domain Correlation Ya Zhang Xue-Wen Chen University of Kansas Introduction Introduction Protein complexes: groups of proteins that interact/associate with each other. Many molecular machineries are protein complexes e.g., proteasome, responsible for degradation of unwanted proteins, consists of more than 50 proteins Motivation Motivation Previous efforts in analyzing protein complex data were largely based on the protein constitution of the complexes. Studying complexes at protein level does not always reveal the function linkage among complexes. The cytoplasmic ribosomal large subunit vs. the mitochondrial ribosomal large subunit In this study, we analyze the domain composition of protein complexes to predict the functions of protein complexes. Discover domain functional Discover domain functional modules modules Support of a set of domains X={d 1 , d 2 , .., d k }, is the fraction of protein complexes containing X. supp(X)=N X /N H-confidence of X is: hconf(X)=supp(X)/max i (supp(p i )) A set of domains X ={d 1 , d 2 , .., d k } is a hyperclique pattern if hconf(X) ≥ h c and supp(X) ≥ s. Understanding the composition and function of protein complexes is an important research focus in biological science. Domain cytoplasmic mitochondr ial Ribosomal_L5 YGR085c; YPR102c YDR237w Ribosomal_L2 3 YOL127w YDR405w Ribosomal_L1 1 YDR418w; YEL054c YNL185c Ribosomal_L2 YFR031c-a; YIL018w YEL050c The cytoplasmic ribosomal large subunit vs. the mitochondrial ribosomal large subunit: Both function for protein synthesis Share no single protein but 14 domains Method Method Domain functional modules Set of domains that perform some elementary functions in protein complexes. Extensively shared among protein complexes. Domain-domain network Nodes: domains Edges: shared membership High-throughput Protein Complexes High-throughput Protein Complexes High-throughput experiments have produced a large amount of protein complex data. Gavin, et al. (Nature, 2002) TAP : Tandem Affinity Purification Ho, et al. (Nature, 2002) HMS-PCI: High-throughput Mass Spectromic Protein Complex Identification But the biological functions of the protein complexes are largely unknown. ComplexID Function(Pred .) MIPS annotation 550.1.108 Protein synthesis probably protein synthesis turnover 550.1.104 Protein synthesis probably protein synthesis turnover 550.1.140 ; 550.1.142 ; 550.1.155 DNA replication probably RNA metabolism 550.1.39; 550.1.42 oxidoreductas e probably intermediate and energy metabolism 550.2.56 DNA replication N/A 550.1.190 550.1.211 ; DNA replication probably transcription/DNA maintanance/chromatin Domain Functional Modules Protein Complexes RNA_pol_Rpb1_1; RNA_pol_Rpb1_2; RNA_pol_Rpb1_3; RNA_pol_Rpb1_4; RNA_pol_Rpb1_5; RNA_pol_L; RNA_pol_A_bac; RNA_pol_Rpb2_1; RNA_pol_Rpb2_2; RNA_pol_Rpb2_3; RNA_pol_Rpb2_5; RNA_pol_Rpb2_6; RNA_pol_Rpb2_7; RNA_pol_Rpb5_N; RNA_pol_Rpb5_C; DNA_RNApol_7kD; RNA_pol_N; RNA_pol_Rpb8; RNA_pol_Rpb6 RNA polymerase I; RNA polymerase II; RNA polymerase III Ribosomal_L14; Ribosomal_L1; Ribosomal_L5; Ribosomal_L5_C; Ribosomal_L23; Ribosomal_L6; KOW; Ribosomal_L11_N; Ribosomal_L11; Ribosomal_L2; Ribosomal_L2_C; L15; Ribosomal_L3; Ribosomal_L13 cytoplasmic ribosomal large subunit; mitochondrial ribosomal large subunit Ribosomal_S17; Ribosomal_S9; S4; Ribosomal_S5; Ribosomal_S5_C; Ribosomal_S14; Ribosomal_S10; Ribosomal_S15; Ribosomal_S2; Ribosomal_S7; Ribosomal_S19 mitochondrial ribosomal small subunit; cytoplasmic ribosomal small subunit Transket_pyr; Pyr_redox_2; Pyr_redox; Biotin_lipoyl; 2-oxoacid_dh; E1_dh; GIDA; Pyr_redox_dim Pyruvate dehydrogenase; 2- oxoglutarate dehydrogenas Results: hypercliques Results: hypercliques Results: function prediction Results: function prediction Discover domain functional modules Discover domain functional modules Discover domain functional modules as cliques in domain-domain network Protein Complexes (Higher order functions) Domain Functional Modules (Elementary functions) C1 C2 C3 M1 M2 Two criteria Frequency of occurrence Correlation / association The modules indicate the functions of the corresponding protein complexes. Identify domain functional modules as hyperclique patterns (Xiong et al. 2005).

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 6/26/2015 Function Prediction of Protein Complexes with Domain Correlation Ya Zhang Xue-Wen Chen University of Kansas Introduction  Protein complexes:

04/18/23

Function Prediction of Protein Complexes Function Prediction of Protein Complexes with Domain Correlationwith Domain Correlation

Ya Zhang Xue-Wen Chen

University of Kansas

IntroductionIntroductionProtein complexes: groups of proteins that interact/associate with each other.Many molecular machineries are protein complexes

e.g., proteasome, responsible for degradation of unwanted proteins, consists of more than 50 proteins

MotivationMotivationPrevious efforts in analyzing protein complex data were largely based on the protein constitution of the complexes. Studying complexes at protein level does not always reveal the function linkage among complexes.

The cytoplasmic ribosomal large subunit vs. the mitochondrial ribosomal large subunit

In this study, we analyze the domain composition of protein complexes to predict the functions of protein complexes.

Discover domain functional modulesDiscover domain functional modules Support of a set of domains X={d1, d2, .., dk}, is

the fraction of protein complexes containing X.

supp(X)=NX/N H-confidence of X is:

hconf(X)=supp(X)/maxi(supp(pi)) A set of domains X ={d1, d2, .., dk} is a

hyperclique pattern if

hconf(X) ≥ hc and supp(X) ≥ s.

Understanding the composition and function of protein complexes is an important research focus in biological science.

Domain cytoplasmic mitochondrial

Ribosomal_L5 YGR085c; YPR102c YDR237w Ribosomal_L23 YOL127w YDR405wRibosomal_L11 YDR418w; YEL054c YNL185cRibosomal_L2 YFR031c-a;

YIL018wYEL050c

Ribosomal_L6 YGL147c; YNL067w YHR147c

The cytoplasmic ribosomal large subunit vs. the mitochondrial ribosomal large subunit:

Both function for protein synthesisShare no single protein but 14 domains

MethodMethodDomain functional modules

Set of domains that perform some elementary functions in protein complexes.Extensively shared among protein complexes.

Domain-domain networkNodes: domainsEdges: shared membership

High-throughput Protein ComplexesHigh-throughput Protein Complexes High-throughput experiments have produced a

large amount of protein complex data. Gavin, et al. (Nature, 2002)

TAP : Tandem Affinity Purification Ho, et al. (Nature, 2002)

HMS-PCI: High-throughput Mass Spectromic Protein Complex Identification

But the biological functions of the protein complexes are largely unknown.

ComplexID

Function(Pred.)

MIPS annotation

550.1.108 Protein synthesis

probably protein synthesis turnover

550.1.104 Protein synthesis

probably protein synthesis turnover

550.1.140; 550.1.142;550.1.155

DNA replication probably RNA metabolism

550.1.39; 550.1.42

oxidoreductase probably intermediate and energy metabolism

550.2.56 DNA replication N/A550.1.190550.1.211; 550.1.220; 550.1.205;550.1.228

DNA replication probably transcription/DNA maintanance/chromatin structure

Domain Functional Modules Protein Complexes

RNA_pol_Rpb1_1; RNA_pol_Rpb1_2; RNA_pol_Rpb1_3; RNA_pol_Rpb1_4; RNA_pol_Rpb1_5; RNA_pol_L; RNA_pol_A_bac; RNA_pol_Rpb2_1; RNA_pol_Rpb2_2; RNA_pol_Rpb2_3; RNA_pol_Rpb2_5; RNA_pol_Rpb2_6; RNA_pol_Rpb2_7; RNA_pol_Rpb5_N; RNA_pol_Rpb5_C; DNA_RNApol_7kD; RNA_pol_N; RNA_pol_Rpb8; RNA_pol_Rpb6

RNA polymerase I; RNA polymerase II; RNA polymerase III

Ribosomal_L14; Ribosomal_L1; Ribosomal_L5; Ribosomal_L5_C; Ribosomal_L23; Ribosomal_L6; KOW; Ribosomal_L11_N; Ribosomal_L11; Ribosomal_L2; Ribosomal_L2_C; L15; Ribosomal_L3; Ribosomal_L13

cytoplasmic ribosomal large subunit; mitochondrial ribosomal large subunit

Ribosomal_S17; Ribosomal_S9; S4; Ribosomal_S5; Ribosomal_S5_C; Ribosomal_S14; Ribosomal_S10; Ribosomal_S15; Ribosomal_S2; Ribosomal_S7; Ribosomal_S19

mitochondrial ribosomal small subunit; cytoplasmic ribosomal small subunit

Transket_pyr; Pyr_redox_2; Pyr_redox; Biotin_lipoyl; 2-oxoacid_dh; E1_dh; GIDA; Pyr_redox_dim

Pyruvate dehydrogenase; 2-oxoglutarate dehydrogenas

Results: hypercliquesResults: hypercliques

Results: function predictionResults: function prediction

Discover domain functional modulesDiscover domain functional modules Discover domain functional modules as

cliques in domain-domain network

Protein Complexes(Higher order

functions)Domain Functional

Modules(Elementary functions)

C1 C2

C3

M1 M2

Two criteria Frequency of

occurrence Correlation /

association

The modules indicate the functions of the corresponding protein complexes.

Identify domain functional modules as hyperclique patterns (Xiong et al. 2005).