supplementary materials for - rune · pdf file supplementary materials for ... ruedi...

14
www.sciencesignaling.org/cgi/content/full/2/81/ra39/DC1 Supplementary Materials for Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases Chris Soon Heng Tan, Bernd Bodenmiller, Adrian Pasculescu, Marko Jovanovic, Michael O. Hengartner, Claus Jørgensen, Gary D. Bader, Ruedi Aebersold, Tony Pawson, Rune Linding* *To whom correspondence should be addressed. E-mail: [email protected] Published 28 July 2009, Sci. Signal. 2, ra39 (2009) DOI: 10.1126/scisignal.2000316 This PDF file includes: Methods Fig. S1. Clustering analysis of core phosphorylation sites. Fig. S2. Phosphorylatable residues in disordered regions are fast evolving. Fig. S3. Phosphorylation site disorder analysis. Fig. S4. Schematic diagram of how the conserved phosphorylation propensity k of each human substrate is computed. Fig. S5. Box plot of number of human substrate relations observed as conserved in target species from randomized trials. Fig. S6. Core sites observed in activation loops of protein kinases. Table S1. Statistical significance of core sites observed between human and yeast. Table S2. Statistical significance of core sites observed between human and fly. Table S3. Statistical significance of core sites observed between human and worm. Table S4. Core sites identified in components of the β-catenin destruction complex. Table S5. Core sites identified in components of the clathrin coat of coated pits. Table S6. Core site identified in 22 cancer-associated genes. Table S7. Correlation of cancer-associated genes with conserved phosphorylation propensity k computed for individual target species. Table S8. Correlation of OMIM genes with conserved phosphorylation propensity k computed for individual target species. References Supplementary Data File Descriptions Other Supplementary Material for this manuscript includes the following: (available at www.sciencesignaling.org/cgi/content/full/2/81/ra39/DC1 ) Supplementary Data (ZIP file containing files in cys, tsv, txt, csv, and fa formats)

Upload: buihanh

Post on 06-Feb-2018

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

www.sciencesignaling.org/cgi/content/full/2/81/ra39/DC1

Supplementary Materials forComparative Analysis Reveals Conserved Protein Phosphorylation

Networks Implicated in Multiple Diseases

Chris Soon Heng Tan, Bernd Bodenmiller, Adrian Pasculescu, Marko Jovanovic, Michael O. Hengartner, Claus Jørgensen, Gary D. Bader, Ruedi Aebersold, Tony

Pawson, Rune Linding*

*To whom correspondence should be addressed. E-mail: [email protected]

Published 28 July 2009, Sci. Signal. 2, ra39 (2009)DOI: 10.1126/scisignal.2000316

This PDF file includes:

MethodsFig. S1. Clustering analysis of core phosphorylation sites.Fig. S2. Phosphorylatable residues in disordered regions are fast evolving.Fig. S3. Phosphorylation site disorder analysis.Fig. S4. Schematic diagram of how the conserved phosphorylation propensity k of each human substrate is computed.Fig. S5. Box plot of number of human substrate relations observed as conserved in target species from randomized trials.Fig. S6. Core sites observed in activation loops of protein kinases.Table S1. Statistical significance of core sites observed between human and yeast.Table S2. Statistical significance of core sites observed between human and fly.Table S3. Statistical significance of core sites observed between human and worm.Table S4. Core sites identified in components of the β-catenin destruction complex.Table S5. Core sites identified in components of the clathrin coat of coated pits.Table S6. Core site identified in 22 cancer-associated genes.Table S7. Correlation of cancer-associated genes with conserved phosphorylation propensity k computed for individual target species.Table S8. Correlation of OMIM genes with conserved phosphorylation propensity k computed for individual target species.ReferencesSupplementary Data File Descriptions

Other Supplementary Material for this manuscript includes the following:(available at www.sciencesignaling.org/cgi/content/full/2/81/ra39/DC1)

Supplementary Data (ZIP file containing files in cys, tsv, txt, csv, and fa formats)

Page 2: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Supplementary Information for:Comparative Analysis Reveals Conserved Protein Phosphory lationNetworks Implicated in Multiple Diseases

Chris Soon Heng Tan 1,2,∗, Bernd Bodenmiller 3,∗, Adrian Pasculescu 1, Marko Jovanovic 4 ,Michael O. Hengartner 4, Claus Jørgensen 1, Gary D. Bader 1,2, Ruedi Aebersold 3,5,6,7, Tony Pawson 1,2

and Rune Linding 8†

1 Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada2 Department of Molecular Genetics, University of Toronto, Toronto, Canada3 Institute of Molecular Systems Biology, ETH, Zurich, Switzerland4 Institute of Molecular Biology, University of Zurich, 8057 Zurich, Switzerland5 Institute for Systems Biology, Seattle, WA 98103, USA6 Competence Center for Systems Physiology and Metabolic Diseases, ETH Zurich, 8093 Zurich, Switzerland7 Faculty of Science, University of Zurich, 8057 Zurich, Switzerland8 Cellular & Molecular Logic Team, Section of Cell and Molecular Biology, The Institute of Cancer Research (ICR), SW3 6JB, London,

UK∗ These authors contributed equally to this work† To whom correspondence should be addressed; E-mail: [email protected]

Page 3: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 2

Supplementary Methods

Generation of peptide samples

D. melanogaster samples were generated as follows: Kc167 cells were grown in Schneiders Drosophilamedium (Invitrogen) supplemented with 10% fetal calf serum, 100 U penicillin (Invitrogen) and 100 g/ml strep-tomycin (Invitrogen, Auckland, New Zealand) in an incubator at 25◦C. To increase the number of mappedphosphorylation sites, different batches of cells were pooled. Cells were either: 1) grown in rich medium, 2)serum-starved, 3) treated for 30 min with 100 nM Rapamycin (LClabs, Woburn, MA, USA) in rich medium, 4)treated for 30 min with 100 nM insulin (serum starved), or 5) treated for 30 min with 100 nM Calyculin A (richmedium). Then the cells were washed with ice-cold phosphate-buffered saline and resuspended in ice-coldlysis buffer containing 10 mM HEPES, pH 7.9, 1.5 mM MgCl2, 10 mM KCl, 0.5 mM dithiothreitol and a pro-tease inhibitor mix (Roche, Basel, Switzerland). To preserve protein phosphorylation, several phosphataseinhibitors were added to a final concentration of 20 nM calyculin A, 200 nM okadaic acid, 4.8 µm cyperme-thrin (all bought from Merck KGaA, Darmstadt, Germany), 2 mM vanadate, 10 mM sodium pyrophosphate,10 mM NaF and 5 mM EDTA. After 10 min incubation on ice, cells were lysed by douncing. Cell debris andnuclei were removed by centrifugation for 10 min at 4◦C at 5500 g. Then the cytoplasmic and membranefraction were separated by ultracentrifugation at 100000 g for 60 min at 4◦C. The proteins of the cytosolicfraction (supernatant) were subjected to acetone precipitation. The protein pellets were resolubilized in 3 mMEDTA, 20 mM Tris-HCl, pH 8.3, and 8 M urea. The disulfide bonds of the proteins were reduced with tris(2-carboxyethyl) phosphine at a final concentration of 12.5 mM at 37◦C for 1 h. The produced free thiolswere alkylated with 40 mM iodoacetamide at room temperature for 1 h. The solution was diluted with 20 mMTris-HCl (pH 8.3) to a final concentration of 1.0 M urea and digested with sequencing-grade modified trypsin(Promega, Madison, WI) at 20 µg per mg of protein overnight at 37◦C. Peptides were desalted on a C18Sep-Pak cartridge (Waters, Milford, MA) and dried in a speedvac.

The S. cerevisiae peptide samples were generated as follows. Wild type (BY7092: can1::STE2prSp his5lyp1∆ his3∆ leu2∆ ura3∆ met15∆) were grown to OD ∼0.8 at 30◦C in synthetic dened (SD) medium (perliter 1.7 g YNB, 5 g ammonium sulfate, 2% glucose, 0.03 g isoleucine, 0.15 g valine, 0.04 g adenine, 0.02g arginine, 0.1 g leucine, 0.03 g lysine, 0.02 g methionine, 0.05 g phenylalanine, 0.2 g threonine, 0.02 ghistidine, 0.02 g tryptophane, 0.03 g tyrosine, 0.02 g uracil, 0.1 g glutamic acid and 0.1 g aspartic acid).Cells were harvested at 30◦C by centrifugation at 850 g and then washed once in SD medium. Finally, theywere collected by centrifugation and shock-frozen in liquid nitrogen. Pellets were thawed in ice-cold lysisbuffer (20 mM Tris/HCl pH 8.0, 100 mM KCl, 10 mM EDTA, 0.1% NP40, 20 nM calyculin A, 200 nM okadaicacid, 4.8 µM cypermethrin (all obtained from Merck KGaA, Darmstadt, Germany), 2 mM vanadate, 10 mMsodium pyrophosphate and 10 mM NaF) using 1 mL of lysis buffer per gram of yeast. The cells were lysed byglass-bead beating (using acid-washed glass beads), the protein supernatant was precipitated using ice-coldacetone, and the pellet was resuspended in 8 M urea, 20 mM Tris/HCl at pH 8.3. After dilution to < 1.5 Murea with 20 mM Tris/HCl at pH 8.3, proteins were digested using trypsin in a w/w ratio of 1:125 and puriedusing C18 reverse phase chromatography (Sep-Pack, Waters).

For C. elegans the wild-type strain N2 (Bristol) was grown on 9-cm nematode growth medium (NGM)agar plates seeded with a lawn of the E. coli strain OP50 or in 100-ml liquid cultures in S-basal buffer inbevelled flasks (with concentrated E. coli NA22 as a food source). Worms were harvested from plates orliquid culture and separated from the bacteria by washing with M9 buffer three times. The worms wereharvested by centrifugation at 500 g and shock frozen using liquid nitrogen. Subsequently, the worms werehomogenized with glass beads (diameter of 212-300 µm, Sigma-Aldrich, St Louis, MO, USA) in the ratio of1:1:2 (worms:beads:buffer) in a cell disrupter (FastPrep FP120, Thermo Savant, Qbiogene Inc., Carlsbad,CA, USA) at 4◦C three times for 45 s at level 6. The buffer used was 50 mM Tris/HCl, pH 8.3, 5 mM EDTA, 8M urea. After glass bead beating treatment, 0.125% SDS was added and the homogenate was incubated for1 h at room temperature to solubilize proteins. Cell debris was removed by centrifugation. The peptides wereproduced from the proteins in the supernatant as described above. For further details refer to (1–4).

Peptide separation by isoelectric focusing

All peptides were separated according to their isoelectric point. For the D. melanogaster this was performedusing an free-flow electrophoresis instrument, type prometheus from FFE Weber Inc. (now BD-Diagnostics,PAS) and FFE-Weber reagent basic kit (Prolyte 1, Prolyte 2, Prolyte 3 and Prolyte 4-7 and pI markers) (BD-Diagnostics, NJ, USA). The digested peptides were diluted in separation media containing 8 M Urea, 250 mMMannitol and 20% ProLyte solution at a concentration of 10 mg/ml. This sample was loaded continuously for1 h at 1 ml/h. Total collection time was 24 h and the volume of each collected fraction was about 25 to 50ml. A Thermo Orion needle tip micro pH electrode (Thermo Electron Corporation, Beverly, MA) was used tomeasure the pH value of each fraction. Peptides from the FFE fractions 18-60 were purified on a C18 Sep-Pak cartridge (Waters Corporation, Milford, MA, USA) (1). For C. elegans and S. cerevisiae the dried-down

Page 4: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 3

peptide samples (15 mg and 20 mg, respectively) were seperated with an Offgel fractionator and thereforeresolubilized to a final concentration of 1 mg/ml in off-gel electrophoresis buffer containing 6.25% glycerol and1.25% IPG buffer (GE Healthcare). The peptides were separated on pH 3-10 IPG strips (GE Healthcare) witha 3100 OFFGEL fractionator (Agilent) as previously described (4; 5). We performed a 1-hour rehydration atmaximum 500 V, 50 mA, and 200 mW followed by the separation at maximum 8000V, 100 mA, and 300 mWuntil 50 kVh were reached. Following isoelectric focussing, the fractions were concentrated and cleaned upby C18 reversed-phase spin columns according to the manufacturers instructions (Sep-Pack, Waters).

Detailed description of phosphopeptide isolation

Phosphopeptides were isolated using a titanium dioxide resin as follows: 1-3 mg of dried peptides were recon-stituted in 280 µl of a washing solution (WS), containing 80% acetonitrile and 3.5% TFA, which is saturatedwith phtalic acid (∼100 mg phtalic acid per ml). Then 1.25 mg TiO2 (GL Science, Saitama, Japan) resin wasplaced into a 1-ml Mobicol spin column (MoBiTec, Gottingen, Germany) and was subsequently washed with280 µl water, 280 µl methanol, and finally was equilibrated with 280 µl WS for at least 10 minutes. Afterremoval of the WS by centrifugation using 500 x g, the peptide solution was added to the equilibrated TiO2

in the blocked Mobicol spin column and was incubated for > 30 min with end-over-end rotation. After thisstep, the peptide solution was removed by centrifugation, and the resin was thoroughly washed two timeseach with 280 µl of the WS, with a 80% acetonitrile, 0.1% TFA solution, and finally with 0.1% TFA. In the finalstep, phosphopeptides were eluted from the TiO2 resin using two times 150 µl of a 0.3 M NH4OH solution (pH∼10.5). After elution, the pH of the pooled eluents was rapidly adjusted to 2.7 with 10% TFA, and the phos-phopeptides were purified with an appropriate reverse-phase column suitable for up to 20 µg peptide. Besidesthe separated peptides, this procedure was also performed on yeast and worm whole-cell or whole-organismlysates.

Alternatively, phosphopeptides were also isolated with immobilization by metal affinity chromatography(IMAC). In detail, 1-3 mg of peptides were reconstituted in 280 µl of a WS, consisting of 250 mM acetic acidwith 30% acetonitrile at pH 2.7. Then 60 µl of uniformly suspended PHOS-Select iron affinity gel (SigmaAldrich), corresponding to ∼30 µl resin, was placed into a 1-ml Mobicol spin column. The resin was equili-brated three times with 280 µl of the WS. After removal of the WS by centrifugation at 500 x g, the peptidesolution was added to the equilibrated IMAC resin in the blocked Mobicol spin column. To obtain reproducibleresults, it is crucial that the pH in all replicate samples is maintained at ∼2.5. The affinity gel was then in-cubated with the peptide solution for 120 min with end-over-end rotation. After the incubation, the liquid wasremoved by centrifugation and the resin was thoroughly washed two times with 280 µl of the WS, and oncewith ultra pure water. In the final step, phosphopeptides were eluted once with 150 µl of a 50 mM phosphatebuffer (pH 8.9) and once with 150 µl of a 100 mM phosphate buffer (pH 8.9), each time incubating the resin< 3 min with the elution buffer. Both elutes were pooled, the pH were rapidly adjusted to 2.7 using 10%TFA, and the phosphopeptides were purified with an appropriate reverse-phase column. This procedure wasperformed on separated fly and worm peptide samples and yeast whole-cell lysates.

Finally, phosphoramidate chemistry (PAC) was used in the case of D. melanogaster for phosphopeptideisolation on the peptide samples after isoelectric focusing (1; 2). Phosphopeptides were isolated with phos-phoramidate chemistry as follows: 1 mg of dried peptide was reconstituted in 750 µl of methanolic HCl, whichwas prepared by slowly adding 120 µl of acetyl chloride to 750 µl of anhydrous methanol. The methyl esteri-fication was then allowed to proceed at 12◦C for 120 min. The solvent was quickly removed in a cool vacuumconcentrator and peptide methyl esters were dissolved in 40 µl methanol, 40 µl water, and 40 µl acetonitrile.Then 500 µl of a solution containing 50 mM N-(3-Dimethylaminopropyl)-N’-ethylcarbodiimide (EDC), 100 mMimidazole pH 5.6, 100 mM 2-(N-Morpholino)ethanesulfonic acid (MES) pH 5.6, and 2 M cystamine was addedto the peptide solution. The reaction was allowed to proceed at room temperature with vigorous shaking for 8hours. The solution was then loaded onto an appropriate reverse-phase column and the derivatized peptideswere subsequently: First, washed with 0.1% TFA; second, treated with 10 mM TCEP (pH should be adjustedto ∼3 using sodium hydroxide (NaOH)) for 8 minutes, in order to produce free thiol groups; third, washedagain with 0.1% TFA to remove residual TCEP. Finally, the derivatized peptides were eluted with 80% acetoni-trile, 0.1% TFA and the pH was adjusted to 6.0 with phosphate buffer. Then acetonitrile was partially removedin the vacuum concentrator to yield a final concentration of ∼30%, and the derivatized phosphopeptides wereincubated with 5 mg maleimide functionalized-glass beads for 1 h at pH 6.2 in a Mobicol column. (The beadswere synthesized by dissolving 120 µmol hydroxybenzotriazole, 120 µmol of 3-maleimidopropionic acid, and120 µmol diisopropylcarbodiimide in 1 ml of dry dimethylformamide, completely. After 30 minutes of incu-bation, 100 mg CPG beads (Proligo Biochemie, Hamburg, Germany) corresponding to 40 µmol free aminogroups were added for 90 minutes. After the reaction, beads were washed with dimethylformamide and driedwith a vacuum concentrator. Beads were stored dry at 4◦C.) The derivatized beads were washed two timessequentially with 300 µl 3 M NaCl, water, methanol, and, finally, with 80% acetonitrile to remove nonspecif-ically bound peptides. In the last step, the beads were incubated with 5% TFA, 30% acetonitrile for 1 h to

Page 5: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 4

recover the phosphopeptides. The recovered sample was dried in the vacuum concentrator. This procedurewas also performed on yeast whole-cell lysates. For further details refer to (1–4).

Mass spectrometry data analysis and sampling depths

The liquid chromatography–tandem mass spectrometry (LC-MS, on a Thermo Fisher Scientic LTQ ORBITRAP

XL) analysis and database searches were performed as described in (1). The S. cerevisiae and C. elegansMS spectra were searched against the SGD (release October 10th 2007) and WormBase (release WS183)databases and the D. melanogaster data were searched against the FlyBase database v4.3. In addition to thedata stored in the PhoshoPep database, we added in the case of S. cerevisiae electron transfer dissociation(ETD) fragmentation data from (6), although these data only constitute 19% of the total dataset. In this study,the D. melanogaster phosphopeptide isolates were most extensively analyzed in terms of the total number ofLC-MS/MS runs employed and consequently larger coverage was achieved than for the other target species.Finally, the expected sizes of the phosphoproteomes of yeast, worm, and fly strongly differ, simply due todifferences in their genome sizes and repertoires of kinases and phosphatases.

Page 6: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 5

Supplementary Figures

Human Combined

21 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

YVLKASR

VMHGEDC

QYK

VSLHGDAKR

TNARLHDSG

Y

V

T

Q

P

N

IG

F

A

LH

0

TS

S

NID

C

AVQPM

TKGVFEDS

LSD

YVKGFEDTP

5

RLIFANDSG

Similar Known Motifs

PDK1

-5 0 50.0

1.5

3.0

bits

QRSAEG

YVNKFTEASD

NSDAKLPRT

RMKGFQSEAT

QNMHDAKTFS

STPMLSYF

GFEDASTVC

ILNPFTVAG

IHVSPT

KINGEATP

CK2_group

bits

-5 0 50.0

1.5

3.0

EEESDE

EDT

SGSED

SED

SDE

SDE

DE

RCK_group

-5 0 50.0

2.0

4.0

bits

RCKPR

QTSLRCP

ASTYRT

SRGTSAP

DRST

CHK2

-5 0 50.0

2.0

4.0

bits

YWVFMIL

KR

AMSRQT

LNT

SMLFYI

HWCYRTS

JNK_group

-5 0 50.0

1.5

3.0

bits

SDE

SALP

TAPSLT

SQAPEP

EAPTG

AS

CDK1

CDK5

mTOR

CDK2,3 Group

GSK3 Group

JNK Group

P38 Group

MAPK1,3,7/NLK Group

RCK Group

CHK2

CLK Group

CaMKIIα/CaMKIIδ Group

CK2 Group

PDK1

1

2

3

4

5

6

7

8

9

10

0.0

1.0

2.0

3.0

4.0

bits

-5

G

KSAPLVR

Q

L

VTS

PF

GDIM

GEVTIRP

SLIATGI

VTGFCSPA

0

TS

RYGLFH

Q

H

E

DLARN

D

WRKTLEP

Y

S

LEHGIRK

5

L

C

SR

QPNMTKE

0.0

1.0

2.0

3.0

4.0

bits

-5

P

GSVAL

W

T

S

NGDQELR

V

Y

Q

GFANLKR

I

V

Q

F

TSPDRL

V

N

H

A

RKSP

0

TS

V

T

IRLDEP

T

M

I

H

E

D

C

S

G

VALP

T

M

I

D

RNSP

V

G

W

S

P

N

ITRQL

5

E

NFSDARQK

0.0

1.0

2.0

3.0

4.0

bits

-5

QNIG

TKIGDAE

FTVR

LKITR

VSRNKFAH

0

STPLDSF

VTIFC

MRG

SKT

5

QIHDP

0.0

1.0

2.0

3.0

4.0bits

-5

R

Q

P

M

I

H

E

YKDALV

QLKIEDVCR

QGR

SPKHEDYQR

EDTASYL

0

SSTEAYDL

IG

D

VPNFS

RNHALGTE

V

T

I

H

F

D

Y

S

L

E

AWR

5

YVTIFEDLS

0.0

1.0

2.0

3.0

4.0

bits

-5

S

N

M

I

L

K

H

P

E

A

R

Q

G

A

M

E

D

PSKR

T

A

S

P

K

G

VQEFR

Y

K

I

F

L

ARPS

S

H

T

L

I

G

AVPR

0

TS

Y

R

N

L

I

S

K

EVDAP

N

K

V

T

R

L

E

A

GSP

S

G

E

T

R

M

Y

L

VDAP

L

A

VPIGTERS

5

L

G

S

A

WYVRP

0.0

1.0

2.0

3.0

4.0

bits

-5

E

S

G

VNKTPA

L

K

A

G

IPNVS

L

I

E

D

N

C

P

F

VKGR

V

S

I

T

M

E

H

RPA

A

N

L

H

YPFMG

0

TS

M

L

K

Y

T

G

D

A

SQHP

V

F

A

Y

R

T

S

G

Q

L

E

Y

L

S

F

IPV

Q

F

D

YRESGV

5

Y

K

I

Q

R

PE

LST

0.0

1.0

2.0

3.0

4.0

bits

-5

R

P

A

LIVSKH

NMLASGR

DREAKH

CNGQS

V

T

Q

N

H

F

E

PKRMI

0

SS

IE

A

LVKDR

V

N

WKGDSRP

Y

V

T

S

R

M

KE

A

NG

VR

KENILST

5

K

IF

E

D

QPVSA

0.0

1.0

2.0

3.0

4.0

bits

-5

G

S

D

A

L

E

S

V

Q

A

K

P

DE

K

I

Y

S

D

EL

R

L

V

G

DFSE

V

Q

F

K

S

LGED

0

T

ST

SGPED

P

G

NAKESD

Y

I

S

A

RKDE

V

G

T

P

I

M

N

A

K

L

DE

5

R

M

Y

AKGDE

0.0

1.0

2.0

3.0

4.0

bits

-5

T

M

L

K

G

D

IRPS

W

C

T

P

K

G

AS

Q

P

M

L

D

W

V

T

S

K

IE

AR

R

K

E

S

I

V

R

H

D

G

S

E

I

0

T

SI

V

L

F

E

D

ASP

Q

L

A

T

H

S

G

VDIE

S

L

I

A

P

D

NGRE

R

P

L

I

V

Q

K

F

SD

5

S

L

F

A

NKVED

0.0

1.0

2.0

3.0

4.0

bits

-5

V

Q

I

F

KAYLSRL

H

E

C

QYK

V

G

E

D

LASHKR

N

M

I

A

T

LHGDSR

Y

S

P

N

I

G

V

L

H

0

T

SY

T

N

L

I

C

A

DVSPQM

K

I

G

A

DVTFES

TRPSLD

Y

R

M

K

G

F

VE

SDTP

5

V

P

L

I

R

F

E

A

DNGS

Pro

lin

e-b

ase

dA

rgin

ine

-ba

sed

Aci

dic

-ba

sed

0.0

1.0

2.0

3.0

4.0

bits

-5

PGVAL

WTQNGDLER

YVGFANLRK

V

Q

P

H

F

STRLD

V

T

ID

KHASRP

0

TS

VRLDEP

VMHGCSALP

V

T

E

RMDSNP

WVTSPNIRQL

5

NESFDRQK

19 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

VASL

SNGDQELR

YNGFA

QLKR

VS

QIFDAPRL

YVMLRNSPK

0

TS

TRLIDEP

TIGEDLAVP

TLFANISRP

W

S

PNIG

RQTL

5

SNDRQKA

17 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

YRQLKIHDAV

VQIECR

GR

YSQEDR

TSEAYL

0

SYTSEAD

VPNLIGFS

TRLGAE

YVSRLWEA

5

YIEDLS

11 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

YPMKEDAVL

VLKDCR

QGR

YQPKHR

TDASYL

0

STEAYD

VPDNFS

NLHGTE

YTSLIHFDWR

5

VTFLDS

14 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

S

M

I

YPNLHEKAR

V

Q

M

C

DPSRK

PGEVSQFR

Y

I

F

D

RLKAPS

T

S

H

F

IGVAPR

0

T

SYSRNLEDVAP

V

T

N

L

E

RGSP

M

YVRLDAP

P

L

IGTERS

5

T

M

L

G

A

WVYRP

30 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

S

M

L

K

I

H

A

PER

W

Q

P

N

D

M

G

A

EKSR

V

T

Q

P

G

A

KEFR

Y

V

N

L

I

F

E

ARPS

S

N

I

H

T

G

LAVRP

0

TS

S

IVKEDAP

R

N

I

V

T

L

K

E

GASP

Y

S

M

G

E

V

T

DAP

L

K

H

T

IG

E

A

PVRS

5

L

K

G

WASYVRP

38 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

PLVSKAR

TSPIVMGFD

VTIGERP

SLTG

TGFCVSPA

0

TS

RYLGHF

HELANR

WPDRLKET

SLIGEHRK

5

SRQNMCTKE

14 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

SGARPLV

TSQPLGDMI

GEVTRPI

IATG

TSPIGFCA

0

TS

YRGFLH

QLDARN

WRKTLEP

YLHERGIK

5

TSR

QNMLPEK

16 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

QIFYLSR

T

S

R

Q

N

H

E

C

YLK

LEASHKR

TMLIHGSDR

Y

R

P

N

L

I

G

E

C

VSH

0

SYTPLVDSQM

VIFATES

TRPLD

VRMEDTSP

5

VRPFDANGES

17 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

VNG

TKGDE

FTRV

KTLR

NLHFRSK

0

STPLDSF

TIVFC

VRG

KST

5

KIDP

15 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

QNIG

TKIGDAE

FTVR

LKITR

VSRNKFAH

0

STPLDSF

VTIFC

MRG

SKT

5

QIHDP

9 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

M

VE

SNKGPTA

L

F

E

D

KGPNIVS

S

N

L

I

D

A

C

VPKFGR

V

S

Q

L

I

THERMPA

S

R

L

D

A

NH

YPFGM

0

TS

YS

Q

LDTGHP

V

T

R

L

F

Y

Q

G

SE

Y

R

L

F

PIV

L

K

F

C

A

YRESGV

5

Y

K

G

RQPILEST

32 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

S

G

TNKVAP

L

G

VIAPNS

Q

P

L

I

F

D

C

VNEKGR

V

T

E

S

R

IH

PA

Q

N

K

E

A

LHYPFMG

0

TS

Y

M

K

D

SQAHP

R

G

T

Q

A

LE

Y

T

Q

M

L

K

H

G

C

IF

D

SPV

F

E

YQDRSGV

5

Y

V

Q

K

H

F

E

D

RPLST

31 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

VSPILHK

MGASR

EARKH

NCGQS

VTRPNMHEI

0

SLIDAVKR

WKGDSPR

YMKENG

VRNITSL

5

QPFEVSA

11 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

RIAVSH

SNLGR

EDAKH

QGCNS

QPIFRMK

0

SVSLKEDR

WVSRNKGDP

VTSRANG

NKIEST

5

VQPKIDSA

10 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

T

R

I

G

N

S

D

A

LE

Q

G

S

K

V

P

A

DE

K

Y

R

I

G

S

D

LE

T

R

A

L

V

GDFSE

V

K

F

SLGED

0

T

ST

GSPED

P

KGNAESD

Y

V

N

A

SRKDE

T

S

R

M

I

V

P

N

L

K

G

ADE

5

T

R

M

L

Y

K

AGDE

65 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

Y

V

R

L

K

S

G

D

A

E

I

Q

A

KPED

I

K

F

Y

E

S

DL

W

V

R

G

DFSE

Y

V

S

K

I

H

Q

LGED

0

T

ST

S

L

GEPDI

G

NEAKSD

A

IRKDE

Q

P

T

I

N

M

K

LDE

5

R

P

M

Y

AKGED

72 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

T

Q

M

K

I

E

GRPLDS

W

R

Q

L

K

I

D

C

PGTAS

WVPMLKIDTAR

W

V

R

Q

H

G

Y

T

S

F

KIE

W

N

LKFDGESI

0

SQ

L

SFEDVAP

Y

T

P

K

H

A

V

Q

IG

D

SE

V

T

S

Q

M

I

F

NDRGE

T

R

L

I

E

Q

PKVFDS

5

T

S

Q

N

I

A

LKVDE

24 sites

0.0

1.0

2.0

3.0

4.0

bits

-5

W

V

T

G

D

A

MKRPIS

W

P

G

E

C

KAS

WV

Q

KISEAR

M

H

A

SRI

P

VRHIE

0

TS

T

M

H

G

F

E

D

LIASP

S

A

TLHGVDIE

S

I

H

D

LGARPNE

R

Q

L

K

I

A

S

D

5

Y

S

P

KFANEVD

27 sites

Target

Figure S1: Clustering analysis of core phosphorylation sites. For every pair of aligned phosphorylated residues, a consensussequence of the local alignment from -5 to 5 of the aligned phosphorylated residues is first defined. For example, ..RK.SP..D. isthe consensus pattern of GTRKGpSPLKDE aligned to NERKVpSPDEDM. Next, a consensus pattern S encoded as a vector set V =(v−5,v−4, ..., v4, v5) is defined, where vector vi is a vector of the 20 element coding for number of specific amino acids appearing atposition i among the consensus sequences. The similarity between vector set Vx and Vy is computed as the sum of cosine similaritiesof all corresponding vectors across the two sets, as follows: First, a vector set is encoded for every consensus sequence. Secondly, thesimilarity between pairs of vector sets are computed, and the most similar pair is then merged into a new vector set by summing up thecorresponding vectors across the two old sets. The previous step is iteratively performed, and if the two most similar vector sets at eachiteration encodes 10 or more core sites, they are distilled out from the computational analysis. Lastly, core sites in human and targetspecies represented by output vector sets are visualized separately with sequence logo, manually, inspected and grouped as shown. Anykinase consensus motif in NetPhorest (termed kinase group) (7) that resembles a motif discovered above are listed to the right and thecorresponding selected motif logos from the NetPhorest atlas is shown. For example, number 8 corresponds to a conserved segment inthe activation loop of some serine/threonine kinases and resembles the consensus motif of the ”master” kinase regulator PDK1. As themotifs above are derived from unsupervised clustering, some of them may not correspond to kinase consensus motif. In addition, somephosphorylation motifs could be missed due to weak signal from limited motif instances.

Page 7: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 6

0%

10%

20%

30%

40%

50%

60%

70%

FLY WORM YEAST

Percentage of phosphorylatable residues (S, T, Y) in humanaligned to phosphorylatable residues in target species

Disordered Ordered

Figure S2: Phosphorylatable residues in disordered regions are fast e volving. Intrinsic disordered regions in proteins are generallyrapidly evolving and hard to align. However, human phosphorylatable residues (serine, theronine, and tyrosine) in disordered regions thatare well-aligned to target species (defined as those with at least 50% identity in -5 to +5 of the central residue) are less conserved thanphosphorylatable residues in ordered regions at the same identity threshold. Here, we deemed a phosphorylatable residue in humanto be conserved in target species if it is aligned to another phosphorylatable residue. The statistics are computed only from humanorthologs in target species that have one-to-one orthologous relationship to avoid relaxation of conservation caused by duplicate genes.

01

23

45

Amino Acid

Ra

tio

of site

s r

esid

ing

in

ord

er/

dis

ord

er

pS pT pY

Observed vs random expectation comparison

of non-core sites in order vs disorderA

S T Y

01

23

45

Amino Acid

Ra

tio

of site

s r

esid

ing

in

ord

er/

dis

ord

er

Core pS

Core pT

Core pY

Observed vs random expectation comparison

of core sites in order vs disorderB

Figure S3: Phosphorylation site disorder analysis. A, Known human phosphorylation sites occur more often in disordered regionsthan would be expected by random chance. The boxplots (in black) display the distributions of in-order/in-disorder ratios for 100 randomlypicked sets of serines, threonines, and tyrosines from human phosphoproteins. Phosphosphorylated serines, threonines, and tyrosinesmight fall by random chance into these sets as well. The size of each set equals that of the respective observed number of phosphorylatedresidues. The colored bars denote the observed in-order/in-disorder ratio of phosphorylated residues. B, Core sites occur more oftenin ordered or structured regions than do other phosphorylation sites. To assess the significance of this observation, we sampled thedistribution of in-order/in-disorder ratio with sets of randomly picked phosphorylation sites from core site proteins. The distributions ofthe random sampling are shown as boxplots (in black) and the colored bars denote the in-order/in-disorder ratio of core sites. Separateanalyses were performed for phosphoserine, -threonine, and -tyrosine.

Page 8: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 7

1

3 1

2 2

1

3 1

1 0

2

0

A B

C D

A' B'

C' D'

E E'

1

3 1

2 1

2

0 1

2 1

0

2

A B

C D

A'' B''

C'' D''

E E''

Min (2, 3)

Min (1, 2)

Min (2, 0)

Min (1, 1)

Local phosphorylation

network of protein E

(Human)

Local phosphorylation

network of protein E

(Fly)

Local phosphorylation

network of protein E

(Human)

Local phosphorylation

network of protein E

(Yeast)

Human-Fly conserved

phosphorylation of protein E

Human-Yeast conserved

phosphorylation of protein E

2

3 1

2

Sum (1, 2)

Combined conserved

network of protein E

Total conserved phosphorlation

propensity k of protein E = 8

Figure S4: Schematic diagram of how the conserved phosphorylation pro pensity k of each human substrate is computed.Inferred conserved human kinase-substrate relationships are quantified based on the number of sites predicted to be phosphorylatedby an orthologous kinase on an orthologous substrate in human and target species. The conserved phosphorylation propensity of asubstrate is computed from its conserved human kinase-substrate relationships as k which is the weighted sum of all its incoming edges.Kinases are represented as red-colored nodes. For simplicity, the illustration is shown for two out of the three target species used. Thenumbers located at the end of each arrow indicate the number of phosphorylation sites predicted [by NetworKIN (8; 9) using NetPhorest(7) as motif engine] to be targeted by the corresponding kinase.

40

06

00

80

01

00

01

20

0

Nu

mb

er

of h

um

an

kin

ase

-su

bstr

ate

re

latio

nsh

ips

ob

se

rve

d c

on

se

rve

d in

ta

rge

t sp

ecie

s

Figure S5: Boxplot of the number human-substrate relations observed t hat were conserved in target species from randomizedtrials. Red bar denotes the actual number of observed conserved human kinase-substrate relationships.

Page 9: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 8

A

CDK7 (1UA2)

B

Figure S6: Core sites observed in activation loops of protein kinases. A, A multiple sequence alignment of activation loop regionsin protein kinases with core sites marked in red. Gene symbols, kinase names, and the sequence coordinates for each activation loop isshown on the left. B, The corresponding region from panel A is highlighted (yellow) on the protein structure of CDK7.

Page 10: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 9

Supplementary Tables

Table S1: Statistical significance of core sites observed between hum an and yeast . The number of core sites expected by randomchance is computed from the observed sequence alignments. The statistical significance is assessed with χ

2 test in R statistical software.

Residue Expected Observed Odds Ratio p-value

pS 9.54 46 4.82 1.0e-07pT 0.73 2 2.74 -pY 1.48 5 3.38 -

Table S2: Statistical significance of core sites observed between hum an and fly . The number of core sites expected by randomchance is computed from the observed sequence alignments. The statistical significance is assessed with χ

2 test in R statistical software.

Residue Expected Observed Odds Ratio p-value

pS 55.53 325 5.85 2.2e-16pT 6.43 64 9.95 6.9e-12pY 11.23 33 2.94 1.1e-3

Table S3: Statistical significance of core sites observed between hum an and worm . The number of core sites expected by randomchance is computed from the observed sequence alignments. The statistical significance is assessed with χ

2 test in R statistical software.

Residue Expected Observed Odds Ratio p-value

pS 14.9 116 7.79 2.2e-16pT 0.89 7 7.87 -pY 2.99 13 4.35 1.2e-2

Table S4: Core sites identified in components of beta-catenin destruc tion complex.

Gene Symbol (HGNC) Ensembl Protein ID Residue Position Target Species Source TypeGSK3B ENSP00000264235 Y 216 WORM FLY YEAST HTP LTPGSK3B ENSP00000264235 S 9 FLY HTP LTPAPC ENSP00000257430 S 1279 FLY HTP LTPAXIN1 ENSP00000262320 T 79 FLY HTPCTNNB1 ENSP00000344456 S 675 FLY HTP LTP

Table S5: Core sites identified in components of the clathrin coat of co ated pits. The LTCL1 subunit of clarthin coat does notcontain any core sites according to our approach.

Gene Symbol (HGNC) Ensembl Protein ID Residue Position Target Species Source TypeCLTA ENSP00000242285 Y 94 WORM HTPCLTC ENSP00000269122 Y 634 FLY HTPAP2M1 ENSP00000292807 T 156 FLY HTP LTPCLTB ENSP00000309415 Y 87 WORM HTP

Page 11: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 10

Table S6: Core sites identified in 22 cancer-associated genes. Phosphorylation sites annotated with HTP and not LTP evidence areconsidered recently dicovered in this study.

Gene Symbol(HGNC) Ensembl Protein ID Residue Position Target species Source Type

APC ENSP00000257430 S 1279 FLY LTPCCDC6 ENSP00000263102 S 367 FLY HTPCCDC6 ENSP00000263102 S 240 FLY HTPCCDC6 ENSP00000263102 S 323 FLY HTPCDK2 ENSP00000266970 T 160 FLY HTP LTPCDK2 ENSP00000266970 Y 15 YEAST HTP LTPCLTC ENSP00000269122 Y 634 FLY HTPCREB1 ENSP00000236996 S 117 WORM LTPCTNNB1 ENSP00000344456 S 675 FLY HTP LTPDEK ENSP00000244776 S 230 FLY HTPDEK ENSP00000244776 S 231 FLY HTPDEK ENSP00000244776 S 306 FLY HTPDEK ENSP00000244776 S 232 FLY HTPFIP1L1 ENSP00000351383 S 85 FLY HTPFOXO1 ENSP00000368880 S 319 WORM LTPFOXO3 ENSP00000339527 S 315 WORM LTPHSP90AA2 ENSP00000216281 S 231 FLY HTP LTPHSP90AA2 ENSP00000216281 Y 627 FLY HTPHSP90AB1 ENSP00000360709 Y 619 FLY HTPHSP90AB1 ENSP00000360709 S 226 FLY HTP LTPMAP2K4 ENSP00000262445 S 257 FLY HTP LTPMLLT4 ENSP00000345834 S 1090 FLY HTPFOX04 ENSP00000363377 S 262 WORM LTPMSH6 ENSP00000234420 S 14 FLY HTPRPS10 ENSP00000347271 T 101 YEAST HTPSEPT6 ENSP00000341524 T 418 FLY HTPSMAD2 ENSP00000262160 T 8 FLY HTP LTPSUZ12 ENSP00000316578 S 583 FLY HTPTCF12 ENSP00000267811 S 559 FLY HTPVCP ENSP00000367954 S 748 YEAST LTP

Table S7: Correlation of cancer-associated genes with conserved pho sphorylation propensity k computed for individual targetspecies.

Fly + Worm + Yeast Fly Worm Yeast

k Cancer Total % Cancer Total % Cancer Total % Cancer Total %Genes Genes Genes Genes Genes Genes Genes Genes

≥ 1 47 759 6.2 44 631 7.0 3 96 3.1 6 215 2.8≥ 2 25 450 5.6 21 344 6.1 1 49 2.0 3 101 3.0≥ 3 15 271 5.5 15 209 7.2 0 17 0.0 1 41 2.4≥ 4 12 192 6.2 12 138 8.7 0 14 0.0 1 25 3.6≥ 5 11 143 7.7 10 93 10.8 0 6 0.0 1 15 6.7≥ 6 10 103 9.7 8 70 11.4 0 3 0.0 1 9 11.1≥ 7 8 73 11.0 7 47 14.9 0 3 0.0 0 7 0.0≥ 8 4 54 7.4 4 31 12.9 0 3 0.0 0 5 0.0

Page 12: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 11

Table S8: Correlation of OMIM genes with conserved phosphorylation p ropensity k computed for different target species.

Fly + Worm + Yeast Fly Worm Yeast

k Cancer Total % Cancer Total % Cancer Total % Cancer Total %Genes Genes Genes Genes Genes Genes Genes Genes

≥ 1 99 759 13.0 85 631 13.5 8 96 8.3 27 215 12.6≥ 2 55 450 12.2 42 344 12.2 5 49 10.2 14 101 13.9≥ 3 34 271 12.5 26 209 12.4 1 17 5.9 7 41 17.1≥ 4 24 192 12.5 20 138 14.5 1 14 7.1 4 25 16.0≥ 5 23 143 16.1 19 93 20.4 0 6 0.0 3 15 20.0≥ 6 18 103 17.5 16 70 22.9 0 3 0.0 3 9 33.3≥ 7 15 73 20.5 14 47 29.8 0 3 0.0 2 7 28.6≥ 8 13 54 24.1 10 31 32.3 0 3 0.0 1 5 20.0≥ 9 11 41 26.8 5 20 25.0 0 2 0.0 0 4 0.0≥ 10 10 35 28.6 4 16 25.0 0 2 0.0 0 4 0.0≥ 11 8 27 29.6 4 11 36.4 0 2 0.0 0 1 0.0≥ 12 7 24 29.2 4 11 36.4 0 1 0.0 0 1 0.0≥ 13 7 22 31.8 4 9 44.4 0 1 0.0 0 0 0.0≥ 14 7 20 35.0 4 6 66.7 0 1 0.0 0 0 0.0≥ 15 4 16 25.0 3 5 60.0 0 1 0.0 0 0 0.0

Page 13: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases 12

Supplementary References

[1] B. Bodenmiller, J. Malmstrom, B. Gerrits, D. Campbell, H. Lam, A. Schmidt, O. Rinner, L. N. Mueller, P. T.Shannon, P. G. Pedrioli, C. Panse, H.-K. Lee, R. Schlapbach, R. Aebersold, Phosphopep–a phosphopro-teome resource for systems biology research in drosophila kc167 cells. Mol Syst Biol 3, 139 (2007).

[2] B. Bodenmiller, L. N. Mueller, M. Mueller, B. Domon, R. Aebersold, Reproducible isolation of distinct,overlapping segments of the phosphoproteome. Nat Methods 4, 231–237 (2007).

[3] B. Bodenmiller, L. N. Mueller, P. G. A. Pedrioli, D. Pflieger, M. A. Jnger, J. K. Eng, R. Aebersold, W. A.Tao, An integrated chemical, mass spectrometric and computational strategy for (quantitative) phospho-proteomics: application to drosophila melanogaster kc167 cells. Mol Biosyst 3, 275–286 (2007).

[4] B. Bodenmiller, D. Campbell, B. Gerrits, H. Lam, M. Jovanovic, P. Picotti, R. Schlapbach, R. Aebersold,Phosphopep–a database of protein phosphorylation sites in model organisms. Nat Biotechnol 26, 1339–1340 (2008).

[5] M. Heller, M. Ye, P. E. Michel, P. Morier, D. Stalder, M. A. Junger, R. Aebersold, F. Reymond, J. S. Rossier,Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focus-ing of peptides. J Proteome Res 4, 2273–2282 (2005).

[6] A. Chi, C. Huttenhower, L. Y. Geer, J. J. Coon, J. E. P. Syka, D. L. Bai, J. Shabanowitz, D. J. Burke, O. G.Troyanskaya, D. F. Hunt, Analysis of phosphorylation sites on proteins from saccharomyces cerevisiaeby electron transfer dissociation (ETD) mass spectrometry. Proc Natl Acad Sci U S A 104, 2193–2198(2007).

[7] M. L. Miller, L. J. Jensen, F. Diella, C. Jørgensen, M. Tinti, L. Li, M. Hsiung, S. A. Parker, J. Bordeaux,T. Sicheritz-Ponten, M. Olhovsky, A. Pasculescu, J. Alexander, S. Knapp, N. Blom, P. Bork, S. Li, G. Ce-sareni, T. Pawson, B. E. Turk, M. B. Yaffe, S. Brunak, R. Linding, Linear motif atlas for phosphorylation-dependent signaling. Sci Signal 1, ra2 (2008).

[8] R. Linding, L. J. Jensen, G. J. Ostheimer, M. A. van Vugt, C. Jørgensen, I. M. Miron, F. Diella, K. Colwill,L. Taylor, K. Elder, P. Metalnikov, V. Nguyen, A. Pasculescu, J. Jin, J. G. Park, L. D. Samson, J. R.Woodgett, R. B. Russell, P. Bork, M. B. Yaffe, T. Pawson, Systematic discovery of in vivo phosphorylationnetworks. Cell 129, 1415–1426 (2007).

[9] R. Linding, J. L. J., A. Pasculescu, M. Olhovsky, K. Colwill, P. Bork, M. B. Yaffe, T. Pawson, NetworKIN: aresource for exploring cellular phosphorylation networks. Nucleic Acids Res. 36, D695–D699 (2008).

Page 14: Supplementary Materials for - Rune  · PDF file  Supplementary Materials for ... Ruedi Aebersold ... Total collection time was 24 h and the volume

Supplementary Data File Descriptions The supplementary data files are provided as a 17 Mb Zip archive and when unzipped, the data are organized into 8 subdirectories as described below. Yeast/ YeastSeq.fa: Yeast sequences used. YeastSite.csv: Experimentally detected yeast phosphosites. YeastSiteNetworKIN.tsv: NetworKIN + NetPhorest predictions of kinases phosphorylating the sites. Worm/ WormSeq.fa: Worm sequences used. WormSite.csv: Experimentally detected worm phosphosites. WormSiteNetworKIN.tsv: NetworKIN + NetPhorest predictions of kinases phosphorylating the sites. Fly/ FlySeq.fa: Fly sequences used. FlySite.csv: Experimentally detected fly phosphosites. FlySiteNetworKIN.tsv: NetworKIN + NetPhorest predictions of kinases phosphorylating the sites. Human/ HumanSite.csv: Human phosphosites assembled from the PhosphoSite and Phospho.ELM databases. HumanSiteNetworKIN.tsv: NetworKIN + NetPhorest prediction of kinases phosphorylating the sites CoreSites/ CoreSite.csv: core sites detected. Included are NetworKIN and NetPhorest predictions. CoreSiteinDomain.csv: Core site occurring within computationally inferred domains using HMM profiles from the PFAM and SMART databases. CoreSiteinMultipleSpecies.csv: List of human core sites occurring in more than one target species. ClusterAnalysis.txt: List of motif clusters identified CoreSites/GOAnalysis/: Bingo (Cytoscape plugin) outputs for Gene Ontology function enrichment analysis of core site proteins. CoreNet/ CoreNet.txt: Inferred human kinase-substrate relationships conserved in target species. CoreNetWithoutCoreSite.txt: Inferred human kinase-substrate relationships conserved in target species when kinase-substrate relationships involving core sites are excluded. CoreNet/GOAnalysis/: Bingo (Cytoscape plugin) outputs for Gene Ontology function enrichment analysis of core net proteins. MSA/ Multiple sequence alignments used in core site detection. DiseaseGeneAnalysis/ DiseaseGene.csv: List of core site and core net genes with known mutations associated with cancer and other diseases (OMIM). Included also is computed conserved phosphorylation propensity (k) of these genes.