nmeth.2099 si titles - nature research · supplementary table 17 3xquest search of the gst [d 0 /d...
TRANSCRIPT
nature|methods
Identification of cross-linked peptides from complex samples Bing Yang, Yan-Jie Wu, Ming Zhu, Sheng-Bo Fan, Jinzhong Lin, Kun Zhang, Shuang Li, Hao Chi, Yu-Xin Li, Hai-Feng Chen, Shu-Kun Luo, Yue-He Ding, Le-Heng Wang, Zhiqi Hao, Li-Yun Xiu, She Chen, Keqiong Ye, Si-Min He & Meng-Qiu Dong
Supplementary File Title
Supplementary Figure 1 Cross-linking products and BS3 the cross-linker.
Supplementary Figure 2 The spectrum of a pair of cross-linked peptides is much more complex than that of a single linear peptide.
Supplementary Figure 3 Optimization of cross-linking conditions.
Supplementary Figure 4 HPLC separation of cross-linking isoforms.
Supplementary Figure 5 Estimation of non-specific cross-linking.
Supplementary Figure 6 Effectiveness of the spectral quality score (SQS).
Supplementary Figure 7 Unmatched peaks in 1 m/z bins collected from 1,030 HCD spectra.
Supplementary Figure 8 Open search mode and FDR estimation.
Supplementary Figure 9 Monte Carlo simulation of the probability distribution of T.
Supplementary Figure 10 Illustration of Pre_score calculation.
Supplementary Figure 11 Illustration of fragment ions found only in cross-link spectra.
Supplementary Figure 12 Usage of different fragment-ion types in KSDP scoring.
Supplementary Figure 13 Optimization of ion type usage improves the separation between target and random matches.
Supplementary Figure 14 Consideration of cross-link specific ion types improves identification.
Supplementary Figure 15 FDR estimation.
Supplementary Figure 16 A cross-link spectrum annotated by pLabel.
Supplementary Figure 17 Cα-Cα distance distribution of any lysine pairs, cross-linkable lysine pairs and observed cross-linked lysine pairs in GST.
Supplementary Figure 18 CXMS analysis of E. coli and C. elegans lysates.
Nature Methods: doi:10.1038/nmeth.2099
Supplementary Table 1 Cross-link search space of E. coli, C. elegans,and human proteome.
Supplementary Table 2 Sequences of 38 synthetic peptides.
Supplementary Table 3 Datasets used for training and testing pLink.
Supplementary Table 4 Features and their weights in SQS calculation.
Supplementary Table 5 Significantly matched fragments after spectrum pre-processing.
Supplementary Table 6 Usage of fragment ions in pLink fine scoring.
Supplementary Table 7 CXMS analysis of GST.
Supplementary Table 8 CXMS analysis of the CNGP (Cbf5-Nop10-Gar1-Nhp2) complex.
Supplementary Table 9 CXMS result of the six-subunit, 550 kDa UTP-B complex.
Supplementary Table 10 CXMS analysis of C. elegans FIB-1::GFP IP.
Supplementary Table 11 Inter-linked peptides identified from E. coli lysates.
Supplementary Table 12 Inter-linked peptides identified from a C. elegans lysate.
Supplementary Table 13 Cross-linking analysis of the CNGP complex using DSS.
Supplementary Table 14 Cross-linking analysis of the CNGP complex using EDC.
Supplementary Table 15 Cross-linking analysis of the CNGP complex using AMAS.
Supplementary Table 16 Cross-linking analysis of the CNGP complex using Sulfo-GMBS.
Supplementary Table 17 xQuest search of the GST [d0/d4]-BS3 cross-linking data.
Supplementary Table 18 Comparison of the GST [d0/d4]-BS3 cross-links identified by xQuest and pLink.
Supplementary Table 19 xQuest search of the GST [d0]-BS3 cross-linking data.
Supplementary Table 20 Comparison of the GST [d0]-BS3 cross-links identified by xQuest and pLink.
Supplementary Note The pLink algorithm and supplementary discussion.
Nature Methods: doi:10.1038/nmeth.2099
Regular
Mono-linked (Type 0)
Loop-linked (Type 1)
Inter-linked (Type 2)
Supplementary Figure 1. Cross-linking products and BS3 the cross-linker. (A) Digestion of chemically cross-linked proteins generates mono-, loop-, and inter-linked peptides besides regular peptides unmodified by the linker. (B) Chemical structures of [d0]- and [d4]-BS3 and (C) the expected pattern of cross-linked peptides in full MS scans.
A B
Light [d0]-BS3
Heavy [d4]-BS3
D D
D D
4.025 Da
Expected L:H intensity ratio 1:1
C
Supplementary Figure 1. Cross-linking products and BS3 the cross-linker. (D-G) Representative full MS scans showing that [d0]- and [d4]-BS3 modified peptides co-elute, with the heavy cross-link slightly ahead of the light one.
D
UTP-b-XL-2500ng-HCD #2195 RT: 34.64 AV: 1 NL: 2.14E6F: FTMS + p NSI Full ms [300.00-2000.00]
406.8 407.0 407.2 407.4 407.6 407.8 408.0 408.2 408.4 408.6 408.8 409.0 409.2 409.4m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rel
ativ
e A
bund
ance
407.986
408.236
408.487
408.737406.979
407.230407.735 409.227407.482 408.986
34.64 min
L H L
L L
H
H
H
H
H
E
UTP-b-XL-2500ng-HCD #2201 RT: 34.70 AV: 1 NL: 4.25E6F: FTMS + p NSI Full ms [300.00-2000.00]
406.8 407.0 407.2 407.4 407.6 407.8 408.0 408.2 408.4 408.6 408.8 409.0 409.2 409.4m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rel
ativ
e A
bund
ance
407.986
408.236
408.487406.979
407.230
407.481 408.738
408.989407.732
34.70 min
L L
L
L H
H
H
H
H
UTP-b-XL-2500ng-HCD #2213 RT: 34.82 AV: 1 NL: 2.39E6F: FTMS + p NSI Full ms [300.00-2000.00]
406.8 407.0 407.2 407.4 407.6 407.8 408.0 408.2 408.4 408.6 408.8 409.0 409.2 409.4m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rel
ativ
e A
bund
ance
406.980
407.986
407.230408.237
407.481
408.487
408.738407.732
408.990
F 34.82 min L L
L
L H
H
H
H
H
UTP-b-XL-2500ng-HCD #2225 RT: 34.93 AV: 1 NL: 9.92E5F: FTMS + p NSI Full ms [300.00-2000.00]
406.8 407.0 407.2 407.4 407.6 407.8 408.0 408.2 408.4 408.6 408.8 409.0 409.2 409.4m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rel
ativ
e A
bund
ance
406.980
407.230
407.986407.481408.237
408.488
407.732406.717 407.197
G 34.93 min L
L
L
L
H
H
H
Supplementary Figure 2. The spectrum of a pair of cross-linked peptides is much more complex than that of a single linear peptide.
Supplementary Figure 3. Optimization of cross-linking conditions.(A) 10 µM GST was cross-linked with indicated amount of BS3 in 50 mM HEPES, pH 7.5, 100 mM KCl at room temperature for 1 hour. (B) 10 µM GST was cross-linked with 0.5 mM BS3 (50x) in the same buffer at RT for indicated amount of time.
M 0 10x 20x 50x 100x 200x 400x M 0 5m 15m 30m 1h 2h 4h
GST
X-linked GST dimer
A B
* *
Supplementary Figure 4. HPLC separation of cross-linking isoforms. (A) Different isoforms of cross-links, along with mono- and loop-links (not shown), are generated from a pair of peptides and can be separated by reverse phase HPLC.
A
Supplementary Figure 4. HPLC separation of cross-linking isoforms. (B) A pair of cross-link isoforms from BS3 treated E. coli lysate.
B
Supplementary Figure 5. Estimation of non-specific cross-linking. Ovalbumin, BSA, and a hetero-dimeric protein complex F15E11.13/F15E11.14 were mixed at indicated concentrations and processed for CXMS analysis. Non-specific cross-linking between ovalbumin and BSA, BSA and F15E11.13/14, or ovalbumin and F15E11.13/14 were observed 0.3 times per sample per experiment. In contrast, within protein/complex cross-links were observed 168.2 times per sample per experiment. BSA cross-links were more readily detected than others. Plotted are the average of two sets of experiments.
Protein (µg/µl)
ovalbumin 1.0 4.0 9.5 15.0 18.0
BSA 18.0 15.0 9.5 4.0 1.0
F15E11.13/14 1.0 1.0 1.0 1.0 1.0
0
50
100
150
200
250
#Cross-‐link spe
ctra
Oval-‐Oval
BSA-‐BSA
F15E11.13-‐F15E11.14
Non-‐Specific-‐Total
Supplementary Figure 6. Effectiveness of the Spectral Quality Score (SQS).With a 4.2 SQS cutoff (grey vertical lines), 93% of the non-cross-link spectra are removed while 99.5% of the cross-link spectra are retained.
Supplementary Figure 7. Unmatched peaks in 1 m/z bins collected from 1030 HCD spectra. Some low m/z noise peaks, especially 108, 153, and 200, occur in nearly every spectrum.
108108
153
200
1000
15001000500
10000
9000
8000
7000
6000
5000
4000
3000
2000
200000
Peaks in HCD spectra (m/z)
Rel
ativ
e In
tens
ity
Supplementary Figure 8. Open search mode and FDR estimation.(A) The open search mode for large databases. (B) A –log(pre_score) cutoff of 3.0 is effective in removing 89% of the non-cross-link spectra.
Treat Δmass as modificaMon on K
Open Database Search
PreScore against any pep. w/ mass <
precursor
K
K
K K K
K
…
Pep mass (w/o modificaMon) ≥ or ≤ ½ precursor?
α pepMdes β pepMdes
K K
K … K
K
K
…
≥ ≤
Pair up top 500 α and β pepMdes:
α + β + linker = precursor
Fine scoring against the candidate pairs
A
!
B
Supplementary Figure 9. Monte Carlo simulation of the probability distribution of T. Statistic T is the normalized length of the longest sequence tag from a spectrum. For a candidate peptide of 5, 8, 10, 15, 18, 20, or 25 aa, the probability of T (by chance) > t (observed) is shown.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
581015182025
Prob
ability
Normalized Tag Length
Supplementary Figure 10. Illustration of Pre_score calculation. See supplemental methods for explanation.
N=MH/w =int(1538.637002/0.5) = 1538
4 1376 1532
17
1538
7
( ) 1.5*10
( ) 1.5*10
x l x
n N n
l
N
P x
Hyper X x
C C C C
C C
2
22
(2 /17 3 /17 4 /17 10 /17) / 4 0.2794;;
0.5;;
1/ (12* ) 1/ (12*4) 0.1443;;
( )1( ) 0.8593;;2
( ) 0.0632
r
x
r
P r
Norm R r
e
9( , , ) ( )* ( )* ( ) 3.3*10pvalue x r t Hyper X x Norm R r Taglen T t
t=2/7=0.28
Taglen(t>0.28)=0.37
x=4
ACDEKGK l=17
n=6
A AC ACDACDE
ACDEK
1.0 r=2/173/174/17
10/17
ACDEKG
Supplementary Figure 11. Illustration of fragment ions found only in cross-link spectra. (A-B) examples of yb-type ions; (C) example of a K-linked ion; (D) fragments resulting from cleavage of the linker.
A B
C D
Supplementary Figure 12. Usage of different fragment ion types in KSDP scoring. See Supplemental Methods for explanation.
Supplementary Figure 13. Optimization of ion type usage improves the separation between target and random matches. A total of 1030 cross-link spectra are searched against a positive database which contains the target sequences (solid lines) and a negative database which does not (dashes lines) using the initial (green) or optimized (red) ion type setting.
#Spe
ctra, cum
ula/
ve
–ln(E-‐value)
old b1+, b2+, y1+, y2+
new b1+, b2+, y1+, y2+, a1+, a2+, yb1+, ya1+, KLα(KLβ), Lα/Lβ1+, Lα/Lβ2+
Supplementary Figure 14. Consideration of cross-link specific ion types improves identification. A total of 1030 cross-link spectra are searched against a positive database which contains the target sequences (solid lines) and a negative database which does not (dashes lines), using only the basic ion types (green) or the basic plus cross-link specific ion types (red).
Basic b1+, b2+, y1+, y2+, a1+, a2+
All b1+, b2+, y1+, y2+, a1+, a2+, yb1+, ya1+, KLα(KLβ), Lα/Lβ1+, Lα/Lβ2+
Refined_Score
#Spe
ctra, cum
ula/
ve
F R + F-‐F R-‐R F-‐R R-‐F
Cross-‐link in silico
T U F
Supplementary Figure 15. FDR estimation.(A) FDR estimation based on a modified reverse database strategy. T, U, and F, for true, union, and false, are possible outcomes of in silico cross-linking of forward (F) and reversed (R) protein/peptide sequences. By random match, spectra fall into T, U and F at a 1:2:1 ratio. (B) When the CXMS data of the yeast UTP-B complex are searched against three databases (human, archaea, and random) that have no UTP-B subunit sequences in them, the percentages of spectra that match to T, U, and F are about 25%, 50%, and 25%, respectively. When searched against a target database containing UTP-B subunit sequences, the percentage of spectra that match to T increases, while those to U and F decrease.
0 5 10 15 20 25 30 35 40 45 50
human archaea random target
% Spe
ctra
T
U
F
B
A
Supplementary Figure 15. FDR estimation.(C) Reliable estimation of FDR for cross-link identification. Only when FDR exceeds 45%, which is much greater than the conventional cutoff values, the estimated FDR begins to deviate significantly from the real FDR.
C
-30 -25 -20 -15 -10 -5 0 50
0.1
0.2
0.3
0.4
0.5
log10 eValue
FDR
Estimated and Real FDR on all charges
Estimated FDRReal FDR
Es/mated FDR = (U-‐F)/T
Log10(E-‐value)
FDR
Supplementary Figure 16. A cross-link spectrum annotated by pLabel. The inset shows double cleavage products.
Supplementary Figure 17. Cα-Cα distance distribution of any lysine pairs, cross-linkable lysine pairs, and observed cross-linked lysine pairs in GST.
GST
0
10
20
30
40
50
60
70
5 10 15 20 25 30 35 40 45 50 55 60 65
No. of lysine pa
irs
Cα-‐Cα distance (Å)
any K-‐K pair
cross-‐linkable pair
observed corss-‐link
Supplementary Figure 18. CXMS analysis of E. coli and C. elegans lysates. (A) 394 inter-linked peptide pairs were identified from E. coli lysates. (B) 75.5% of the E. coli cross-links are compatible with the structures of corresponding proteins/complexes deposited in the PDB database (179/237). (C) Y2H verified interactions represented by five E. coli inter-links. (D) 39 inter-linked peptide pairs were identified from a C. elegans lysate.
Intra-‐molecular,
270
Inter-‐molecular,
124 Compa/ble 179
Incompa/ble 58
Structure unavailable
157
A B
C
E. coli E. coli
Inter-‐molecular,
10 intra-‐
molecular, 29
D
C. elegans
3 4 5
6
7 8 1
2
1. posiMve control 2. negaMve control 3. AD-‐AAA97042.1 + BD-‐NP_416801.2 (#91) 4. AD-‐AAC73200.1 + BD-‐AAC75219.1 (#98) 5. AD-‐NP_416518.2 + BD-‐AAC73708.1 (#115) 6. AD-‐YP_026243.1 + BD-‐AAA58136.1 (#71) 7. AD-‐YP_025307.1 + BD-‐AAA58136.1 (#69) 8. AD-‐AAC74522.1 + BD-‐AAA58136.1 (#70)
– LW – LWH
Yang et al. 6/7/12 2:08 PM 1
Supplementary Tables for “Identification of Cross-linked Peptides from Complex Samples” by Yang et al.
Supplementary Table 1: Cross-link search space of E. coli, C. elegans, and human proteome
Database Proteins Regular search space
(#candidate peptides)
Cross-‐link search space (#candidate
peptide pairs)
E. coli 6126 3.07 x105 4.72 x1010
C. elegans 24652 2.14 x106 2.31 x1012
Human 87069 3.67 x106 6.74 x1012
Note: For n number of candidate linear peptides, the number of possible cross-linked peptide pairs is 0.5*n*(n+1) for cross-
link search.
Supplementary Table 2: Sequences of 38 synthetic peptides Peptide Sequence Length (aa) Mass of [M+H]+ AR-9 AILVNFKAR 9 1031.6360 AR-9-1 AQFKTVSTR 9 1037.5738 DK-10 DGMIKLWDLK 10 1218.6551 DR-7 DMKLWQR 7 976.5033 DR-19 DPTSPAPLKTHTIELQGQR 19 2089.1036 DK-7 DQEAQKK 7 846.4315 DR-14 DWNTNAKTHTIAQR 14 1655.8248 ER-28 EGSQSKDYSSLLATLINFSPAAVDLEIR 28 3024.5524 ER-13 EKQFLNALVMAFR 13 1566.8461 FK-12 FILTTSKDLSAK 12 1323.7518 FR-9 FVKQQWNLR 9 1218.6742 GK-5 GKNSK 5 533.3042 GK-21 GLWDVSFCQYDKLLATSSGDK 21 2390.1332 IR-9 IHVLKNIHR 9 1129.6952 KR-14 KCLHTLQEHTSAVR 14 1679.8646 KR-11 KDAQSQEMSQR 11 1307.6008 KK-20 KLGEAPIKPQGNAVLIAVNK 20 2060.2226 KR-7 KMRPEVR 7 915.5193 KK-10 KNIAAIDLNK 10 1099.6469 KR-7 KVEEDVR 7 874.4628 LK-11 LDAEEQMNKFK 11 1352.6514 LK-12 LDIDQLQLSVKK 12 1399.8155 LR-7 LFNVLKR 7 889.5618 LK-12 LLATSSGDKTVK 12 1219.6892 LK-17 LNEEYLINKVYEAIPIK 17 2049.1266 LR-20 LSEIPGMVKIVDAIIPYTQR 20 2243.2467 NR-10 NEELKLQINR 10 1256.6957 NK-8 NKFLPVLK 8 958.6084 NP-9 NSKIFSPFR 9 1095.5945 SR-14 SDFKFSNLLGTVYR 14 1646.8536 SK-22 SHKDSITGFWCQGEDWLISTSK 22 2582.1980 SK-26 SLNSFEPFDEIVWFIDALTQGLKSNK 26 2998.5196 TR-8 TPDVNKDR 8 944.4796 VK-9 VGGSTIKSK 9 876.5149 VR-11 VKCVTFHPATR 11 1315.6939 VR-6 VKTELR 6 745.4566 VR-7 VWDLVKR 7 915.5410 YK-14 YFAYISKLDSASVK 14 1591.8366
Note: All peptide are ≥ 98% pure, with a free N-terminal NH2 and a C-terminal COOH. Cysteine residues were
carbamidomethylated.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 2
Supplementary Table 3: Datasets used for training and testing pLink
Dataset Source of data #Spectra
A Non-‐redundant inter-‐link spectra from 741 peptide pairs
(Note: the highest scoring spectrum was kept from the ones identifying the same cross-‐link)
2077
Sub_A1 A subset of A, containing only light [d0]-‐BS3 cross-‐links 1030
Sub_A2 A subset of A, containing only heavy [d4]-‐BS3 cross-‐links 1047
B All inter-‐link spectra from 741 peptide pairs, including redundant ones 13267
Sub_B1 A subset of B, containing only light [d0]-‐BS3 cross-‐links 7706
Sub_B2 A subset of B, containing only heavy [d4]-‐BS3 cross-‐links 5561
C Spectra that failed to identify inter-‐links from the 741 peptide pair
experiments
153368
D HCD spectra of regular peptides identified from C. elegans lysates 21116
Supplementary Table 4: Features and their weights in SQS calculation
No. Feature Weight
1 Number of peaks 0.306 2 Number of peaks with known charge states 0.256
3 Fraction of known-‐charge-‐state peaks in all peaks 0.007 4 Average Peak Intensity 0.029
5 Standard deviation of peak intensities 0.043 6 Smallest m/z range containing 95% of the total peak intensity 0.008
7 Smallest m/z range containing 50% of the total peak intensity 0.024 8 Total ion current per m/z 0.045
9 Standard deviation of the consecutive m/z gaps between all peaks -‐0.023 10 Average number of neighbor peaks within a 2-‐Da interval around any peak -‐0.005
11 Length of the longest tag 0. 257 12 #Tags (number of peak pairs that differ by an amino acid residue mass) 0.205
13 #Tags per peak (#Tags / #peaks) 0.097 14 Fractional peak intensity of tags (summed geometric means of peak intensities
of every peak pair in feature 12 divided by the summed geometric means of all peak pairs)
0.0004
All features except 2 and 3 are calculated as described before by Nesvizhskii2.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 3
Supplementary Table 5. Significantly matched fragments after spectrum pre-processing
Ion type Match Significance
Match count ratio
Match gain ratio
Average intensity
Ave. mass deviation (m/z)
Cont. ratio
Ave. # of residues
y1+ 0.034 0.218 0.671 0.232 -‐0.00062 0.617 4.583
y2+ 0.004 0.093 0.314 0.143 -‐0.00061 0.270 9.270 b1+ 0.001 0.047 0.161 0.119 -‐0.00013 0.143 4.330
yb1+ 7.627 x10-‐4 0.109 0.096 0.073 -‐0.00042 n/a 3.226 b2+ 5.865 x10-‐4 0.046 0.164 0.077 -‐0.00054 0.133 8.591
ya1+ 2.797 x10-‐4 0.060 0.053 0.089 -‐0.00038 n/a 1.772 a1+ 2.279 x10-‐4 0.020 0.065 0.179 -‐0.00024 0.051 1.737
y3+ 8.825 x10-‐5 0.019 0.069 0.069 -‐0.00046 0.058 7.008 αL/βL 7.403 x10-‐5 0.021 0.092 0.037 -‐0.00049 n/a 4.988
b3+ 5.976 x10-‐5 0.025 0.086 0.028 -‐0.00021 0.071 7.881 [y-‐H2O]
1+ 4.256 x10-‐5 0.009 0.031 0.152 -‐0.00007 0.032 0.574
a2+ 3.921 x10-‐5 0.014 0.056 0.049 -‐0.00036 0.048 4.903 KLα/KLβ 3.772 x10-‐5 0.007 0.140 0.036 -‐0.00029 n/a 2.562
[M+5H]5+ 3.001 x10-‐5 0.003 0.192 0.053 -‐0.00006 n/a 5.842 a3+ 1.867 x10-‐5 0.010 0.042 0.043 -‐0.00032 0.038 5.321
b4+ 1.656 x10-‐5 0.014 0.050 0.023 -‐0.00043 0.040 8.576 [y-‐NH3-‐H2O]
5+ 1.620 x10-‐5 0.005 0.015 0.200 0.00046 0.015 1.400
[M+4H]4+ 1.207 x10-‐5 0.002 0.110 0.055 0.00011 n/a 2.710 a4+ 1.081 x10-‐5 0.008 0.050 0.027 -‐0.00018 0.050 3.300
[M+5H-‐NH3-‐H2O]
5+ 8.058 x10-‐6 0.002 0.175 0.022 -‐0.00062 n/a 4.350
Note: Ions types colored in red are only found in spectra of cross-linked peptides. The αL/βL and KLα/KLβ categories contain all charges states found in the spectra. KLα/KLβ ions include neutral loss species KLα-‐17 and KLβ-‐17, too.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 4
Supplementary Table 6: Usage of fragment ions in pLink fine scoring
Ion type Sub-‐type (S/X/B) weighted for ion continuity(Y/N) Initial setting
b1+ B Y b2+ B Y y1+ B Y y2+ B Y
Optimized b1+ S Y b2+ X N b3+ X N y1+ B Y y2+ B Y y3+ X N a1+ S N a2+ B N yb1+ B N ya1+ B N
KLα(KLβ) X N αL(βL) B N
S for simple ions: of a fragment ion type in question, only the cross-link-free sub-type is considered;
X for Xlink ions: of a fragment ion type in question, only the cross-link-containing sub-type is considered;
B for both: all ions of the indicated type are considered, containing a linker or not.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 5
Supplementary Table 7. CXMS analysis of GST Cross-‐linking sites
Cross-‐linked peptides Cα-‐Cα (Å)
#total spec
#spec exp-‐1
#spec exp-‐2
#spec exp-‐3
Best E-‐value
Spec qual. manual
evaluation
1:1 d0:d4 pair?
Agree with struc.?
GST26-‐0 LLLEYLEEKYEEHLYER(9)-‐MSPILGYWK(1)
~7.24 30 13 12 5 6.89E-‐21 high Yes Yes
GST26-‐1 LLLEYLEEKYEEHLYER(9)-‐SPILGYWK(1)
7.24 24 13 10 1 2.59E-‐18 high Yes Yes
GST124-‐112 VDFLSKLPEMLK(6)-‐IAYSKDFETLK(5)
20.39/ *22.02
12 7 1 4 6.68E-‐05
mid Yes Yes
GST180-‐193 KRIEAIPQIDK(1)-‐YLKSSK(3) 14.24 13 3 0 10 8.83E-‐12 high Yes Yes
GST217-‐10 YIAWPLQGWQATFGGGDHPPKSDLVPR(21)-‐IKGLVQPTR(2)
~15.85 16 6 2 8 1.99E-‐09 high Yes Yes
GST26-‐39 LLLEYLEEKYEEHLYER(9)-‐DEGDKWR(5)
24.57 19 2 5 12 2.38E-‐04 low Yes No
GST63-‐0 KFELGLEFPNLPYYIDGDVKLTQSMAIIR(20)-‐MSPILGYWK(1)
~12.10 3 1 2 0 7.54E-‐05
low Yes Yes
GST217-‐112 YIAWPLQGWQATFGGGDHPPKSDLVPR(21)-‐IAYSKDFETLK(5)
~16.46 5 0 0 5 2.63E-‐07 mid Yes Yes
Note: Experiment #1 and #2 were carried out on LTQ-orbitrap-ETD and experiment #3 on LTQ-orbitrap Velos. Cross-link identifications were filtered by requiring 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 3 spectral observations. Those detected in 2 out of 3 experiments and have > 3 spectral copies are shown in Fig. 2a. The starting amino acid Met0 and amino acids C-terminal to Pro216 are not visible in the X-ray structure, so the distance involving Met0 or Lys217 is measured using the nearest residue Ser1 or Pro216 instead. All cross-links except GST26-39 have an intra-subunit Cα-‐Cα distance of less than 24 Å, therefore are structurally sound. The cross-link GST124-112 can be either intra-subunit or inter-subunit (labeled with *).
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 6
Supplementary Table 8. CXMS analysis of the CNGP (Cbf5-Nop10-Gar1-Nhp2) complex No. Inter-‐linked
lysine pairs #total spec (#pair)
#spec exp-‐1
#spec exp-‐2
Best E-‐value
Cα-‐Cα Distance
(Å)
Manual eval. of
spec qual.
Compatible with
structure?
1 Cbf5 161-‐Gar1 115 15 (2) 0 15 1.77E-‐10 18.85 high Yes
2 Cbf5 180-‐Cbf5 134 21 (2) 4 17 1.59E-‐24 10.99 high Yes
3 Cbf5 180-‐Nop10 18 9 (1) 3 6 6.98E-‐17 12.98 high Yes
4 Cbf5 180-‐Nop10 19 17 (1) 10 7 2.20E-‐14 15.46 high Yes
5 Cbf5 267-‐Cbf5 31 9 (1) 0 9 3.68E-‐13 11.38 high Yes
6 Gar1 77-‐Gar1 115 23 (2) 11 12 9.73E-‐20 11.01 high Yes
7 Nop10 1-‐Nop10 19 31 (2) 15 16 1.48E-‐19 11.48 high Yes
8 Nop10 40-‐Nhp2 69 5 (1) 1 4 1.82E-‐03 17.18 mid Yes
9 Nop10 40-‐Nhp2 65 14 (1) 7 7 1.82E-‐03 13.96 high Yes
10 Nop10 40-‐Nhp2 61 11 (1) 6 5 1.07E-‐09 12.30 high Yes
11 Gar1 115-‐Gar1 104 9 (1) 7 2 1.61E-‐16 24.75 high No
12 Nop10 40-‐Nop10 19 6 (1) 1 5 7.09E-‐08 30.57 high No
13 Cbf5 87-‐Cbf5 114 3 (1) 2 1 1.83E-‐03 14.71 low Yes
14 Nhp2 65 -‐Nhp2 69 3 (1) 3 0 6.59E-‐03 5.79 high Yes
15 Nop10 40-‐Cbf5 114 3 (1) 3 0 2.89E-‐08 34.61 high No
Note: The results came from two experiments; a low-flow 50-µm ID column was used in experiment #1 and a high-flow 100-µm ID column was used in experiment #2. Inter-link identifications were filtered by requiring 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 3 spectral copies. Only those with 4 or more spectra are illustrated in Fig. 2b. Some inter-links were identified by two peptide pairs due to missed cleavage of trypsin, e.g., Cbf5161-Gar1115 was supported by SLENLTGALFQRPPLISAVKR(20)-EGDKFYIAADKLLPIER(11) and SLENLTGALFQRPPLISAVKR(20)-FYIAADKLLPIER(7).
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 7
Supplementary Table 9. CXMS result of the six-subunit, 550 kDa UTP-B complex
No. Protein 1-‐Protein 2 #total spec (#pairs)
#spec exp_1
#spec exp_2
#spec exp_3
Best E-‐value
Manual eval. of spec qual.
Domain of protein 1
Domain of protein 2
UTP13-‐UTP13 1 UTP13(86)-‐UTP13(94) 7 5 0 2 7.77E-‐05 high U13-‐WD1 U13-‐WD1 2 UTP13(91)-‐UTP13(51) 5 5 0 0 1.61E-‐09 high U13-‐WD1 U13-‐WD1 3 UTP13(181)-‐UTP13(179) 19 12 0 7 3.97E-‐05 mid U13-‐WD1 U13-‐WD1 4 UTP13(228)-‐UTP13(181) 4 (2) 0 3 1 5.75E-‐10 high U13-‐WD1 U13-‐WD1 5 UTP13(533)-‐UTP13(555) 17 10 0 7 7.95E-‐08 high U13-‐WD2 U13-‐WD2 6 UTP13(741)-‐UTP13(780) 18 7 4 7 7.43E-‐18 high U13-‐CTD U13-‐CTD 7 UTP13(751)-‐UTP13(699) 129 (2) 77 12 40 5.53E-‐17 high U13-‐CTD U13-‐CTD UTP12-‐UTP12 8 UTP12(163)-‐UTP12(187) 24 15 3 6 1.61E-‐20 high U12-‐WD1 U12-‐WD1 9 UTP12(279)-‐UTP12(337) 10 7 3 0 4.11E-‐06 high U12-‐WD1 U12-‐WD1 10 UTP12(318)-‐UTP12(230) 34 20 8 6 6.56E-‐17 high U12-‐WD1 U12-‐WD1 11 UTP12(318)-‐UTP12(237) 46 22 6 18 8.94E-‐18 high U12-‐WD1 U12-‐WD1 12 UTP12(337)-‐UTP12(253) 16 11 0 5 1.26E-‐05 high U12-‐WD1 U12-‐WD1 13 UTP12(381)-‐UTP12(279) 4 4 0 0 7.19E-‐06 high U12-‐WD2 U12-‐WD1 14 UTP12(404)-‐UTP12(420) 15 11 2 2 1.66E-‐09 high U12-‐WD2 U12-‐WD2 15 UTP12(486)-‐UTP12(503) 26 8 8 10 6.23E-‐17 high U12-‐WD2 U12-‐WD2 16 UTP12(780)-‐UTP12(774) 19 8 3 8 2.31E-‐08 high U12-‐CTD U12-‐CTD 17 UTP12(877)-‐UTP12(884) 8 4 4 0 1.59E-‐06 high U12-‐CTD U12-‐CTD UTP21-‐UTP21
18 UTP21(9)-‐UTP21(9) 5 3 2 0 4.27E-‐05 high n/a n/a 19 UTP21(9)-‐UTP21(19) 19 (2) 16 3 0 2.36E-‐05 high U21-‐WD2 U21-‐WD2 20 UTP21(22)-‐UTP21(661) 3 2 0 1 5.29E-‐11 high U21-‐WD2 U21-‐CTD 21 UTP21(148)-‐UTP21(102) 31 23 0 8 2.07E-‐05 high U21-‐WD1 U21-‐WD1 22 UTP21(288)-‐UTP21(408) 45 30 5 10 4.23E-‐14 high U21-‐WD1 U21-‐WD2 23 UTP21(435)-‐UTP21(408) 6 6 0 0 2.18E-‐07 high U21-‐WD2 U21-‐WD2 24 UTP21(539)-‐UTP21(502) 6 0 1 5 1.38E-‐03 mid U21-‐WD2 U21-‐WD2 25 UTP21(661)-‐UTP21(9) 3 0 1 2 1.20E-‐08 high U21-‐CTD U21-‐WD2 26 UTP21(661)-‐UTP21(19) 30 22 4 4 1.27E-‐05 high U21-‐CTD U21-‐WD2 27 UTP21(819)-‐UTP21(806) 12 2 3 7 3.24E-‐11 high U21-‐CTD U21-‐CTD 28 UTP21(828)-‐UTP21(873) 18 10 3 5 5.24E-‐23 high U21-‐CTD U21-‐CTD 29 UTP21(873)-‐UTP21(917) 7 (2) 0 7 0 6.18E-‐08 high U21-‐CTD U21-‐CTD 30 UTP21(873)-‐UTP21(918) 26 (2) 21 5 0 2.39E-‐07 high U21-‐CTD U21-‐CTD 31 UTP21(873)-‐UTP21(828) 7 6 0 1 1.35E-‐08 high U21-‐CTD U21-‐CTD 32 UTP21(804)-‐UTP21(794) 3 0 3 0 5.15E-‐07 high U21-‐CTD U21-‐CTD UTP1-‐UTP1
33 UTP1(27)-‐UTP1(85) 4 4 0 0 4.34E-‐03 high U1-‐WD1 U1-‐WD1 34 UTP1(46)-‐UTP1(6) 11 5 3 3 5.71E-‐17 high U1-‐WD1 U1-‐WD2 35 UTP1(56)-‐UTP1(674) 4 1 2 1 4.87E-‐06 mid U1-‐WD1 U1-‐WD2 36 UTP1(96)-‐UTP1(129) 14 7 0 7 2.59E-‐12 high U1-‐WD1 U1-‐WD1 37 UTP1(166)-‐UTP1(264) 29 16 4 9 2.33E-‐12 high U1-‐WD1 U1-‐WD1 38 UTP1(180)-‐UTP1(96) 17 4 0 13 7.27E-‐15 high U1-‐WD1 U1-‐WD1 39 UTP1(180)-‐UTP1(129) 6 4 2 0 9.12E-‐08 high U1-‐WD1 U1-‐WD1 40 UTP1(211)-‐UTP1(264) 16 7 2 7 2.95E-‐14 high U1-‐WD1 U1-‐WD1 41 UTP1(264)-‐UTP1(674) 9 0 0 9 3.26E-‐08 high U1-‐WD1 U1-‐WD2 42 UTP1(536)-‐UTP1(557) 4 0 1 3 2.55E-‐14 high U1-‐WD2 U1-‐WD2 43 UTP1(536)-‐UTP1(572) 9 8 1 0 1.78E-‐09 high U1-‐WD2 U1-‐WD2 44 UTP1(572)-‐UTP1(557) 22 11 2 9 2.52E-‐16 high U1-‐WD2 U1-‐WD2 45 UTP1(572)-‐UTP1(674) 55 31 6 18 4.65E-‐10 mid U1-‐WD2 U1-‐WD2 46 UTP1(753)-‐UTP1(733) 64 26 9 29 8.07E-‐32 high U1-‐CTD U1-‐CTD 47 UTP1(46)-‐UTP1(27) 3 0 0 3 1.64E-‐07 low U1-‐WD1 U1-‐WD1 48 UTP1(46)-‐UTP1(85) 3 0 0 3 9.56E-‐03 low U1-‐WD1 U1-‐WD1 UTP18-‐UTP18
49 UTP18(154)-‐UTP18 (134) 7 7 0 0 8.89E-‐13 high U18-‐NTD U18-‐NTD 50 UTP18(154)-‐UTP18(170) 13 7 0 6 4.17E-‐12 high U18-‐NTD U18-‐NTD 51 UTP18(585)-‐UTP18(538) 31 14 6 11 3.36E-‐08 high U18-‐WD U18-‐WD UTP6-‐UTP6
52 UTP6(389)-‐UTP6(439) 21 12 2 7 4.12E-‐12 high U6-‐CTD U6-‐CTD 53 UTP6(397)-‐UTP6(389) 9 5 0 4 6.78E-‐18 high U6-‐CTD U6-‐CTD 54 UTP6(321)-‐UTP6(361) 3 0 3 0 1.60E-‐09 high U6-‐CTD U6-‐CTD UTP13-‐UTP12
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 8
No. Protein 1-‐Protein 2 #total spec (#pairs)
#spec exp_1
#spec exp_2
#spec exp_3
Best E-‐value
Manual eval. of spec qual.
Domain of protein 1
Domain of protein 2
55 UTP12(381)-‐UTP13(555) 14 (2) 8 1 5 2.29E-‐10 high U12-‐WD2 U13-‐WD2 56 UTP12(855)-‐UTP13(751) 16 (2) 0 7 9 1.30E-‐14 high U12-‐CTD U13-‐CTD 57 UTP13(533)-‐UTP12(381) 11 6 0 5 2.99E-‐08 high U13-‐WD2 U12-‐WD2 58 UTP13(546)-‐UTP12(515) 22 9 6 7 1.41E-‐11 high U13-‐WD2 U12-‐WD2 59 UTP13(699)-‐UTP12(843) 18 11 7 0 2.39E-‐05 high U13-‐CTD U12-‐CTD 60 UTP13(780)-‐UTP12(909) 44 29 6 9 5.51E-‐11 high U13-‐CTD U12-‐CTD 61 UTP13(815)-‐UTP12(866) 4 0 1 3 1.42E-‐12 high U13-‐CTD U12-‐CTD 62 UTP13(815)-‐UTP12(884) 9 0 6 3 2.34E-‐06 high U13-‐CTD U12-‐CTD 63 UTP13(741)-‐UTP12(111) 17 (2) 4 10 3 5.22E-‐11 high U13-‐CTD U12-‐WD1 64 UTP13(815)-‐UTP12(877) 3 0 3 0 2.90E-‐08 high U13-‐CTD U12-‐CTD UTP12-‐UTP21
65 UTP21(890)-‐UTP12(906) 4 0 0 4 5.87E-‐07 high U21-‐CTD U12-‐CTD UTP21-‐UTP1
66 UTP1(6)-‐UTP21(661) 11 6 3 2 2.48E-‐12 high U1-‐WD2 U21-‐CTD 67 UTP1(27)-‐UTP21(730) 9 4 3 2 1.09E-‐27 high U1-‐WD2 U21-‐CTD 68 UTP1(96)-‐UTP21(382) 49 23 11 15 6.73E-‐24 high U1-‐WD1 U21-‐WD2 69 UTP1(129)-‐UTP21(9) 18 7 3 8 7.37E-‐13 high U1-‐WD1 U21-‐WD2 70 UTP1(129)-‐UTP21(102) 23 11 4 8 2.30E-‐06 high U1-‐WD1 U21-‐WD1 71 UTP21(382)-‐UTP1(129) 25 6 4 15 9.83E-‐19 high U21-‐WD2 U1-‐WD1 72 UTP21(661)-‐UTP1(85) 7 1 0 6 3.39E-‐16 high U21-‐CTD U1-‐WD1 UTP1-‐UTP18
73 UTP1(572)-‐UTP18(418) 6 4 0 2 1.09E-‐03 mid U1-‐WD2 U18-‐WD UTP21-‐UTP18
74 UTP21(245)-‐UTP18(341) 5 3 2 0 2.12E-‐06 high U21-‐WD1 U18-‐WD 75 UTP21(288)-‐UTP18(341) 10 10 0 0 2.16E-‐09 high U21-‐WD1 U18-‐WD 76 UTP21(408)-‐UTP18(538) 3 3 0 0 3.24E-‐07 high U21-‐WD2 U18-‐WD UTP1-‐UTP12
77 UTP1(800)-‐UTP12(884) 24 24 0 0 5.67E-‐08 high U1-‐CTD U12-‐CTD UTP1-‐UTP6
78 UTP6(65)-‐UTP1(572) 4 4 0 0 1.94E-‐05 low U6-‐NTD U1-‐WD2 UTP6-‐UTP13
79 UTP13(751)-‐UTP6(72) 5 0 0 5 4.09E-‐03 low U13-‐CTD U6-‐NTD
Note: Cross-link identifications were filtered by requiring 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 3 spectral observations. Highlighted in grey are eight cross-links that are observed in only one experiment and are associated with either low-quality spectra or only 3 spectral copies; they are not shown in Fig. 2c. WD1: WD domain 1, WD2: WD domain 2; NTD: N-terminal domain; CTD: C-terminal domain. Domains of each protein were predicted. Utp1, Utp12, Utp13 and Utp21 contain tandem WD domains that were modeled according to the AIP1 structure (PDB code: 1PI6), where the first strand of WD repeat 1 is paired with the last strand of WD repeat 14 in WD domain 2. Utp6 is composed of NTD (residues 1-206) and CTD (207-440). Utp18 is composed of NTD (1-224) and WD domain (225-594). Utp1 is composed of WD1 (18-346), WD2 (1-17, 347-707) and CTD (708-923). Utp21 is composed of WD1 (33-352), WD2 (1-32, 353-656) and CTD (657-939). Utp12 is composed of WD1 (20-361), WD2 (1-19, 362-687) and CTD (688-943). Utp13 is composed of WD1 (19-339), WD2 (1-18, 340-648) and CTD (649-817).
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 9
Supplementary Table 10. CXMS analysis of C. elegans FIB-1::GFP IP
No. Protein1-‐Protein2
Peptide1-‐Peptide2 #total spec
#spec exp_1
#spec exp_2
Cα-‐Cα distance
(Å)
Best E-‐value
Manual eval. of
spec qual. 1 ce_Fib1(115)-‐
ce_Fib1(133) GGKTVVVEPHR(3)-‐GKEDALATK(2) 10 5 5 ~8.7 5.30E-‐12 high
2 ce_Snu13(21)-‐ce_Snu13(118)
AFPLADTNLSQKLMDLVQQAMNYK(12)-‐SQIQKIKEDVEK(5)
13 5 8 9.3 4.80E-‐10 high
3 ce_Nop56(161)-‐ce_Fib1(236)
VKFDVHR(2)-‐DLLGVAKK(7) 7 4 3 17.1 5.34E-‐08 high
SKVKFDVHR(4)-‐DLLGVAKK(7) 3 0 3 17.1 1.61E-‐11 high
4 ce_Nop58(172)-‐ce_Nop56(183)
IDTMIVQAVSLLDDLDKELNNYVMR(17)-‐VDNMVIQSIALLDQLDKDINLFGMR(17)
3 3 0 11.4 4.34E-‐09 high
Note: Cross-link identifications were filtered by requiring 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 3 spectral observations. Highlighted in yellow are cross-links between Nop58 and Nop56, and between Nop56 and FIB-1. The Cα-Cα distances were measured on the equivalent residues in an archaeal C/D RNP structure (PDB code 3PLA), assuming that Nop56 and Nop58 form a heterodimer. Residue 115 of FIB-1 is not present in the archaeal structure and is approximated by the equivalent archaeal residue of FIB-1 residue 117.The gene names for C. elegans FIB-1, Nop56, Nop58, and Snu13 are T01C3.7, K07C5.4, W01B11.3/nol-5, and M28.5.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 10
Supplementary Table 11. Inter-linked peptides identified from E. coli lysates. Filtering criteria: mass accuracy 10 ppm, FDR < 5%, E-value < 0.01. Cross-links that may be interpreted as either intra-molecular or inter-molecular are taken as intra-molecular. In column “Note”, Yes-TAP denotes interactions verified by affinity purification/mass spec analysis (experimental datasets in reference 23), while Yes-Y2H or NO-Y2H indicates positive or negative Y2H test result (this paper). No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2
#Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
Inter-‐molecular Cross-‐links
1 gb|AAC73894.2|(278)-‐ref| NP_417064.1|(2)/ ref| NP_418081.1|(2)/ gb| AAB40481.1 |(108)
23S rRNA mA1618 methyltransferase, SAM-‐dependent |(278)-‐ back-‐
translocating Elongation Factor EF4, GTPase(2)/ lipopolysaccharide core biosynthesis protein|(2)/ ORF_o346 predicted oxidoreductase|(108)
KEMAQGQK(1)-‐KNIR(1) 1 4.88E-‐05 NO
2 gb|AAC73820.1|(476)-‐gb|AAC74882.1|(274)
2-‐oxoglutarate decarboxylase, thiamin-‐requiring |(476)-‐aminodeoxychorismate
synthase, subunit I |(274)
HGHNEADEPSATQPLMYQKIK(19)-‐PIKGTLPR(3)
1 2.59E-‐05 NO
3 gb|AAC73820.1|(476)-‐gb|AAC74659.1|(748)
2-‐oxoglutarate decarboxylase, thiamin-‐requiring |(476)-‐probable selenate
reductase, periplasmic |(748)
HGHNEADEPSATQPLMYQKIK(19)-‐LPAKVTPR(4)
1 5.11E-‐07 NO
4 gb|AAC73997.1|(1)-‐gb|AAC73280.1|(11)
30S ribosomal subunit protein S1 |(1)-‐30S ribosomal subunit protein S2 |(11)
MTESFAQLFEESLK(1)-‐DMLKAGVHFGHQTR(4)
2 2.68E-‐05 NO
5 gb|AAC73997.1|(158)-‐gb|AAA97098.1|(9)
30S ribosomal subunit protein S1 |(158)-‐30S ribosomal subunit protein S18 |(9)
VIKLDQK(3)-‐KFCR(1) 1 4.50E-‐08 NO
6 gb|AAC73997.1|(162)-‐gb|AAA97098.1|(9)
30S ribosomal subunit protein S1 |(162)-‐30S ribosomal subunit protein S18 |(9)
LDQKR(4)-‐KFCR(1) 1 5.88E-‐06 NO
7 gb|AAA58139.1|(108)-‐gb|AAC74323.1|(18)
30S ribosomal subunit protein S12 |(108)-‐fused acetaldehyde-‐CoA
dehydrogenase/iron-‐dependent alcohol dehydrogenase/pyruvate-‐formate lyase
deactivase |(18)
GALDCSGVKDR(9)-‐KAQR(1)
1 3.87E-‐08 NO
8 gb|AAA58139.1|(108)-‐gb|AAC75516.1|(440)
30S ribosomal subunit protein S12 |(108)-‐fused malic enzyme predicted oxidoreductase/predicted
phosphotransacetylase |(440)
GALDCSGVKDR(9)-‐KAPKR(1)
2 7.22E-‐05 NO
9 gb|AAA58139.1|(44)-‐gb|AAC73134.1|(19)
30S ribosomal subunit protein S12 |(44)-‐30S ribosomal subunit protein S20 |(19)
KPNSALR(1)-‐KHNASR(1) 7 5.21E-‐08 YES NO
10 gb|AAA58139.1|(44)-‐gb|AAA57987.1|(85)
30S ribosomal subunit protein S12 |(44)-‐50S ribosomal subunit protein L21 |(85)
KPNSALR(1)-‐KQQGHR(1) 3 2.51E-‐07 NO
11 gb|AAC75658.1|(13)-‐gb|AAA69236.1|(24)
30S ribosomal subunit protein S16 |(13)-‐CG Site no. 33104: hypA, hydrogenase nickel incorporation protein ORF_o116
|(24)
KRPFYQVVVADSR(1)-‐HGAKR(4)
3 9.73E-‐08 NO
12 gb|AAA97098.1|(30)-‐gb|AAA89145.1|(5)
30S ribosomal subunit protein S18 |(30)-‐30S ribosomal subunit protein S21 |(5)
DIATLKNYITESGK(6)-‐PVIKVR(4)
1 7.37E-‐07 YES YES 16.9 Å
13 gb|AAA97098.1|(30)-‐gb|AAA97096.1|(104)
30S ribosomal subunit protein S18 |(30)-‐30S ribosomal subunit protein S6 |(104)
DIATLKNYITESGK(6)-‐HAVTEASPMVKAK(11)
1 3.09E-‐04 YES NO
14 gb|AAA58113.1|(21)-‐ref|NP_416245.2|(2)
30S ribosomal subunit protein S19 |(21)-‐cell division modulator |(2)
VEKAVESGDK(3)-‐KKPLR(1) 1 3.24E-‐04 YES YES 10.2 Å
15 gb|AAA58113.1|(21)-‐gb|AAC74441.1|(214)
30S ribosomal subunit protein S19 |(21)-‐Rac prophage; conserved protein |(214)
KVEKAVESGDK(4)-‐KPIR(1) 3 8.57E-‐09 NO
16 gb|AAC73134.1|(16)-‐gb|AAC75676.1|(214)/gb
|AAA79797.1|(213)
30S ribosomal subunit protein S20 |(16)-‐CP4-‐57 prophage; predicted protein
|(214)/ORF_f538|(213)
AIQSEKAR(6)-‐KGEHSR(1) 1 1.18E-‐04 NO
17 gb|AAA89145.1|(25)-‐gb|AAA97098.1|(9)
30S ribosomal subunit protein S21 |(25)-‐30S ribosomal subunit protein S18 |(9)
SCEKAGVLAEVR(4)-‐KFCR(1)
3 1.03E-‐08 YES NO
18 gb|AAA89145.1|(54)-‐gb|AAA57986.1|(4)/ref|N
P_415615.1|(182)
30S ribosomal subunit protein S21 |(54)-‐50S ribosomal subunit protein L27 |(4)/
predicted aminodeoxychorismate lyase|(182)
ASAVKR(5)-‐AHKK(3) 2 1.65E-‐05 NO
19 gb|AAA58111.1|(108)-‐gb|AAA58112.1|(16)
30S ribosomal subunit protein S3 |(108)-‐50S ribosomal subunit protein L22 |(16)
KPELDAK(1)-‐SSAQKVR(5) 3 1.76E-‐08 NO
20 gb|AAA97096.1|(93)-‐gb|AAA97098.1|(50)
30S ribosomal subunit protein S6 |(93)-‐30S ribosomal subunit protein S18 |(50)
TKHAVTEASPMVK(2)-‐AKYQR(2)
2 8.87E-‐07 YES YES 11.7 Å
21 gb|AAA58138.1|(11)-‐gb|AAA58032.1|(100)
30S ribosomal subunit protein S7 |(11)-‐30S ribosomal subunit protein S9 |(100)
KILPDPK(1)-‐KAGFVTR(1) 7 30 4.36E-‐15 YES YES 17.2 Å
22 gb|AAA58138.1|(131)-‐ref|YP_025294.2|(197)
30S ribosomal subunit protein S7 |(131)-‐acetolactate synthase III, large subunit
|(197)
LANELSDAAENKGTAVKK(12)-‐GQIKR(4)
1 5.25E-‐07 NO
23 gb|AAA58138.1|(131)-‐ 30S ribosomal subunit protein S7 |(131)-‐ LANELSDAAENKGTAVK(12) 1 6.41E-‐06 NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 11
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
gb|AAA69125.1|(122) ORF_f239; was ORF_f191 and ORF_f194 before splice |(122)
-‐LTKLDAQLK(3)
24 gb|AAA58138.1|(131)-‐gb|AAC74772.1|(667)
30S ribosomal subunit protein S7 |(131)-‐phosphoenolpyruvate synthase |(667)
LANELSDAAENKGTAVKK(12)-‐QGLKR(4)
1 1.88E-‐06 NO
25 gb|AAA58138.1|(131)-‐gb|AAC73932.1|(11)
30S ribosomal subunit protein S7 |(131)-‐predicted transporter |(11)
LANELSDAAENKGTAVKK(12)-‐NALKR(4)
9 1.33E-‐10 NO
26 gb|AAA58138.1|(131)-‐gb|AAC74957.1|(56)
30S ribosomal subunit protein S7 |(131)-‐purine-‐binding chemotaxis protein |(56)
LANELSDAAENKGTAVKK(12)-‐IANTPAFIKGVTNLR(9)
2 2.05E-‐10 NO
27 gb|AAA58138.1|(131)-‐gb|AAB18602.1|(181)
30S ribosomal subunit protein S7 |(131)-‐rfaY; lipopolysaccharide core biosynthesis
protein |(181)
LANELSDAAENKGTAVKK(12)-‐IIDLSGKR(7)
1 3.89E-‐06 NO
28 gb|AAA58138.1|(136)-‐gb|AAC74957.1|(56)
30S ribosomal subunit protein S7 |(136)-‐purine-‐binding chemotaxis protein |(56)
LANELSDAAENKGTAVKK(17)-‐IANTPAFIKGVTNLR(9)
1 4.79E-‐06 NO
29 gb|AAA58103.1|(41)-‐gb|AAC75616.1|(1)/ gb|AAA79825.1|(1)
30S ribosomal subunit protein S8 |(41)-‐holo-‐[acyl-‐carrier-‐protein] synthase 1
|(1)/ dpj|(1)
VAIANVLKEEGFIEDFK(8)-‐MAILGLGTDIVEIAR(1)
1 1.72E-‐05 NO
30 gb|AAA58103.1|(50)-‐gb|AAC73418.1|(226)
30S ribosomal subunit protein S8 |(50)-‐c-‐di-‐GMP-‐specific phosphodiesterase |(226)
VAIANVLKEEGFIEDFKVEGDTK(17)-‐LGNDKIK(5)
5 1.50E-‐05 NO
31 gb|AAA58098.1|(29)-‐gb|AAA57987.1|(85)
50S ribosomal subunit protein L15 |(29)-‐50S ribosomal subunit protein L21 |(85)
GIGSGLGKTGGR(8)-‐KQQGHR(1)
2 1.94E-‐09 YES YES 13.1 Å
32 gb|AAA58114.1|(183)-‐gb|AAA58118.1|(1)
50S ribosomal subunit protein L2 |(183)-‐30S ribosomal subunit protein S10 |(1)
KVEADCR(1)-‐MQNQR(1) 9 6.83E-‐08 NO
33 gb|AAA58114.1|(207)-‐gb|AAC73134.1|(19)
50S ribosomal subunit protein L2 |(207)-‐30S ribosomal subunit protein S20 |(19)
VLGKAGAAR(4)-‐KHNASR(1) 1 8.95E-‐09 NO
34 gb|AAA58114.1|(207)-‐gb|AAA97170.1|(6)
50S ribosomal subunit protein L2 |(207)-‐ORF_o111 |(6)
VLGKAGAAR(4)-‐NKWLR(2) 1 1.22E-‐05 NO
35 gb|AAA58114.1|(71)-‐gb|AAC75376.1|(84)
50S ribosomal subunit protein L2 |(71)-‐acetyl-‐CoA carboxylase, beta
(carboxyltransferase) subunit |(84)
NKDGIPAVVER(2)-‐DVLKFR(4)
1 9.01E-‐06 NO
36 gb|AAA57987.1|(85)-‐gb|AAA58118.1|(1)
50S ribosomal subunit protein L21 |(85)-‐30S ribosomal subunit protein S10 |(1)
KQQGHR(1)-‐MQNQR(1) 2 3.99E-‐07 NO
37 gb|AAA58112.1|(16)-‐gb|AAC73134.1|(19)
50S ribosomal subunit protein L22 |(16)-‐30S ribosomal subunit protein S20 |(19)
SSAQKVR(5)-‐KHNASR(1) 1 1.58E-‐08 NO
38 gb|AAA58112.1|(16)-‐gb|AAA57987.1|(85)
50S ribosomal subunit protein L22 |(16)-‐50S ribosomal subunit protein L21 |(85)
SSAQKVR(5)-‐KQQGHR(1) 2 1.60E-‐08 YES NO
39 gb|AAA58112.1|(16)-‐gb|AAA96986.1|(277)
50S ribosomal subunit protein L22 |(16)-‐ORF_f510; fused D-‐allose transporter
subunits of ABC superfamily: ATP-‐binding components|(277)
SSAQKVR(5)-‐KKVR(1) 1 3.62E-‐05 YES NO
40 gb|AAA58112.1|(16)-‐gb|AAA96986.1|(278)
50S ribosomal subunit protein L22 |(16)-‐ORF_f510; fused D-‐allose transporter
subunits of ABC superfamily: ATP-‐binding components|(278)
SSAQKVR(5)-‐KKVR(2) 1 3.26E-‐07 YES NO
41 gb|AAA57986.1|(19)-‐gb|AAC74308.1|(459)
50S ribosomal subunit protein L27 |(19)-‐nitrate reductase 1, alpha subunit |(459)
DSEAKR(5)-‐LPVKR(4) 4 2.50E-‐12 NO
42 gb|AAC76661.1|(10)-‐gb|AAC76752.1|(231)
50S ribosomal subunit protein L28 |(10)-‐L-‐glutamine:D-‐fructose-‐6-‐phosphate
aminotransferase |(231) VCQVTGKR(7)-‐TGAEVKR(6) 1 1.02E-‐08 NO
43 gb|AAC76661.1|(26)-‐gb|AAA58114.1|(207)
50S ribosomal subunit protein L28 |(26)-‐50S ribosomal subunit protein L2 |(207)
SHALNATKR(8)-‐VLGKAGAAR(4)
1 4.99E-‐09 YES NO
44 gb|AAA58109.1|(9)-‐gb|AAB18601.1|(132)
50S ribosomal subunit protein L29 |(9)-‐rfaZ; lipopolysaccharide core biosynthesis
protein |(132)
ELREKSVEELNTELLNLLR(5)-‐IKFNILR(2)
1 5.69E-‐05 NO
45 gb|AAA58117.1|(7)-‐gb|AAC75655.1|(2)
50S ribosomal subunit protein L3 |(7)-‐50S ribosomal subunit protein L19 |(2)
MIGLVGKK(7)-‐SNIIK(1) 2 3.82E-‐06 YES YES 13.6 Å
46 gb|AAA58096.1|(32)-‐gb|AAC77492.1|(298)
50S ribosomal subunit protein L36 |(32)-‐threonine deaminase |(298)
VICSAEPKHK(8)-‐KYIALHNIR(1)
1 9.72E-‐05 NO
47 gb|AAC43084.1|(82)-‐gb|AAC74011.1|(308)
50S ribosomal subunit protein L7/L12 |(82)-‐murein L,D-‐transpeptidase |(308)
GATGLGLKEAKDLVESAPAALK(8)-‐SKPAPAVR(2)
1 2.81E-‐03 NO
48 gb|AAC43084.1|(82)-‐gb|AAA58134.1|(13)
50S ribosomal subunit protein L7/L12 |(82)-‐ORF_f64; bacterioferritin-‐associated
ferredoxin|(13)
GATGLGLKEAKDLVESAPAALK(8)-‐KIRQAVR(1)
2 6.50E-‐04 NO
49 gb|AAC43084.1|(85)-‐gb|AAA58134.1|(13)
50S ribosomal subunit protein L7/L12 |(85)-‐ORF_f64; bacterioferritin-‐associated
ferredoxin|(13)
GATGLGLKEAKDLVESAPAALK(11)-‐KIRQAVR(1)
1 1.73E-‐03 NO
50 gb|AAA97099.1|(42)-‐gb|AAC76661.1|(44)
50S ribosomal subunit protein L9 |(42)-‐50S ribosomal subunit protein L28 |(44)
KNIEFFEAR(1)-‐FWVESEKR(7)
6 1.51E-‐12 YES YES 21.2 Å
51 gb|AAA97099.1|(89)-‐gb|AAA58114.1|(183)
50S ribosomal subunit protein L9 |(89)-‐50S ribosomal subunit protein L2 |(183)
AGDEGKLFGSIGTR(6)-‐KVEADCR(1)
2 1.66E-‐10 YES YES 22.7 Å
52 gb|AAC73576.1|(141)-‐ref|NP_415483.2|(1)
adenylate kinase |(141)-‐methylglyoxal synthase |(1)
FNPPKVEGKDDVTGEELTTR(5)-‐MELTTR(1)
4 2.82E-‐06 NO NO-‐Y2H
53 gb|AAC73576.1|(145)-‐gb|AAA57978.1|(2)
adenylate kinase |(145)-‐dihydropteroate synthase |(2)
VEGKDDVTGEELTTR(4)-‐LRGFFLSIHTR(1)
1 3.00E-‐05 NO
54 gb|AAC73576.1|(145)-‐ref|NP_415483.2|(1)
adenylate kinase |(145)-‐methylglyoxal synthase |(1)
FNPPKVEGKDDVTGEELTTR(9)-‐MELTTR(1)
1 1.91E-‐06 NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 12
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
55 gb|AAC73341.1|(389)-‐gb|AAC73134.1|(19)
aminoacyl-‐histidine dipeptidase (peptidase D) |(389)-‐30S ribosomal
subunit protein S20 |(19) LAGAKTEAK(5)-‐KHNASR(1) 1 3.66E-‐05 NO
56 gb|AAC74376.1|(369)-‐gb|AAA58106.1|(97)
antimicrobial peptide transport ABC transporter periplasmic binding protein |(369)-‐50S ribosomal subunit protein L24
|(97)
SREQLKSLGLENLTLK(6)-‐FFKSNSETIK(3)
1 2.26E-‐03 NO
57 gb|AAA97142.1|(179)-‐gb|AAA97141.1|(137)
aspartate carbomoyltransferase catalytic subunit |(179)-‐aspartate
carbomoyltransferase regulatory subunit |(137)
TVHSLTQALAKFDGNR(11)-‐ANDIALKCK(7)
7 5.06E-‐20 YES YES 20.6 Å
58 gb|AAC75632.1|(62)-‐gb|AAC73989.1|(591)
autonomous glycyl radical cofactor |(62)-‐pyruvate formate lyase I |(591)
EVPVEVKPEVR(7)-‐IQKLHTYR(3)
3 5.56E-‐14 NO Yes-‐TAP
59 gb|AAC75632.1|(92)-‐gb|AAC75449.1|(81)
autonomous glycyl radical cofactor |(92)-‐conserved protein |(81)
HPEKYPQLTIR(4)-‐KAYERGYR(1)
1 1.35E-‐04 NO
60 gb|AAC73144.1|(366)-‐ref|NP_415790.1|(186)
carbamoyl-‐phosphate synthase large subunit|(366)-‐DNA topoisomerase I,
omega subunit|(186)
FNFEKFAGANDR(5)-‐KIAR(1)
*KIAR can be mapped to 3 other proteins with no evidence of binding to
AAC73144.1
1 6.36E-‐03 NO Yes-‐TAP
61 gb|AAC73144.1|(504)-‐ ref|NP_415790.1|(186)
carbamoyl-‐phosphate synthase large subunit |(504)-‐DNA topoisomerase I,
omega subunit|(186)
LAKLAGVR(3)-‐KIAR(1)
*same as above 3 5.32E-‐05 NO
Yes-‐TAP
62 gb|AAC73144.1|(940)-‐ref|NP_417006.2|(41)/ gb|AAC76893.1|(122)
carbamoyl-‐phosphate synthase large subunit |(940)-‐GTPase; multicopy
suppressor of ftsJ |(41)/ sensory histidine kinase in two-‐component regulatory
system with CpxR |(122)
AQLGSNSTMKK(10)-‐KYGR(1)
1 5.90E-‐07 NO
63 gb|AAA58092.1|(1)-‐gb|AAC73297.1|(437)
CG Site no. 234; RNA polymerase alpha subunit |(1)-‐lysine decarboxylase 2,
constitutive |(437)
MQGSVTEFLKPR(1)-‐KEVQR(1)
1 9.95E-‐04 NO
64 gb|AAB18435.1|(3)-‐gb|AAC75642.1|(2)
CG Site no. 551; Leu/Ile/Val-‐binding protein |(3)-‐conserved protein, UPF0124
family |(2)
LKNNITTHVITRR(2)-‐SKLIVPQWPQPK(1)
1 4.61E-‐06 NO
65 gb|AAA58136.1|(10)-‐gb|AAC75629.1|(411)
CG Site No. 61; translation elongation factor EF-‐Tu|(10)-‐ATP-‐dependent RNA
helicase |(411)
TKPHVNVGTIGHVDHGK(2)-‐EKEK(2)
2 1.21E-‐04 NO Yes-‐TAP
66 gb|AAA58136.1|(10)-‐gb|AAC73699.1|(2)
CG Site No. 61; translation elongation factor EF-‐Tu|(10)-‐carbon starvation
protein |(2)
TKPHVNVGTIGHVDHGK(2)-‐NKSGK(1)
1 9.92E-‐03 NO
67 gb|AAA58136.1|(57)-‐gb|AAB18579.1|(10)
CG Site No. 61; translation elongation factor EF-‐Tu|57)-‐alternate gene name
yibL |(10)
AFDQIDNAPEEKAR(12)-‐NEIKRLSDR(4)
2 4.14E-‐05 NO
68 gb|AAA58136.1|(57)-‐gb|AAB18493.1|(4) / gb|AAC76542.1|(4)
CG Site No. 61; translation elongation factor EF-‐Tu|(57)-‐GAD alpha protein |(4)/
glutamate decarboxylase A, PLP-‐dependent|(4)
AFDQIDNAPEEKAR(12)-‐DQKLLTDFR(3)
1 2.83E-‐05 NO
69 gb|AAA58136.1|(57)-‐ref|YP_025307.1|(1)
CG Site No. 61; translation elongation factor EF-‐Tu| (57)-‐multidrug efflux system
transporter |(1)
AFDQIDNAPEEKAR(12)-‐MQKYISEAR(1)
2 3.27E-‐06 NO Yes-‐Y2H
70 gb|AAA58136.1|(57)-‐gb|AAC74522.1|(3)
CG Site No. 61; translation elongation factor EF-‐Tu| (57)-‐polyhydroxybutyrate
(PHB) synthase, ABC transporter periplasmic binding protein homolog |(3)
AFDQIDNAPEEKAR(12)-‐MSKTFAR(3)
2 4.30E-‐06 NO Yes-‐Y2H
71 gb|AAA58136.1|(57)-‐ref|YP_026243.1|(62)
CG Site No. 61; translation elongation factor EF-‐Tu| (57)-‐predicted von
Willibrand factor containing protein |(62)
AFDQIDNAPEEKAR(12)-‐SRLKDAR(4)
2 6.84E-‐07 NO Yes-‐Y2H
72 gb|AAA58137.1|(370)/-‐gb|AAC74021.1|(207)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(370)-‐
alkanesulfonate monooxygenase, FMNH(2)-‐dependent |(207)
IVQMHANKR(8)-‐EKIEQVR(2)
1 1.78E-‐05 NO
73 gb|AAA58137.1|(423)-‐gb|AAA58099.1|(6)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(423)-‐50S ribosomal subunit protein L30 |(6)
MEFPEPVISIAVEPKTKADQEK(15)-‐TIKITQTRSAIGR(3)
4 3.37E-‐09 YES NO
74 gb|AAA58137.1|(423)-‐gb|AAC73558.1|(25)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(423)-‐conserved
protein, DUF1428 family |(25)
MEFPEPVISIAVEPKTK(15)-‐EMAAKAAPLFKEFGALR(5)
1 2.06E-‐06 NO
75 gb|AAA58165.1|(2)-‐ref|NP_416491.2|(2)
CG Site no. 893; siroheme synthase |(2)-‐predicted multdrug exporter, MATE family
|(2)
DHLPIFCQLR(1)-‐WFHFLQLR(1)
1 9.48E-‐05 NO
76 gb|AAC75673.1|(141)-‐gb|AAC73702.1|(118)
CP4-‐57 prophage; predicted protein |(141)-‐conserved protein |(118)
LKELLTTNPKAPVR(2)-‐HEIGKGSSSLKLR(11)
1 1.05E-‐05 NO
77 gb|AAC75071.2|(328)-‐gb|AAC75496.1|(17)
D-‐alanyl-‐D-‐alanine carboxypeptidase (penicillin-‐binding protein 6b) |(328)-‐CPZ-‐
AEIPHIKAKYTLDGK(7)-‐INTNKSPR(5)
1 1.79E-‐04 NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 13
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
55 prophage; predicted protein |(17)
78 gb|AAC74146.1|(9)-‐
ref|YP_026161.1|(120)
dihydro-‐orotase |(9)-‐RNA chaperone, probable regulator of ProP translation
|(120)
TAPSQVLKIRR(8)-‐AEQQAKK(6)
2 6.16E-‐07 NO
79 gb|AAC43085.1|(236)-‐gb|AAC75621.1|(30)
DNA-‐directed RNA polymerase, beta-‐subunit |(236)-‐leader peptidase (signal
peptidase I) |(30)
DNKLQMELVPER(3)-‐FFFAPKR(6)
1 4.36E-‐04 NO
80 gb|AAC43086.1|(74)-‐gb|AAC75459.1|(412)
DNA-‐directed RNA polymerase, beta'-‐subunit |(74)-‐xanthosine transporter
|(412) DYECLCGKYK(8)-‐IKHR(2) 3 5.81E-‐10 NO
81 gb|AAC76962.1|(781)-‐gb|AAC76961.1|(163)
DNA-‐directed RNA polymerase, beta prime subunit|(781)-‐RNA polymerase,
beta subunit|(163)
KGLADTALK(1)-‐GKTHSSGK(2)
1 1.61E-‐07 YES NO Yes-‐TAP
82 gb|AAC75562.1|(356)-‐gb|AAA58099.1|(19)
exonuclease VII, large subunit |(356)-‐50S ribosomal subunit protein L30 |(19)
LNQQNPQPKIHRAQTR(9)-‐LPKHK(3)
1 2.94E-‐03 NO
83 gb|AAC74462.1|(70)-‐gb|AAA58098.1|(129)
fermentative D-‐lactate dehydrogenase, NAD-‐dependent |(70)-‐50S ribosomal
subunit protein L15 |(129) HGVKYIALR(4)-‐VTKGAR(3) 2 2.76E-‐05 NO
84 gb|AAC73552.1|(380)-‐gb|AAA69082.1|(1)
fused predicted multidrug transporter subunits of ABC superfamily: ATP-‐binding
components |(380)-‐ORF_f76 |(1)
NFVALVGHTGSGKSTLASLLMGYYPLTEGEIR(13)-‐
MHFAQR(1) 1 5.22E-‐03 NO
85 gb|AAC75710.1|(459)-‐gb|AAC76725.1|(388)
gamma-‐aminobutyrate transporter |(459)-‐chromosomal replication initiator protein DnaA, DNA-‐binding transcriptional
dual regulator |(388)
LVLWQKTPVHNTR(6)-‐TVAEYYKIK(7)
1 5.54E-‐05 NO
86 gb|AAC74666.1|(2)-‐
ref|NP_418673.4|(142)/ gb|AAA97148.1|(145)
Global DNA-‐binding transcriptional repressor; autorepressor; required for anaerobic growth on glucosamine |(2)-‐biofilm modulator regulated by toxins
|(142)/ ORF_o153b|(145)
VAENQPGHIDQIKQTNAGAVYR(1)-‐KAVVK(1)
1 2.02E-‐04 NO
87 gb|AAB18608.1|(57)-‐gb|AAA58136.1|(57)
glucosyltransferase I |(57)-‐CG Site No. 61; translation elongation factor Tu;
translation elongation factor Tu |(57)
AFELIQVPVKSHTNHGR(10)-‐AFDQIDNAPEEKAR(12)
1 2.94E-‐06 NO
88 gb|AAC74849.1|(184)-‐gb|AAC75139.1|(2)
glyceraldehyde-‐3-‐phosphate dehydrogenase A |(184)-‐sensory histidine
kinase in two-‐component regulatory system with BaeR |(2)
VINDNFGIIEGLMTTVHATTATQKTVDGPSHK(24)-‐
KFWR(1) 9 2.49E-‐07 NO
89 gb|AAC74849.1|(61)-‐gb|AAA58005.1|(91)/ gb|AAC76235.1|(91)
glyceraldehyde-‐3-‐phosphate dehydrogenase A |(61)-‐ORF_o95 |(91)/ ribosome hibernation promoting factor HPF; stabilizes 70S dimers (100S) |(91)
FDGTVEVKDGHLIVNGKK(8)-‐HKDKLK(4)
1 1.86E-‐04 NO
90
gb|AAB03058.1|(488)-‐ref|NP_416294.4|(293)/ref|YP_026224.1|(1003)/gb|AAB18570.1|(1003)/ gb|AAC73794.1|(1003)/ gb|AAC76675.1|(200)
glycerol kinase |(488)-‐conserved protein |(293)/ RshB|(1003)/ rhsA|(1003)/ RshC
|(1003)/ tRNA mG18-‐2'-‐O-‐methyltransferase, SAM-‐dependen|(200)
YAGWKK(5)-‐VAKR(3) 1 7.51E-‐03 NO
91 gb|AAA97042.1|(117)-‐ref|NP_416801.2|(256)
GroEL protein |(117)-‐predicted inner membrane protein |(256)
AVAAGMNPMDLKR(12)-‐KNPLLSR(1)
2 2.30E-‐10 NO NO-‐Y2H
92 gb|AAA97042.1|(277)-‐gb|AAC74356.1|(659)
GroEL protein |(277)-‐DNA topoisomerase I, omega subunit |(659)
VAAVKAPGFGDR(5)-‐AKRR(2)
1 1.21E-‐05 NO
93 gb|AAA97042.1|(364)-‐gb|AAC75438.1|(1)
GroEL protein |(364)-‐valine-‐pyruvate aminotransferase 3 |(1)
QQIEEATSDYDREKLQER(14)-‐MADTRPER(1)
1 3.54E-‐03 NO
94 gb|AAA97042.1|(51)-‐gb|AAC73887.1|(8)
GroEL protein |(51)-‐predicted family 3 glycosyltransferase |(8)
SFGAPTITKDGVSVAR(9)-‐IIKEIGR(3)
1 5.15E-‐05 NO
95 gb|AAC75369.1|(234)-‐gb|AAA97125.1|(322)
histidine/lysine/arginine/ornithine transporter subunit |(234)-‐ORF_o417a;
ATP-‐binding component of ABC transporter superfamily |(322)
EALNKAFAEMR(5)-‐GKPQNLR(2)
1 24 1.30E-‐12 NO
96 gb|AAA97304.1|(53)/ gb|AAC73117.1|(190)-‐gb|AAC73227.1|(106)
hypothetical protein 126 of GenBank Accession Number D10483 (ECO110K)
|(53)/ Peroxide resistance protein, lowers intracellular iron|(190)-‐lipoamide
dehydrogenase, E3 component is part of three enzyme complexes |(106)
KLNAEIIKPVFLDEK(8)-‐VINQLTGGLAGMAKGR(14)
1 1.05E-‐03 NO
97 gb|AAC74001.1|(189)-‐gb|AAC74635.1|(1)
lipid A 4'kinase |(189)-‐Qin prophage; small toxic polypeptide |(1)
AGRLKSVDAVIVNGGVPR(5)-‐MKQQK(1)
1 6.28E-‐03 NO
98 gb|AAC73200.1|(83)-‐gb|AAC75219.1|(8)
Lipid II flippase; integral membrane protein involved in stabilizing FstZ ring
during cell division |(83)-‐inner membrane protein, UPF0324 family |(8)
LTNDPFFFAKR(10)-‐TNITLQKQHR(7)
2 7.82E-‐06 NO Yes-‐Y2H
99 gb|AAA58215.1|(1)-‐
ref|NP_415376.4|(120)
maltodextrin phosphorylase |(1)-‐putrescine transporter subunit: ATP-‐
binding component of ABC superfamily |(120)
MSQPIFNDKQFQEALSR(1)-‐QDKLPKAEIASR(3)
1 2.77E-‐05 NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 14
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
100 gb|AAA58215.1|(254)-‐gb|AAC74429.1|(25)
maltodextrin phosphorylase |(254)-‐Rac prophage; predicted protein |(25)
AEQQGINAEKLTKVLYPNDNHTAGK(13)-‐SKLTK(2)
1 3.27E-‐05 NO
101 gb|AAC75211.1|(81)-‐gb|AAC75438.1|(2)
methyl-‐galactoside transporter subunit |(81)-‐valine-‐pyruvate aminotransferase 3
|(2)
QNDQIDVLLAKGVK(11)-‐ADTR(1)
1 4.50E-‐04 YES NO
102 gb|AAC74310.1|(173)-‐gb|AAC43066.1|(188)
molybdenum-‐cofactor-‐assembly chaperone subunit (delta subunit) of
nitrate reductase 1 |(173)-‐argininosuccinate lyase |(188)
LANTAIDSDKVAEK(10)-‐LQDALKR(6)
1 1.69E-‐06 NO
103 gb|AAC74925.1|(323)-‐gb|AAC73700.1|(361)
myristoyl-‐acyl carrier protein (ACP)-‐dependent acyltransferase |(323)-‐predicted oxidoreductase |(361)
KDLYPIK(7)-‐KVESFKA(6) 1 4.77E-‐05 NO
104 gb|AAA58162.1|(90)-‐gb|AAC75024.1|(138)
NADH-‐nitrate oxidoreductase apoprotein |(90)/-‐conserved inner membrane protein
|(138)
AITINRQEKVIHSSAGR(9)-‐LAAQDPLKFEK(8)
2 2.50E-‐05 NO
105 gb|AAC43111.1|(66)-‐gb|AAA96987.1|(183)
ORF_f728; ankyrin repeat protein |(66)-‐ORF_f311; D-‐allose transporter
subunit|(183)
HIFSNKDFVIK(6)-‐NGATEAFKK(8)
1 9.01E-‐03 NO
106 gb|AAC43111.1|(199)-‐gb|AAA69243.1|(791)
ORF_f728; ankyrin repeat protein |(199)-‐DNA mismatch repair protein |(791)
EALHDSLKR(8)-‐QKLR(1) 1 3.95E-‐05 NO
107
gb|AAC43111.1|(199)-‐gb|AAA69113.1|(2)/ gb|AAC76245.1|(376)/
ref|NP_415636.1|(16)/ref|NP_417354.1|(763)
ORF_f728; ankyrin repeat protein |(199)-‐ORF_o252 |(2)/ glutamate synthase, 4Fe-‐
4S protein, small subunit|(376)/ lipoprotein-‐releasing system
transmembrane protein|(16)/ predicted oxidoreductase, Fe-‐S subunit|(763)
EALHDSLKR(8)-‐GRRR(1) 1 5.18E-‐10 NO
108
gb|AAC43111.1|(199)-‐gb|AAA58124.1|(155)/
ref|NP_417786.1|(155)/gb|AAB18034.1|(191)/gb|
AAC73410.1|(191)
ORF_f728; ankyrin repeat protein |(199)-‐ORF_o398 |(155)/ general secretory pathway component, cryptic|(155)/ hypothetical protein|(191)/ predicted
electron transport protein with ferridoxin-‐like domain|(191)
EALHDSLKR(8)-‐QKIR(2) 1 1.00E-‐03 NO
109 gb|AAC74697.1|(9)-‐gb|AAA58115.1|(1)
oriC-‐binding complex H-‐NS/Cnu; binds 26 bp cnb site; also forms a complex with StpA |(9)-‐50S ribosomal subunit protein
L23 |(1)
TVQDYLLKFR(8)-‐MIREER(1) 1 9.21E-‐06 NO
110 gb|AAA69093.1|(14)-‐gb|AAC74589.1|(2)
phosphoglycerate kinase |(14)-‐autoinducer 2-‐binding protein |(2)
MTDLDLAGKR(9)-‐TLHRFKK(1)
3 8.97E-‐07 NO NO-‐Y2H
111 gb|AAA69093.1|(14)-‐gb|AAC73248.1|(94)
phosphoglycerate kinase |(14)-‐predicted fimbrial-‐like adhesin protein |(94)
MTDLDLAGKR(9)-‐KAQIKLTK(1)
1 6.97E-‐05 NO
112 ref|NP_416391.4|(96)-‐gb|AAA58068.1|(1)/ gb|AAC76296.1|(1)
predicted protein |(96)-‐ORF_f220 |(1)/ DNA-‐binding transcriptional regulator|(1)
KSQRAWLDFR(1)-‐MAKRTK(1)
1 1.45E-‐03 NO
113 gb|AAC74788.1|(77)-‐gb|AAC73245.1|(16)
protein chain initiation factor IF-‐3 |(77)-‐3-‐methyl-‐2-‐oxobutanoate
hydroxymethyltransferase |(16) FLYEKSK(5)-‐QEKK(3) 1 5.50E-‐05 NO
114 gb|AAA97280.1|(57)-‐ gb|AAC76634.1|(2)
purine-‐nucleoside phosphorylase |(57)-‐glutaredoxin 3 |(2)
GRKISVMGHGMGIPSCSIYTK(3)-‐ANVEIYTK(1)
2 4.64E-‐04 NO
115 ref|NP_416518.2|(2)-‐gb|AAC73708.1|(85)
putrescine importer, low affinity |(2)-‐universal stress protein UP12 |(85)
SHNVTPNTSR(1)-‐IKQHVR(2)
2 1.27E-‐13 NO Yes-‐Y2H
116 gb|AAC74647.1|(35)-‐ref|NP_415638.2|(2)/ gb|AAC75629.1|(412)
Qin prophage; cell division inhibition protein |(35)-‐deacetylase of acs and cheY,
regulates chemotaxis |(2)/ ATP-‐dependent RNA helicase|(412)
RKQER(2)-‐EKPR(1) 3 1.20E-‐05 NO
117 gb|AAA97208.1|(90)-‐gb|AAC73235.1|(685)
recombinase involved in phase variation |(90)-‐glucose dehydrogenase |(685)
EVQALKNWLSIR(6)-‐TNEVVWKK(7)
1 1.13E-‐06 NO
118 gb|AAC73823.1|(241)-‐gb|AAC73822.1|(1)
succinyl-‐CoA synthetase, NAD(P)-‐binding, alpha subunit |(241)-‐succinyl-‐CoA synthetase, beta subunit |(1)
EHVTKPVVGYIAGVTAPKGK(18)-‐MNLHEYQAK(1)
2 5.12E-‐14 YES YES 15.4 Å
119 gb|AAC74198.1|(340)-‐gb|AAA58137.1|(370)
transcription-‐repair coupling factor |(340)-‐CG Site No. 732; alternate name
far; elongation factor EF-‐G|(370)
VQLKTEHLPTK(4)-‐IVQMHANKR(8)
1 2.00E-‐06 NO
120 ref|YP_026188.1|(591)-‐gb|AAC74745.1|(48)/ gb|AAB47951.1|(48)
transketolase 1, thiamin-‐binding |(591)-‐predicted protein |(48)/ hypothetical
protein |(48)
VVSMPSTDAFDKQDAAYR(12)-‐TALANKRIQR(6)
1 1.46E-‐06 NO
121 gb|AAB18523.1|(1)-‐gb|AAC73164.1|(394)
unnamed protein product |(1)-‐peptidyl-‐prolyl cis-‐trans isomerase (PPIase) |(394)
MLSPVCPGFVCMR(1)-‐TDAAQKDR(6)
4 2.62E-‐06 NO
122 gb|AAC76834.1|(40)-‐gb|AAA97161.1|(2)
uridine phosphorylase |(40)-‐ORF_f332 |(2); DNA-‐binding transcriptional repressor, 5-‐gluconate-‐binding(2)
IAALMDKPVK(7)-‐RNHR(1) 2 3.99E-‐06 NO
123 ref|NP_416570.2|(151)/ gb|AAC73670.1|(46)/gb|
uridine/cytidine kinase |(151) bacteriophage N4 receptor, inner
RIKR(3)-‐NKNR(2) 1 2.90E-‐06 NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 15
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
AAC73678.1|(264)-‐gb|AAC73565.1|(3)
membrane subunit |(46)/mechanosensitive channel protein, miniconductance |(264)-‐multidrug efflux
system |(3)
124 gb|AAA57977.1|(392)-‐gb|AAB18536.1|(674)
yhbF; phosphoglucosamine mutase |(392)-‐glycine-‐tRNA synthetase, beta
subunit |(674)
YTAGSGDPLEHESVKAVTAEVEAALGNR(15)-‐LTMLEKLR(6)
1 7.55E-‐05 NO
Intra-‐molecular Cross-‐links
125 gb|AAC75568.1|(1)-‐gb|AAC75568.1|(11)
1-‐hydroxy-‐2-‐methyl-‐2-‐(E)-‐butenyl 4-‐diphosphate synthase |(1)-‐1-‐hydroxy-‐2-‐methyl-‐2-‐(E)-‐butenyl 4-‐diphosphate
synthase |(11)
MHNQAPIQR(1)-‐KSTR(1) 1 3 7.17E-‐06 NO
126 gb|AAC73277.1|(254)-‐gb|AAC73277.1|(263)
2,3,4,5-‐tetrahydropyridine-‐2-‐carboxylate N-‐succinyltransferase |(254)-‐2,3,4,5-‐tetrahydropyridine-‐2-‐carboxylate N-‐
succinyltransferase |(263)
YSLYCAVIVKK(10)-‐GKVGINELLR(2)
3 2.23E-‐22 NO
127 gb|AAC73277.1|(263)-‐gb|AAC73277.1|(259)
2,3,4,5-‐tetrahydropyridine-‐2-‐carboxylate N-‐succinyltransferase |(263)-‐2,3,4,5-‐tetrahydropyridine-‐2-‐carboxylate N-‐
succinyltransferase |(259)
GKVGINELLR(2)-‐VDAKTR(4)
6 4 3.23E-‐09 NO
128 gb|AAC73820.1|(54)-‐gb|AAC73820.1|(71)
2-‐oxoglutarate decarboxylase, thiamin-‐requiring |(54)-‐2-‐oxoglutarate
decarboxylase, thiamin-‐requiring |(71)
STFQQLPGTGVKPDQFHSQTR(12)-‐LAKDASR(3)
3 7.49E-‐05 YES NO
129 gb|AAC73997.1|(260)-‐gb|AAC73997.1|(247)
30S ribosomal subunit protein S1 |(260)-‐30S ribosomal subunit protein S1 |(247)
VSLGLKQLGEDPWVAIAK(6)-‐VLKFDR(3)
2 1.49E-‐05 NO
130 gb|AAC73997.1|(279)-‐gb|AAC73997.1|(347)
30S ribosomal subunit protein S1 |(279)-‐30S ribosomal subunit protein S1 |(347)
YPEGTKLTGR(6)-‐ISLGLKQCK(6)
3 3.05E-‐12 YES NO
131 gb|AAA58118.1|(59)-‐gb|AAA58118.1|(11)
30S ribosomal subunit protein S10 |(59)-‐30S ribosomal subunit protein S10 |(11)
FTVLISPHVNKDAR(11)-‐LKAFDHR(2)
6 1.96E-‐22 YES NO
132 gb|AAA58118.1|(82)-‐gb|AAA58118.1|(1)
30S ribosomal subunit protein S10 |(82)-‐30S ribosomal subunit protein S10 |(1)
LVDIVEPTEKTVDALMR(10)-‐MQNQR(1)
3 7.54E-‐07 YES NO
133 gb|AAA58118.1|(82)-‐gb|AAA58118.1|(30)
30S ribosomal subunit protein S10 |(82)-‐30S ribosomal subunit protein S10 |(30)
LVDIVEPTEKTVDALMR(10)-‐LIDQATAEIVETAKR(14)
1 6.67E-‐10 YES YES 10.5 Å
134 gb|AAA58139.1|(108)-‐gb|AAA58139.1|(120)
30S ribosomal subunit protein S12 |(108)-‐30S ribosomal subunit protein S12 |(120)
GALDCSGVKDR(9)-‐YGVKRPK(4)
1 2.95E-‐11 YES YES 9.6 Å
135 gb|AAA58139.1|(108)-‐gb|AAA58139.1|(51)
30S ribosomal subunit protein S12 |(108)-‐30S ribosomal subunit protein S12 |(51)
GALDCSGVKDR(9)-‐KVCR(1) 4 8.06E-‐07 YES YES 15.9 Å
136 gb|AAA58139.1|(108)-‐ref|NP_417045.4|(269)
30S ribosomal subunit protein S12 |(108)-‐predicted DNA-‐binding transcriptional
regulator |(269)
GALDCSGVKDR(9)-‐KQAR(1)
13 4.69E-‐07 YES YES 6.1 Å
137 gb|AAA58139.1|(51)-‐ref|NP_417045.4|(269)
30S ribosomal subunit protein S12 |(51)-‐predicted DNA-‐binding transcriptional
regulator |(269) KVCR(1)-‐KQAR(1) 3 3.95E-‐06 YES YES 21.6 Å
138 gb|AAA58093.1|(103)-‐ref|NP_416896.4|(571)
30S ribosomal subunit protein S13 |(103)-‐predicted diguanylate cyclase |(571)
TKTNAR(2)-‐KGPR(1) 1 3 3.69E-‐06 YES YES 12.2 Å
139 gb|AAA58093.1|(110)-‐gb|AAA58093.1|(103)
30S ribosomal subunit protein S13 |(110)-‐30S ribosomal subunit protein S13 |(103)
TRKGPR(3)-‐TKTNAR(2) 5 2.05E-‐06 YES YES 12.2 Å
140 gb|AAA58093.1|(78)-‐gb|AAA58093.1|(103)
30S ribosomal subunit protein S13 |(78)-‐30S ribosomal subunit protein S13 |(103)
EISMSIKR(7)-‐TKTNAR(2) 5 2.83E-‐07 YES YES 22.5 Å
141 gb|AAA58093.1|(78)-‐ref|NP_416896.4|(571)
30S ribosomal subunit protein S13 |(78)-‐predicted diguanylate cyclase |(571)
EISMSIKR(7)-‐KGPR(1) 1 1.13E-‐07 YES YES 19.0 Å
142 gb|AAA97098.1|(50)-‐gb|AAA97098.1|(9)
30S ribosomal subunit protein S18 |(50)-‐30S ribosomal subunit protein S18 |(9)
AKYQR(2)-‐KFCR(1) 3 2.55E-‐11 YES NO
143 gb|AAA58113.1|(29)-‐gb|AAA58113.1|(18)
30S ribosomal subunit protein S19 |(29)-‐30S ribosomal subunit protein S19 |(18)
AVESGDKKPLR(8)-‐KVEK(1) 1 2.58E-‐04 YES YES 14.4 Å
144 gb|AAC73280.1|(11)-‐gb|AAC73280.1|(2)
30S ribosomal subunit protein S2 |(11)-‐30S ribosomal subunit protein S2 |(2)
DMLKAGVHFGHQTR(4)-‐ATVSMR(1)
6 1.14E-‐12 YES NO
145 gb|AAC73280.1|(115)-‐gb|AAC73280.1|(66)
30S ribosomal subunit protein S2 |(115)-‐30S ribosomal subunit protein S2 |(66)
LKDLETQSQDGTFDK(2)-‐KGKILFVGTK(3)
1 2.15E-‐03 YES YES 19.8 Å
146 gb|AAC73280.1|(115)-‐ gb|AAC73280.1|(112)
30S ribosomal subunit protein S2 |(115)-‐30S ribosomal subunit protein S2|(112)
LKDLETQSQDGTFDK(2)-‐QSIKR(4)
1 8.91E-‐05 YES YES 5.4 Å
147 gb|AAC73134.1|(16)-‐gb|AAC73134.1|(19)
30S ribosomal subunit protein S20 |(16)-‐30S ribosomal subunit protein S20 |(19)
AIQSEKAR(6)-‐KHNASR(1) 2 20 3.04E-‐08 YES YES 5.0 Å
148 gb|AAC73134.1|(16)-‐gb|AAC73134.1|(19)
30S ribosomal subunit protein S20 |(16)-‐30S ribosomal subunit protein S20 |(19)
RAIQSEKAR(7)-‐KHNASR(1) 5 4.17E-‐09 YES YES 5.0 Å
149 gb|AAC73134.1|(16)-‐gb|AAC73134.1|(19)
30S ribosomal subunit protein S20 |(16)-‐30S ribosomal subunit protein S20 |(19)
AIQSEKAR(6)-‐KHNASRR(1) 6 1.89E-‐07 YES YES 5.0 Å
150 gb|AAC73134.1|(49)-‐gb|AAC73134.1|(34)
30S ribosomal subunit protein S20 |(49)-‐30S ribosomal subunit protein S20 |(34)
AAAQKAFNEMQPIVDR(5)-‐KVYAAIEAGDK(1)
2 7.28E-‐13 YES YES 9.8 Å
151 gb|AAC73134.1|(5)-‐gb|AAC73134.1|(19)
30S ribosomal subunit protein S20 |(5)-‐30S ribosomal subunit protein S20 |(19)
ANIKSAK(4)-‐KHNASR(1) 2 5.44E-‐08 YES YES 23.0 Å
152 gb|AAC73134.1|(64)-‐gb|AAC73134.1|(71)
30S ribosomal subunit protein S20 |(64)-‐30S ribosomal subunit protein S20 |(71)
QAAKGLIHK(4)-‐NKAAR(2) 3 8.11E-‐05 YES YES 8.6 Å
153 gb|AAA58111.1|(108)-‐gb|AAA58111.1|(147)
30S ribosomal subunit protein S3 |(108)-‐30S ribosomal subunit protein S3 |(147)
KPELDAK(1)-‐LGAKGIK(4) 7 5.45E-‐11 YES YES 15.9 Å
154 gb|AAA58111.1|(49)-‐gb|AAA58111.1|(79)
30S ribosomal subunit protein S3 |(49)-‐30S ribosomal subunit protein S3 |(79)
ELAKASVSR(4)-‐PGIVIGKK(7)
3 1.04E-‐08 YES YES 17.3 Å
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 16
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
155 gb|AAA58094.1|(156)-‐ref|NP_418672.4|(5)
30S ribosomal subunit protein S4 |(156)-‐predicted transcriptional regulator |(5)
VKAALELAEQR(2)-‐KQSR(1) 3 4 7.88E-‐06 YES YES 9.3 Å
156 gb|AAA58094.1|(167)-‐gb|AAA58094.1|(183)
30S ribosomal subunit protein S4 |(167)-‐30S ribosomal subunit protein S4 |(183)
EKPTWLEVDAGK(2)-‐MEGTFKR(6)
1 5 1.41E-‐05 YES YES 11.5 Å
157 gb|AAA58094.1|(83)-‐gb|AAA58094.1|(185)
30S ribosomal subunit protein S4 |(83)-‐30S ribosomal subunit protein S4 |(185)
LKGNTGENLLALLEGR(2)-‐KPER(1)
3 27 1.21E-‐07 YES YES 14.3 Å
158 gb|AAA58094.1|(83)-‐gb|AAA58094.1|(77)
30S ribosomal subunit protein S4 |(83)-‐30S ribosomal subunit protein S4 |(77)
LKGNTGENLLALLEGR(2)-‐NYYKEAAR(4)
2 1.66E-‐06 YES YES 11.7 Å
159 gb|AAA58138.1|(137)-‐gb|AAA58138.1|(110)
30S ribosomal subunit protein S7 |(137)-‐30S ribosomal subunit protein S7 |(110)
KREDVHR(1)-‐KRGDK(1) 1 6.75E-‐06 YES YES 16.6 Å
160 gb|AAA58103.1|(41)-‐gb|AAA58103.1|(56)
30S ribosomal subunit protein S8 |(41)-‐30S ribosomal subunit protein S8 |(56)
VAIANVLKEEGFIEDFK(8)-‐VEGDTKPELELTLK(6)
9 1.44E-‐07 YES YES 23.5 Å
161 gb|AAA58032.1|(100)-‐gb|AAA58032.1|(13)
30S ribosomal subunit protein S9 |(100)-‐30S ribosomal subunit protein S9 |(13)
KAGFVTR(1)-‐RKSSAAR(2) 1 1.43E-‐08 YES YES 18.5 Å
162 gb|AAA58032.1|(2)-‐gb|AAA58032.1|(100)
30S ribosomal subunit protein S9 |(2)-‐30S ribosomal subunit protein S9 |(100)
AENQYYGTGR(1)-‐KAGFVTR(1)
12 2.28E-‐12 YES NO
163 gb|AAA58032.1|(2)-‐gb|AAA58032.1|(13)
30S ribosomal subunit protein S9 |(2)-‐30S ribosomal subunit protein S9 |(13)
AENQYYGTGR(1)-‐RKSSAAR(2)
1 4.33E-‐05 YES NO
164 gb|AAA58032.1|(2)-‐gb|AAA58032.1|(13)
30S ribosomal subunit protein S9 |(2)-‐30S ribosomal subunit protein S9 |(13)
AENQYYGTGRR(1)-‐KSSAAR(1)
13 2.16E-‐12 YES NO
165 gb|AAA58032.1|(2)-‐gb|AAA58032.1|(22)
30S ribosomal subunit protein S9 |(2)-‐30S ribosomal subunit protein S9 |(22)
AENQYYGTGR(1)-‐VFIKPGNGK(4)
5 10 7.98E-‐12 YES NO
166 gb|AAC43082.1|(167)-‐gb|AAC43082.1|(54)
50S ribosomal subunit protein L1 |(167)-‐50S ribosomal subunit protein L1 |(54)
YRNDKNGIIHTTIGK(5)-‐KSDQNVR(1)
3 9.04E-‐21 YES YES 7.3 Å
167 gb|AAC43082.1|(167)-‐gb|AAC43082.1|(54)
50S ribosomal subunit protein L1 |(167)-‐50S ribosomal subunit protein L1 |(54)
NDKNGIIHTTIGK(3)-‐KSDQNVR(1)
3 7.37E-‐17 YES YES 7.3 Å
168 gb|AAC43082.1|(205)-‐gb|AAC43082.1|(54)
50S ribosomal subunit protein L1 |(205)-‐50S ribosomal subunit protein L1 |(54)
AKPTQAKGVYIK(7)-‐KSDQNVR(1)
8 3.12E-‐19 YES YES 17.2 Å
169 gb|AAC43082.1|(205)-‐gb|AAC43082.1|(54)
50S ribosomal subunit protein L1 |(205)-‐50S ribosomal subunit protein L1 |(54)
PTQAKGVYIK(5)-‐KSDQNVR(1)
8 9.67E-‐20 YES YES 17.2 Å
170 gb|AAC43082.1|(54)-‐gb|AAC43082.1|(210)
50S ribosomal subunit protein L1 |(54)-‐50S ribosomal subunit protein L1 |(210)
KSDQNVR(1)-‐GVYIKK(5) 1 2.32E-‐04 YES YES 20.0 Å
171 gb|AAC43083.1|(37)-‐gb|AAC43083.1|(105)
50S ribosomal subunit protein L10 |(37)-‐50S ribosomal subunit protein L10 |(105)
GVTVDKMTELR(6)-‐ANAKFEVK(4)
2 5.59E-‐09 NO
172 gb|AAA58091.1|(78)-‐gb|AAA58091.1|(121)
50S ribosomal subunit protein L17 |(78)-‐50S ribosomal subunit protein L17 |(121)
TRDNEIVAKLFNELGPR(9)-‐SEKAEAAAE(3)
1 3.91E-‐06 YES YES 8.8 Å
173 gb|AAC75655.1|(111)-‐gb|AAC75655.1|(106)
50S ribosomal subunit protein L19 |(111)-‐50S ribosomal subunit protein L19 |(106)
IKERLN(2)-‐TGKAAR(3) 5 1.23E-‐06 YES YES 16.8 Å
174 gb|AAC75655.1|(63)-‐gb|AAC75655.1|(106)
50S ribosomal subunit protein L19 |(63)-‐50S ribosomal subunit protein L19 |(106)
KISNGEGVER(1)-‐TGKAAR(3)
2 6 1.06E-‐06 YES YES 13.5 Å
175 gb|AAC75655.1|(87)-‐gb|AAC75655.1|(106)
50S ribosomal subunit protein L19 |(87)-‐50S ribosomal subunit protein L19 |(106)
VFQTHSPVVDSISVKR(15)-‐TGKAAR(3)
3 2.44E-‐07 YES YES 17.0 Å
176 gb|AAC75655.1|(87)-‐gb|AAC75655.1|(63)
50S ribosomal subunit protein L19 |(87)-‐50S ribosomal subunit protein L19 |(63)
VFQTHSPVVDSISVKR(15)-‐KISNGEGVER(1)
4 25 2.77E-‐26 YES YES 18.1 Å
177 gb|AAA58114.1|(125)-‐gb|AAA58114.1|(108)
50S ribosomal subunit protein L2 |(125)-‐50S ribosomal subunit protein L2 |(108)
AGDQIQSGVDAAIKPGNTLPMR(14)-‐YILAPKGLK(6)
3 1.31E-‐06 YES YES 10.7 Å
178 gb|AAA58114.1|(183)-‐gb|AAA58114.1|(207)
50S ribosomal subunit protein L2 |(183)-‐50S ribosomal subunit protein L2 |(207)
KVEADCR(1)-‐VLGKAGAAR(4)
16 1.17E-‐13 YES NO
179 gb|AAA58114.1|(59)-‐gb|AAA58114.1|(183)
50S ribosomal subunit protein L2 |(59)-‐50S ribosomal subunit protein L2 |(183)
HIGGGHKQAYR(7)-‐KVEADCR(1)
8 7.32E-‐09 YES NO
180 gb|AAA58114.1|(71)-‐gb|AAA58114.1|(68)
50S ribosomal subunit protein L2 |(71)-‐50S ribosomal subunit protein L2 |(68)
NKDGIPAVVER(2)-‐IVDFKR(5)
12 5.83E-‐07 YES YES 9.5 Å
181 gb|AAA58112.1|(70)-‐gb|AAA58112.1|(1)
50S ribosomal subunit protein L22 |(70)-‐50S ribosomal subunit protein L22 |(1)
VLESAIANAEHNDGADIDDLKVTK(21)-‐METIAK(1)
3 2.75E-‐10 YES YES 1.4 Å
182 gb|AAA57986.1|(19)-‐gb|AAA57986.1|(24)
50S ribosomal subunit protein L27 |(19)-‐50S ribosomal subunit protein L27 |(24)
DSEAKR(5)-‐LGVKR(4) 7 1.34E-‐09 YES YES 9.9 Å
183 gb|AAC76661.1|(10)-‐gb|AAC76661.1|(26)
50S ribosomal subunit protein L28 |(10)-‐50S ribosomal subunit protein L28 |(26)
VCQVTGKRPVTGNNR(7)-‐SHALNATKR(8)
1 7.38E-‐07 YES YES 17.3 Å
184 gb|AAC76661.1|(10)-‐gb|AAC76661.1|(54)
50S ribosomal subunit protein L28 |(10)-‐50S ribosomal subunit protein L28 |(54)
VCQVTGKRPVTGNNR(7)-‐VSAKGMR(4)
3 2.92E-‐06 YES YES 10.3 Å
185 gb|AAC76661.1|(26)-‐gb|AAC76661.1|(10)
50S ribosomal subunit protein L28 |(26)-‐50S ribosomal subunit protein L28 |(10)
SHALNATKR(8)-‐VCQVTGKR(7)
6 19 2.76E-‐17 YES YES 17.3 Å
186 gb|AAC76661.1|(26)-‐gb|AAC76661.1|(54)
50S ribosomal subunit protein L28 |(26)-‐50S ribosomal subunit protein L28 |(54)
SHALNATKR(8)-‐VSAKGMR(4)
13 3.05E-‐07 YES YES 24.0 Å
187 gb|AAA58117.1|(38)-‐gb|AAA58117.1|(1)
50S ribosomal subunit protein L3 |(38)-‐50S ribosomal subunit protein L3 |(1)
VTQVKDLANDGYR(5)-‐MIGLVGK(1)
5 2.52E-‐10 YES YES 14.2 Å
188 gb|AAC76660.1|(50)-‐gb|AAC76660.1|(33)
50S ribosomal subunit protein L33 |(50)-‐50S ribosomal subunit protein L33 |(33)
QHVIYKEAK(6)-‐TKPEKLELK(5)
5 7.73E-‐11 YES YES 1.1 Å
189 gb|AAA58116.1|(123)-‐gb|AAA58116.1|(137)
50S ribosomal subunit protein L4 |(123)-‐50S ribosomal subunit protein L4 |(137)
LIVVEKFSVEAPK(6)-‐LLAQKLK(5)
1 3.73E-‐04 YES YES 15.6 Å
190 gb|AAA58116.1|(166)-‐gb|AAA58116.1|(63)
50S ribosomal subunit protein L4 |(166)-‐50S ribosomal subunit protein L4 |(63)
NLHKVDVR(4)-‐QKGTGR(2) 1 5 1.50E-‐08 YES YES 17.4 Å
191 gb|AAA58116.1|(74)-‐gb|AAA58116.1|(63)
50S ribosomal subunit protein L4 |(74)-‐50S ribosomal subunit protein L4 |(63)
SGSIKSPIWR(5)-‐QKGTGR(2)
8 1.64E-‐05 YES YES 20.4 Å
192 gb|AAA58105.1|(47)-‐gb|AAA58105.1|(69)
50S ribosomal subunit protein L5 |(47)-‐50S ribosomal subunit protein L5 |(69)
ITLNMGVGEAIADKK(14)-‐PLITKAR(5)
4 5.72E-‐10 YES YES 12.7 Å
193 gb|AAC43084.1|(109)-‐ 50S ribosomal subunit protein L7/L12 KALEEAGAEVEVK(1)-‐ 1 8.98E-‐05 YES NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 17
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
gb|AAC43084.1|(101) |(109)-‐50S ribosomal subunit protein L7/L12 |(101)
EGVSKDDAEALK(5)
194 gb|AAC43084.1|(85)-‐gb|AAC43084.1|(71)
50S ribosomal subunit protein L7/L12 |(85)-‐50S ribosomal subunit protein
L7/L12 |(71)
EAKDLVESAPAALK(3)-‐VAVIKAVR(5)
6 9.19E-‐09 YES NO
195 gb|AAA97099.1|(22)-‐gb|AAA97099.1|(1)
50S ribosomal subunit protein L9 |(22)-‐50S ribosomal subunit protein L9 |(1)
VANLGSLGDQVNVKAGYAR(14)-‐MQVILLDK(1)
2 1.99E-‐15 YES YES 6.0 Å
196 ref|NP_414903.4|(2)-‐ref|NP_414903.4|(2)
5-‐aminolevulinate dehydratase (porphobilinogen synthase) |(2)-‐5-‐
aminolevulinate dehydratase (porphobilinogen synthase) |(2)
TDLIQRPR(1)-‐TDLIQRPR(1) 3 3 2.23E-‐11 YES ? 0.0 Å
197 ref|NP_414903.4|(217)-‐ref|NP_414903.4|(213)
5-‐aminolevulinate dehydratase (porphobilinogen synthase) |(217)-‐5-‐
aminolevulinate dehydratase (porphobilinogen synthase) |(213)
KSYQMNPMNR(1)-‐EAAGSALKGDR(8)
1 1.96E-‐12 YES YES 12.8 Å
198 ref|NP_418346.2|(48)-‐ref|NP_418346.2|(207)
6-‐N-‐hydroxylaminopurine resistance protein |(48)-‐6-‐N-‐hydroxylaminopurine
resistance protein |(207) KVHGGPDR(1)-‐TMQKR(4) 3 2.25E-‐03 YES YES 22.0 Å
199 gb|AAC73296.1|(53)-‐gb|AAC73296.1|(46)
acetyl-‐CoA carboxylase, carboxytransferase, alpha subunit |(53)-‐
acetyl-‐CoA carboxylase, carboxytransferase, alpha subunit |(46)
KIFADLGAWQIAQLAR(1)-‐EKSVELTR(2)
2 7.59E-‐07 YES YES 11.0 Å
200 gb|AAC74358.1|(10)-‐gb|AAC74358.1|(2)
aconitate hydratase 1 |(10)-‐aconitate hydratase 1 |(2)
EASKDTLQAK(4)-‐SSTLR(1) 2 1.06E-‐06 NO
201 gb|AAC74178.1|(10)-‐gb|AAC74178.1|(1)
acyl carrier protein (ACP) |(10)-‐acyl carrier protein (ACP) |(1)
KIIGEQLGVK(1)-‐MSTIEER(1)
7 8 1.34E-‐15 YES NO
202 gb|AAC74178.1|(10)-‐gb|AAC74178.1|(2)
acyl carrier protein (ACP) |(10)-‐acyl carrier protein (ACP) |(2)
KIIGEQLGVK(1)-‐STIEER(1) 4 24 3.10E-‐11 YES YES 9.4 Å
203 gb|AAC73576.1|(50)-‐gb|AAC73576.1|(40)
adenylate kinase |(50)-‐adenylate kinase |(40)
QAKDIMDAGK(3)-‐AAVKSGSELGK(4)
1 5.47E-‐05 YES YES 8.9 Å
204 gb|AAC74215.1|(19)-‐gb|AAC74215.1|(83)
adenylosuccinate lyase |(19)-‐adenylosuccinate lyase |(83)
YGDKVSALR(4)-‐IKTIER(2) 4 6 2.86E-‐12 YES YES 15.9 Å
205 gb|AAC73706.1|(27)-‐gb|AAC73706.1|(7)
alkyl hydroperoxide reductase, C22 subunit |(27)-‐alkyl hydroperoxide
reductase, C22 subunit |(7)
NGEFIEITEKDTEGR(10)-‐SLINTKIKPFK(6)
1 1.71E-‐05 NO
206 gb|AAA97297.1|(195)-‐gb|AAA97297.1|(188)
alternate gene names arcA, fexA, msp, seg, sfrA; CG Site No. 831; alternate gene names arcA, fexA, msp, seg, sfrA; DNA-‐binding response regulator in two-‐
component regulatory system with ArcB or CpxA |(195)-‐alternate gene names arcA, fexA, msp, seg, sfrA; CG Site No. 831; alternate gene names arcA, fexA, msp, seg, sfrA; DNA-‐binding response regulator in two-‐component regulatory
system with ArcB or CpxA |(188)
ELKPHDR(3)-‐KMTGR(1) 2 7.48E-‐06 NO
207 gb|AAC73341.1|(315)-‐gb|AAC73341.1|(315)
aminoacyl-‐histidine dipeptidase (peptidase D) |(315)-‐aminoacyl-‐histidine
dipeptidase (peptidase D) |(315) AALIAKSR(6)-‐AALIAKSR(6) 1 5.58E-‐05 NO
208 gb|AAC73341.1|(389)-‐gb|AAC73341.1|(315)
aminoacyl-‐histidine dipeptidase (peptidase D) |(389)-‐aminoacyl-‐histidine
dipeptidase (peptidase D) |(315)
LAGAKTEAK(5)-‐AALIAKSR(6)
2 1.28E-‐06 NO
209 gb|AAC73341.1|(59)-‐gb|AAC73341.1|(43)
aminoacyl-‐histidine dipeptidase (peptidase D) |(59)-‐aminoacyl-‐histidine
dipeptidase (peptidase D) |(43)
KPATAGMENR(1)-‐EKGFHVER(2)
2 4.41E-‐10 NO
210 gb|AAA97157.1|(80)-‐gb|AAA97157.1|(147)
aminopeptidase A/1 |(80)-‐aminopeptidase A/1 |(147)
ILLIGCGKER(8)-‐TNKSEPR(3) 4 8.51E-‐08 YES YES 18.2 Å
211 gb|AAC74018.1|(843)-‐gb|AAC74018.1|(839)
aminopeptidase N |(843)-‐aminopeptidase N |(839)
QEKMR(3)-‐YDAKR(4) 2 6.67E-‐08 YES YES 6.7 Å
212 gb|AAC74016.1|(286)-‐gb|AAC74016.1|(294)
asparaginyl tRNA synthetase |(286)-‐asparaginyl tRNA synthetase |(294)
ADDMKFFAER(5)-‐VDKDAVSR(3)
28 1.55E-‐10 NO
213 gb|AAC74016.1|(351)-‐gb|AAC74016.1|(294)
asparaginyl tRNA synthetase |(351)-‐asparaginyl tRNA synthetase |(294)
YLAEEHFKAPVVVK(8)-‐VDKDAVSR(3)
1 1.08E-‐10 NO
214 gb|AAC74014.1|(134)-‐gb|AAC74014.1|(121)
aspartate aminotransferase, PLP-‐dependent |(134)-‐aspartate
aminotransferase, PLP-‐dependent |(121)
RVWVSNPSWPNHKSVFNSAGLEVR(13)-‐
VAADFLAKNTSVKR(13) 3 8.25E-‐08 YES YES 18.1 Å
215 gb|AAC74014.1|(276)-‐gb|AAC74014.1|(93)
aspartate aminotransferase, PLP-‐dependent |(276)-‐aspartate
aminotransferase, PLP-‐dependent |(93)
AFSQMKAAIR(6)-‐GSALINDKR(8)
4 4.83E-‐12 YES YES 10.8 Å
216 gb|AAA97142.1|(30)-‐gb|AAA97142.1|(2)
aspartate carbomoyltransferase catalytic subunit |(30)-‐aspartate
carbomoyltransferase catalytic subunit |(2)
DDLNLVLATAAKLK(12)-‐ANPLYQK(1)
8 8.92E-‐23 YES YES 17.6 Å
217 gb|AAA97142.1|(41)-‐gb|AAA97142.1|(41)
aspartate carbomoyltransferase catalytic subunit |(41)-‐aspartate
carbomoyltransferase catalytic subunit |(41)
ANPQPELLKHK(9)-‐ANPQPELLKHK(9)
2 5.14E-‐08 YES ?
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 18
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
218 gb|AAC75632.1|(26)-‐gb|AAC75632.1|(1)
autonomous glycyl radical cofactor |(26)-‐autonomous glycyl radical cofactor |(1)
AANDDLLNSFWLLDSEKGEAR(17)-‐MITGIQITK(1)
7 4.32E-‐20 NO
219 gb|AAC75632.1|(62)-‐gb|AAC75632.1|(1)
autonomous glycyl radical cofactor |(62)-‐autonomous glycyl radical cofactor |(1)
EVPVEVKPEVR(7)-‐MITGIQITK(1)
4 2.35E-‐13 NO
220 gb|AAC73229.1|(1)-‐gb|AAC73229.1|(7)
bifunctional aconitate hydratase 2/2-‐methylisocitrate dehydratase |(1)-‐
bifunctional aconitate hydratase 2/2-‐methylisocitrate dehydratase |(7)
MLEEYR(1)-‐KHVAER(1) 8 50 1.74E-‐15 YES YES 10.2 Å
221 gb|AAC73229.1|(396)-‐gb|AAC73229.1|(373)
bifunctional aconitate hydratase 2/2-‐methylisocitrate dehydratase |(396)-‐bifunctional aconitate hydratase 2/2-‐methylisocitrate dehydratase |(373)
ACGVKGIRPGAYCEPK(5)-‐QAKDVAESDR(3)
4 2.92E-‐11 YES YES 19.3 Å
222 gb|AAC73229.1|(759)-‐gb|AAC73229.1|(722)
bifunctional aconitate hydratase 2/2-‐methylisocitrate dehydratase |(759)-‐bifunctional aconitate hydratase 2/2-‐methylisocitrate dehydratase |(722)
MDAAQLTEEGYYSVFGKSGAR(17)-‐AAGKLLDAHK(4)
2 7.04E-‐13 YES YES 12.7 Å
223 gb|AAC73144.1|(412)-‐gb|AAC73144.1|(504)
carbamoyl-‐phosphate synthase large subunit |(412)-‐carbamoyl-‐phosphate
synthase large subunit |(504)
GLEVGATGFDPKVSLDDPEALTK(12)-‐LAKLAGVR(3)
3 12 2.66E-‐07 YES YES 13.9 Å
224 gb|AAC73144.1|(649)-‐gb|AAC73144.1|(366)
carbamoyl-‐phosphate synthase large subunit |(649)-‐carbamoyl-‐phosphate
synthase large subunit |(366)
GVIVQYGGQTPLKLAR(13)-‐FNFEKFAGANDR(5)
7 2.30E-‐14 YES YES 20.6 Å
225 gb|AAA58092.1|(1)-‐gb|AAA58092.1|(145)
CG Site no. 234; RNA polymerase alpha subunit |(1)-‐CG Site no. 234; RNA polymerase alpha subunit |(145)
MQGSVTEFLKPR(1)-‐IKVQR(2)
4 3 7.21E-‐07 YES YES 23.9 Å
226 gb|AAA58092.1|(304)-‐gb|AAA58092.1|(297)
CG Site no. 234; RNA polymerase alpha subunit |(304)-‐CG Site no. 234; RNA polymerase alpha subunit |(297)
SLTEIKDVLASR(6)-‐TPNLGKK(6)
3 3.12E-‐12 YES NO
227 gb|AAA58092.1|(95)-‐gb|AAA58092.1|(145)
CG Site no. 234; RNA polymerase alpha subunit |(95)-‐CG Site no. 234; RNA polymerase alpha subunit |(145)
VQGKDEVILTLNK(4)-‐IKVQR(2)
1 4.77E-‐05 YES YES 12.1 Å
228 gb|AAA58136.1|(177)-‐gb|AAA58136.1|(57)
CG Site No. 61; translation elongation factor Tu |(177)-‐CG Site No. 61;
translation elongation factor Tu |(57)
GSALKALEGDAEWEAK(5)-‐AFDQIDNAPEEKAR(12)
1 2.65E-‐16 YES NO
229 gb|AAA58136.1|(209)-‐gb|AAA58136.1|(264)
CG Site No. 61; translation elongation factor Tu |(209)-‐CG Site No. 61;
translation elongation factor Tu |(264)
AIDKPFLLPIEDVFSISGR(4)-‐KLLDEGR(1)
14 1.20E-‐15 YES NO
230 gb|AAA58136.1|(209)-‐gb|AAA58136.1|(57)
CG Site No. 61; translation elongation factor Tu |(209)-‐CG Site No. 61;
translation elongation factor Tu |(57)
AIDKPFLLPIEDVFSISGR(4)-‐AFDQIDNAPEEKAR(12)
3 9.03E-‐15 YES NO
231 gb|AAA58136.1|(253)-‐gb|AAA58136.1|(295)
CG Site No. 61; translation elongation factor Tu |(253)-‐CG Site No. 61;
translation elongation factor Tu |(295)
ETQKSTCTGVEMFR(4)-‐GQVLAKPGTIKPHTK(6)
4 1.73E-‐13 YES YES 9.8 Å
232 gb|AAA58136.1|(253)-‐gb|AAA58136.1|(300)
CG Site No. 61; translation elongation factor Tu |(253)-‐CG Site No. 61;
translation elongation factor Tu |(300)
ETQKSTCTGVEMFR(4)-‐GQVLAKPGTIKPHTK(11)
5 3.62E-‐14 YES YES 12.8 Å
233 gb|AAA58136.1|(253)-‐gb|AAA58136.1|(300)
CG Site No. 61; translation elongation factor Tu |(253)-‐CG Site No. 61;
translation elongation factor Tu |(300)
ETQKSTCTGVEMFR(4)-‐PGTIKPHTK(5)
12 8.30E-‐14 YES YES 12.8 Å
234 gb|AAA58136.1|(264)-‐gb|AAA58136.1|(5)
CG Site No. 61; translation elongation factor Tu |(264)-‐CG Site No. 61;
translation elongation factor Tu |(5) KLLDEGR(1)-‐EKFER(2) 1 4.56E-‐05 YES NO
235 gb|AAA58136.1|(391)-‐gb|AAA58136.1|(300)
CG Site No. 61; translation elongation factor Tu |(391)-‐CG Site No. 61;
translation elongation factor Tu |(300)
TVGAGVVAKVLG(9)-‐PGTIKPHTK(5)
2 2.13E-‐08 YES NO
236 gb|AAA58136.1|(57)-‐gb|AAA58136.1|(264)
CG Site No. 61; translation elongation factor Tu |(57)-‐CG Site No. 61; translation
elongation factor Tu |(264)
AFDQIDNAPEEKAR(12)-‐KLLDEGR(1)
4 60 1.15E-‐19 YES NO
237 gb|AAA58137.1|(23)-‐gb|AAA58137.1|(143)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(23)-‐CG Site No.
732; alternate name far; elongation factor EF-‐G|(143)
NIGISAHIDAGKTTTTER(12)-‐IAFVNKMDR(6)
3 1.02E-‐10 YES YES 11.7 Å
238 gb|AAA58137.1|(370)-‐gb|AAA58137.1|(375)
CG Site No. 732; alternate name far; elongation factor EF-‐G |(370)-‐CG Site No. 732; alternate name far; elongation factor
EF-‐G|(375)
IVQMHANKR(8)-‐EEIKEVR(4)
2 8 2.57E-‐11 YES YES 17.2 Å
239 gb|AAA58137.1|(389)-‐gb|AAA58137.1|(440)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(389)-‐CG Site No. 732; alternate name far; elongation factor
EF-‐G|(440)
AGDIAAAIGLKDVTTGDTLCDPDAPIILER(11)-‐LAKEDPSFR(3)
3 8.65E-‐06 YES YES 8.6 Å
240 gb|AAA58137.1|(440)-‐gb|AAA58137.1|(134)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(440)-‐CG Site No. 732; alternate name far; elongation factor
EF-‐G|(134)
LAKEDPSFR(3)-‐YKVPR(2) 6 1.83E-‐08 YES NO
241 gb|AAA58137.1|(686)-‐gb|AAA58137.1|(643)
CG Site No. 732; alternate name far; elongation factor EF-‐G|(686)-‐CG Site No. 732; alternate name far; elongation factor
EF-‐G|(643)
ASYTMEFLKYDEAPSNVAQAVIEAR(9)-‐
GMLKGQESEVTGVK(4) 1 5.60E-‐07 YES YES 15.3 Å
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 19
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
242 gb|AAC73125.1|(166)-‐gb|AAC73125.1|(155)
chaperone Hsp70, co-‐chaperone with DnaJ |(166)-‐chaperone Hsp70, co-‐
chaperone with DnaJ |(155)
IAGLEVKR(7)-‐QATKDAGR(4)
11 30 7.82E-‐16 YES YES 9.3 Å
243 gb|AAC73125.1|(304)-‐gb|AAC73125.1|(246)
chaperone Hsp70, co-‐chaperone with DnaJ |(304)-‐chaperone Hsp70, co-‐
chaperone with DnaJ |(246)
AKLESLVEDLVNR(2)-‐KDQGIDLR(1)
4 3.33E-‐17 YES YES 16.2 Å
244 gb|AAC73125.1|(304)-‐gb|AAC73125.1|(299)
chaperone Hsp70, co-‐chaperone with DnaJ |(304)-‐chaperone Hsp70, co-‐
chaperone with DnaJ |(299)
AKLESLVEDLVNR(2)-‐HMNIKVTR(5)
8 69 4.42E-‐15 YES YES 9.7 Å
245 gb|AAC73125.1|(55)-‐gb|AAC73125.1|(263)
chaperone Hsp70, co-‐chaperone with DnaJ |(55)-‐chaperone Hsp70, co-‐
chaperone with DnaJ |(263)
TTPSIIAYTQDGETLVGQPAKR(21)-‐LKEAAEK(2)
2 17 1.76E-‐09 YES YES 14.2 Å
246 ref|NP_417434.4|(87)/gb
|AAA69126.1|(97)-‐ref|NP_417434.4|(93)/gb
|AAA69126.1|(103)
conserved protein, DUF469 family |(87)/ ORF_f118(97)-‐conserved protein, DUF469
family |(93)/ ORF_f118(103) KWLEER(1)-‐KLDEVR(1) 1 47 2.83E-‐10 NO
247 ref|NP_416797.2|(55)-‐ref|NP_416797.2|(1)
conserved protein, UPF0304 family |(55)-‐conserved protein, UPF0304 family |(1)
EFGELKEETCR(6)-‐MEMTNAQR(1)
22 3.38E-‐15 YES NO
248 ref|NP_416797.2|(55)-‐ref|NP_416797.2|(1)
conserved protein, UPF0304 family |(55)-‐conserved protein, UPF0304 family |(1)
ELDREFGELKEETCR(10)-‐MEMTNAQR(1)
1 2.23E-‐14 YES NO
249
gb|AAC77485.1|(30)/ gb|AAA67568.1|(30)-‐gb|AAC77485.1|(80)/ gb|AAA67568.1|(80)
conserved protein, UPF0438 family |(30)/ o137|(30)-‐conserved protein, UPF0438
family |(80)/ o137|(80)
HGDFTIKEAQLLER(7)-‐VWSKYMTR(4)
2 9.49E-‐09 NO
250 gb|AAC76668.1|(61)-‐gb|AAC76668.1|(61)
conserved protein, UPF0701 family |(61)-‐conserved protein, UPF0701 family |(61)
GKVECTLR(2)-‐GKVECTLR(2) 2 7 5.15E-‐08 NO
251 gb|AAC73821.1|(133)-‐gb|AAC73821.1|(148)
dihydrolipoyltranssuccinase |(133)-‐dihydrolipoyltranssuccinase |(148)
LLAEHNLDASAIKGTGVGGR(13)-‐EDVEKHLAK(5)
6 1.16E-‐09 YES YES 10.7 Å
252 gb|AAC73821.1|(133)-‐gb|AAC73821.1|(94)
dihydrolipoyltranssuccinase |(133)-‐dihydrolipoyltranssuccinase |(94)
LLAEHNLDASAIKGTGVGGR(13)-‐SEEKASTPAQR(4)
2 4.02E-‐16 NO
253 gb|AAC73821.1|(156)-‐gb|AAC73821.1|(148)
dihydrolipoyltranssuccinase |(156)-‐dihydrolipoyltranssuccinase |(148)
APAKESAPAAAAPAAQPALAAR(4)-‐EDVEKHLAK(5)
1 5.75E-‐05 NO
254 gb|AAC73821.1|(94)-‐gb|AAC73821.1|(85)
dihydrolipoyltranssuccinase |(94)-‐dihydrolipoyltranssuccinase |(85)
SEEKASTPAQR(4)-‐EGNSAGKETSAK(7)
3 7.30E-‐07 NO
255 gb|AAC73777.1|(148)-‐gb|AAC73777.1|(117)
DNA-‐binding transcriptional dual regulator of siderophore biosynthesis and
transport |(148)-‐DNA-‐binding transcriptional dual regulator of
siderophore biosynthesis and transport |(117)
EDEHAHEGK(9)-‐EIAAKHGIR(5)
1 1.50E-‐09 NO
256 gb|AAC73975.1|(129)-‐gb|AAC73975.1|(162)
DNA-‐binding transcriptional dual regulator, leucine-‐binding |(129)-‐DNA-‐binding transcriptional dual regulator,
leucine-‐binding |(162)
KLLGETLLR(1)-‐LVIKTR(4) 2 4.23E-‐08 YES YES 20.6 Å
257 gb|AAC43085.1|(1065)-‐gb|AAC43085.1|(1073)
DNA-‐directed RNA polymerase, beta-‐subunit |(1065)-‐DNA-‐directed RNA polymerase, beta-‐subunit |(1073)
IQPGDKMAGR(6)-‐HGNKGVISK(4)
6 5 3.19E-‐09 YES NO
258 gb|AAC43086.1|(1132)-‐gb|AAC43086.1|(781)
DNA-‐directed RNA polymerase, beta'-‐subunit |(1132)-‐DNA-‐directed RNA polymerase, beta'-‐subunit |(781)
IPQESGGTKDITGGLPR(9)-‐KGLADTALK(1)
3 1.82E-‐12 YES NO
259 gb|AAC43085.1|(1158)-‐gb|AAC43085.1|(1)
DNA-‐directed RNA polymerase, beta-‐subunit |(1158)-‐DNA-‐directed RNA polymerase, beta-‐subunit |(1)
QKVDLSTFSDEEVMR(2)-‐MVYSYTEK(1)
16 3.49E-‐13 YES NO
260 gb|AAC43086.1|(1192)-‐gb|AAC43086.1|(1072)
DNA-‐directed RNA polymerase, beta'-‐subunit |(1192)-‐DNA-‐directed RNA polymerase, beta'-‐subunit |(1072)
LVITPVDGSDPYEEMIPKWR(18)-‐TAGGKDLRPALK(5)
1 8.39E-‐10 YES NO
261 gb|AAC43086.1|(87)-‐gb|AAC43086.1|(50)
DNA-‐directed RNA polymerase, beta'-‐subunit |(87)-‐DNA-‐directed RNA polymerase, beta'-‐subunit |(50)
GVICEKCGVEVTQTK(6)-‐TFKPER(3)
2 3.61E-‐08 YES YES 18.1 Å
262 gb|AAC43086.1|(953)-‐gb|AAC43086.1|(992)
DNA-‐directed RNA polymerase, beta'-‐subunit |(953)-‐DNA-‐directed RNA polymerase, beta'-‐subunit |(992)
AAAESSIQVKNK(10)-‐TKESYK(2)
1 4.76E-‐09 YES YES 8.9 Å
263 gb|AAC43086.1|(992)-‐gb|AAC43086.1|(955)
DNA-‐directed RNA polymerase, beta'-‐subunit |(992)-‐DNA-‐directed RNA polymerase, beta'-‐subunit |(955)
TKESYK(2)-‐NKGSIK(2) 2 1.88E-‐07 YES YES 9.9 Å
264 gb|AAC76774.1|(101)-‐gb|AAC76774.1|(119)
D-‐ribose transporter subunit |(101)-‐D-‐ribose transporter subunit |(119)
ILLINPTDSDAVGNAVKMANQANIPVITLDR(17)-‐
QATKGEVVSHIASDNVLGGK(4)
2 3.48E-‐08 YES YES 7.9 Å
265 gb|AAC76774.1|(275)-‐gb|AAC76774.1|(281)
D-‐ribose transporter subunit |(275)-‐D-‐ribose transporter subunit |(281)
GVETADKVLK(7)-‐GEKVQAK(3)
4 4 1.53E-‐11 YES YES 7.0 Å
266 gb|AAC76774.1|(291)-‐gb|AAC76774.1|(143)
D-‐ribose transporter subunit |(291)-‐D-‐ribose transporter subunit |(143)
YPVDLKLVVK(6)-‐IAGDYIAKK(8)
6 1.39E-‐24 YES YES 16.4 Å
267 gb|AAC74220.1|(378)-‐gb|AAC74220.1|(58)
e14 prophage; isocitrate dehydrogenase, specific for NADP+ |(378)-‐e14 prophage; isocitrate dehydrogenase, specific for
HMGWTEAADLIVKGMEGAINAK(13)-‐AYKGER(3)
1 4.16E-‐07 YES YES 10.2 Å
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 20
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
NADP+ |(58)
268 gb|AAC74220.1|(62)-‐gb|AAC74220.1|(12)
e14 prophage; isocitrate dehydrogenase, specific for NADP+ |(62)-‐e14 prophage; isocitrate dehydrogenase, specific for
NADP+ |(12)
KISWMEIYTGEK(1)-‐VVVPAQGKK(8)
7 7.84E-‐17 YES YES 13.6 Å
269 gb|AAA69289.1|(195)-‐gb|AAA69289.1|(56)
enolase |(195)-‐enolase |(56) MGSEVFHHLAKVLK(11)-‐
DGDKSR(4) 2 3.47E-‐04 YES YES 18.4 Å
270 gb|AAC74370.1|(205)-‐gb|AAC74370.1|(201)
enoyl-‐[acyl-‐carrier-‐protein] reductase, NADH-‐dependent |(205)-‐enoyl-‐[acyl-‐carrier-‐protein] reductase, NADH-‐
dependent |(201)
KMLAHCEAVTPIR(1)-‐TLAASGIKDFR(8)
6 8.61E-‐13 YES YES 9.0 Å
271 gb|AAC74370.1|(205)-‐gb|AAC74370.1|(201)
enoyl-‐[acyl-‐carrier-‐protein] reductase, NADH-‐dependent |(205)-‐enoyl-‐[acyl-‐carrier-‐protein] reductase, NADH-‐
dependent |(201)
KMLAHCEAVTPIRR(1)-‐TLAASGIKDFR(8)
10 8.50E-‐16 YES YES 9.0 Å
272 gb|AAC76757.1|(384)-‐gb|AAC76757.1|(388)
F1 sector of membrane-‐bound ATP synthase, alpha subunit |(384)-‐F1 sector of membrane-‐bound ATP synthase, alpha
subunit |(388)
VGGAAQTKIMK(8)-‐KLSGGIR(1)
5 3.30E-‐13 NO
273 gb|AAC73899.1|(101)-‐gb|AAC73899.1|(27)
Fe-‐binding and storage protein |(101)-‐Fe-‐binding and storage protein |(27)
AVQLGGVALGTTQVINSKTPLK(18)-‐KATVELLNR(1)
2 1.23E-‐11 YES YES 18.3 Å
274 gb|AAC73899.1|(140)-‐gb|AAC73899.1|(27)
Fe-‐binding and storage protein |(140)-‐Fe-‐binding and storage protein |(27)
AIGEAKDDDTADILTAASR(6)-‐KATVELLNR(1)
4 4.07E-‐16 YES YES 11.5 Å
275 gb|AAB18612.1|(255)-‐gb|AAB18612.1|(240)
formamidopyrimidine-‐DNA glycosylase |(255)-‐formamidopyrimidine-‐DNA
glycosylase |(240)
VCGTPIVATKHAQR(10)-‐KGEPCR(1)
1 6 3.77E-‐07 YES YES 11.9 Å
276 gb|AAC74683.1|(430)-‐gb|AAC74683.1|(426)
fumarate hydratase (fumarase C),aerobic Class II |(430)-‐fumarate hydratase (fumarase C),aerobic Class II |(426)
AHKEGLTLK(3)-‐AAEIAKK(6) 3 1.17E-‐13 YES YES 6.1 Å
277 gb|AAC74683.1|(69)-‐gb|AAC74683.1|(127)
fumarate hydratase (fumarase C),aerobic Class II |(69)-‐fumarate hydratase (fumarase C),aerobic Class II |(127)
VNEDLGLLSEEKASAIR(12)-‐KVHPNDDVNK(1)
3 7.00E-‐19 YES YES 8.6 Å
278 gb|AAC74683.1|(8)-‐gb|AAC74683.1|(426)
fumarate hydratase (fumarase C),aerobic Class II |(8)-‐fumarate hydratase (fumarase
C),aerobic Class II |(426)
SEKDSMGAIDVPADK(3)-‐AAEIAKK(6)
4 1.11E-‐08 YES NO
279 gb|AAC77114.1|(550)/gb
|AAA97053.1|(550)-‐ gb|AAC77114.1|(527)/gb
|AAA97053.1|(527)
formamidopyrimidine/5-‐formyluracil/ 5-‐hydroxymethyluracil DNA
glycosylase|(550)/fumarate reductase, flavoprotein subunit |(550)-‐
formamidopyrimidine/5-‐formyluracil/ 5-‐hydroxymethyluracil DNA
glycosylase|(527)/fumarate reductase, flavoprotein subunit |(527)
DDVNFLKHTLAFR(7)-‐KESR(1)
1 1.50E-‐04 YES YES 12.1 Å
280 gb|AAC74319.1|(57)-‐gb|AAC74319.1|(120)
global DNA-‐binding transcriptional dual regulator H-‐NS |(57)-‐global DNA-‐binding transcriptional dual regulator H-‐NS |(120)
KLQQYR(1)-‐TPAVIKK(6) 4 1.94E-‐09 NO
281 gb|AAC74319.1|(57)-‐gb|AAC74319.1|(57)
global DNA-‐binding transcriptional dual regulator H-‐NS |(57)-‐global DNA-‐binding transcriptional dual regulator H-‐NS |(57)
KLQQYR(1)-‐KLQQYR(1) 2 2.48E-‐07 NO
282 gb|AAC74319.1|(96)-‐gb|AAC74319.1|(136)
global DNA-‐binding transcriptional dual regulator H-‐NS |(96)-‐global DNA-‐binding transcriptional dual regulator H-‐NS |(136)
AQRPAKYSYVDENGETK(6)-‐SLDDFLIKQ(8)
4 8.13E-‐18 NO
283 gb|AAC74319.1|(96)-‐gb|AAC74319.1|(57)
global DNA-‐binding transcriptional dual regulator H-‐NS |(96)-‐global DNA-‐binding transcriptional dual regulator H-‐NS |(57)
AQRPAKYSYVDENGETK(6)-‐KLQQYR(1)
7 6.84E-‐15 NO
284 gb|AAC74319.1|(96)-‐gb|AAC74319.1|(57)
global DNA-‐binding transcriptional dual regulator H-‐NS |(96)-‐global DNA-‐binding transcriptional dual regulator H-‐NS |(57)
PAKYSYVDENGETK(3)-‐KLQQYR(1)
1 5.38E-‐07 NO
285 gb|AAB03004.1|(2)-‐gb|AAB03004.1|(231)
glutamine synthetase |(2)-‐glutamine synthetase |(231)
SAEHVLTMLNEHEVK(1)-‐FNTMTKK(6)
6 1.11E-‐07 NO
286 gb|AAC73774.1|(312)-‐gb|AAC73774.1|(2)
glutamyl-‐tRNA synthetase |(312)-‐glutamyl-‐tRNA synthetase |(2)
EFCKR(4)-‐SEAEAR(1) 2 2.69E-‐06 YES NO
287 gb|AAC74726.1|(101)-‐gb|AAC74726.1|(108)
glutaredoxin-‐4 |(101)-‐glutaredoxin-‐4 |(108)
GELQQLIKETAAK(8)-‐YKSEEPDAE(2)
4 26 4.34E-‐10 YES NO
288 gb|AAB18476.1|(329)-‐gb|AAB18476.1|(430)
glutathione oxidoreductase |(329)-‐glutathione oxidoreductase |(430)
LFNNKPDEHLDYSNIPTVVFSHPPIGTVGLTEPQAR(5)-‐
KDFDNTVAIHPTAAEEFVTMR(1)
2 1.96E-‐10 YES YES 15.5 Å
289 gb|AAA69114.1|(174)-‐gb|AAA69114.1|(205)
glutathione synthetase |(174)-‐glutathione synthetase |(205)
VKEGDPNLGVIAETLTEHGTR(2)-‐
YCMAQNYLPAIKDGDKR(12)
7 7.66E-‐22 NO
290 gb|AAC74849.1|(124)-‐gb|AAC74849.1|(192)
glyceraldehyde-‐3-‐phosphate dehydrogenase A |(124)-‐glyceraldehyde-‐3-‐phosphate dehydrogenase A |(192)
VVMTGPSKDNTPMFVK(8)-‐TVDGPSHKDWR(8)
1 4.74E-‐07 YES YES 18.9 Å
291 gb|AAC74849.1|(213)-‐ glyceraldehyde-‐3-‐phosphate GASQNIIPSSTGAAKAVGK(1 2 1.94E-‐07 YES YES 19.8 Å
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 21
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
gb|AAC74849.1|(192) dehydrogenase A |(213)-‐glyceraldehyde-‐3-‐phosphate dehydrogenase A |(192)
5)-‐TVDGPSHKDWR(8)
292 gb|AAC74849.1|(225)-‐gb|AAC74849.1|(249)
glyceraldehyde-‐3-‐phosphate dehydrogenase A |(225)-‐glyceraldehyde-‐3-‐phosphate dehydrogenase A |(249)
VLPELNGKLTGMAFR(8)-‐LEKAATYEQIK(3)
2 6.37E-‐06 YES YES 15.9 Å
293 gb|AAC74849.1|(331)-‐gb|AAC74849.1|(331)
glyceraldehyde-‐3-‐phosphate dehydrogenase A |(331)-‐glyceraldehyde-‐3-‐phosphate dehydrogenase A |(331)
VLDLIAHISK(10)-‐VLDLIAHISK(10)
1 5.65E-‐15 YES ? 0.0 Å
294 gb|AAB18536.1|(592)-‐gb|AAB18536.1|(584)
glycine-‐tRNA synthetase, beta subunit |(592)-‐glycine-‐tRNA synthetase, beta
subunit |(584)
VSNILAKSDEVLSDR(7)-‐TLDAAAALAAANKR(13)
1 1.17E-‐20 NO
295 gb|AAA97042.1|(117)-‐gb|AAA97042.1|(34)
GroEL protein |(117)-‐GroEL protein |(34) AVAAGMNPMDLKR(12)-‐
VTLGPKGR(6) 2 5 1.90E-‐12 YES YES 16.4 Å
296 gb|AAA97042.1|(122)-‐gb|AAA97042.1|(117)
GroEL protein |(122)-‐GroEL protein |(117) GIDKAVTAAVEELK(4)-‐AVAAGMNPMDLKR(12)
1 2.00E-‐13 YES YES 8.7 Å
297 gb|AAA97042.1|(122)-‐gb|AAA97042.1|(34)
GroEL protein |(122)-‐GroEL protein |(34) GIDKAVTAAVEELK(4)-‐
VTLGPKGR(6) 2 4.95E-‐05 YES YES 17.2 Å
298 gb|AAA97042.1|(272)-‐gb|AAA97042.1|(226)
GroEL protein |(272)-‐GroEL protein |(226) GIVKVAAVK(4)-‐KISNIR(1) 2 3 9.57E-‐09 YES YES 22.2 Å
299 gb|AAA97042.1|(364)-‐gb|AAA97042.1|(371)
GroEL protein |(364)-‐GroEL protein |(371) QQIEEATSDYDREKLQER(14
)-‐VAKLAGGVAVIK(3) 1 1.73E-‐04 YES YES 10.6 Å
300 gb|AAA97042.1|(371)-‐gb|AAA97042.1|(364)
GroEL protein |(371)-‐GroEL protein |(364) VAKLAGGVAVIK(3)-‐
EKLQER(2) 4 3.71E-‐06 YES YES 10.6 Å
301 gb|AAA97042.1|(51)-‐gb|AAA97042.1|(117)
GroEL protein |(51)-‐GroEL protein |(117) SFGAPTITKDGVSVAR(9)-‐AVAAGMNPMDLKR(12)
1 7.39E-‐10 YES YES 17.9 Å
302 gb|AAA97042.1|(51)-‐gb|AAA97042.1|(34)
GroEL protein |(51)-‐GroEL protein |(34) SFGAPTITKDGVSVAR(9)-‐
VTLGPKGR(6) 2 4 2.24E-‐07 YES YES 13.2 Å
303 gb|AAA97042.1|(7)-‐gb|AAA97042.1|(15)
GroEL protein |(7)-‐GroEL protein |(15) DVKFGNDAR(3)-‐VKMLR(2) 4 11 1.23E-‐06 YES YES 14.0 Å
304 gb|AAA97042.1|(7)-‐gb|AAA97042.1|(15)
GroEL protein |(7)-‐GroEL protein |(15) AAKDVKFGNDAR(6)-‐
VKMLR(2) 2 2.21E-‐05 YES YES 14.0 Å
305 gb|AAA97041.1|(1)-‐gb|AAA97041.1|(13)
GroES protein |(1)-‐GroES protein |(13) MNIRPLHDR(1)-‐VIVKR(4) 1 6.54E-‐05 YES NO
306 gb|AAA97041.1|(1)-‐gb|AAA97041.1|(15)
GroES protein |(1)-‐GroES protein |(15) MNIRPLHDR(1)-‐RKEVETK(2)
3 4.96E-‐10 YES NO
307 gb|AAC43098.1|(3)-‐gb|AAC43098.1|(18)
histonelike DNA-‐binding protein HU-‐alpha (NS2) (HU-‐2) |(3)-‐histonelike DNA-‐binding
protein HU-‐alpha (NS2) (HU-‐2) |(18)
MNKTQLIDVIAEK(3)-‐AELSKTQAK(5)
4 6.18E-‐10 YES YES 11.9 Å
308 gb|AAC74782.1|(66)-‐gb|AAC74782.1|(57)
integration host factor (IHF), DNA-‐binding protein, alpha subunit |(66)-‐integration host factor (IHF), DNA-‐binding protein,
alpha subunit |(57)
NPKTGEDIPITAR(3)-‐DKNQRPGR(2)
1 1.37E-‐05 YES YES 23.8 Å
309 gb|AAC73743.1|(675)-‐gb|AAC73743.1|(738)
leucyl-‐tRNA synthetase |(675)-‐leucyl-‐tRNA synthetase |(738)
VWKLVYEHTAK(3)-‐LAKAPTDGEQDR(3)
5 1.48E-‐20 NO
310 gb|AAC76752.1|(51)-‐gb|AAC76752.1|(246)
L-‐glutamine:D-‐fructose-‐6-‐phosphate aminotransferase |(51)-‐L-‐glutamine:D-‐fructose-‐6-‐phosphate aminotransferase
|(246)
LGKVQMLAQAAEEHPLHGGTGIAHTR(3)-‐
QDIESNLQYDAGDKGIYR(14)
22 5.06E-‐15 YES NO
311 gb|AAC73227.1|(339)-‐gb|AAC73227.1|(299)
lipoamide dehydrogenase, E3 component is part of three enzyme complexes |(339)-‐lipoamide dehydrogenase, E3 component is part of three enzyme complexes |(299)
KHYFDPK(1)-‐VDKQLR(3) 1 1.90E-‐04 NO
312 gb|AAC73227.1|(370)-‐gb|AAC73227.1|(299)
lipoamide dehydrogenase, E3 component is part of three enzyme complexes |(370)-‐lipoamide dehydrogenase, E3 component is part of three enzyme complexes |(299)
EKGISYETATFPWAASGR(2)-‐VDKQLR(3)
2 4.32E-‐06 NO
313 gb|AAA97029.1|(156)-‐gb|AAA97029.1|(2)
lysyl-‐tRNA synthetase |(156)-‐lysyl-‐tRNA synthetase |(2)
ALRPLPDKFHGLQDQEVR(8)-‐SEQETR(1)
3 1.86E-‐05 YES NO
314 gb|AAA58038.1|(217)-‐gb|AAA58038.1|(82)
malate dehydrogenase |(217)-‐malate dehydrogenase |(82)
IQNAGTEVVEAKAGGGSATLSMGQAAAR(12)-‐
KPGMDR(1) 5 7.72E-‐11 YES YES 14.8 Å
315 gb|AAA58038.1|(99)-‐gb|AAA58038.1|(134)
malate dehydrogenase |(99)-‐malate dehydrogenase |(134)
SDLFNVNAGIVKNLVQQVAK(12)-‐KAGVYDK(1)
1 9.27E-‐07 YES YES 11.4 Å
316 gb|AAA69143.1|(312)-‐gb|AAA69143.1|(379)
malate synthase |(312)-‐malate synthase |(379)
KLNDDR(1)-‐VQKNSR(3) 4 4.73E-‐09 YES YES 23.6 Å
317 gb|AAA58215.1|(54)-‐gb|AAA58215.1|(2)
maltodextrin phosphorylase |(54)-‐maltodextrin phosphorylase |(2)
AQPFAKPVANQR(6)-‐SQPIFNDK(1)
6 5.93E-‐15 YES YES 9.0 Å
318 gb|AAB03063.1|(109)-‐gb|AAB03063.1|(72)
matches PS00017: ATP_GTP_A; similar to Pasteurella haemolytica hypoth. protein ORF1; heat shock induced |(109)-‐matches
PS00017: ATP_GTP_A; similar to Pasteurella haemolytica hypoth. protein
ORF1; heat shock induced |(72)
DLTDAAVKMVR(8)-‐LAKLANAPFIK(3)
2 1.68E-‐18 YES YES 15.0 Å
319 gb|AAC75175.1|(466)-‐gb|AAC75175.1|(403)
methionyl-‐tRNA synthetase |(466)-‐methionyl-‐tRNA synthetase |(403)
YVDEQAPWVVAKQEGR(12)-‐NAGFINKR(7)
1 1 2.32E-‐08 YES YES 13.7 Å
320 gb|AAC74920.1|(212)-‐gb|AAC74920.1|(202)
multifunctional 2-‐keto-‐3-‐deoxygluconate 6-‐phosphate aldolase and 2-‐keto-‐4-‐
EAVEGAKL(7)-‐ITKLAR(3) 1 9.69E-‐06 YES YES 18.2 Å
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 22
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
hydroxyglutarate aldolase and oxaloacetate decarboxylase |(212)-‐
multifunctional 2-‐keto-‐3-‐deoxygluconate 6-‐phosphate aldolase and 2-‐keto-‐4-‐
hydroxyglutarate aldolase and oxaloacetate decarboxylase |(202)
321 gb|AAB03056.1|(226)-‐gb|AAB03056.1|(233)
ORF_f248 |(226)-‐ORF_f248 |(233) DTQQLLKETR(7)-‐QMTKHLR(4)
1 8 9.98E-‐14 YES YES 10.7 Å
322 gb|AAA69076.1|(102)-‐gb|AAA69076.1|(2)
ORF_f441; third start codon |(102)-‐ORF_f441; third start codon |(2)
LGQDAAPEKLGVDR(9)-‐SEISR(1)
13 7.23E-‐09 YES YES 18.7 Å
323 gb|AAA57918.1|(356)-‐gb|AAC73989.1|(293)
ORF_f746 |(356)-‐pyruvate formate lyase I |(293)
TLVTKNSFR(5)-‐DLKAGK(3) 1 3.98E-‐05 YES YES 13.2 Å
324 gb|AAC73539.1|(1)-‐gb|AAC73539.1|(81)
peptidyl-‐prolyl cis/trans isomerase (trigger factor) |(1)-‐peptidyl-‐prolyl
cis/trans isomerase (trigger factor) |(81)
MQVSVETTQGLGR(1)-‐NFIDAIIKEK(8)
2 2.27E-‐12 YES NO
325 gb|AAC73539.1|(279)-‐gb|AAC73539.1|(272)
peptidyl-‐prolyl cis/trans isomerase (trigger factor) |(279)-‐peptidyl-‐prolyl
cis/trans isomerase (trigger factor) |(272) ELKSAIR(3)-‐KNMER(1) 1 25 8.30E-‐08 YES YES 10.4 Å
326 gb|AAC73539.1|(361)-‐gb|AAC73539.1|(392)
peptidyl-‐prolyl cis/trans isomerase (trigger factor) |(361)-‐peptidyl-‐prolyl
cis/trans isomerase (trigger factor) |(392)
TNELKADEER(5)-‐NKELMDNMR(2)
2 9.59E-‐15 YES YES 17.0 Å
327 gb|AAC75299.1|(140)-‐gb|AAC75299.1|(195)
periplasmic glycerophosphodiester phosphodiesterase |(140)-‐periplasmic
glycerophosphodiester phosphodiesterase |(195)
FPMGKSDFR(5)-‐KYGYTGK(1)
6 3.87E-‐14 YES YES 19.8 Å
328 gb|AAA58200.1|(91)-‐gb|AAA58200.1|(68)
phosphoenolpyruvate carboxykinase |(91)-‐phosphoenolpyruvate
carboxykinase |(68)
GKNDNKPLSPETWQHLK(2)-‐SPKDK(3)
1 1.26E-‐04 YES YES 12.1 Å
329 gb|AAC73782.1|(273)-‐gb|AAC73782.1|(2)
phosphoglucomutase |(273)-‐phosphoglucomutase |(2)
FMHLDKDGAIR(6)-‐AIHNR(1)
3 3.63E-‐12 NO
330 gb|AAA69093.1|(27)-‐gb|AAA69093.1|(84)
phosphoglycerate kinase |(27)-‐phosphoglycerate kinase |(84)
ADLNVPVKDGKVTSDAR(8)-‐DKLSNPVR(2)
2 1.56E-‐08 YES YES 18.9 Å
331 gb|AAA69093.1|(30)-‐gb|AAA69093.1|(84)
phosphoglycerate kinase |(30)-‐phosphoglycerate kinase |(84)
DGKVTSDAR(3)-‐DKLSNPVR(2)
2 5.22E-‐07 YES YES 14.0 Å
332 gb|AAA69093.1|(49)-‐gb|AAA69093.1|(84)
phosphoglycerate kinase |(49)-‐phosphoglycerate kinase |(84)
ASLPTIELALKQGAK(11)-‐DKLSNPVR(2)
2 1.95E-‐08 YES YES 14.5 Å
333 gb|AAC73842.1|(100)-‐gb|AAC73842.1|(113)
phosphoglyceromutase 1 |(100)-‐phosphoglyceromutase 1 |(113)
HYGALQGLNKAETAEK(10)-‐YGDEQVKQWR(7)
5 9.31E-‐13 YES YES 9.9 Å
334 gb|AAC73842.1|(146)-‐gb|AAC73842.1|(86)
phosphoglyceromutase 1 |(146)-‐phosphoglyceromutase 1 |(86)
LSEKELPLTESLALTIDR(4)-‐SWKLNER(3)
1 1.31E-‐05 YES YES 13.4 Å
335 ref|YP_026170.1|(853)-‐ref|YP_026170.1|(622)
phosphoribosylformyl-‐glycineamide synthetase |(853)-‐phosphoribosylformyl-‐
glycineamide synthetase |(622)
QLGDKPADVR(5)-‐AKGDALAR(2)
1 3.45E-‐10 NO
336 gb|AAC75738.1|(38)-‐gb|AAC75738.1|(1)
pleiotropic regulatory protein for carbon source metabolism |(38)-‐pleiotropic regulatory protein for carbon source
metabolism |(1)
IGVNAPKEVSVHR(7)-‐MLILTR(1)
2 6.06E-‐08 YES YES 23.6 Å
337 gb|AAA57967.1|(286)-‐ref|NP_417633.4|(1)
polynucleotide phosphorylase |(286)-‐polynucleotide
phosphorylase/polyadenylase |(1) ITDKQER(4)-‐MLNPIVR(1) 2 29 4.17E-‐08 YES YES 15.3 Å
338 ref|NP_418204.2|(35)-‐ref|NP_418204.2|(35)
predicted cytoplasmic sugar-‐binding protein |(35)-‐predicted cytoplasmic
sugar-‐binding protein |(35)
LGHTDTLVVCDAGLPIPKSTTR(18)-‐
LGHTDTLVVCDAGLPIPKSTTR(18)
1 2.15E-‐09 NO
339 ref|NP_415943.4|(13)-‐ref|NP_415943.4|(1)
predicted protein |(13)-‐predicted protein |(1)
LKNENPR(2)-‐MFPEYR(1) 3 5.47E-‐10 NO
340 ref|NP_415943.4|(25)-‐ref|NP_415943.4|(36)
predicted protein |(25)-‐predicted protein |(36)
FMSLFDKHNK(7)-‐KEGSDGR(1)
2 8.45E-‐13 NO
341 ref|NP_415193.2|(331)-‐ref|NP_415193.2|(281)
predicted protein with nucleoside triphosphate hydrolase domain |(331)-‐
predicted protein with nucleoside triphosphate hydrolase domain |(281)
KAALAAER(1)-‐NTKSGLR(3) 1 13 3.47E-‐07 NO
342 gb|AAA57971.1|(131)-‐gb|AAA57971.1|(125)
protein chain initiation factor 2 |(131)-‐protein chain initiation factor 2 |(125)
EAQQKAER(5)-‐EAEESAKR(7)
9 3.23E-‐08 NO
343 gb|AAA57971.1|(131)-‐gb|AAA57971.1|(149)
protein chain initiation factor 2 |(131)-‐protein chain initiation factor 2 |(149)
EAQQKAER(5)-‐EAAEQAKR(7)
8 2.63E-‐10 NO
344 gb|AAA57971.1|(149)-‐gb|AAA57971.1|(149)
protein chain initiation factor 2 |(149)-‐protein chain initiation factor 2 |(149)
EAAEQAKR(7)-‐EAAEQAKR(7)
1 10 2.97E-‐09 NO
345 gb|AAA57971.1|(184)-‐gb|AAA57971.1|(149)
protein chain initiation factor 2 |(184)-‐protein chain initiation factor 2 |(149)
EQEAAELKR(8)-‐EAAEQAKR(7)
1 1.54E-‐06 NO
346 gb|AAA57971.1|(186)-‐gb|AAA57971.1|(194)
protein chain initiation factor 2 |(186)-‐protein chain initiation factor 2 |(194)
RKAEEEAR(2)-‐KLEEEAR(1) 4 1.34E-‐09 NO
347 gb|AAA57971.1|(186)-‐gb|AAA57971.1|(194)
protein chain initiation factor 2 |(186)-‐protein chain initiation factor 2 |(194)
KAEEEARR(1)-‐KLEEEAR(1) 10 1.52E-‐15 NO
348 gb|AAA57971.1|(194)-‐gb|AAA57971.1|(186)
protein chain initiation factor 2 |(194)-‐protein chain initiation factor 2 |(186)
KLEEEARR(1)-‐KAEEEARR(1) 1 2.58E-‐09 NO
349 gb|AAA57971.1|(194)-‐ protein chain initiation factor 2 |(194)-‐ KLEEEAR(1)-‐KAEEEAR(1) 16 4.84E-‐11 NO
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 23
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
gb|AAA57971.1|(186) protein chain initiation factor 2 |(186)
350 gb|AAC74788.1|(97)-‐gb|AAC74788.1|(125)
protein chain initiation factor IF-‐3 |(97)-‐protein chain initiation factor IF-‐3 |(125)
EIKFRPGTDEGDYQVK(3)-‐AKITLR(2)
3 7.18E-‐15 YES NO
351 gb|AAC73575.1|(238)-‐gb|AAC73575.1|(362)
protein refolding molecular co-‐chaperone Hsp90, Hsp70-‐dependent; heat-‐shock
protein; ATPase |(238)-‐protein refolding molecular co-‐chaperone Hsp90, Hsp70-‐dependent; heat-‐shock protein; ATPase
|(362)
NKSEITDEEYKEFYK(2)-‐VLQMLEKLAK(7)
3 1.71E-‐06 YES NO
352 gb|AAC73575.1|(516)-‐gb|AAC73575.1|(524)
protein refolding molecular co-‐chaperone Hsp90, Hsp70-‐dependent; heat-‐shock
protein; ATPase |(516)-‐protein refolding molecular co-‐chaperone Hsp90, Hsp70-‐dependent; heat-‐shock protein; ATPase
|(524)
VKALLGER(2)-‐VKDVR(2) 2 2.73E-‐06 YES NO
353 gb|AAA97280.1|(57)-‐gb|AAA97280.1|(84)
purine-‐nucleoside phosphorylase |(57)-‐purine-‐nucleoside phosphorylase |(84)
KISVMGHGMGIPSCSIYTK(1)-‐ELITDFGVKK(9)
2 9.58E-‐05 YES YES 10.7 Å
354 gb|AAC73225.1|(368)-‐gb|AAC73225.1|(305)
pyruvate dehydrogenase, decarboxylase component E1, thiamin-‐binding |(368)-‐pyruvate dehydrogenase, decarboxylase component E1, thiamin-‐binding |(305)
GGHDPKK(6)-‐KDTSGK(1) 2 9.03E-‐12 YES YES 20.5 Å
355 gb|AAC73225.1|(375)-‐gb|AAC73225.1|(381)
pyruvate dehydrogenase, decarboxylase component E1, thiamin-‐binding |(375)-‐pyruvate dehydrogenase, decarboxylase component E1, thiamin-‐binding |(381)
IYAAFKK(6)-‐AQETKGK(5) 1 4 1.64E-‐09 YES YES 12.3 Å
356 gb|AAC73988.1|(97)-‐gb|AAC73988.1|(2)
pyruvate formate lyase activating enzyme 1 |(97)-‐pyruvate formate lyase activating
enzyme 1 |(2)
KEGIHTCLDTNGFVR(1)-‐SVIGR(1)
2 7.19E-‐15 YES YES 18.9 Å
357 gb|AAC73989.1|(117)-‐gb|AAC73989.1|(162)
pyruvate formate lyase I |(117)-‐pyruvate formate lyase I |(162)
ALIPFGGIKMIEGSCK(9)-‐KSGVLTGLPDAYGR(1)
7 1.31E-‐17 YES YES 17.6 Å
358 gb|AAC73989.1|(162)-‐gb|AAC73989.1|(616)
pyruvate formate lyase I |(162)-‐pyruvate formate lyase I |(616)
KSGVLTGLPDAYGR(1)-‐KTGNTPDGR(1)
4 1.49E-‐15 YES YES 9.4 Å
359 gb|AAC73989.1|(195)-‐gb|AAC73989.1|(2)
pyruvate formate lyase I |(195)-‐pyruvate formate lyase I |(2)
VALYGIDYLMKDK(11)-‐SELNEK(1)
3 5 9.10E-‐13 YES YES 10.6 Å
360 gb|AAC73989.1|(195)-‐gb|AAC73989.1|(235)
pyruvate formate lyase I |(195)-‐pyruvate formate lyase I |(235)
VALYGIDYLMKDK(11)-‐ALGQMKEMAAK(6)
1 9.63E-‐06 YES YES 14.1 Å
361 gb|AAC73989.1|(454)-‐gb|AAC73989.1|(616)
pyruvate formate lyase I |(454)-‐pyruvate formate lyase I |(616)
TMLYAINGGVDEKLK(13)-‐KTGNTPDGR(1)
3 3.66E-‐10 YES YES 7.4 Å
362 gb|AAC73989.1|(725)-‐gb|AAC73989.1|(591)
pyruvate formate lyase I |(725)-‐pyruvate formate lyase I |(591)
EMLLDAMENPEKYPQLTIR(12)-‐IQKLHTYR(3)
1 4.57E-‐10 YES NO
363 gb|AAC74746.1|(68)-‐gb|AAC74746.1|(76)
pyruvate kinase I |(68)-‐pyruvate kinase I |(76)
TAAILLDTKGPEIR(9)-‐TMKLEGGNDVSLK(3)
1 4.35E-‐05 YES YES 20.7 Å
364 gb|AAC73283.1|(1)-‐gb|AAC73283.1|(7)
ribosome recycling factor |(1)-‐ribosome recycling factor |(7)
MISDIR(1)-‐KDAEVR(1) 1 17 4.69E-‐07 YES YES 10.4 Å
365 gb|AAC73283.1|(138)-‐gb|AAC73283.1|(1)
ribosome recycling factor |(138)-‐ribosome recycling factor |(1)
RDANDKVK(6)-‐MISDIR(1) 2 2.03E-‐08 YES YES 9.3 Å
366 gb|AAC73283.1|(138)-‐gb|AAC73283.1|(1)
ribosome recycling factor |(138)-‐ribosome recycling factor |(1)
DANDKVK(5)-‐MISDIR(1) 9 2.70E-‐09 YES YES 9.3 Å
367 gb|AAC73283.1|(138)-‐gb|AAC73283.1|(7)
ribosome recycling factor |(138)-‐ribosome recycling factor |(7)
DANDKVK(5)-‐KDAEVR(1) 8 3.67E-‐06 YES YES 12.9 Å
368 gb|AAC73283.1|(15)-‐gb|AAC73283.1|(26)
ribosome recycling factor |(15)-‐ribosome recycling factor |(26)
MDKCVEAFK(3)-‐TQISKIR(5) 1 7.67E-‐05 YES YES 17.2 Å
369 gb|AAC73283.1|(15)-‐gb|AAC73283.1|(7)
ribosome recycling factor |(15)-‐ribosome recycling factor |(7)
MDKCVEAFK(3)-‐KDAEVR(1)
2 4.57E-‐10 YES YES 12.5 Å
370 ref|NP_415894.4|(96)-‐ref|NP_415894.4|(104)
stress-‐induced protein, ATP-‐binding protein |(96)-‐stress-‐induced protein, ATP-‐
binding protein |(104)
VHVHVEEGSPKDR(11)-‐ILELAKK(6)
2 5.36E-‐10 NO
371 gb|AAC73822.1|(215)-‐gb|AAC73822.1|(1)
succinyl-‐CoA synthetase, beta subunit |(215)-‐succinyl-‐CoA synthetase, beta
subunit |(1)
QGDLICLDGKLGADGNALFR(10)-‐MNLHEYQAK(1)
4 9.28E-‐11 YES YES 11.1 Å
372 gb|AAC73822.1|(295)-‐gb|AAC73822.1|(360)
succinyl-‐CoA synthetase, beta subunit |(295)-‐succinyl-‐CoA synthetase, beta
subunit |(360)
LHGGEPANFLDVGGGATKER(18)-‐KLADSGLNIIAAK(1)
5 2.78E-‐30 YES YES 14.9 Å
373 gb|AAC74789.1|(286)-‐gb|AAC74789.1|(570)
threonyl-‐tRNA synthetase |(286)-‐threonyl-‐tRNA synthetase |(570)
LKEYQYQEVK(2)-‐VKADLR(2)
17 3.71E-‐08 YES NO
374 gb|AAC74789.1|(614)-‐gb|AAC74789.1|(638)
threonyl-‐tRNA synthetase |(614)-‐threonyl-‐tRNA synthetase |(638)
GKDLGSMDVNEVIEK(2)-‐SLKQLEE(3)
3 2.34E-‐05 YES YES 13.4 Å
375 gb|AAC74789.1|(638)-‐gb|AAC74789.1|(241)
threonyl-‐tRNA synthetase |(638)-‐threonyl-‐tRNA synthetase |(241)
SLKQLEE(3)-‐LEEAAKR(6) 3 5.87E-‐06 YES YES 21.3 Å
376 gb|AAA97278.1|(299)-‐gb|AAA97278.1|(291)
thymidine phosphorylase |(299)-‐thymidine phosphorylase |(291)
AKLQAVLDNGK(2)-‐LAKDDAEAR(3)
3 1.08E-‐07 NO
377 gb|AAA69102.1|(46)-‐gb|AAA69102.1|(226)
transketolase |(46)-‐transketolase |(226) DFLKHNPQNPSWADR(4)-‐
DIDGHDAASIKR(11) 4 6 9.58E-‐16 YES YES 15.2 Å
378 gb|AAA69102.1|(46)-‐gb|AAA69102.1|(226)
transketolase |(46)-‐transketolase |(226) DFLKHNPQNPSWADRDR(4
)-‐DIDGHDAASIKR(11) 2 1.82E-‐07 YES YES 15.2 Å
379 ref|YP_026188.1|(591)-‐gb|AAA69102.1|(453)
transketolase 1, thiamin-‐binding |(591)-‐transketolase |(453)
VVSMPSTDAFDKQDAAYR(12)-‐MAALMKQR(6)
1 1.98E-‐06 YES YES 16.4 Å
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 24
No. ID_protein1-‐protein2 Name_protein1-‐protein2 Sequence_pep1-‐pep2 #Spec exp_1
#Spec exp_2
Best E-‐Value
Struct. in PDB?
Cα-‐Cα <24 Å?
Note
380 gb|AAC73970.1|(39)-‐gb|AAC73970.1|(42)
translation initiation factor IF-‐1 |(39)-‐translation initiation factor IF-‐1 |(42)
VELENGHVVTAHISGKMR(16)-‐KNYIR(1)
3 5 1.31E-‐17 NO
381 gb|AAC74709.1|(85)-‐gb|AAC74709.1|(90)
tyrosyl-‐tRNA synthetase |(85)-‐tyrosyl-‐tRNA synthetase |(90)
FQQAGHKPVALVGGATGLIGDPSFKAAER(25)-‐
KLNTEETVQEWVDK(1) 12 6.70E-‐16 YES YES 11.2 Å
382 ref|NP_417046.1|(293)-‐ ref|NP_417046.1|(375)
serine hydroxymethyltransferase |(293)-‐ serine hydroxymethyltransferase |(375)
TYQQQVAKNAK(8)-‐GFKEAEAK(3)
4 6.45E-‐17 YES YES 8.1 Å
383 ref|NP_417046.1|(331)-‐ ref|NP_417046.1|(62)
serine hydroxymethyltransferase |(331)-‐ serine hydroxymethyltransferase |(62)
NLTGKEADAALGR(5)-‐YAEGYPGKR(8)
4 25 4.28E-‐22 YES YES 11.4 Å
384 ref|NP_417046.1|(346)-‐ ref|NP_417046.1|(331)
serine hydroxymethyltransferase |(346)-‐ serine hydroxymethyltransferase |(331)
ANITVNKNSVPNDPK(7)-‐NLTGKEADAALGR(5)
3 3.93E-‐10 YES YES 5.5 Å
385 ref|NP_417046.1|(346)-‐ ref|NP_417046.1|(62)
serine hydroxymethyltransferase |(346)-‐ serine hydroxymethyltransferase |(62)
ANITVNKNSVPNDPK(7)-‐YAEGYPGKR(8)
2 4.67E-‐09 YES YES 12.4 Å
386 ref|NP_416993.2|(116)-‐ref|NP_416993.2|(148)
uracil phosphoribosyltransferase |(116)-‐uracil phosphoribosyltransferase |(148)
NEETLEPVPYFQKLVSNIDER(13)-‐KAGCSSIK(1)
1 9.67E-‐06 YES YES 13.5 Å
387 ref|NP_416993.2|(26)-‐ref|NP_416993.2|(14)
uracil phosphoribosyltransferase |(26)-‐uracil phosphoribosyltransferase |(14)
EQDISTKR(7)-‐HKLGLMR(2) 1 3.59E-‐09 YES YES 14.2 Å
388 ref|NP_416993.2|(26)-‐ref|NP_416993.2|(26)
uracil phosphoribosyltransferase |(26)-‐uracil phosphoribosyltransferase |(26)
EQDISTKR(7)-‐EQDISTKR(7) 3 1.57E-‐06 YES ? 0.0 Å
389 ref|NP_416993.2|(67)-‐ref|NP_416993.2|(70)
uracil phosphoribosyltransferase |(67)-‐uracil phosphoribosyltransferase |(70)
VTIEGWNGPVEIDQIKGK(16)-‐KITVVPILR(1)
2 7.26E-‐17 YES YES 8.3 Å
390 gb|AAC73282.1|(10)-‐gb|AAC73282.1|(2)
uridylate kinase |(10)-‐uridylate kinase |(2) PVYKR(4)-‐ATNAK(1) 1 1.76E-‐07 YES NO
391 gb|AAC73282.1|(68)-‐gb|AAC73282.1|(212)
uridylate kinase |(68)-‐uridylate kinase |(212)
GAGLAKAGMNR(6)-‐DHKLPIR(3)
1 4.51E-‐07 YES YES 17.8 Å
392 gb|AAA97155.1|(909)-‐gb|AAA97155.1|(926)
valyl-‐tRNA synthetase |(909)-‐valyl-‐tRNA synthetase |(926)
IENKLANEGFVAR(4)-‐APEAVIAKER(8)
5 17 4.22E-‐16 NO
393 gb|AAA57977.1|(392)-‐gb|AAA57977.1|(412)
yhbF; phosphoglucosamine mutase |(392)-‐yhbF; phosphoglucosamine mutase
|(412)
YTAGSGDPLEHESVKAVTAEVEAALGNR(15)-‐KSGTEPLIR(1)
10 2.08E-‐13 NO
394 gb|AAA57977.1|(412)-‐gb|AAA57977.1|(305)
yhbF; phosphoglucosamine mutase |(412)-‐yhbF; phosphoglucosamine mutase
|(305) KSGTEPLIR(1)-‐AKVGDR(2) 1 2.16E-‐05 NO
Summary
Out of the 394 cross-‐links, 237 correspond to proteins or protein complexes with structural models deposited in PDB. Consistent with the structural models are
179 (75.5%) cross-‐links.
From Exp_1 From Exp_2 Overlap Union
235 cross-‐links (656 spectra) 208 cross-‐links (1372 spectra) 49 cross-‐links (923 spectra) 394 cross-‐links (2028 spectra)
51 inter-‐protein (91 spectra) 75 inter-‐protein (195 spectra) 2 inter-‐protein (68 spectra) 124 inter-‐protein (286 spectra)
184 intra-‐protein (565 spectra) 133 intra-‐protein (1177 spectra) 47 intra-‐protein (855 spectra) 270 intra-‐protein (1742 spectra)
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 25
Supplementary Table 12. Inter-linked peptides identified from a C. elegans lysate. Filtering criteria: mass accuracy 10 ppm, FDR < 5%, E-value < 0.01.
No. Protein1-‐Protein2 CGC name Peptide1-‐Peptide2 #Spec-‐total
#Spec-‐exp1
#Spec-‐exp2
Best E-‐value
inter-‐molecular 1 C04F12.4(102)-‐C52B9.9(135) [RPL-‐14]-‐[MEC-‐18] AKLTDFER(2)-‐KAQMR(1) 1 1 1.48E-‐06 2 C06A8.2(387)-‐C53H9.1(108) [C06A8.2]-‐[RPL-‐27] ANAKLVEVK(4)-‐KALVEVK(1) 1 1 7.05E-‐03 3 C09D4.5(152)-‐T08A11.2(673) [RPL-‐19]-‐[T08A11.2] AKQLADQAQAR(2)-‐KSWQAR(1) 1 1 1.10E-‐03
4 C16A3.9(106)-‐C41G7.3(281)/R10D12.17(232)
[RPS-‐13]-‐[C41G7.3/SRW-‐14] KDIDSK(1)-‐KHIER(1) 2 2 7.30E-‐07
5 C53B4.5(1)/Y41E3.2(1)-‐M163.4(731)
[COL-‐119/DPY-‐4]-‐[GFI-‐3] DIDSKIKAYR(1)-‐LLGKDPR(4) 1 1 4.37E-‐03
6 F10B5.1(81)-‐F55C5.7a(383) [RPL-‐10]-‐[RSKD-‐1] NCGKDGFHLR(4)-‐KYVMK(1) 1 1 9.64E-‐04 7 F11C3.3(1144)-‐C18E9.7(832) [UNC-‐54]-‐[C18E9.7] AKSDLQR(2)-‐KSADR(1) 1 1 6.25E-‐07
8
F23D12.10(510)-‐F42G2.5(3)/Y57G11A.1a(492)/Y57G11A.1b(777)
[F23D12.10]-‐[F42G2.5/TAG-‐273] KNWLR(1)-‐MPEKK(4) 1 1 8.65E-‐08
9 T14G12.2(206)-‐Y82E9BR.18(802) [AEX-‐4]-‐[Y82E9BR.18] LQEPKLNR(5)-‐IQEKR(4) 1 1 6.02E-‐04 10 Y111B2A.16(0)-‐F32D8.2(488) [TAF-‐7.2]-‐[F32D8.2] MSIYPGVR(1)-‐THKCVR(3) 1 1 9.89E-‐04 intra-‐molecular
11
C04F6.1(1572)-‐C04F6.1(1565)/F59D8.2(1572)-‐F59D8.2(1565) VIT-‐5/VIT-‐4 FLKEAR(3)-‐HSKNAR(3) 1 1 2.74E-‐07
12 C14B9.7(77)-‐C14B9.7(86) RPL-‐21 GAVGIIVNKR(9)-‐GNILPKR(6) 1 1 3.59E-‐07 13 C53H9.1(116)-‐C53H9.1(108) RPL-‐27 SKFEER(2)-‐KALVEVK(1) 1 1 6.09E-‐07 14 F11C3.3(1144)-‐F11C3.3(1139) UNC-‐54 AKSDLQR(2)-‐SKADR(2) 1 1 5.00E-‐09 15 F11C3.3(979)-‐F11C3.3(971) UNC-‐54 QSKDHQIR(3)-‐KAESEK(1) 1 1 3.27E-‐04 16 F46E10.10a(239)-‐F46E10.10a(237) MDH-‐1 KLSSAMSAAK(1)-‐GGVIIEKR(7) 2 2 1.19E-‐09 17 F52D10.3a(12)-‐F52D10.3a(76) FTT-‐2 AKLAEQAER(2)-‐KQQMAK(1) 4 1 3 2.30E-‐07 18 F53G12.10(51)-‐F53G12.10(47) RPL-‐7 AEKYVQEYR(3)-‐TQYFKR(5) 1 1 6.53E-‐07 19 F53G12.10(99)-‐F53G12.10(102) RPL-‐7 GINQLHPKPR(8)-‐KALQILR(1) 3 3 1.25E-‐11 20 F55D10.2(60)-‐F55D10.2(53) RPL-‐25.1 TSKMDHFR(3)-‐KSAPK(1) 1 1 3.17E-‐06 21 K02F2.2(323)-‐K02F2.2(332) AHCY-‐1 DTIKPQVDR(4)-‐YTLKNGR(4) 1 1 3.66E-‐07 22 K10B3.7(256)-‐K10B3.7(265) GDP-‐3 LEKPASLDDIK(3)-‐KVIK(1) 1 1 4.48E-‐04 23 K12F2.1(1850)-‐K12F2.1(1854) MYO-‐3 HQDTEKNWR(6)-‐KAER(1) 1 1 3.46E-‐06 24 M01F1.2(100)-‐M01F1.2(78) RPL-‐16 GNEALKNLR(6)-‐APGKIFWR(4) 4 4 3.04E-‐15 25 M117.2(12)-‐M117.2(76) PAR-‐5 AKLAEQAER(2)-‐KQQLAK(1) 5 2 3 3.86E-‐11 26 M117.2(144)-‐M117.2(123) PAR-‐5 AAVVEKSQK(6)-‐MKGDYYR(2) 2 2 1.13E-‐11 27 R13A5.8(56)-‐R13A5.8(49) RPL-‐9 KWFGVR(1)-‐IGKSTLR(3) 1 1 1.01E-‐05 28 R13A5.8(62)-‐R13A5.8(49) RPL-‐9 KELAAIR(1)-‐IGKSTLR(3) 9 3 6 7.73E-‐13 29 R13A5.8(62)-‐R13A5.8(56) RPL-‐9 KELAAIR(1)-‐KWFGVR(1) 3 3 3.26E-‐11 30 T05E11.1(197)-‐T05E11.1(206) RPS-‐5 KKDELER(1)-‐VAKSNR(3) 2 2 2.03E-‐08 31 T05E11.1(198)-‐T05E11.1(206) RPS-‐5 KDELER(1)-‐VAKSNR(3) 2 1 1 5.66E-‐07 32 T05F1.3(125)-‐T05F1.3(129) RPS-‐19 ILSKQGR(4)-‐KDLDR(1) 3 1 2 2.72E-‐06 33 Y105E8B.1a(127)-‐Y105E8B.1a(127) LEV-‐11 KVMENR(1)-‐KVMENR(1) 2 2 3.08E-‐09 34 Y105E8B.1a(232)-‐Y105E8B.1a(127) LEV-‐11 LKEAETR(2)-‐KVMENR(1) 1 1 1.30E-‐07 35 Y105E8B.1a(34)-‐Y105E8B.1a(27) LEV-‐11 QITEKLER(5)-‐ADAAEEKVR(7) 2 2 2.30E-‐11 36 Y24D9A.8a(312)-‐Y24D9A.8a(305) Y24D9A.8 TLEKLIEAK(4)-‐NFAKDAR(4) 2 2 1.31E-‐10 37 Y38A10A.5(368)-‐Y38A10A.5(361) CRT-‐1 KKAEEEK(1)-‐KAEEEAR(1) 1 1 1.40E-‐07 38 Y38H6C.1a(34)-‐Y38H6C.1a(32) DCT-‐16 KDDEPER(1)-‐IAATYKK(6) 2 1 1 3.04E-‐08 39 ZC434.2(95)-‐ZC434.2(101) RPS-‐7 DILILAKR(7)-‐ILPKPQR(4) 2 2 3.33E-‐11
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 26
Supplementary Table 13. Cross-linking analysis of the CNGP complex using DSS
No. Inter-linked sites #total spec (#pair)
Best E-value
Cα-Cα Distance (Å)
Manual eval. of spec qual.
Compatible with
structure?
1 Cbf5_K161-Cbf5_K114 7 (1) 1.34E-‐08 23.99 High Yes
2 Cbf5_K161-‐Gar1_K115 19 (2) 5.66E-‐12 18.85 High Yes
3 Cbf5_K161-‐Gar1_K59 4 (1) 7.67E-‐14 21.95 High Yes
4 Cbf5_K180-Cbf5_K134 11 (1) 1.15E-‐18 10.99 High Yes
5 Cbf5_K180-‐Cbf5_K137 2 (1) 7.99E-‐17 13.58 High Yes
6 Cbf5_K180-Nop10_K18 9 (1) 2.42E-‐15 12.98 High Yes
7 Cbf5_K180-Nop10_K19 5 (1) 1.92E-‐12 15.46 High Yes
8 Cbf5_K267-Cbf5_K31 10 (1) 5.68E-‐15 11.38 High Yes
9 Cbf5_K87-Cbf5_K114 10 (1) 3.50E-‐12 14.71 High Yes
10 Cbf5_K9-‐Cbf5_K50 7 (2) 2.94E-‐13 16.57 Low Yes
11 Gar1_K77-‐Cbf5_K161 5 (1) 4.92E-‐07 18.89 High Yes
12 Gar1_K77-Gar1_K115 22 (2) 9.61E-‐16 11.01 High Yes
13 Nhp2_K131-‐Nhp2_K133 2 (1) 2.79E-‐05 5.86 High Yes
14 Nhp2_K143-‐Nhp2_K133 3 (1) 7.37E-‐06 16.4 High Yes
15 Nhp2_K45-‐Nhp2_K49 2 (1) 6.92E-‐08 6.07 High Yes
16 Nop10_K1-Nop10_K19 6 (2) 1.11E-‐20 11.48 High Yes
17 Nop10_K28-‐Nop10_K49 2 (1) 5.55E-‐04 n/a Low 18 Nop10_K40-Nhp2_K61 15 (2) 1.26E-‐11 12.3 High Yes
19 Nop10_K40-Nhp2_K65 4 (1) 3.84E-‐10 13.96 High Yes
20 Nop10_K40-Nhp2_K69 3 (1) 1.36E-‐03 17.18 High Yes
21 Nop10_K40-‐Nop10_K49 2 (1) 4.80E-‐05 n/a Low 22 Gar1_K115-Gar1_K104 1 (1) 3.42E-‐05 24.75 Mid No
Filtering criteria: 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 2 spectral copies (except Gar1_K115-‐Gar1_K104). Shown in bold are 12 cross-linked lysine pairs observed previously in the BS3 experiments (Supplementary Table 8). The Gar1_K115-‐Gar1_K104 cross-link is listed because it was observed nine times with BS3. n/a, not available (either or both residues are invisible in the CNGP structure).
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 27
Supplementary Table 14. Cross-linking analysis of the CNGP complex using EDC
No. Inter-linked sites #total spec (#pair)
Best E-value
Cα-Cα Distance (Å)
Manual eval. of spec
qual.
Compatible with
structure?
1 Cbf5_K267-‐Cbf5_D273 9 (2) 1.27E-‐07 19.11 High No
2 Gar1_K108-‐Gar1_D96 2 (1) 4.91E-‐06 19.06 High No
3 Gar1_K115-‐Gar1_D107 11 (1) 9.44E-‐07 20.46 High No
4 Gar1_K115-‐Gar1_E105 4 (1) 1.49E-‐04 24.12 High No
5 Gar1_K72-‐Gar1_E80 3 (1) 5.21E-‐09 21.27 Mid No
6 Nhp2_K143-‐Nhp2_E152 6 (1) 2.36E-‐05 14.17 High No
7 Nhp2_K37-‐Nhp2_D18 5 (1) 5.40E-‐10 n/a Mid
Filtering criteria: 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 2 spectral copies. The maximum Cα-Cα distance of an EDC cross-linked K-D or K-E pair is expected to be 9.7 Å or 11 Å, respectively. This is calculated based on the projected distances of C-C, C-O, C-N (amide), and C-N (amine) bonds, at 1.26, 1.17, 1.15, and 1.19 Å, respectively.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 28
Supplementary Table 15. Cross-linking analysis of the CNGP complex using AMAS
No. Inter-linked sites #total spec (#pair)
Best E-value
Cα-Cα Distance
(Å)
Manual eval. of spec qual.
Compatible with
structure?
1 Cbf5_K114-‐Cbf5_C339 4 (1) 8.67E-‐08 13.07 High Yes
2 Cbf5_K358-‐Cbf5_C339 6 (2) 1.63E-‐06 n/a High
3 Cbf5_K370-‐Cbf5_C339 3 (1) 3.66E-‐07 n/a Mid
4 Cbf5_K383-‐Cbf5_C339 5 (1) 2.29E-‐06 n/a High
5 Gar1_K77-‐Gar1_C94 7 (1) 6.84E-‐11 6.82 High Yes
Filtering criteria: 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 2 spectral copies. The Cα-Cα distance of a K-C pair cross-linked by AMAS is expected to be less than 16.5 Å. This is calculated based on the projected distances of C-C, C-O, C-N (amide), C-N (amine), C-S, and S-S bonds, at 1.26, 1.17, 1.15, 1.19, 1.44, and 1.50 Å, respectively.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 29
Supplementary Table 16. Cross-linking analysis of the CNGP complex using Sulfo-GMBS
No. Inter-linked sites #total spec (#pair)
Best E-value
Cα-Cα Distance (Å)
Manual eval. of spec qual.
Compatible with
structure?
1 Cbf5_K114-Cbf5_C339 7 (1) 2.52E-‐07 13.07 High Yes
2 Cbf5_K137-‐Cbf5_C125 6 (1) 2.40E-‐07 17.73 High Yes
3 Cbf5_K161-‐Cbf5_C125 3 (1) 9.62E-‐08 26.76 High No
4 Cbf5_K161-‐Cbf5_C190 11 (1) 7.99E-‐07 23.68 High No
5 Cbf5_K358-‐Cbf5_C280 2 (1) 4.17E-‐07 n/a High
6 Cbf5_K358-Cbf5_C339 15 (3) 4.08E-‐11 n/a High
7 Cbf5_K360-‐Cbf5_C339 2 (1) 4.14E-‐06 n/a High
8 Cbf5_K363-Cbf5_C339 4 (2) 1.21E-‐06 n/a High
9 Cbf5_K367-‐Cbf5_C339 4 (1) 1.01E-‐07 n/a High
10 Cbf5_K370-‐Cbf5_C339 11 (2) 4.78E-‐09 n/a High
11 Cbf5_K383-‐Cbf5_C280 3 (1) 5.76E-‐08 n/a High
12 Cbf5_K383-Cbf5_C339 6 (1) 1.22E-‐05 n/a High
13 Cbf5_K50-‐Cbf5_C280 6 (1) 9.15E-‐07 21.45 High No
14 Cbf5_K50-‐Gar1_C94 3 (1) 3.11E-‐07 61.68 High No
15 Gar1_K59-‐Cbf5_C190 2 (1) 1.44E-‐06 24.88 Mid No
16 Gar1_K59-‐Gar1_C94 3 (1) 3.73E-‐07 21.44 High No
17 Gar1_K77-Gar1_C94 14 (2) 1.66E-‐12 6.82 High Yes
18 Nop10_K19-‐Cbf5_C125 2 (1) 7.48E-‐08 17.13 High Yes
19 Nop10_K40-‐Cbf5_C125 4 (1) 7.64E-‐07 32.61 High No
20 Nop10_K40-‐Gar1_C94 3 (1) 9.30E-‐10 56.17 High No
Filtering criteria: 10 ppm mass accuracy, FDR < 5%, E-value < 0.01, and ≥ 2 spectral copies. Cross-links observed with AMAS (see Supplementary Table 15) are highlighted in yellow. The maximum Cα-Cα distance of a K-C pair cross-linked by Sulfo-GMBS is 19 Å. This is calculated based on the projected distances of C-C, C-O, C-N (amide), C-N (amine), C-S, and S-S bonds, at 1.26, 1.17, 1.15, 1.19, 1.44, and 1.50 Å, respectively.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 30
Supplementary Table 17. xQuest search of the GST [d0/d4]-BS3 cross-linking data On-‐line xQuest search
parameters (A) 4misclv_no-‐a-‐ion_0.01MS2
(B) 4misclv_a-‐ion_0.01MS2
(C) 4misclv_no-‐a-‐ion_0.02MS2
(D) 4misclv_a-‐ion_0.02MS2
(E) 2misclv_a-‐ion_0.01MS2
(F) 2misclv_no-‐a-‐ion_0.10MS2
Filtering of results F-‐F hits
non F-‐F
F-‐F hits
non F-‐F
F-‐F hits
non F-‐F
F-‐F hits
non F-‐F
F-‐F hits
non F-‐F
F-‐F hits
non F-‐F
#spec with score >= 4 47 81 36 70 42 42 29 44 30 61 16 14
#spec with score >= 5 41 38 26 31 28 23 22 23 22 32 10 6
#spec with score >= 6 28 19 21 15 21 16 20 18 18 18 6 4
#spec with score >= 7 20 13 17 15 16 9 17 11 16 16 5 3
#spec with score >= 8 17 10 15 10 11 6 14 8 13 12 4 2
#spec with score >= 9 15 6 13 6 10 4 11 5 12 9 4 2
#spec with score >= 10 11 6 10 6 7 4 8 5 9 7 2 2
Top 10 7 3 7 3 7 3 8 2 5 5 6 4
Top 20 14 6 13 7 14 6 13 7 11 9 12 8
(i) #spec scoring higher than the 1st reverse match in own charge
group
10 0 9 0 5 0 5 0 4 0 4 0
(ii) #spec scoring higher than the 2nd reverse match in own charge
group
11 4 10 4 12 4 11 4 6 4 10 4
(iii) #spec scoring higher than the 3rd reverse match in own charge
group
15 8 10 8 16 8 11 8 9 8 15 8
Details of (i) # F-‐F hits for +3, +4, +5, and +6 spectra are:
5, 3, 1, and 1 1, 3, 2, and 3 1, 1, 2, and 1 1, 1, 0, and 3 2, 1, 1, and 0 4, 0, 0, and 0
Details of (ii) # of F-‐F hits for +3, +4, +5, and +6 spectra are:
5, 3, 1, and 2 1, 4, 2, and 3 4, 3, 3, and 2 2, 3, 3, and 3 2, 2, 1, and 1 6, 2, 2, and 0
Details of (iii) # of F-‐F hits for +3, +4, +5, and +6 spectra are:
7, 3, 3, and 2 1, 4, 2, and 3. 8, 3, 3, and 2. 2, 3, 3, and 3 3, 3, 2, and 1 10, 2, 3, and 0
The cross-linking data of GST exp_1 in Supplementary Table 7 was searched with xQuest on-line. For the search parameters indicated in the table, 2/4misclv, 2 or 4 missed cleavages allowed; a-ion/no-a-ion, a, b, y ions or only b, y ions considered; 0.01/0.02/0.1MS2, MS/MS mass tolerance at 0.01, 0.02, or 0.1 m/z. Other parameters were: Cross-linker BS3_delta4; enzyme trypsin; xlink mass-shift 138.0680796; monolink mass-shifts 156.0786442, 155.0964278; isotope shift 4.0247 Da; retention time tolerance +/- 3 minutes; reactive amino acid, K; ionization mode ESI; fixed modification C 57.02146; MS1 mass tolerance 10 ppm; min 3 AA, max 40AA. A F-F cross-link means that both peptides match to the forward sequence of GST. The best result in each category is colored red. No spectra with > +6 charge scored 4 or higher.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 31
Supplementary Table 18. Comparison of the GST [d0/d4]-BS3 cross-links identified by xQuest and pLink GST cross-links identified by xQuest using parameter set (A) and filtering condition (i) in Supplementary Table 17 are
shown.
Scan# Charge xQuest score
GST_pep1–pep2 Sites in pep.
Sites in GST intra/ inter
Dist. (Å)
< 24 Å?
Spec IDed by pLink? Xlink IDed by pLink? (Supp Table 7, exp_1)
1498 3 17.2 KRIEAIPQIDK–YLKSSK K1-‐K3 K180-‐K193 intra 14.2 Yes Yes, same result Yes
(13 spectral copies)
3208 4 15.4 LLLEYLEEKYEEHLYERDEG
DKWRNK–YLKSSK K22-‐K3 K39-‐K193 intra 28.8 No No No
3315 6 13.6
KFELGLEFPNLPYYIDGDVK–
YIADKHNMLGGCPKERAEISMLEGAVLDIR
K1-‐K14 K44-‐K86 inter 21.4 Yes No No
1481 5 13.3 IKGLVQPTR–
YIADKHNMLGGCPKER K2-‐K14 K10-‐K86 inter 25.2 No No No
3435 3 12.9 IAYSKDFETLKVDFLSK–
WRNKK K11-‐K4 K118-‐K43 inter 28.5 No
Yes, different result (VDFLSKLPEMLK(6)-‐
IAYSKDFETLK(5), K124-‐K112, 20.4 or 22.0 Å)
No
3719 3 12.6 DFETLKVDFLSK–WRNKK K6-‐K4 K118-‐K43 inter 28.5 No No No
3247 4 11.6 DFETLKVDFLSK–
LLLEYLEEKYEEHLYERDEGDK
K6-‐K9 K118-‐K26 intra 37.7 No No No
3767 3 11.4 IAYSKDFETLK–RIEAIPQIDKYLK
K5-‐K10 K112-‐K190 intra 36.4 No No No
4064 3 11.2 IAYSKDFETLK–RIEAIPQIDKYLK
K5-‐K10 K112-‐K190 intra 36.4 No No No
3386 4 10.5 IAYSKDFETLKVDFLSK–
WRNKK K11-‐K4 K118-‐K43 inter 28.5 No
Yes, different result (VDFLSKLPEMLK(6)-‐
IAYSKDFETLK(5), K124-‐K112, 20.4 or 22.0 Å)
No
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 32
Supplementary Table 19. xQuest search of the GST [d0]-BS3 cross-linking data On-‐line xQuest search parameters (A) 4misclv_no-‐a-‐ion_0.01MS2 (C) 4misclv_no-‐a-‐ion_0.02MS2
Filtering of results F-‐F hits non F-‐F F-‐F hits non F-‐F
#spec with score >= 4 59 113 59 113
#spec with score >= 5 46 94 45 76
#spec with score >= 6 42 71 32 56
#spec with score >= 7 28 50 23 32
#spec with score >= 8 15 32 15 22
#spec with score >= 9 12 18 11 17
#spec with score >= 10 8 12 8 12
Top 10 5 5 5 5
Top 20 8 12 8 12
(i) #spec scoring higher than the 1st reverse match in own charge group
5 0 7 0
(ii) #spec scoring higher than the 2nd reverse match in own charge group
7 4 9 4
(iii) #spec scoring higher than the 3rd reverse match in own charge group
9 8 11 8
Details of (i) # F-‐F hits for +3, +4, +5, and +6 spectra:
1,2,1,1 1,3,1,2
Details of (ii) # F-‐F hits for +3, +4, +5, and +6 spectra:
1,2,1,3 1,3,3,2
Details of (iii) # F-‐F hits for +3, +4, +5, and +6 spectra:
1,2,3,3 1,3,4,3
GST was cross-linked with [d0]-BS3 alone. Parameters sets (A) and (C) in Supplementary Table 17 were used for on-line xQuest search, except for isotope shift being zero. The best result is colored red.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 33
Supplementary Table 20. Comparison of the GST [d0]-BS3 cross-links identified by xQuest and pLink Cross-links colored blue are identified by both xQuest and pLink. For those colored red or orange, identical or similar ones are identified in Supplementary Table 7.
xQuest Result (search parameter set (C), filtering condition (i) from Supp Table 19)
Scan# Charge xQuest score
GST_pep1–pep2 Sites in pep.
Sites in GST Intra/ inter
Cα-‐Cα (Å)
< 24 Å? Spec IDed by pLink?
Xlink IDed by pLink?
3899 6 27.57 DEGDKWRNKKFELGLEFPNLPYYIDGDVK– DEGDKWRNKKFELGLEFPNLPYYIDGDVK
K9-‐K9 K43-‐K43 intra 51.0 No No No
3896 5 27.18 DEGDKWRNKKFELGLEFPNLPYYIDGDVK– DEGDKWRNKKFELGLEFPNLPYYIDGDVK
K5-‐K5 K39-‐K39 intra 52.9 No No No
3563 3 16.96 VDFLSKLPEMLK-‐ VDFLSKLPEMLK K6-‐K6 K124-‐K124 intra 31.8 No Yes Yes
3737 4 15.18 IAYSKDFETLKVDFLSK–
WRNKK K11-‐K4 K118-‐K43 inter 28.5 No No No
3680 4 10.43 YGVSRIAYSKDFETLK–
YIAWPLQGWQATFGGGDHPPKSDLVPR K10-‐K21 K112-‐K217 n/a No No
3611 4 10.4 YIADKHNMLGGCPK– YIADKHNMLGGCPK K5-‐K5 K77-‐K77 intra 20.4 No No No
3439 6 5.94 LTQSMAIIRYIADKHNMLGGCPKER–
YGVSRIAYSKDFETLK K14-‐K10 K77-‐K112 intra 40.7 No No No
pLink Result (precursor mass accuracy 10 ppm, fragment mass accuracy 20 ppm, FDR <5%, E-‐value <0.01)
Scan#. charge
Spec qual.
Best E-‐value
GST_pep1–pep2 Sites in GST Intra/ inter
Cα-‐Cα (Å)
< 24 Å? Spec IDed by xQuest?
Xlink IDed by xQuest?
3249.3 2989.3 2984.4 3246.4 2993.5
high 7.13E-‐21
KFELGLEFPNLPYYIDGDVK(1)-‐IKGLVQPTR(2) NKKFELGLEFPNLPYYIDGDVK(3)-‐IKGLVQPTR(2) NKKFELGLEFPNLPYYIDGDVK(3)-‐IKGLVQPTR(2) KFELGLEFPNLPYYIDGDVK(1)-‐IKGLVQPTR(2) NKKFELGLEFPNLPYYIDGDVK(3)-‐IKGLVQPTR(2)
K44-‐K10 intra 20.04 Yes No No
1793.4 1799.3
high 7.87E-‐21 YIADKHNMLGGCPKER(14)-‐YIADKHNMLGGCPK(5) YIADKHNMLGGCPKER(14)-‐YIADKHNMLGGCPK(5)
K86-‐K77 inter 8.39 Yes No No
2541.3 high 1.14E-‐16 YIADKHNMLGGCPK(5)-‐SPILGYWK(1) K77-‐K1 intra 14.25 Yes No No
3387.3 high 1.30E-‐15 IEAIPQIDKYLK(9)-‐MSPILGYWK(1) K190-‐K0 n/a No No
1936.5 1940.3
high 1.26E-‐14 YIADKHNMLGGCPK(5)-‐IKGLVQPTR(2) YIADKHNMLGGCPK(5)-‐IKGLVQPTR(2)
K77-‐K10 intra 23.03 Yes No No
2377.3 high 8.19E-‐14 YIADKHNMLGGCPKER(14)-‐MSPILGYWK(1) K86-‐K0 n/a No No
2460.4 high 8.84E-‐14 IEAIPQIDKYLK(9)-‐IKGLVQPTR(2) K190-‐K10 intra 16.99 Yes No No
3092.3 high 1.43E-‐13 VDFLSKLPEMLK(6)-‐IKGLVQPTR(2) K124-‐K10 intra 23.51 Yes No No
1820.5 high 1.64E-‐13 YIADKHNMLGGCPKER(14)-‐KRIEAIPQIDK(1) K86-‐K180 intra 29.02 No No No
2335.3 2824.3
high 3.57E-‐13 IAYSKDFETLK(5)-‐IAYSKDFETLK(5)
IAYSKDFETLKVDFLSK(5)-‐IAYSKDFETLK(5) K112-‐K112 inter 16.80 Yes No No
1737.4 1738.3
high 1.57E-‐12 YIADKHNMLGGCPKER(14)-‐IKGLVQPTR(2) YIADKHNMLGGCPKER(14)-‐IKGLVQPTR(2)
K86-‐K10 inter 25.22 No No No
3534.4 high 2.28E-‐12 NKKFELGLEFPNLPYYIDGDVK(3)-‐LPEMLKMFEDR(6) K44-‐K130 inter 14.86 Yes No No
2114.5 2210.3
low 6.92E-‐12 YIADKHNMLGGCPKER(14)-‐RIEAIPQIDKYLK(10) YIADKHNMLGGCPKER(14)-‐IEAIPQIDKYLK(9)
K86-‐K190 inter 29.76 No No No
3695.3. high 2.09E-‐10 NKKFELGLEFPNLPYYIDGDVK(2)-‐MSPILGYWK(1) K43-‐K0 n/a No No
3774.3 3776.5
high 4.93E-‐10 NKKFELGLEFPNLPYYIDGDVK(3)-‐VDFLSKLPEMLK(6) NKKFELGLEFPNLPYYIDGDVK(3)-‐VDFLSKLPEMLK(6)
K44-‐K124 inter 15.77 Yes No No
2267.3 2263.4 2261.5
high 8.66E-‐10 YIADKHNMLGGCPKER(14)-‐MSPILGYWKIK(9) YIADKHNMLGGCPKER(14)-‐MSPILGYWKIK(9) YIADKHNMLGGCPKER(14)-‐MSPILGYWKIK(9)
K86-‐K8 inter 26.14 No No No
2677.4 high 1.77E-‐09 NKKFELGLEFPNLPYYIDGDVK(3)-‐ K44-‐K86 inter 21.35 Yes No No
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 34
YIADKHNMLGGCPKER(14)
1744.3 high 3.08E-‐09 KRIEAIPQIDK(1)-‐YLKSSK(3) K180-‐K193 intra 14.24 Yes No No
3801.3 mid 8.33E-‐09 KFELGLEFPNLPYYIDGDVK(1)-‐LPEMLKMFEDR(6) K44-‐K130 inter 14.86 Yes No No
1741.4 high 1.13E-‐08 YIADKHNMLGGCPKER(14)-‐LVCFKK(5) K86-‐K179 intra 26.29 No No No
3773.4 3894.4
mid 7.69E-‐06 NKKFELGLEFPNLPYYIDGDVK(2)-‐VDFLSKLPEMLK(6)
NKKFELGLEFPNLPYYIDGDVK(2)-‐DFETLKVDFLSKLPEMLK(12)
K43-‐K124 inter 19.13 Yes No No
3129.4 high 1.89E-‐05 NKKFELGLEFPNLPYYIDGDVK(2)-‐DEGDKWR(5) K43-‐K39 intra 6.42 Yes No No
2634.3 high 2.30E-‐05 YIADKHNMLGGCPK(5)-‐MSPILGYWK(1) K77-‐K0 n/a No No
3737.4 high 2.52E-‐05 VDFLSKLPEMLK(6)-‐IAYSKDFETLK(5)
K124-‐K112 intra/*inter
20.39/*22.02
Yes No No
1628.6 high 7.21E-‐05 YIADKHNMLGGCPKER(14)-‐YIADKHNMLGGCPKER(14) K86-‐K86 inter 21.29 Yes No No
3981.3 high 1.45E-‐04 FELGLEFPNLPYYIDGDVKLTQSMAIIR(19)-‐SPILGYWK(1) K63-‐K1 intra 12.10 Yes No No
2999.4 low 1.66E-‐04 YIAWPLQGWQATFGGGDHPPKSDLVPR(21)-‐LVCFKK(5) K217-‐K179 n/a No No
2150.4 high 2.03E-‐04 YIADKHNMLGGCPKER(14)-‐IEAIPQIDKYLKSSK(12) K86-‐K193 inter 32.23 No No No
3498.5 low 4.58E-‐04 KFELGLEFPNLPYYIDGDVK(1)-‐LLLEYLEEKYEEHLYER(9) K44-‐K26 intra 23.61 Yes No No
3563.3 low 4.54E-‐03 VDFLSKLPEMLK(6)-‐VDFLSKLPEMLK(6) K124-‐K124 inter 31.80 No Yes Yes
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 35
Supplementary Note for “Identification of Cross-linked Peptides from Complex Samples” by Yang et al.
pLink
• Datasets for software training and testing The key to optimizing CXMS is to understand the fragmentation patterns of cross-linked peptides.
Thus, we synthesized peptides mimicking cross-linked sequences after trypsin digestion. A total of 38 peptides of 5 to 28 amino acids were synthesized, each containing a lysine (K) or arginine at the C-terminus and at least one non-C-terminal K for cross-linking (Supplementary Table 2). All possible pair combinations-741 in total (including self-self pairs)-were treated with a 1:1 mix of [d0]- and [d4]-BS3 and analyzed one by one using reverse phase HPLC coupled to an LTQ-orbitrap-ETD mass spectrometer. We collected normal-resolution collision-induced dissociation (CID), normal-resolution electron-transfer dissociation (ETD), and high-resolution higher-energy collisional dissociation (HCD) spectra for each precursor ion and found that HCD produced the best results, followed by ETD (data not shown). Therefore, we focused on HCD. A total of 2077 non-redundant HCD spectra were collected from inter-linked, synthetic peptides, including cross-linking isoforms of the same pair (Supplementary Fig. 4).
The datasets are listed in Supplementary Table 3. Positive datasets A and B were generated from pair-wise cross-linking of two synthetic peptides. The resulting HCD spectra were searched against a database of only two target peptides that were in the cross-linking reaction. The two peptide sequences were cross-linked in silico or not in every possible way, mono-, loop-, or inter-linked. A precursor mass tolerance of 50 ppm was specified for database search using pFind with simple adaptation1. Only single cleavage products b+1, b+2, y+1, and y+2 were considered at this step, and a positive match required a minimum E-value of 0.005 and FDR no greater than 1%. Dataset A is made up of 2077 non-redundant HCD spectra of cross-linked peptides, 1030 of which are from light [d0]-BS3 (subset A1) and 1047 from heavy [d0]-BS3 (subset A2). Dataset B contains 13267 spectra and is equivalent to dataset A plus redundant spectral copies. Dataset B can be divided into subsets B1 ([d0]-BS3 cross-links) and B2 ([d4]-BS3 cross-links). The rest of the spectra from the 741 peptide-pair experiments that failed to identify inter-links were collected into dataset C. Some spectra in dataset C represent mono- or loop-linked peptides, and some may be inter-link spectra of very poor quality. Dataset D contains HCD spectra of regular peptides identified from C. elegans lysates (not cross-linked). Datasets C and D are negative datasets. From the negative datasets C and D, 5060 spectra were randomly selected for training and another 5060 spectra were selected for testing. From the positive dataset B, 5468 and 5467 spectra were randomly selected for training and testing, respectively. There were no overlap between the training set and the test set.
• Spectral quality filtering Because the computational cost of cross-link search is high, it is necessary to remove spectra that are
most certainly not going to identify cross-links. A Spectral Quality Score (SQS) was calculated as described before using 14 spectral features listed in Supplementary Table 4. Of these, 12 spectral features have been
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 36
described for CID spectra2. Since high resolution and high mass accuracy HCD spectra generally allow charge state determination of peptides and their fragments, we added two new features, the number of peaks with known charge-states and the fraction of peaks with known charge-states.
To determine the weight of each spectral feature in SQS calculation, we used Linear Discriminant Analysis (LDA). The high-quality cross-link spectra in positive dataset B were randomly sorted into two groups (6633 spectra in each), one for training and the other for testing. Similarly, two non-overlapping negative datasets were randomly selected from dataset C, one containing 6135 spectra for training and the other containing 6134 spectra for testing. The weights of the 14 features obtained from LDA are shown in Supplementary Table 4.
We found from the test data sets that a SQS threshold value of 4.2 was able to achieve 99.5% sensitivity, 94% accuracy, and 93% specificity, that is, 99.5% of the positive spectra were correctly retained and 93% of the negative spectra were correctly removed (Supplementary Fig. 6). Hence, all spectra were filtered by requiring a SQS score greater than 4.2.
• Pre-processing All peaks in a spectrum are classified into 6 categories shown below. Only the ones from the first
category are used for peptide-spectrum matching, the rest are removed. 1. Main peaks (monoisotopic) 2. Isotopic peaks 3. Peaks resulting from a neutral loss of ammonia 4. Peaks resulting from a neutral loss of water 5. Precursor ions. 6. Noise peaks.
Precursor ions and noise peaks are discarded first. Noise peaks are those that do not match any theoretical ions and appear in almost every spectrum. Shown in Supplementary Fig. 7 is the histogram (in 1 m/z bin) of unmatched peaks from 1030 HCD spectra (dataset Sub_A1, Supplementary Table 3). There are obvious noise peaks in the low m/z range, especially at 108, 153, and 200. In most spectra, these noise peaks tend to appear simultaneously.
Then isotopic peaks, ammonia- and water-loss peaks are identified and removed. These peaks are redundant and subsidiary to the main peaks they are affiliated with. If a main peak exists, affiliated peaks are removed to reduce random match.
After removing the peaks that contribute little to peptide-spectrum matching as above, the remaining peaks are called main peaks and their charge states are labeled if possible. The intensities of all main peaks are square root transformed to prevent peptide-spectrum matching from being excessively influenced by a few high-intensity peaks.
• Open search against large databases for complex protein mixtures For CXMS analysis of complex protein mixtures, the database search space (all possible peptide pair
combinations, or combinatorial mode) is prohibitively large. To make it feasible, we devised an open-search mode for any protein database of more than 100 proteins (Supplementary Fig. 8a). Briefly, before searching for a cross-linked peptide pair α−β as a whole, pLink looks for possible sequences of α and β separately by treating the other peptide as a modification on a lysine residue (pre-scoring). Then, the spectrum is scored against candidate α and β sequence pairs (fine-scoring).
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 37
The pre-scoring step resembles a conventional database search in that no candidate sequences are cross-linked in silico. However, unlike conventional search, which stipulates a narrow mass tolerance, the mass tolerance window is “opened up” to allow a spectrum to match to all candidate peptides whose theoretical masses are the same as or lower than that of the precursor. In addition, the candidate sequences must bear the specificity of the cross-linker and the digestion enzyme, and contain no more than a user-defined limit for missed cleavages. With BS3 and trypsin and no more than two missed cleavages, it means that the candidate sequence must have a K/R at the C-terminus and at least one but no more than three non-C-terminal K unless it is a protein N-terminal sequence, in which case zero, one, or two non-C-terminal K are allowed. For each filtered and pre-processed spectrum, a fast pre-scoring algorithm (described below) finds a list of possible sequences with a modification on K. The mass of the modification is the difference between the mass of the precursor and that of the peptide sequence being examined. By requiring a minimum –log(pre-score) value of 3.0, 89% of the non-target sequences were removed, while 99.7% of the targets were retained (Supplementary Fig. 8b).
After pre-scoring, candidate sequences are sorted into two groups. Those with a theoretical mass (the peptide itself, ignoring the mass of the modification) greater than (M – 50 ppm)/2, where M is the precursor mass, are classified as candidate α sequences. Similarly, those with a theoretical mass less than (M + 50 ppm)/2 are classified as candidate β sequences. In a cross-linked peptide pair α−β, α is always the one with a higher mass.
When all the spectra in the positive dataset “B” (13267 cross-link spectra, Supplementary Table 3) were searched against the source peptide sequences plus a background E. coli protein database (6164 forward and 6164 reversed protein sequences), it was found that 99% of the time, the correct α sequence were among the top 20 candidates while the correct β sequence were among the top 250. The average rank orders for the correct α and β sequences are 1.65 and 8.73, respectively. As such, the top 500 α and β candidates are kept for each spectrum, and the associated probability of losing the correct α or β sequence is 0.003.
Next, candidate α and β sequences are paired up to satisfy the following relationship within a specified mass tolerance range (default is 50 ppm for HCD): Mα + Mβ + Mlinker = M.
The last step of the open search is to score the spectrum again against candidate α−β pairs cross-linked in silico using a refined scoring algorithm (see below). For low-complexity samples, filtered and pre-processed spectra can go straight to fine scoring against candidate α−β pairs generated directly from a small protein database (Fig. 1a).
• Pre-scoring algorithm The pre-scoring algorithm in the open-search mode measures the significance of the match between
ions in an experimental spectrum and peaks in the theoretical spectrum of a candidate sequence. It consists of three components X, R, and T, representing the number of matched ions, average intensity ranking of matched ions, and the length of the longest sequence tag, respectively, that arise by chance.
(1) Statistic X: number of matched ions In the case of a random match, the number of matched ions X follows a hypergeometric distribution.
The probability density function of X is as follows:
(1)
In equation (eq.) 1, “N” is the maximal number of distinguishable peaks, equal to the
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 38
theoretical mass of a candidate sequence divided by the mass tolerance width for fragments; “n” is the number of peaks in the theoretical spectrum, letter “l” represents the number of fragment ions in an experimental spectrum; and “x” is the number of fragment ions that match to the candidate sequence.
The probability of “X,” the number of matched ions arising by chance, exceeding “x” is calculated using eq. 2.
(2)
(2) Statistic R: mean intensity ranking of matched ions Ions are ranked by intensity from high to low. The relative ranking of the ith ion is Ri (Ri = i/n,
while i=1, 2, 3, …n). For randomly matched ions, we assume an independent and even distribution of Ri between 0 and 1. The expectation of Ri is 0.5 and the variance is 1/12. Let “R” be the mean relative ranking of a group of randomly matched ions. Then, R follows a normal distribution and the probability density function of R is:
(3)
In eq. 3, µ and σ2 are expectation and variance of R, respectively. For x number of matched
ions:
(4)
(5)
For R (the mean intensity ranking of randomly matched ions) to be less than r (observed mean
intensity ranking of matched ions), the probability can be calculated as follows:
(6)
(3) Statistic T: length of the longest sequence tag T is the amino acid length of the longest sequence tag obtained from a spectrum divided by
that of a candidate peptide, so T ∈ [0,1]. Let a string of 0 and 1 indicate whether or not an amino acid residue is correctly identified at each and every position of a peptide. Assuming that there is a 50% chance of correct identification at any position, and identification at each is an independent event, then T is the longest run of 1s normalized to the length of the string. The probability distribution of T with different peptide lengths can be simulated using the Monte Carlo method and stored in a table. Given the length of a candidate peptide, the probability of T (by chance) > t (observed) can be looked up from the table. Supplementary Fig. 9 shows the probability values as a function of T for peptides of 5, 8, 10,
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 39
15, 18, 20, or 25 aa, as generated from Monte Carlo simulation. Lastly, the pre-score is the product of the three p-values described above.
(7)
Supplementary Fig. 10 illustrates how Pre_score is computed. We found that with a Pre-score threshold of 10-3, or –log(pre-score)≥3, sensitivity reached 99.7%, specificity 89%, and accuracy 70% (Supplementary Fig. 8b).
Pre-scoring is essential for large database search. When 5455 spectra were searched against a database of 108 proteins on a personal computer (Intel Core 2 Quad CPU, 2.4 GHz, 2 GB RAM), pre-scoring reduced the search time from 10 hours to 43 minutes.
• Fine-scoring algorithm The fine-scoring algorithm is an extension of the Kernel Spectral Dot Product (KSDP)
algorithm previously developed for the database search engine pFind6. We took the following steps to refine the KSDP algorithm for cross-link identification.
(1) Selection of fragment ion types for scoring Theoretically many types of fragment ions can be generated from inter-linked peptides, but in
actuality some are rarely seen and some never. It is crucial to select the appropriate ion types for scoring. Failing to consider ions that do exist abundantly in cross-link spectra doubtlessly compromises identification, and so does careless inclusion of ions that only exist theoretically because it does nothing but increasing the chance of random match.
To distinguish various fragment ion types, we used the systematic nomenclature suggested previously with slight modifications3.
In an inter-linked peptide pair, the higher-mass peptide is α and the lower-mass peptide is β; if two peptides are of the same mass but different sequence (extremely rare), α sequence is lower than β in alphabetical order; if two molecules of the same peptide cross-link, then it is both α and β.
For single backbone cleavage products, only the dissociated peptide is indicated, e.g. βy21+,
βb52+, αy8
2+. If αy82+ contains the cross-linking site it may be labeled as αy8
2+x to indicate that the intact β peptide is attached to the α fragment through the linker. Suffix such as “–H2O” or “–18.0153” indicates neutral loss, e.g. αb5
1+–NH3. For fragments that require two backbone cleavage events, they are labeled like αy5αb3
1+, βy6βa5
2+, αb5βy21+, or αy8βa7
3+ (illustrated in Supplementary Fig. 11a-c). Always, α precedes β if both peptides are dissociated and y precedes b or a when both cleavages occur on the same peptide. These fragments all contain the cross-link and may be decorated with “x” (e.g. αy8βa7
3+x). Among these types are two special αyαa and βyβa fragments that result from enhanced cleavage of the nearest peptide bond N-terminal to a cross-linked lysine Cα atom and breakage of the C-C bond involving this Cα atom3. These are called K-linked fragments, labeled as KLα or KLβ, equivalent to a lysine immonium ion attached to an intact α or β peptide through the linker (Supplementary Fig. 11c). KLα or KLβ ions with a neutral loss of ammonia (KLα/β –17) are also common as reported before4.
Either amide bond in the BS3 linker can also break3,4 . α/βL5 or α/βL13 refer to the resulting α or β peptide with a linker fragment (illustrated in Supplementary Fig. 11d). The suffix number refers
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 40
to the number of atoms between the lysine Cα atom and the cleaved amide bond. Since we know the absolute identifications of the 2077 non-redundant cross-link spectra from
dataset A (Supplementary Table 3), we matched all theoretical fragments (peaks) to ions found in the experimental spectra after pre-processing. For each ion type, the following features were collected:
1. Count Ratio, calculated as the number of matched peaks of an ion type divided by the number of matched peaks of all ion types
2. Gain Ratio, is the number of matched peaks of an ion type divided by the total number of peaks of that ion type
3. Average Intensity, average of normalized intensity of matched peaks belonging to the same ion type (normalized to the base peak intensity)
4. Match Significance, the product of Count Ratio, Gain Ratio, and Average Intensity 5. Average Mass Deviation, average mass deviation of matched ions from the
theoretical peaks they are matched to 6. Tag Length, amino acid length of the longest sequence tag deduced from ions of the
same type, normalized to the length of the peptide (for cleavages that are all on one side of the cross-link on either α or β peptide)
7. Average Number of Amino Acids, indicating on average how many amino acid residues constitute an ion of a given type
We ranked all ion types by match significance. The top 20 ion types are shown in Supplementary Table 5. The most significant ion type is y1+. Of all matched peaks, 21.8% are y1+, and 67.1% of all theoretical y1+ peaks have matching ions in experimental spectra. Moreover, matched y1+ peaks have much higher intensity compared to the rest of the ion types. The y1+ ions also tend to be continuous, generating sequence tags that can cover, on average, 61.7% of the peptide sequence.
Besides single-cleavage ion types, several double-cleavage ion types are also significant, such as yb1+, ya1+, α/βL, and KLα/β. The yb1+ ions include αyαb
1+, βyβb1+, αyβb
1+, and αbβy1+ sub-types.
Similarly, the ya1+ ions include αyαa1+, βyβa
1+, αyβa1+, and αaβy
1+. Related ions αyβy1+, αbβb
1+, and αaβa1+
are not included because they are too rare. Precursor ions [M+5H]5+ and [M+4H]4+ are intense in HCD spectra of high charge state cross-links. Without pre-processing, [M+5H]5+ and [M+4H]4+ ranked 3rd and 4th in match significance, next only to y1+ and y2+ (data not shown). These precursor ions provide no sequence information, therefore they are not used for scoring. Although a fraction remains after pre-processing, they are out of the top 10 in match significance.
To further analyze the properties of fragment ions common in cross-link spectra, we divided each ion type into two sub-types depending on whether or not a fragment contains a cross-link. Their theoretical peak-experimental ion match properties were analyzed as above, and the results indicate that: a) the charge states of matched peaks containing a cross-link tend to be higher that those not; b) most of the matched peaks and ion intensity are accounted for by those containing no cross-links, c) for ion types b1+, y1+, a1+, and ya1+, the sub-types that contain a cross-link can be ignored; d) for ion types b2+, b3+, a2+, a3+, and y3+, the sub-types that do not contain a cross-link are negligible; e) for ion types y2+ and yb1+ both sub-types should be taken into account.
(2) Weighting factors in KSDP The refined scoring function is based on the KSDP model 1:
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 41
(8)
In eq. 8 “S” is the sum of the intensity of all matched ions, “L” is the number of peaks to be
weighted for continuity, “K” is the sum of the weight for ion continuity and the weight for coexistence of different types of ions supporting the same cleavage, and “θ” is a constant balancing K and S.
According to ion continuousness (feature “Tag Length”), only the ion types with a large Tag Length value such as y1+, y2+, b1+ and b2+ may be weighted for continuity. In contrast, all ion types are to be weighted for co-existence.
(9)
(10)
In the equations above “M” is a matrix, in which each row is an ion type and each column
represents a cleavage site. The value in each cell indicates whether or not a theoretical peak finds a matching ion in the experimental spectrum, 1 for yes and 0 for no. “l1” and “l2” are the lower and upper boundaries of a continuity window from which Kcontinuity is to be calculated. Similarly, “l3” and “l4” are the boundaries of a coexistence window. An example is shown in Supplementary Fig. 12. There are two continuity windows 1 (for b ions) and 2 (for y ions). Also, there are four coexistence windows 3 through 6 whose Kcoexistence values are to be computed. Window 3 emphasizes the cleavages immediately C-terminal to the cross-linking site in different ion types, and window 6 highlights fragments resulting from linker cleavage.
For inter-linked peptides α-β, Kcontinuity and Kcoexistence are computed separately for α and β, and the final K value is calculated as follows:
(11)
(3) Optimizing the use of ion types To find out the best way to utilize the information carried by each ion type, we considered the
following options: a) whether or not an ion type is to be used in KSDP scoring; b) for each ion type, whether its cross-link-containing and cross-link-free sub-types should be treated separately; c) for each ion type or sub-type, whether it should be weighted for continuity.
To be able to decide on each option or combination of options, we searched dataset_A1 (1030 non-redundant HCD spectra, Supplementary Table 3) against a target database and a decoy database.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 42
The goal was to find the best setting to maximize the difference between the target database score and the decoy database score. Here we took normalized refined_scores (E-values, see below) and used the negative values of their natural logarithms, i.e. –ln(E-value), to compare differences. The initial setting and the optimal setting we finally arrived at are shown in Supplementary Table 6. In the final setting, yb1+ and ya1+ ions are limited to the ones that contain no more than 4 amino acids to reduce random match, since matched yb1+ and ya1+ peaks have an average of 3.226 and 1.772 amino acids (Supplementary Table 5). After optimization, the negative ln(E-value) of target database search increased by an average of 7.1772, whereas the decoy database search only increased by 1.918 (Supplementary Fig. 13). Similar results were obtained with 240 positively identified cross-link spectra from the CXMS analyses of GST and CNGP. The negative ln(E-value) of target database search increased by an average of 20.9 from the initial setting to the optimized setting, whereas the decoy database search increased by 11.9. Consideration of cross-link specific ions made a significant contribution to this improvement (Supplementary Fig. 14).
• Calculation of E-values Refined_scores cannot be compared between spectra. The absolute score value is related to
the precursor charge state and number of ions in a spectrum, and the length of a matching peptide, so a “busy” spectrum matched to a wrong sequence may have a higher Refined_score than a “sparse” spectrum to the right sequence. We normalize Refined_scores by converting them to E-values. The following steps are taken to calculate E-values.
1) Generate 5000 Theoretical Peptide Candidates (TPCs). For each TPC, randomly generate an amino acid at every position until the cumulative mass is just above the precursor mass minus the linker. Then split each TPC into halves, one half as α (marked as α’) and the other as β (marked as β’). TPCs from this step mimic the situation where both α and β identifications are out of random match.
2) Keep α peptide the same and randomly generate 5000 TPCs in place of β (marked as β’). For each TPC, randomly generate an amino acid at every position until the cumulative mass is just above the mass of β peptide.
3) Keep β peptide the same and randomly generate 5000 TPCs in place of α (marked as α’) in the same way. TPCs from step 2 and step 3 are meant to mimic the situation where the identification of one cross-linked peptide is correct but the other is by chance.
4) Compute Refined_score for a spectrum against each of the 5000 TPCs (a total of 15,000 TPCs).
5) Take the top 10% Refined_scores and rank them from low to high. These are background data points.
6) Calculate empirical p-value for each Refined_score value, that is, at each score value, the empirical p-value for a score higher than current one is the number of data points with a higher score divided by the number of total data points (5000).
7) From the background data points above, find the maximum likelihood estimate of the parameters of the linear model between log(p-value) and Refined_score. log(p-value) = a*x + b (x is Refined_score, a<0, b>0)
8) Obtain the p-value(s) for the candidate cross-link(s) matched to a spectrum. 9) Calculate E-value by multiplying the p-value of a match by the number of candidate
cross-links falling within the mass tolerance window from a search database.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 43
We tested other methods to generate background candidate peptides, for example, taking Real Peptide Candidates (RPCs) from the search database, or using other methods to generate TPCs (data not shown). The simple TPC method described above proved to be fast, effective and stable.
• Estimation and control of FDR A major problem of CXMS is how to calculate and control FDR associated with cross-link
identifications. For this, we devised a modified version of the reversed database strategy5 to estimate FDR. The sequence of each protein entry in a forward or target database is reversed to create a decoy database the same size as the forward database. Peptide sequences from both the forward (F) and the reversed (R) database are cross-linked in silico in every possible combinations, producing three categories of cross-links: those between two forward peptides (F-F), between two reversed peptides (R-R), and those between a forward peptide and a reversed peptide (F-R and R-F). As shown in Supplementary Fig. 15a, the F-F category is marked “T” for true because only this category contains true identifications; the R-R category is marked “F” for false; the F-R and R-F are collectively called “U” for union of F and R. The size ratio among T, F, and U is 1:1:2. Out of random match, a spectrum has a ¼, ¼, and ½ chance to match to T, F, and U, respectively.
To find out if this theoretical prediction holds true, we searched the cross-link data of the yeast UTP-B complex against three databases (a human protein database, an archaea protein database, and a random sequence database built by computer) that has none of the UTP-B subunit sequences. Therefore, all matches are random. We found that the number matches to T, F, and U closely followed the expected ratio of 1:1:2 for all three databases (Supplementary Fig. 15b). If the UTP-B protein sequences were included, then the number of matches to T increased, accompanied by a drop in both F and U. Thus, the theory holds true and we can use the number of matches to T, F, and U to estimate FDR.
Among the spectra that match to peptide pairs in T (NT), there are two types of false matches: (1) both peptide sequences are wrong and (2) one peptides sequence is correct but the other is wrong. The number of type 1 false matches is estimated by the number of spectra that match to F (NF), meanwhile, twice as many of them (2NF) are expected to match to U. Therefore, the number of type 2 false matches is estimated by (NU – 2NF). Hence, we derive the following:
(12)
(13)
In rare cases where NU-NF ≤ 0, FDR is estimated using the ratio of NF/NT. To test the accuracy of estimated FDR using eq. 13 we pooled dataset A (cross-link spectra
with known identity) and dataset D (spectra from non-cross-linked samples) together and searched a database containing target and decoy sequences. The result shows that estimated FDR is in excellent agreement with true FDR (Supplementary Fig. 15c). Only when FDR reaches above 40%, which is too high to be worthy of fine distinction, the estimation begins to deviate.
Nature Methods: doi:10.1038/nmeth.2099
Yang et al. 6/7/12 2:08 PM 44
• pLink performance tests (1) Test#1–Small Dataset against Small Database The test data consisted of 2077 cross-link spectra from dataset A and 3016 non-cross-link
spectra from dataset D. In the database are the sequences of 38 synthetic peptides and the UTP-B proteins (six proteins), along with the reversed sequences. The search parameters are: precursor mass tolerance 50 ppm, fragment mass tolerance 20 ppm, cross-linker [d0]-BS3, search mode combinatorial (all peptide pair combinations enter fine scoring). Of the 2077 spectra in dataset A, 1030 are [d0]-BS3 cross-links and 1047 are [d4]-BS3 cross-links. Therefore, in this search the [d4]-BS3 cross-link spectra can match to one correct peptide sequence at best. This was done by purpose to create the scenario where only one peptide identification is right and the other is wrong. All together, we expect 1030 positive identifications and 4063 negative identifications in a perfect search. Sensitivity, accuracy, and specificity are calculated as follows.
(14)
(15)
(16)
The search result shows that positive and negative identifications are well separated by E-
value and precursor mass accuracy. A 10 ppm precursor mass accuracy was applied and the result was examined at varying FDR control levels. At 5% FDR, sensitivity is as high as 99%, accuracy 98%, and specificity 99%.
(2) Test#2–Small Dataset against Large Database In open-search mode, the same test data above were searched against an E. coli database with
38 synthetic peptide sequences appended to it, along with the reversed sequences. There are 12328 protein sequences (forward+reversed) in this database, from which 657195 peptide sequences are derived. Repeating the same analysis method, we find a small drop in sensitivity, but sensitivity, accuracy, and specificity are still at 95% or above.
(3) Test#3–Large Dataset against Small Database A total of 42051 spectra from datasets B and D and a subset of C (7668 spectra) constituted
the test data. The search and filtering parameters were the same as above. Similarly, only 7706 [d0]-BS3 cross-link spectra (dataset_B1) were expected to be identified because in the search parameter the cross-linker was set to [d0]-BS3 only. The small database was the same one used in test#1. From the combinatorial-mode search, sensitivity, accuracy, specificity are all above 97% at 5% FDR. The pLink performance in this test is comparable to that in test#1, suggesting that with a small database, having lots of interfering negative spectra hardly affects the result.
Nature Methods: doi:10.1038/nmeth.2099
6/7/12 2:08 PM
45
(4) Test#4–Large Dataset against Large Database The same dataset in test#3 (42051 spectra) were searched using the open-search mode against the
database used in test#2 (12328 proteins, or 657195 peptides). At 5% FDR, the sensitivity decreased to 92%, but accuracy and specificity remained above 95% (Fig. 1b).
• pLink run time in typical experiments (1) Example #1: 5267 spectra + tiny database (6 proteins + reverse sequences)
hardware: personal computer Intel Core 2 Quad CPU, 2.33 GHz, 2 GB RAM search parameters: same as above run time: 120 min in combinatorial search mode
(2) Example #2: 5455 spectra + median database (108 proteins + reverse sequences) hardware: personal computer Intel Core 2 Quad CPU, 2.33 GHz, 2 GB RAM search parameters: same as above run time: 43 min in open search mode
(3) Example #3: 6403 spectra + large database (10065 proteins + reverse sequences) hardware: computer cluster, 8 units of Dual Xeon 5405, 2.0 GHz, 4G RAM, 8 core/unit search parameters: same as above run time: 46 min in open search mode
• Comparison of pLink and xQuest To benchmark the performance of pLink, we compared it to xQuest6. The GST cross-linking data from
experiment 1 in Supplementary Table 7 (using [d0]/[d4]-BS3) and a database containing GST and trypsin, along with their reversed sequences were submitted to the xQuest web server (http://prottools.ethz.ch/orinner/public/htdocs/xquest/xquest_review.html).
As shown in Supplementary Table 17, the search parameters were varied in order to find the best
setting for HCD data. Allowing 4 missed trypsin cleavage sites, no consideration of a-ions, MS1 mass tolerance of 10 ppm, and MS2 mass tolerance of either 0.01 or 0.02 m/z (equivalent to 10 or 20 ppm at 1000 m/z) yielded better results than others. The best result (setting A with filtering condition (i), bolded and colored red in Supplementary Table 17) identified 10 spectra, corresponding to 8 cross-linked lysine pairs (Supplementary Table 18). Most of these lysine pairs have a Cα-Cα distance greater than 24 Å, only two are within the distance limit for BS3. The overlap between the xQuest result and the pLink result is marginal. Only the K180-K193 cross-link, which has the highest xQuest score and shortest Cα-Cα distance, was identified with both programs. From the same data and similar parameters (MS2 mass tolerance at 20 ppm instead of 0.01 m/z), pLink identified 7 cross-linked lysine pairs from over 40 spectra; all except one are formed between two lysine residues less than 24 Å apart (Supplementary Table 7, exp_1). Repeating the comparison using the GST exp_2 data from Supplementary Table 7 recapitulated the difference. Thus, the pLink results appear to be more reliable and encompass more successful spectral identifications.
Nature Methods: doi:10.1038/nmeth.2099
6/7/12 2:08 PM
46
For samples treated with light cross-linker only, pLink is equally effective. For instance, a similar number of cross-links are identified from E. coli experiment #1 (with [d0]-BS3) and #2 (with 1/1 [d0]/[d4]-BS3) using our CXMS method (Supplementary Table 11). The xQuest algorithm relies on isotopic spectral pairs to differentiate common and xlink ions, which are the basis of its identification strategy6. Without light/heavy isotope labeling, xQuest is expected to be ineffective. This is verified experimentally using data from a GST sample cross-linked with light [d0]-BS3. As shown in Supplementary Tables 19 and 20, the best xQuest identification result consists of 7 GST cross-links, none of them is supported by the GST structure (Cα-Cα distance > 24 Å for six of them, one without structural evidence). In contrast, pLink identified 30 cross-linked lysine pairs from 43 spectra. Most of them are structurally sound (Cα-Cα distance < 24 Å for 18 pairs, > 24 Å for 7 pairs, and 5 without structural evidence). There is only one overlap between the xQuest result and the pLink result (colored blue in Supplementary Table 20). Between the cross-links identified by pLink from the light BS3 experiment and those from the [d0]/[d4]-BS3 experiments (Supplementary Table 7), two are identical (colored red in Supplementary Table 20) and another pair are very similar (K63-K1, colored orange in Supplementary Table 20, vs K63-M0, i.e. the protein N-terminus, in Supplementary Table 7).
Supplementary Discussion The earliest software for CXMS can be traced back to year 20007. Since then, much effort has been
invested into this technology. Divide-and-conquer is one strategy whose critical component is a cross-linker
which breaks in a MS2 scan, thereby allowing sequencing of two released peptides in subsequent MS3 scans.
Among these are disuccinimidyl sulfoxide (DSSO), Protein Interaction Reporter (PIR) cross-linkers,
Isotopically Coded Cleavable (ICC) cross-linkers, and cyanurbiotindipropionylsuccinimide (CBDPS)8-11.
Another approach is mainly focused on developing software tools that identify two inter-linked peptides without
having to separate them, including SearchXLinks12, X!Links13, xComb14, Popitam15, Xi16, PepLynx17, Xlink-
Identifier18, and a SEQUEST-like search engine for crosslink analysis19. Here we took the latter approach and
developed a software tool compatible with a variety of common cross-linkers. Our goal is to make CXMS easy,
effective, and readily available.
In our BS3 cross-linking analysis of GST and the CNGP complex, most of the observed cross-links are
consistent with the crystal structure data, but there are a few that are not. In repeated experiments, the
problematic cross-links are mostly observed only 1–3 times; in contrast, the structurally supported cross-links
are observed with an average of 14.7 times for GST (three repeats) and 13.4 times for CNGP (two repeats)
(Supplementary Tables 7-8). The structurally incompatible cross-links may be non-specific ones coming from
multiple sources. They could arise by chance when two non-interacting proteins in Brownian motion happen to
be momentarily close enough to be captured by a cross-linker. We show experimentally that this does occur, but
the frequency is very low, at least 100-fold lower than cross-links within a protein or protein complex
(Supplementary Fig. 5). Another possibility is that they result from cross-linking of protein aggregates, i.e.
denatured or partially denatured proteins. We are careful to avoid protein aggregates and over-cross-linking. For
Nature Methods: doi:10.1038/nmeth.2099
6/7/12 2:08 PM
47
example, we abandon cross-linking conditions that cause any visible precipitation. However, microscopic
aggregation remains a possibility. The third possibility is that in some regions there might be some alternative
structures of a protein in solution compared to that in crystal lattice.
Notably, there are cross-linking “hot” sites; these are lysine residues that form cross-links with two or
more sites, e.g. GST_K26 (Supplementary Table 7) or Cbf5_K180 (Supplementary Table 8). Conversely,
there are “cold” sites” that are not observed at all even though they could theoretically form cross-links with
other lysine residues. Out of >300 K-K combinations in GST, 54 K-K cross-links are theoretically possible
using BS3 after applying surface accessibility and distance constrains (Supplementary Fig. 17)18. Yet only
eight of them were identified and of the eight, three involve K26 (Supplementary Table 7). In GST and CNGP,
cross-links that are incompatible with structure are often originated from hot sites. For the UTP-B complex, out
of 345 amine groups (339 lysines plus 6 N-termini), only 94 (27%) were detected as cross-linking sites. Yet, 39
(41%) of these lysine residues cross-link with two or more sites, accounting for 88.5% of the total cross-link
spectra.
The uneven distribution of observed cross-links is likely a combined result of solution phase chemistry
(accessibility and reactivity of two amine groups within 11.4 Å distance) and gas phase chemistry (ionization,
m/z, and fragmentation of cross-linked peptides). The former determines how many cross-links are formed and
the amount of each; the latter governs the visibility of cross-linked peptides in mass spectrometry. Solvent
accessible surface distance can be calculated using Xwalk20. Here we focus on reactivity of lysine residues. The
pKa value of the ε-amino group of lysine is usually around 10.521, but it changes with local environment and
can be as low as 5.322. Those with higher pKa values are less nucleophilic, i.e. less reactive to BS3. A positively
charged environment favors the deprotonated form of the lysine ε-amine, lowering its pKa and increasing its
reactivity. On the contrary, a negatively charged surrounding stabilizes the protonated form, increasing its pKa
and lowering its reactivity. Moreover, positively charged regions would attract negatively charged BS3 better.
So, everything else being the same, a lysine adjacent to an arginine would be more reactive than one next to
serine or glutamic acid. If a lysine side chain forms a salt bridge with an acidic residue, then it is unlikely to
react with BS3. Cross-linking hot sites possibly occupy regions that are positively charged and highly accessible
to BS3.
So far, pLink is the only algorithm that has been optimized with a large standard dataset in which the
absolute identity of each cross-linked peptide pair is known. However, it is optimized only for BS3. For
comprehensive structural analysis by CXMS, other homo- or hetero-bifunctional cross-linkers would be helpful.
pLink is compatible with other cross-linkers, whether they are specific to lysine or not, but for best performance
some of the ion types may need adjustment if the fragmentation behavior of a cross-linker differs from that of
BS3. As expected, we find that pLink is equally effective for DSS, a functional homolog of BS3
Nature Methods: doi:10.1038/nmeth.2099
6/7/12 2:08 PM
48
(Supplementary Table 13). Similar NHS-type cross-linkers like BS2G and DSG should work just as well.
pLink also works for three hetero-bifunctional cross-linkers we have tested–EDC (inducing K-D or K-E cross-
link, zero-length), AMAS (K-C cross-link, 4.4-Å spacer arm) and sulfo-GMBS (K-C cross-link, 7.3-Å spacer
arm) (Supplementary Tables 14-16). The percentage of the cross-links that are structurally sound is lower than
that obtained with amine-specific cross-linkers BS3 and DSS (compare Supplementary Tables 14-16 to
Supplementary Tables 8 and 13). Especially for EDC, none of the cross-links identified fits distance constrain.
This may be due to imperfections of cross-linking conditions. For hetero-bifunctional cross-linkers, different
functional groups call for different reaction conditions, and it has yet to be determined what is the best balance
between preserving the native conformation of proteins and maximizing cross-linking efficiency. The EDC-
carboxyl reaction is most efficient at pH 4.5, the maleimide-sulfhydryl reaction is best at pH 6.5-7.5, whereas
the amine-target NHS-ester reaction is performed at pH 7-9. For EDC, the reaction has to be carried out in two
steps, first at slightly acidic pH, then at neutral or slightly alkaline pH. The pH change may affect protein
conformation somewhat. In spite of this, the general trend remains true that the longer the spacer arm of a cross-
linker, the more cross-links. For example, the sulfo-GMBS cross-links of the CNGP complex included all five
AMAS cross-links and 15 additional ones (Supplementary Tables 15-16).
Besides chemical cross-linking, pLink is applicable to natural cross-links such as disulfide bond or
sumoylation. Again, for optimal results, a large enough standard dataset of disulfide-linked peptides or
sumoylated peptides will be helpful for fine-tuning of the pLink parameters.
The current version of CXMS still has limited sensitivity to detect protein-protein interactions in
endogenous protein complexes. With further improvement, such as the development of cross-linkers with
increased efficiency and affinity tags for specific enrichment of cross-linked peptides, the technique will
become more powerful to explore the interactome of complex samples.
Overall, CXMS provides reliable structural information, trading resolution with ease and speed. It complements high-resolution approaches, particularly in cases where a protein of interest is difficult to crystallize. CXMS may also be utilized in kinetic studies of protein folding and unfolding, or protein complex assembly and disassembly.
References 1. Fu, Y. et al. Bioinformatics 20, 1948-‐1954 (2004).
2. Nesvizhskii, A. I. et al. Mol Cell Proteomics 5, 652-‐670 (2006).
3. Schilling, B., Row, R. H., Gibson, B. W., Guo, X. & Young, M. M. J Am Soc Mass Spectrom 14, 834-‐850
(2003).
Nature Methods: doi:10.1038/nmeth.2099
6/7/12 2:08 PM
49
4. Gaucher, S. P., Hadi, M. Z. & Young, M. M. J Am Soc Mass Spectrom 17, 395-‐405 (2006).
5. Elias, J.E. & Gygi, S.P. Nature methods 4, 207-‐14 (2007)
6. Rinner, O. et al. Nature methods 5, 4 (2008).
7. Young, M. M. et al. Proc Natl Acad Sci U S A 97, 5802-‐5806 (2000).
8. Kao, A. et al. Mol Cell Proteomics 10, M110 002212 (2011).
9. Anderson, G. A., Tolic, N., Tang, X., Zheng, C. & Bruce, J. E. J Proteome Res 6, 3412-‐3421 (2007).
10. Petrotchenko, E. V. & Borchers, C. H. BMC Bioinformatics 11, 64 (2010).
11. Petrotchenko, E. V., Serpa, J. J. & Borchers, C. H. Mol Cell Proteomics 10, M110 001420 (2011).
12. Wefing, S., Schnaible, V. & Hoffmann, D. Anal Chem 78, 1235-‐1241 (2006).
13. Lee, Y. J., Lackner, L. L., Nunnari, J. M. & Phinney, B. S. J Proteome Res 6, 3908-‐3917 (2007).
14. Panchaud, A., Singh, P., Shaffer, S. A. & Goodlett, D. R. J Proteome Res 9, 2508-‐2515 (2010).
15. Singh, P. et al., Anal Chem 80, 8799-‐806 (2008).
16. Chen, Z. A. et al. EMBO J 29, 717-‐726 (2010).
17. Zelter, A. et al. J Proteome Res 9, 3583-‐3589 (2010).
18. Du, X. et al. J Proteome Res 10, 923-‐931 (2011).
19. McIlwain, S., Draghicescu, P., Singh, P., Goodlett, D. R. & Noble, W. S. J Proteome Res 9, 2488-‐2495
(2010).
20. Kahraman, A., Malmstrom, L. & Aebersold, R. Bioinformatics 27, 2163-‐2164 (2011).
21. Grimsley, G. R., Scholtz, J. M. & Pace, C. N. Protein Sci 18, 247-‐251 (2009).
22. Isom, D. G., Castaneda, C. A., Cannon, B. R. & Garcia-‐Moreno, B. Proc Natl Acad Sci USA 108, 5260-‐
5265 (2011).
23. Su, C. et al. Nucleic Acids Res. 36, D632–D636 (2008)
Nature Methods: doi:10.1038/nmeth.2099