supporting information -...
TRANSCRIPT
S1
Supporting Information
Visualizing Proton Antenna in High Resolution Green Fluorescent Protein Structure
Ai Shinobu, Gottfried J. Palm*, Abraham J. Schierbeek and Noam Agmon*
Table S1. X-ray data collection and refinement statistics for GFP sg11
Data collection
X-ray source Bruker Microstar
Montel multilayer optic
Wavelength (Å) 1.5418
Resolution limits (Å) (highest shell) 68 -0.90 (1.00-0.90)
Space group P212121
Unit cell parameters a b c (Å ) 52.15 59.42 68.11
Unique reflections 155426 (41798)
Redundancy 19.7(9.2)
Completeness (%) 91.4 (90.8)
I/σ(I) 20.4 (3.0)
Rmerge 0.080 (0.487)
Wilson B factor (Å2) 8.4
Solvent content (%) 39
Refinement
Resolution limits (Å) 6 - 0.90
Total reflections 133320
No. reflections in test set 4231
Rcryst 0.1456
Rfree 0.1740
No. amino acid residues in a.u. 229
No. water/other solvent molecules in a.u. 326
Rmsd bond lengths (Å) 0.034
Rmsd bond angles (deg.) 3.4
PDB entry code 2WUR
S2
Table S2. List of PDB codes for GFP and mutants structures. Green shading indicates a continuous ASW without a break at Ser72, whereas blue shading indicates structures belonging to type B (see text).
PDB
code
Resolution
[Å] Mutations*
Monomers
per
asymmetric
unit **
1 1BFP 2.1 Y66H/Y145F 1
2 1C4F 2.25 S65T 1
3 1CV7 2.5 K26R/F64L/S65T/Y66W/N146I/M153T/V163A/
N164H 1
4 1EMA 1.9 S65T 1
5 1EMB 2.13 wt 1
6 1EMC 2.3 F64L/I167T 4
7 1EME 2.5 F64L/I167T 1
8 1EMF 2.4 F64L/Y66H/V163A 1
9 1EMG 2.0 S65T 1
10 1EML 2.3 F64L/I167T 1
11 1EMM 2.3 F64L 1
12 1F09 2.14 S65G/V68L/S72A/H148Q/T203Y 1
13 1F0B 2.1 S65G/V68L/S72A/H148Q/T203Y 1
14 1GFL 1.9 wt 2
15 1HCJ 1.8 wt 4
16 1HUY 2.2 S65G/V68L/Q69M/S72A/T203Y 1
17 1JBY 1.8 S65T/H148G/T203C 1
18 1JBZ 1.5 S65T/H148G/T203C 1
19 1KYR 1.5 F64L/Y66H/F99S/Y145F/H148G/M153T/V163A 1
20 1KYS 1.44 F64L/S65T/Y66H/F99S/Y145F/H148G/M153T/V163A 1
21 1MYW 2.2 F46L/F64L/S65G/V68L/S72A/M153T/V163A/
S175G/T203Y 1
22 1OXD 1.15 F64L/S65T/Y66W/N146I/M153T/V163A 1
23 1OXE 1.15 F64L/S65T/Y66W/N146I/M153T/V163A 1
24 1Q4A 1.45 S65T 1
25 1Q4B 1.48 S65T 1
26 1Q4C 1.55 S65T/T203C 1
S3
27 1Q4D 1.58 S65T/T203C 1
28 1Q4E 1.38 S65T/Y145C 1
29 1Q73 1.6 S65T/Y145C/T203C 1
30 1QXT 2.0 F64L/S65T/R96A/F99S/M153T/V163A 1
31 1QY3 2.0 F64L/S65T/R96A/F99S/M153T/V163A 1
32 1QYF 1.5 F64L/S65T /R96A/F99S/M153T/V163A 1
33 1QYO 1.8 F64L/S65G/Y66G/F99S/M153T/V163A 1
34 1QYQ 1.8 F64L/S65G/Y66G/F99S/M153T/V163A 1
35 1RM9 2.9 F64L/S65T/Y66W/N146I/M153T/V163A 1
36 1RMM 1.9 F64L/S65T 1
37 1RMP 3.0 F64L/S65T 1
38 1RRX 2.1 F64L/S65T 1
39 1S6Z 1.5 F64L/S65T/Y66L 1
40 1W7S 1.85 wt 4
41 1W7T 1.85 wt 4
42 1W7U 1.85 wt 4
43 1YFP 2.5 S65G/V68L/S72A/T203Y 2
44 1YHG 2.5 F64L/S65G/Y66S/V68G/F99S/M153T/V163A 2
45 1YHH 1.5 F64L/S65A/Y66S/G67A/F99S/M153T/V163A 1
46 1YHI 1.9 S65A/Y66S/R96A/F99S/M153T/V163A 1
47 1Z1P 2.0 F64L/S65T/Y66L 1
48 1Z1Q 1.5 F64L/S65T/Y66L 1
49 2AWJ 1.6 F64L/S65T/R96M/F99S/M153T/V163A 1
50 2AWL 1.85 F64L/S65T/R96K/F99S/M153T/V163A 1
51 2AWM 1.7 F64L/S65T/R96A/F99S/M153T/V163A/Q183R 1
52 2B3P 1.4 S30R/Y39N/F64L/S65T/F99S/N105T/Y145F/
M153T/V163A/I171V/A206V 1
53 2DUE 1.24 S65T 1
54 2DUF 1.5 S65T/H148D 1
55 2DUG 1.4 S65T/H148N 1
56 2DUH 1.2 S65T/H148N 1
57 2DUI 1.36 Q80R/H148D 1
58 2EMD 2.0 F64L/Y66H 1
59 2EMN 2.3 F64L/Y66H 1
60 2EMO 2.6 F64L/Y66H/V163A 1
61 2G16 2.0 S65A/Y66S/F99S/M153T/V163A 1
62 2G2S 1.2 F64L/S65G/Y66S/F99S/M153T/V163A 1
S4
63 2G3D 1.35 F64L/S65G/Y66A/F99S/M153T/V163A 1
64 2G5Z 1.8 S65G/Y66S/F99S/M153T/V163A 1
65 2H6V 1.47 F64L/S65T/T203Y 1
67 2H9W 1.82 F64L/S65T/T203Y 1
68 2HGD 1.6 S65A/Y66F/F99S/M153T/V163A 1
69 2HGY 2.05 S65A/Y66F/F99S/M153T/V163A/E222A 1
70 2HJO 1.25 F64L/S65T/V224H 1
71 2HQZ 1.2 L42H/F64L/S65T 1
72 2O24 1.45 F64L/S65T/T203Y 1
73 2O29 1.8 F64L/S65T/T203Y 1
74 2O2B 1.94 F64L/S65T/T203Y 1
75 2OKW 1.9 F64L/S65T/S205C 6
76 2OKY 2.4 F64L/S65T/S205C 2
77 2Q57 2.0 F64L/S65T/Y66W/S72A/Y145A/N146I/
H148D/M153T/V163A 1
78 2QLE 1.59 Q80R/S205V 4
79 2YFP 2.6 S65G/V68L/S72A/H148G/T203Y 1
80 3CBE 1.31 C48S/F64L/F99S/S147C/H148S/M153T/
V163A/I167T/Q204C 1
81 3CD9 1.5 C48S/F64L/F99S/S147C/H148S/V163A/
I167T/Q204C 1
82 2WUR 0.9 F64L/I167T 1
* Mutations Q80R, K238N and the Ala insertion after the N-terminal Met are cloning artifacts and
not explicitly mentioned. They do not affect the spectral properties of GFP.
** In PDB entries with more than one monomer per asymmetric unit, each monomer was used
independently for structural comparisons, with the exception of 1YFP- chain A, 1YHG- chain A,
2OKW- chain C, and 2QLE- chain B. This gives a total of 104 monomers.
S5
Table S3. Statistical properties of hydrogen-bonded clusters for the 104 PDB structures of GFP
mutants listed in Tbl. S2.
Property y Average STD
Correlation
coefficient
with
resolution
Correlation
coefficient
with water
fraction
Linear fit parameters for water
fraction (fw):
y = A + B fw
A B
Number of
clusters 22.5 8.6 -0.624 0.720 9.09 ± 1.41 149.96 ± 14.32
Maximal cluster
size 63.4 48.4 -0.664 0.762 -16.29 ± 7.37 892.07 ± 74.91
Average cluster
size 14.0 5.2 -0.647 0.741 5.73 ± 0.82 92.44 ± 8.29
Total number of
protein atoms
that participate in
clusters
193.4 88.0 -0.829 0.965 9.82 ± 5.41 2053.99 ± 55.01
Total number of
water oxygens
that participate in
clusters
135.0 83.6 -0.796 0.984 -42.71 ± 3.53 1988.29 ± 35.91
Total number of
atoms that
participate in
clusters *
328.4 169.9 -0.820 0.984 -32.89 ± 7.23 4042.27 ± 73.49
* protein + waters
S6
sg11- based reference atom list:
The partition of the ASW into sub-clusters was done here differently than in the main text (Tbl. S4 below). Atoms were assigned to the cluster in which they are present in more than 50% of the GFP structures. This is reflected mainly in the cut near Ser72 which divides sub-clusters 2 and 3. In addition, 10 atoms from the exit end were deleted because they participate in very few structures. Table S4: Partition of the sg11-based reference atom list into sub-clusters. From this 71-atom list, 10 atoms were deleted from cluster 3 (these are depicted in purple in Fig. S1).
sub-cluster 1 sub-cluster 2 sub-cluster 3
O GLU 5 Oε1 GLU 5 Oγ SER 65
O GLU 6 Oε2 GLU 5 Oη TYR 66
O LEU 7 O GLN 69 N VAL 68
O PHE 8 O CYS 70 O VAL 68
O THR 9 O SER 72 Nε2 GLN 69
Oγ1 THR 9 Oγ SER 72 O PHE 71
N GLY 10 N TYR 74 O TYR 143
O GLY 10 Nζ LYS 79 O TYR 145
N VAL 11 O LYS 79 O ASN 146
O VAL 11 O ASP 82 N SER 147
O GLU 34 Oδ1 ASP 82 Oγ SER 147
Oε1 GLU 34 Oδ2 ASP 82 Nδ1 HIS 148
Oε2 GLU 34 Nζ LYS 85 O HIS 148
Oδ1 ASP 36 Oγ1 THR 203
Oδ2 ASP 36 O THR 203
N ALA 37 O GLN 204
O ALA 37 N SER 205
N THR 38 O SER 205
Oγ1 THR 38 Oγ SER 205
Oη TYR 39 N LEU 207
Oγ1 THR 43 O LEU 207
O ASP 117 Oε1 GLU 222
Oδ1 ASP 117 Oε2 GLU 222
Oδ2 ASP 117
Oγ1 THR 118
S7
Figure S1. Schematic depiction of the connectivity in the sg11 active-site cluster (PDB file 2WUR). Atoms are colored according to the partition into sub-clusters in Tbl. S4: red- sub-cluster 1, blue- sub-cluster 2, green- sub-cluster 3, purple- sub-cluster 4 that was not considered in the sub-cluster analysis due to its scarcity within the 103 PDB structures. Note that O-T203 and N-S147 are in green because in structures other than sg11 (e.g., 1EMB), they usually belong to sub-cluster 3.
S8
The division into sub-clusters enables us now to correlate separately the number of atoms in each
sub-cluster with the water fraction of the various GFP structures. As can be seen, the correlation
exhibited in the main text for the total ASW holds for every sub-cluster separately.
Figure S2. Correlation graphs for the number of atoms from sg11-based list (Tbl. S4) appearing in a
specific structure vs. its water fraction. Graphs are shown for sub-clusters 1, 2 and 3 and for the
combined list of 71 atoms. (The latter is the same as Fig. 6B in the main text). Linear fit data presented
in Tbl. S5.
S9
Table S5. Statistical properties of the number of protein atoms participating in the hydrogen-bonded
sub-clusters from the list in Tbl. S4 for the 104 PDB structures of GFP mutants.
Number of
protein
atoms (y)
present in
GFP
mutants in:
Average STD
Correlation
coefficient
with
resolution
Correlation
coefficient
with water
fraction
Linear fit parameters for water
fraction (fw): y = A + Bfw
(See Fig. S2)
A B
Sub-cluster
11 15.08 7.73 -0.782 0.902 0.01 ± 0.78 168.56 ± 7.98
Sub-cluster
22 9.51 3.21 -0.813 0.772 4.15 ± 0.48 59.94 ± 4.89
Sub-cluster
33 14.23 4.51 -0.592 0.602 8.35 ± 0.85 65.77 ± 8.62
Total sg11-
based
reference
list 4
41.07 14.86 -0.826 0.909 11.87 ± 1.46 326.58 ± 14.87
1 Out of 25 atoms 2 Out of 13 atoms 3 Out of 23 atoms 4 Out of 71 atoms. Sub-clusters 1, 2, 3 combined + 10 additional deleted atoms (colored purple in Fig. S1)
S10
A similar analysis is possible for water oxygen atoms (instead of protein atoms) in each sub-
cluster. These correlations are summarized in Tbl. S6 below. Note that sub-cluster 1, which is on a
hydrophilic surface patch, has the highest water fraction of all sub-clusters.
Table S6. Statistical properties of the number of water oxygen atoms participating in the hydrogen-
bonded sub-clusters from the list in Tbl. S4 for the 104 PDB structures of GFP mutants. Because the
clusters of each PDB structure are not identical to the 2WUR clusters, the counting of the waters was
done as follows: for each PDB structure, in each sub-cluster, if an atom from the reference list appeared
in one of that PDB structures clusters, we counted all the water oxygen atoms which are H-bonded to
that atom, and summed over all the atoms in the sub-cluster.
Fraction of
water
oxygens (y)
in:
Average STD
Correlation
coefficient
with
resolution
Correlation
coefficient
with water
fraction
Linear fit parameters for
water fraction (fw):
y = A + Bfw
A B
Sub-cluster 1 0.306 0.168 -0.806 0.931 -0.032 ± 0.014 3.780 ± 0.146
Sub-cluster 2 0.260 0.102 -0.803 0.808 0.081 ± 0.015 1.996 ± 0.144
Sub-cluster 3 0.231 0.101 -0.619 0.750 0.066 ± 0.016 1.837 ± 0.160
Total sg11-
based reference
list
0.270 0.122 -0.806 0.937 0.024 ± 0.010 2.758 ± 0.102
S11
Figure S3. ASW fragments of a decarboxylated wt-GFP: (A) At 100K (PDB file 1W7T, A subunit);
(B) After annealing at 170 K (PDB file 1W7U). Hydrogen bonds are coded so that black bonds - exist in
both structures, blue bonds - exist only in 1W7U, red bonds - exist only in 1W7T. It is clearly seen how
annealing leads to formation of the Thr203-His148 exit pathway and to the restructuring of 3 water
molecules near Ser72, that now connect it with Ser65.