anatomy of the e2 ligase fold: implications for enzymology and evolution of ubiquitin/ub-like...

14

Click here to load reader

Upload: a-maxwell-burroughs

Post on 27-Oct-2016

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

Available online at www.sciencedirect.comJournal of

www.elsevier.com/locate/yjsbi

Journal of Structural Biology 162 (2008) 205–218

StructuralBiology

Anatomy of the E2 ligase fold: Implications for enzymologyand evolution of ubiquitin/Ub-like protein conjugation

A. Maxwell Burroughs a,b, Marcie Jaffee a, Lakshminarayan M. Iyer a, L. Aravind a,*

a National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USAb Bioinformatics Program, Boston University, Boston, MA 02215, USA

Received 11 October 2007; received in revised form 12 December 2007; accepted 13 December 2007Available online 1 January 2008

Abstract

The configuration of the active site of E2 ligases, central enzymes in the ubiquitin/ubiquitin-like protein (Ub/Ubl) conjugation sys-tems, has long puzzled researchers. Taking advantage of the wealth of newly available structures and sequences of E2s from diverseorganisms, we performed a large-scale comparative analysis of these proteins. As a result we identified a previously under-appreciateddiversity in the active site of these enzymes, in particular, the spatial location of the catalytic cysteine and a constellation of associatedconserved residues that potentially contributes to catalysis. We observed structural innovations of differing magnitudes occurring in var-ious families across the E2 fold that might correlate in part with differences in target interaction. A key finding was the independent emer-gence on multiple occasions of a polar residue, often a histidine, in the vicinity of the catalytic cysteine in different E2 families. Wepropose that these convergently emerging polar residues have a common function, such as in the stabilization of oxyanion holes duringUb/Ubl transfer and spatial localization of the Ub/Ubl tails in the active site. Thus, the E2 ligases represent a rare example in enzymeevolution of high structural diversity of the active site and position of the catalytic residue despite all characterized members catalyzing asimilar reaction. Our studies also indicated certain evolutionarily conserved features in all active members of the E2 superfamily thatstabilize the unusual flap-like structure in the fold. These features are likely to form a critical mechanical element of the fold requiredfor catalysis. The results presented here could aid in new experiments to understand E2 catalysis.Published by Elsevier Inc.

Keywords: Ubiquitin; Ligation; trans-thiolation; E2; E1; Active site; UBC; RWD; Ubiquitin-conjugation; Enzyme evolution; Catalysis

1. Introduction

The E2 enzyme is the central component in the transferof ubiquitin (Ub) and ubiquitin-like (Ubl) proteins to tar-gets in diverse conjugation pathways. During the ubiquiti-nation of proteins, an Ub/Ubl is first adenylated by the E1enzyme, and then transferred to a conserved cysteine resi-due on the E1 protein through a thioester linkage formedwith the terminal carboxyl group of the Ub/Ubl. A subse-quent trans-thiolation reaction transfers the Ub/Ubl to aconserved cysteine residue on the E2 enzyme. In the finalstep of the cascade, the Ub/Ubl is transferred from the

1047-8477/$ - see front matter Published by Elsevier Inc.

doi:10.1016/j.jsb.2007.12.006

* Corresponding author. Fax: +1 301 435 7794E-mail address: [email protected] (L. Aravind).

E2 to the e-amino group of lysine residues on protein sub-strates or the amine group of phosphatidylethanolamine inthe case of the autophagy pathway (Ciechanover et al.,2000; Ichimura et al., 2000). This final step is usually med-iated by E3 ligases that may function in one of two distinctways: The HECT-like E3 ligases transfer the Ub/Ubl fromE2 to an internal cysteine through a further trans-thiola-tion step before transferring it to the target, whereas theRING (U-Box) and A20 finger-type E3 ligases appear toonly bridge the E2 enzyme and the substrate with the targetlysine residues (Aravind et al., 2003; Ardley and Robinson,2005; Wertz et al., 2004). The role of the E2 enzymes in theUb/Ubl conjugation cascade, and the importance of theconserved cysteine shared by all catalytically active E2enzymes in the trans-thiolation reaction are well-estab-lished (Berleth and Pickart, 1996; Dye and Schulman,

Page 2: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

206 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

2007; Hershko et al., 1983; Pickart, 2001). The transfer ofUb/Ubl from E1 to E2 involves a nucleophilic attack bythe conserved E2 cysteine on the carbonyl group of theUb/Ubl-E1 thioester linkage (Pickart, 2001). Experimentshave suggested that this reaction is primarily catalyzed byresidues on the E1 protein in addition to the E2 cysteine(Wu et al., 2003).

Likewise, the conserved cysteine in the E2 protein is theonly residue shown to be essential for the transfer of Ub/Ubl from E2 to the HECT ligases (Dye and Schulman,2007; Wu et al., 2003). However, all other terminal Ub/Ubl transfer reactions involving E2 enzymes suggest a crit-ical role for the target lysine. Studies on the ubiquitinatingand sumoylating E2 enzymes, Ubc13 and Ubc9, indicatethat the target lysine makes a nucleophilic attack on thecarbonyl group of the Ub/Ubl-E2 thioester linkage upondeprotonation. Together, these studies also implicate threeother residues (namely an asparagine, tyrosine and aspar-tate) in E2 function, which are proposed to mediate local-ization of the target lysine, and act to lower the effectivepKa of the active site to allow the lysine’s deprotonation(Capili and Lima, 2007; Wu et al., 2003; Yunus and Lima,2006). The asparagine is also proposed to stabilize an oxy-anion formed in the reaction intermediate during the nucle-ophilic attack (Wu et al., 2003). These additional residuesimplicated in E2 action appear to be only required in thetransfer of Ub/Ubl from E2 to the e amino groups of targetlysine residues in conjugations mediated by non-covalentlyinteracting E3, like the RING superfamily (Knipscheer andSixma, 2006; Melchior et al., 2003; Wu et al., 2003; Yunusand Lima, 2006; Zheng et al., 2000). However, the general-ity of this proposal for E2 function, and other aspects ofthe biochemical mechanism by which Ub transfer iseffected, including the role of the E2s in target specificity,is poorly understood.

All eukaryotes possess several E2s ranging from about 8to 11 in certain apicomplexans and Giardia to a little over50 in certain multicellular plants and animals (Supplemen-tary material). Most of them, barring those in the autoph-agy pathway (Apg3 and Apg10), are relatively close to eachother in sequence. The entire diversity of eukaryotic E2shas not been systematically explored through the applica-tion of evolutionary principles to identify key functionalresidues. The extreme divergence of Apg3 and Apg10 fam-ilies also did not allow for a direct comparison with theclassical E2s (Ichimura et al., 2000; Yamada et al., 2007).More recently, we discovered the first prokaryotic homo-logs of the E2 enzymes proposed to function analogousto their eukaryotic counterparts in predicted Ub-conjuga-tion like systems in bacteria (Iyer et al., 2006). A compar-ison of their conservation profiles with the eukaryotic E2srevealed that while the E2 catalytic cysteine is strictly con-served, they include representatives which are highly diver-gent in sequence. Importantly, comparisons of theprokaryotic and eukaryotic sequences suggested that theresidues besides the catalytic cysteine proposed to beinvolved in E2 function might not be generally conserved.

Concomitantly, several new crystal structures of E2enzymes have been solved, including that of Apg3, whichwas also shown to lack the other residues involved in E2function, despite conserving the catalytic cysteine (Araiet al., 2006; Mizushima et al., 2007; Yamada et al., 2007).Structures of catalytically inactive versions of the E2 foldsuch as UEV1, Tsg101 (UEV) and the RWD domain havealso been solved, providing structural information regard-ing the non-enzymatic adaptations of the fold (Eddinset al., 2006; Hau et al., 2006; Nameki et al., 2004).

These pieces of new data presented us with an opportu-nity to use the evolutionary information from bacterial E2homologs in conjunction with the new crystal structures toexplore previously unknown aspects of E2 catalysis andbiochemical function. To this end, we performed a compre-hensive comparative analysis of the E2 superfamily, identi-fying all structural and sequence variations. As a result wewere able to uncover an unusually high and previouslyunder-appreciated diversity in the sequence and structuralfeatures of the active sites of the E2 superfamily. In partic-ular, it appears that the E2s might represent an infrequentexample in enzyme evolution, where despite retention ofthe key catalytic residue and general biochemical functionthere is high active site diversity. We present evidence thatthe structural features identified here might provide newinsights into the action of these enzymes, especially in termsof residues other than the catalytic cysteine.

2. Results and discussion

2.1. Comprehensive identification and classification ofmembers of the E2 superfamily

A multi-pronged search strategy utilizing sequence andstructure similarity searches was used to comprehensivelyidentify all members of the E2 fold. For structural searcheswe began with a non-redundant set of representatives ofthe E2 fold from the SCOP database ((Murzin et al.,1995); http://scop.mrc-lmb.cam.ac.uk/scop/) and itera-tively searched the current version of the PDB databaseusing the DALILITE program (Holm and Sander, 1995).In addition to previously described E2 proteins, thesesearches retrieved several E2 homologs (Z score > 9), suchas UbcI (PDB: 2f4w) and Ufc1 (PDB: 2in1), crystallized aspart of structural genomics initiatives. Sequences of all dis-tinct structural representatives were then used as queriesfor sequence profile searches with the PSI-BLAST program(see Section 4 for search strategy and threshold for signifi-cance). Alignments of all true positives were used for fur-ther HMM searches with the HMMER program (Eddy,1998). The complete set of true positive sequences detectedin these searches were clustered and classified into familiesbased on uniquely shared sequence conservation patternsand distinct structural features (supplementary material).A structural multiple alignment of all available E2 foldstructures was constructed using the MUSTANG program(Konagurthu et al., 2006). A sequence alignment including

Page 3: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218 207

representatives of all detected families of the E2 domainwas constructed using the MUSCLE and KALIGN pro-grams and adjusted based on the above-mentioned struc-tural alignment (Fig. 1; see Section 4 for details). As aresult of our analysis, we identified 16 distinct families ofE2 domains, 11 eukaryotic and five prokaryotic. Of thesethree families in eukaryotes and one family in prokaryoteswere either predicted or have been experimentally shown tobe inactive, and lacked the catalytic cysteine.

Among the eukaryotic E2 families, Ub ligase activity hasbeen demonstrated in representatives across their entirespectrum of structural diversity (Fig. 1), ranging from thehighly divergent families such as Apg3/Apg10 to the formslike UbcI and UBE2W-like families, which are closer to theclassical E2 family (Bartke et al., 2004; Chen et al., 1993;Christensen et al., 2007; Hershko et al., 1983; Ichimuraet al., 2000; Komatsu et al., 2004; Melner et al., 2006). Thus,the entire structure and sequence diversity of the eukaryoticmembers of the superfamily is bracketed by versions withUb ligase activity, suggesting that this activity is a reason-able assumption for all representatives of the group thatbear a conserved cysteine. The bacterial versions lack exper-imental evidence regarding catalytic activity. However, theyshow contextual associations in the form of domain fusionsand conserved gene neighborhoods with homologs of othercore components of the ubiquitin signaling system, such asE1, Ub and the JAB metalloprotease. Further, none of thecontextual connections of the prokaryotic E2 proteinsreveal associations suggesting participation in alternateenzymatic pathways. Hence it is highly probable that theprokaryotic E2s show similar enzymatic activity as theireukaryotic counterparts (Iyer et al., 2006).

A preliminary comparison of the residue conservationacross the 12 catalytically active families revealed thatother than the cysteine none of the other residues impli-cated in Ub/Ubl transfer based on the studies on UBC9and UBC13 (Wu et al., 2003; Yunus and Lima, 2006) arewidely conserved. The asparagine residue implicated inpositioning the lysine and stabilizing the oxyanion (Wuet al., 2003) was conserved in four families (Fig. 1). Ofthe other residues implicated in providing the environmentto reduce the pKa in the active site, the tyrosine residue wasonly detected in 3 eukaryotic families (although it is often ahydrophobic in others), whereas the aspartate residue wasonly detected in certain members of the classical E2 family(Fig. 1 and Table 1). This immediately suggested that theproposed additional residues of the E2 active site are unli-kely to be a general feature of E2 catalysis, and that there isa striking diversity in the active sites of the E2 superfamily.In order to understand better the role of this diversity in E2function we used the above multiple sequence alignment,and the corresponding structural alignment for an explora-tion of the evolution of the active site of the superfamily.To do this we first identified the structure and sequence fea-tures conserved across the fold and used this as a baselineto systematically explore the innovations occurring in eachdistinctive family.

2.2. Core structure, active site configuration and surface

features conserved in the E2 fold

As previously described the core of the E2 fold is basedon a four-stranded b-meander (henceforth labeled strands1–4) that resembles a single ‘‘blade” of the b-propellerstructures (Aravind et al., 2006). Beyond this basic core,the E2 fold displays certain distinctive structural accretionsthat distinguish it from all other comparable four-strandedb-meanders. These include: (1) a C-terminal flap-like struc-ture that contains within it two small elements in extendedconformation forming a b-hairpin (hereinafter termed the‘‘flap”; Fig. 2). (2) Flanking the meander and the flap, thereis a helix at the N-terminus (helix 1) and 1–2 helices at theC-terminus (henceforth helix 2a and 2b, Fig. 2). The cata-lytic cysteine is usually located at the C-terminus of theflap, and in every available structure is in an exposed loca-tion (Capili and Lima, 2007; Mizushima et al., 2007; Mor-aes et al., 2001; Tong et al., 1997; Yamada et al., 2007).This flap appears to be an important feature for E2 catal-ysis—comparisons of structures with and without theUb/Ubl substrate (Capili and Lima, 2007; Reverter andLima, 2005; Yunus and Lima, 2006) show that the C-termi-nal extended segment of the flap undergoes a shift to analternative conformation. It moves away from the N-termi-nal extended segment and pairs with the extended tail ofthe Ub/Ubl through hydrogen-bonding (Fig. 2, see Supple-mentary material). This is critical for appropriate position-ing the Ub/Ubl tail for attack by the target lysine (Reverterand Lima, 2005; Yunus and Lima, 2006).

The only conserved residue that is seen in a number offamilies in close proximity to the cysteine is the asparaginethat was previously implicated in stabilizing the oxyanionhole ((Wu et al., 2003); hereafter termed ‘‘flap aspara-gine”). In structures with the co-crystallized substratesthe terminal residue of the Ub/Ubl (glycine) is sandwichedbetween this asparagine and the cysteine (Reverter andLima, 2005). In several families, including the classical E2of Ub in eukaryotes and certain bacterial families theside-chain of the flap asparagine is flanked, on the sideopposite to the catalytic cysteine and substrate Ub/Ubl,by a conserved histidine (termed flap histidine) that is pres-ent two residues upstream in the sequence (HxN signature,Figs. 1 and 2). As mentioned above, the tyrosine or anequivalent hydrophobic residue implicated in forming apart of the active site in studies on UBC9 is only conservedin a small number of families, and the co-crystal structuresshow it forming a part of the interface with the target pep-tide (Reverter and Lima, 2005; Yunus and Lima, 2006).Hence, its poor conservation is likely to be a consequenceof the differences in target peptide recognized by differentE2s, as well and alterations in the architecture of the activesite seen in various families of the E2 fold (see below).

Examination of the sequence and structure alignmentsreveal several residues, besides the catalytic cysteine, thatare highly conserved across most members of the fold,including the divergent Apg3/10 and bacterial families

Page 4: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

208 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

Page 5: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

Fig

.1.

Mu

ltip

leal

ign

men

to

fse

lect

edE

2d

om

ain

s,re

pre

sen

tin

gac

tive

site

vari

ants

.S

equ

ence

sar

esh

ow

nw

ith

thei

rge

ne

nam

e,sp

ecie

sab

bre

viat

ion

s,an

dge

nb

ank

ID(g

i)n

um

ber

sse

par

ated

by

un

der

sco

res

give

nto

the

left

of

the

alig

nm

ent.

PD

Bid

so

fse

qu

ence

sw

ith

solv

edX

-ray

crys

tals

tru

ctu

res

are

hig

hli

ghte

din

ora

nge

.Th

eto

pli

ne

of

the

alig

nm

ent

lab

els

the

maj

or

con

serv

edfe

atu

res

of

the

fold

and

the

lin

eb

elo

wth

eal

ign

men

tsh

ow

sth

eco

nse

nsu

sse

con

dar

yst

ruct

ure

;E

and

Hd

eno

teb-

stra

nd

and

a-h

elix

,re

spec

tive

ly.

Fam

ily

nam

esar

eli

sted

toth

eri

ght

of

the

alig

nm

ent.

Th

eb

acte

rial

fam

ilie

sA

–Eco

rres

po

nd

tore

pre

sen

tati

ves

of

row

s6A

–6E

inT

able

2o

fIy

eret

al.

(200

6).

Co

lum

ns

inth

eal

ign

men

tar

eco

lore

db

ased

on

the

fun

ctio

nal

role

of

the

resi

du

e:th

eca

taly

tic

cyst

ein

eis

colo

red

inye

llo

wan

dsh

aded

inre

d,

flap

his

tid

ine

and

asp

arag

ine

resi

du

esan

do

ther

resi

du

esp

red

icte

dto

per

form

anal

ogo

us

role

sar

eco

lore

din

red

and

shad

edin

yell

ow

,th

ep

roli

ne

and

hyd

rop

ho

bic

resi

du

efo

rmin

ga

con

serv

edst

abil

izin

gco

nta

ctar

esh

aded

ingr

een

,an

dth

eco

nse

rved

aro

mat

icre

sid

ue

and

oth

erre

sid

ues

pre

dic

ted

tofo

rma

chai

no

fin

tera

ctin

gre

sid

ues

are

colo

red

ingr

ay.

Sp

ecie

sab

bre

viat

ion

sar

eas

foll

ow

s:A

fum

,A

sper

gil

lus

fum

igatu

s;A

mel

,A

pis

mel

life

ra;

Asp

.,A

zoarc

us

sp.;

Ath

a,A

rab

ido

psi

sth

ali

ana

;B

cer,

Baci

llus

cere

us;

Bvi

e,B

urk

ho

lder

iavi

etn

am

ien

sis;

Cel

e,C

aen

orh

ab

dit

isel

ega

ns;

Cn

eo,

Cry

pto

cocc

us

neo

form

an

s;C

per

,C

lost

rid

ium

per

frin

gen

s;D

dis

,D

icty

ost

eliu

md

isco

ideu

m;

Dm

el,

Dro

sop

hil

am

elan

og

ast

er;

Eco

l,E

sch

eric

hia

coli

;E

his

,E

nta

mo

eba

his

toly

tica

;G

lam

,G

iard

iala

mbli

a;

Go

xy,

Glu

con

oba

cter

ox

yd

ans;

Gu

ra,

Geo

ba

cter

ura

niu

mre

du

cen

s;H

sap

,H

om

osa

pie

ns;

Lm

aj,

Lei

shm

ania

majo

r;M

aqu

,M

ari

no

ba

cter

aq

uaeo

lei;

Md

eg,

Mic

rob

ulb

ifer

deg

rad

an

s;M

mu

s,M

us

musc

ulu

s;O

tau

,O

stre

oco

ccu

sta

uri

;P

arc,

Psy

chro

ba

cter

arc

ticu

s;P

ber

,P

arv

ula

rcu

lab

erm

ud

ensi

s;P

fal,

Pla

smo

diu

mfa

lcip

aru

m;

Pn

ap,

Pola

rom

onas

naphth

ale

niv

ora

ns;

Pp

ro,

Pel

ob

act

erp

ropio

nic

us;

Psp

.,P

ola

rom

on

as

sp.;

Pte

t,P

ara

mec

ium

tetr

au

reli

a;

Reu

t,R

als

tonia

eutr

op

ha;

Rm

et,

Ra

lsto

nia

met

all

idu

ran

s;R

sp.,

Rh

izob

ium

sp.;

Sce

r,S

acc

ha

rom

yce

sce

revi

sia

e;S

po

m,

Sch

izo

sacc

ha

rom

yce

sp

om

be;

Tan

n,

Th

eile

ria

an

nu

lata

;T

bru

,T

ryp

ano

som

ab

ruce

i;T

cru

,T

ryp

ano

som

acr

uzi

;T

vag,

Tri

chom

on

as

vagin

ali

s;V

cho

,V

ibri

och

ole

rae;

Xax

o,

Xa

nth

om

on

as

ax

on

opo

dis

.

A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218 209

(Fig. 1). Amongst these are a proline N-terminal to strand4 and its hydrophobic interacting partner (usually trypto-phan or methionine) located between the flap and helix-2a (Figs. 1 and 2). This pair of residues is also conservedin inactive versions indicating a central role for them in sta-bilizing the E2 fold by tethering the flap to the core sheet ofthe b-meander (Figs. 1 and 2). Another well-conserved res-idue present in both active and inactive versions is a hydro-phobic (mostly aromatic) residue in the loop betweenstrand-2 and strand-3 (Figs. 1 and 2). This residue is partof a ‘‘chain” of interacting residues that, in some form, isseen in most members of the fold, and might play animportant role both in connecting the flap to the core b-sheet as well as potentially allowing conformationalchanges during catalysis (see below for details).

In order to determine if the different versions of the E2fold might share a general mode of target interaction, wecompared their protein surfaces and conservation of sur-face residues. We observed that all catalytically active ver-sions including the divergent Apg3/Apg10 and the inactiveUEV1 proteins and Tsg101 (UEV) family shared the pres-ence of two distinct surface grooves. These grooves arelocated on either face of the central b-sheet and arepredominantly lined by hydrophobic residues that are con-served within individual families (Fig. 3, see supplementarymaterial). Prior structural studies suggest that the grooveon the face of the sheet which packs with the helices inter-acts with the C-terminal tail of Ub/Ubl (Reverter andLima, 2005). The catalytic cysteine of the Ubc6 andAPG3/APG10 families is located towards the center ofthe groove, while that of the other E2 families is closer tothe end of the groove (Fig. 3). However, the invariant asso-ciation of the catalytic cysteine with the groove suggeststhat, despite the differences in residues lining it in the var-ious families, its role in binding the Ub/Ubl tail is an ances-tral feature of the E2 superfamily. The role of the grooveon the exposed face is less clear, but certain structures sug-gest that it might accommodate a part of the E1 duringtrans-thiolation (Wang et al., 2007). Its conservation acrossdiverse E2 families suggests that an E1 interaction via thisgroove is a potentially ancestral feature of the E2 fold.

2.3. Major structural variations in members of the E2 fold

Although the core E2 topology is retained across thefold, and the catalytic cysteine shared by all active mem-bers, we observed rather dramatic structural variations incertain members that entirely altered the local architectureof the active site. The most striking of these was seen in theAPG3 and APG10 families involved in direct E3-indepen-dent transfer of Ubls to protein or lipid substrates inautophagy (Ichimura et al., 2000; Yamada et al., 2007).In these two families, there is a unique helix inserted intothe N-terminal portion of the flap region, and the flap isalso much longer than all other E2 families. Furthermore,instead of forming the usual b-hairpin that points awayfrom the core b-meander, its extended regions stack as

Page 6: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

Fig. 2. Topology diagram of generalized E2 domain and tree depicting general relationships between E2 domain families. (A) Cartoon representationshowing conserved elements of the E2 fold. The extended elements of the flap are colored in gray. Labels for all elements in the figure correspond to thosein the text. Locations of the flap histidine and flap asparagine in the classic E2 active site are shown as circles colored in yellow, while other conservedresidues mentioned in the text are depicted as line drawings. The C-terminal Ub/Ubl tail that forms hydrogen-bonding interactions with the C-terminalextended element is depicted as a brown colored arrow. The red lines represent hydrogen-bonding occurring before attack by the target lysine. (B)Reconstructed evolutionary history of the E2 protein fold. Lineages in the fold are listed to the right, with inferred evolutionary depth traced by solidhorizontal lines across the relative temporal epochs representing key evolutionary transition periods shown as vertical lines. Horizontal lines are color-coded according to observed phyletic distribution in a given lineage, red applied to bacteria-specific lineages and green applied to eukaryote-specificlineages. Dashed lines indicate uncertainty regarding the origins of a lineage. Yellow circles placed on a horizontal line indicate a lineage has lost catalyticactivity.

210 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

additional strands to the core b-meander (Figs. 2 and 4,PDB: 2dyt). Consequently, the catalytic cysteine is nolonger in an equivalent location as that of the other E2 foldmembers, and is displaced further downstream. This unu-sual displacement precludes interactions with the positionequivalent to the above-mentioned asparagine. Not sur-prisingly, members of the Apg3/Apg10 families lack theflap histidine and asparagine, instead possessing a strictlyconserved histidine residue two residues N-terminal tothe catalytic cysteine (Table 1). Strikingly, the spatial prox-imity of the side-chain of this conserved histidine vis-a-visthe catalytic cysteine is highly reminiscent of that of theflap asparagine vis-a-vis the cysteine of the classical E2(Figs. 2 and 4, PDB: 2dyt). In addition, members of theApg3/Apg10 families also contain large inserts betweenthe 2nd and 3rd strands of the core b-meander.

A comparable structural variation is furnished by theUbc6 family (PDB: 2F4W), which is unified by a distinctivethree-residue insert in the flap after the catalytic cysteine(Figs. 1 and 2). The flap b-hairpin is reorganized due to

hydrogen-bonding interaction between the first extendedregion and a neomorphic extended region between helices2a and 2b (a loop in most other E2 families). This freesthe second extended region of the b-hairpin that insteadappears to hydrogen bond with the core b-meander witha consequent spatial displacement of the catalytic cysteinefrom its usual structural neighborhood. Consistent withthis feature, the Ubc6 family lacks the flap histidine andasparagine, and the equivalent residues are not spatiallypositioned to interact with the active cysteine (Figs. 2 and4). However, we uncovered a histidine, seven residuesdownstream of the catalytic cysteine in the family-specificinsert, which is universally conserved in the Ubc6 family(Fig. 1 and Table 1). Although this region is not resolvedin the available crystal structure, the location of the histi-dine suggests that it is likely to come close to the catalyticcysteine and flank it similar to the flap asparagine in theclassical E2s, or the above-mentioned histidine of theAgp3/Apg10 families (Figs. 2 and 4). The third majorstructural variation is the UbcI family (PDB: 1ZUO) in

Page 7: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

Table 1Distinguishing sequence and structural features of E2 families

Family namea ReferencePDB/Genbank index no.

Catalyticcysteineb

Flapasparagine

Flaphistidine

Otherpolarresidues

Chain ofconservedinteracting residuesc

Classical E2-UBC13 1J7D-chain B + + + � S45-F47-H77-N79-C87Classical E2-UBC9 1Z5S-chain A + + + Y87, D127 T51-W53-H83-N85-C93Bruce-like E2 61744456 + + N � 4591T-4593Y-4628N-4630N-C4638d

UBE2W 2A7L + H + � T47-Y49-H81-H83-C91Ubc6 2F4W + � � H101 T54-Y56-M122-F87- C94-H101UbcI 1ZUO + � � H362, E306 L252-I268-L292-V297-L298-C304-H362Ufc 2IN1 + K (backbone) � W145 F80-L102-M109-C116-W145APG3 2DYT + � � H232 Y169-M201-V229-T215-T213-A214-H232-C234APG10 119616289 + � � H164 Y102-I128-F161-T149-T147-I146-H164-C166d

Tsg101(UEV)

2F0R � � [H]e � �

AKTIP 61743933 � H � � �Bacterial family A 48864353 + + + � S62-I64-H97-N99-C110d

Bacterial family D 56315656 + + � H185 �a Only relevant families discussed in the text are shown. Specific members of a family that are being described are shown next to a hyphen.b In columns 3,4, and 5, + denotes presence and � denotes absence. Alternate conserved residues found in the same position are also shown.c The chain is represented as a sequence of interacting amino acid residues, separated by hyphens. The number next to the residue is position of the

residue in the reference sequence.d Chain of interacting residues predicted based on sequence conservation and comparison to known structures.e Square brackets are enclose residues that are widely but not absolutely conserved.

A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218 211

which a constellation of interconnected modifications areobserved. The N-terminal region of the flap upstream tothe b-hairpin forms an extended region that packs with aneomorphic extended region between helices 2a and 2b(usually a loop in most other E2s). This not only appearsto have displaced the flap hairpin with the catalytic cysteine‘‘upwards” (Figs. 2 and 4), but also appears to have dis-placed helix 2b to pack against the flap hairpin. Keepingwith these rearrangements, this family lacks the equivalentof the flap histidine and asparagine, instead bearing a fam-ily-specific conserved histidine at the C-terminus of helix 2b(Figs. 1 and 2, Table 1). This histidine occupies a spatialposition equivalent to the flap asparagine or its structuralanalogs in the other variants discussed above (Fig. 4). Anacidic residue (usually glutamate), universally conservedin this family and located two residues C-terminal to thecatalytic cysteine, appears to tether the histidine in thisposition by forming a salt bridge.

The catalytically inactive RWD domain is the moststructurally divergent version of the E2 fold. Here, the flapappears to have assumed an entirely helical conformation.The retention of the hydrophobic residue, normally presentbetween the flap and helix 2a, and its interacting partner,the conserved proline N-terminal to strand 4, suggests thatthe family-specific helix is derived from the flap and thestructural transition may have been favored by the lossof catalytic activity.

2.4. Subtle sequence variations in E2 families affecting active

site properties

Beyond the major structural variations in the fold andthe corresponding changes in active site architecture, wedetected several sequence variations specific to particular

families that could alter the configuration of the active site.The most striking variation is seen in the Ufc1 family inwhich both the flap asparagine and histidine are absent(Table 1). In place of the former there is a family-specificconserved lysine whose backbone carbonyl is juxtaposedclose to the catalytic cysteine rather than the side-chain.Additionally, this family contains a universally conservedtryptophan in the middle of helix 2a whose indole nitrogenalso comes in close vicinity to the catalytic cysteine andbackbone carbonyl of the lysine (Figs. 1 and 4). Thesetogether appear to form a constellation highly reminiscentof the flap asparagine side-chain of the classical E2s. Sub-tler changes are seen in the Bruce-like E2 family, whichlack independent E3 ligases, and in which the E2 domainis always part of a larger multi-domain polypeptide ((Bart-ke et al., 2004) and AMB, LMI and LA unpublished obser-vations). Here, the flap histidine is substituted by anasparagine. Likewise, throughout the UBE2W family, theflap asparagine residue is replaced by a conserved histidine(Figs. 1 and 4), but its side chain occupies a position similarto the flap asparagine of the classical forms.

Four of the bacterial families (A–D) are predicted to becatalytically competent forms due to conservation of theactive cysteine, with members of the family E likely to beinactive versions analogous to the eukaryotic UEV1 pro-teins and Tsg101 (UEV) family. Though members of thebacterial families exhibit a high level of divergence amongstthemselves, most members of families A, B, and D conserveeither an asparagine or a histidine in position equivalent tothe flap asparagine of the classical E2s (Fig. 1). Family Ccurrently has too few members to derive a meaningful con-servation profile. Family A displays the same active siteconfiguration as the classical E2 family with both a histi-dine and asparagine residue in the flap (i.e. with a HXN

Page 8: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

Fig. 3. Surface diagrams displaying conserved binding grooves. Molecular surfaces are shown for the two conserved binding grooves in selected membersof the E2 fold. The different grooves are labeled at the top of the figure, the ubiquitin-binding groove is on the left and the putative E1-binding groove is onthe right. Two surface depictions in the same orientation are provided for each groove, the left surface provides an opaque view of the surface with thecatalytic cysteine and flap asparagine/histidine residues colored in yellow, with family-specific conserved residues associated with the groove colored blue.The right surface provides cartoon depictions of the secondary structure of the domain colored in yellow framed by the transparent outline of themolecular surface. The catalytic cysteine and flap asparagine/histidine (or equivalent residues) are rendered as ball-and-sticks and colored in red. Thesurfaces of other conserved residues in a family are colored blue. Family names and PDB codes are given at the left.

212 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

signature). All members of family B have a conserved his-tidine as the cognate of the flap asparagine and lack anyother conserved polar residue that is likely to be in closeproximity to the catalytic cysteine. Members of family D,which have a conventional flap asparagine, lack the flaphistidine. Interestingly, they possess a highly conserved his-tidine between helices 2a and 2b. Based on alignments with

known E2 structures this residue is predicted to occupy aspatial position equivalent to the flap histidine of the con-ventional forms ((Iyer et al., 2006); Fig. 1).

Members of the Tsg101 (UEV) family occasionally pre-serve the flap histidine and rarely the flap asparagine(Fig. 1 and Table 1). The structure of the Tsg101 (UEV)protein suggests that the general architecture of the active

Page 9: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

Fig. 4. Cartoon structural diagrams showing E2 domain active site variants. Cartoon representations of X-ray structures are shown with PDB codes andfamily names provided at the left. Strands are colored in blue and helices in red. The catalytic cysteine and the flap asparagine/histidine or predictedequivalent residues are labeled and rendered as ball-and-sticks colored in yellow. The molecular surfaces of these residues, as well as residues predicted toform the conserved interacting chain are rendered using their van der Waal radii. The histidine residue predicted to be important in the Ubc6 family, butnot observed in the crystal structure due to structure disorder is placed in its approximate predicted spatial location and colored pink. The proline andhydrophobic resides forming a conserved hydrophobic contact are rendered as ball-and-sticks and are colored orange. The active versions of the domainare boxed in orange and the inactive Tsg101 (UEV) family boxed in purple.

A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218 213

site is retained despite lacking the active cysteine, and isconsistent with its role in binding mono-ubiquitin on cargo

proteins as part of the endosomal sorting ESCRT-I com-plex (Garrus et al., 2001). Likewise, the inactive UEV1 pro-

Page 10: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

214 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

teins, which are very close to the classical E2 proteins,retain their interaction with Ub as a subunit of a heterodi-meric E2 complex involved in polyubiquitination (Eddinset al., 2006). All members of the other inactive eukaryoticE2 family, namely the enigmatic AKTIP family, retainthe flap histidine (Fig. 1 and Table 1). Hence, these toomight preserve the general active site architecture, andpotentially function as an Ub-binding protein comparableto UEV1 and Tsg101 (UEV).

2.5. Identification of a chain of interacting residues

connecting the active site to the core b-meander

To understand better the role of the highly conservedhydrophobic (mostly aromatic) residue in the loop betweenstrand-2 and strand-3 (Fig. 1) we systematically examinedits interacting partners and their conservation. We foundthat this hydrophobic residue always interacted with theflap histidine when it was present. As the flap histidine(or cognate asparagine in the BRUCE family) in turn inter-acts with the flap asparagine, and that residue in turn withthe catalytic cysteine, these residues formed an interactingchain connecting the active site to the core b-meander(Fig. 4). In four families (Figs. 1 and 4, Table 1), an alco-holic residue, also from the loop between strand-2 andstrand-3 and two positions upstream of the highly con-served hydrophobic residue, is part of this interactingchain. The hydrophobic residue between strand-2 andstrand-3 is also nearly absolutely conserved in families thathave undergone major structural alterations, or those lack-ing the flap histidine and/or asparagine. An examination ofthese families suggested that even in these cases a compara-ble chain is constituted by the conserved hydrophobic res-idue along with other interacting residues conserved infamily-specific manner (Table 1). For example, in theUbcI-like proteins the interaction chain has been elongatedto include more residues, concomitant with the displace-ment of the active cysteine due to emergence of an addi-tional extended region in the flap. The additional residuesinclude conserved hydrophobic residues from strand-3,the neomorphic extended region in the flap and a couplefrom the first extended segment of the flap hairpin (Table1).

In the Ubc6p family this chain, rather than directlyleading to the cysteine, connects with the loop betweenhelix-2a and helix-2b. This loop in turn furnishes a resi-due, which is usually hydrophobic, that appears todirectly interact with the catalytic cysteine. Furthermore,the afore-stated residue might also convergently providea hydrophobic environment for the cysteine as suggestedfor the tyrosine in experiments on Ubc9 (Figs. 1 and 4and Table 1). In line with the major structural modifica-tion seen in the APG3/APG10 families the interactionchain is longer and more diffuse than any other represen-tatives of the fold. Here, the conserved hydrophobic resi-due between strand -2 and strand-3 initiates the chain viaa contact with a highly conserved methionine from the

family-specific helix in the flap. This in turn interacts witha conserved hydrophobic residue and the Apg3/Apg10-specific ThTb motif (h: hydrophobic and b: big residues),respectively, from the two extended regions of the exag-gerated b-hairpin seen in these families. The ThTb motifthen provides the link to the HPC motif at the active site(Figs. 1 and 4, Table 1).

2.6. Implications of the conserved and diversified features for

catalytic mechanisms of the E2 fold

From the earliest studies on E2 mechanism it has beenremarked that the ‘‘catalytic landscape of the E2 activesite is remarkably sterile” (Johnson, 2004; Pickart, 2001;Wu et al., 2003). While subsequent studies have uncov-ered potentially important details regarding the E2 mech-anism (Wu et al., 2003; Yunus and Lima, 2006), therehas been no systematic attempt to unite these structureand mutagenesis studies with evolutionary conservationacross the E2 fold and within individual families. Theabove observations emerging from our analysis basedon a natural classification of the E2 domains and thewealth of new structures provides several new insightsinto E2 catalytic function:

(1) In terms of mechanical properties of the E2 domain thepresence of a conserved surface groove suggests a simi-lar mode of Ub/Ubl-interaction throughout the fold(Fig. 3). Furthermore, conservation of the flap structuredespite the variety in active site architecture suggests animportant generalized role for it in E2 action (Fig. 2 andFig. 4). The opening and pairing of the second extendedsegment of the flap hairpin with the Ub/Ubl-tail sug-gests that this part of the flap is central to interactionwith the Ub/Ubl tail and specific recognition of the cog-nate Ub/Ubl.

(2) The conserved proline-hydrophobic side chain inter-action at the ‘‘top” of the flap and the chain of inter-acting residues at the ‘‘bottom” of the flap may alsobe interpreted as contributing to the mechanical fea-tures of the E2-Ub/Ubl interaction (Fig. 2). Thesetether the C-terminal end of the flap and the firstextended segment to the rest of the b-meander scaf-fold. Hence, they probably stabilize the rest of thestructure while allowing the second segment of thehairpin to alternatively form hydrogen-bonds withthe first segment (when there is no Ub/Ubl) or withthe Ub/Ubl-tail (upon and after trans-thiolation)(see supplementary material; (Reverter and Lima,2005; Yunus and Lima, 2006)).

(3) In terms of reaction mechanism, the cysteine being theonly universally conserved residue in all active formsis consistent with experimental evidence that it is theonly required residue for most reactions involvingE2s, except for the E3-dependent final transfer of Ub/Ubl to a target lysine. Our comparisons also show thatthe E2 fold represents an exceptional case in terms of

Page 11: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218 215

enzyme evolution: while always being on the secondextended segment of the flap and lying in an exposedposition in a groove, the active cysteine occupies aremarkable diversity of spatial positions in differentfamilies with respect to the core scaffold of the fold(Fig. 4). This is no more strikingly illustrated than in acomparison of the Apg3 structure with that of a mem-ber of the classical E2 family (Fig. 4) (Yamada et al.,2007). This observation is again consistent with therebeing few constraints on the specific environment ofthe cysteine. It also means that there are no specific con-served residues across the E2 fold for creating the rightenvironment for lowering the pKa of the attacking tar-get lysine (Yunus and Lima, 2006). Instead this is likelyto be achieved in a generic fashion due to the interfaceformed by target-E2 packing.

(4) While the flap asparagine implicated in oxyanion holestabilization (Wu et al., 2003; Yunus and Lima, 2006)is conserved only in certain families, we found that itis always substituted by a histidine from the cognateposition, or other polar side chains from entirely dif-ferent locations (Fig. 1 and Fig. 4 and Table 1). Inmost instances, these alternative polar side chainsare histidine, and in one case (Ufc1) appears to bereplaced by a combination of a backbone carbonyland the nitrogen of the tryptophan indole group.Thus, a polar group appears to have convergentlyemerged in an equivalent spatial location in the vicin-ity of the active cysteine in different families of thefold. The available structures suggest that this polargroup is likely to interact with the C-terminus ofthe Ub/Ubl. Further, this residue being almostalways an asparagine or a histidine is reminiscent ofthe fact that these two residues appear frequently inpeptidase catalytic triads to stabilize oxyanion holes(Berg et al., 2002; Johnston et al., 1999; Rawlingsand Barrett, 1994). Hence, it is possible that each ofthese represent independent innovations to stabilizethe oxyanion hole and localize the Ub/Ubl tail forattack by lysine. However, in some cases like Apg3/Apg10 the conserved histidine could additionally pro-vide a general base for activating the lysine. In a moregeneral sense, this convergent evolution of a polargroup in the vicinity of the cysteine in the E2s is rem-iniscent of the convergent evolution of the third (thecysteine and histidine being usually invariant and intheir ancestral positions) residue of the catalytic triadin peptidases of the papain-like fold (Balakirev et al.,2003; Iyer et al., 2004).

The flap histidine which interacts with the flap aspara-gine and is part of the above-described interacting residuechain is fairly widespread. Moreover, in the bacterial fam-ily D, where it is absent it seems to have a functional ana-log in the form of another histidine from the loopbetween helices 2a and 2b. This suggests a potentiallyimportant role for this histidine with respect to the flap

asparagine, by interacting with the oxo-group of theasparagine side-chain as suggested by some substrate-complexed structures (e.g. PDB: 1z5s; (Reverter andLima, 2005)). Mutagenesis of the histidine in Ubc13affected protein solubility/stability (Wu et al., 2003);hence, it is not clear if the experiments that failed touncover a role for it in catalysis have objectively evalu-ated its function. Given the nature of the histidine imidaz-ole ring it is possible that it acts as a switch thattransiently regulates the asparagine, by alternately assum-ing acidic and basic states. As the substitutes for the flapasparagine are usually histidines, it is possible that inthese cases the histidine directly shows such switch-likeproperties. The only multi-domain E2s are the BRUCE-like family, and these have an asparagine in place of theflap histidine (Fig. 1 and Table 1). Here, the other globu-lar domains fused to the E2 domain might help in provid-ing an E3 like function. This raises the question as towhether the role of the flap histidine is mainly importantin the context of E2s functioning with independent E3s ofthe non-HECT variety.

(5) A corollary to the observation that there is a wholerange of family-specific innovations in the regionaround the active cysteine is that these could playa role in target specificity. Extrapolating on thebasis of the target-E2 complexes (e.g. PDB: 1z5s;(Reverter and Lima, 2005)), we propose that theneomorphic helices or strands in the flap region(for example in Agp3/Apg10, UbcI, and Ubc6)and the region between helix2a and helix 2b andadditional C-terminal helices would provide fam-ily-specific interfaces for target specificity. In othercases, apparently drastic changes in substrate spec-ificity might also be achieved via relatively subtlechanges. While Apg3 and Apg10 differ in, respec-tively, operating on a protein and lipid target theyare closer to each other than any other E2 family.Autophagy or related processes are apparently uni-versal to eukaryotes, but some eukaryotes lackApg10, while possessing Apg3 (Supplementarymaterial). This suggests that Apg3 might performboth roles in these organisms without any majorchanges in its active site. Nevertheless, we identifieda few positions that might contribute towards thedifferences in target specificities of the two enzymes.Through systematic comparison of relative entropythe cognate alignment positions of the two familieswe identified one strongly differentially conservedposition two residues upstream of the catalytic cys-teine (Fig. 1). It is, respectively, a highly conservedthreonine in Apg10 and a histidine in Apg3. Thestructure of Apg3 (Yamada et al., 2007) suggeststhat it is likely to line the groove containing thecatalytic cysteine and control access to it. Takentogether, these observations suggest that the residueat this position might play a role in target specific-ity of the two families.

Page 12: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

216 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

2.7. Evolutionary history of E2 enzymes

The availability of complete genome sequences of early-branching eukaryotes like Trichomonas and Giardia, andancient branches like the kinetoplastid-heterolobosanclade allows an objective reconstruction of the E2 super-family in eukaryotes. Phyletic patterns of different E2 fam-ilies suggest that the Last Eukaryotic Common Ancestor(LECA) possessed members of the large classical E2 fam-ily, along with other catalytically active families such asUbc6p, APG10, APG3, and UbcI, and catalytically inac-tive versions such as Tsg101 (UEV), UEV1 and AKTIP.Thus, the major structural variants in the E2 family seenin extant eukaryotes were already present in LECA (Anan-tharaman et al., 2007) and adapted to diverse niches suchas degradation of proteins in the cytosol and endoplasmicreticulum (classical E2 and Ubc6p families), autophagythat involves transfer of Ubls to proteins (APG10) and lip-ids (APG3), transfer of polyubiquitin to proteins (UEV1family), and endosomal sorting (Tsg101) (Fig. 2). TheUfc1 and the inactive RWD families are apparently absentin the basal eukaryotic branches and appear to haveemerged slightly later, prior to the diversification of theheterolobosea-kinetoplastid clade (Fig. 2 and Supplemen-tary material). The UBE2W and Bruce-like E2 familiesemerged in the common ancestor of the crown-groupeukaryotes (plants, slime molds, fungi, and animals) andthe chromalveolate clades. These diversifications involvedfour independent innovations of polar residues in thevicinity of the catalytic cysteine and loss of catalytic activ-ity on at least three different occasions. This evolutionaryreconstruction of the E2 families is consistent with the cor-responding reconstructions of the cognate Ub/Ubl (Bur-roughs et al., 2007) and E1 families (MB, LMI, LAmanuscript in preparation) that interact with the differentE2 proteins. Thus, there appears to have been a major con-comitant radiation of these three components resulting inmultiple parallel Ub-conjugation systems prior to LECAitself, which were recruited to diverse physiological pro-cesses. In contrast, the E3-ligases show major lineage-spe-cific expansions in different eukaryotic lineages (Lespinetet al., 2002) suggesting that they were the main playersin recruiting the ancient core of the conjugation systemsto various new targets.

In our earlier studies, we had reported that the bacterialmembers of the E2 superfamily are encoded by predictedmobile operons that also code for E1-like, Ub-like andJab-peptidase components (Iyer et al., 2006). Based on thiswe proposed that they constitute potential modificationsystems that were likely to have been precursors of theeukaryotic E2s. Of the five distinct bacterial E2 families,family A most closely resembles the large classical E2 fam-ily of eukaryotes with a flap histidine and flap asparagine.We suggest this is likely to represent the family from whichthe eukaryotic ancestral version emerged, most probablyduring the primary endosymbiotic event that spawned theeukaryotes.

3. General conclusions

The catalytic mechanism of the E2 enzymes has been anenduring problem of interest in the enzymology of ubiqui-tin-conjugation. Taking advantage of the wealth of recentstructural data, sensitive sequence comparisons, and anobjective natural classification of the E2 superfamily weidentify many under-appreciated features of the E2 fold.We present hypotheses for the roles of these features inmechanical and biochemical aspects of E2 catalysis. Cru-cially, we identify the major unifying themes of E2 catalysisranging from the bacterial versions, to the divergent Agp3/Apg10 and the classical eukaryotic E2 proteins and theirvariants. Our studies show that the E2 superfamily hasundergone considerable sequence and structure diversifica-tion in the region of its active site since the time of its emer-gence in the bacteria. Our analysis suggests that thesequence and structural diversity in the E2 active siteswas not driven to a major extent by changes in the modeof E1 or Ub/Ubl binding. Instead, it is appears that theE2 diversification reflects the presence of relatively few con-straints on the catalytic cysteine, and adaptations that facil-itate target peptide (i.e. context of the target lysine) or lipidchoice. A common theme is presence of a polar residue(mostly histidine or asparagine) that has convergentlyemerged in various families in the vicinity of the catalyticcysteine. This is likely to both position the Ub-tail as wellas possibly stabilize the oxyanion hole during transfer ofUb/Ubl to lysine. We hope that the analysis presented herespurs further experiments in the form of site directed muta-genesis and structure determination to elucidate E2catalysis.

4. Material and methods

The non-redundant (NR) database of protein sequences(National Center for Biotechnology Information, NIH,Bethesda, MD) was searched with the BLASTP program(Altschul et al., 1997). Profile searches were conductedusing the PSI-BLAST program (Altschul et al., 1997) witheither single sequences or multiple alignments as queries,with a profile inclusion expectation (e) value threshold of0.01; searched were iterated until convergence. HiddenMarkov models (HMMs) built from alignments using thehmmbuild program were also employed in searches carriedout using the hmmsearch program from the HMMERpackage (Eddy, 1998). For queries and searches containingcompositionally biased segments, the statistical correctionoption built into the BLAST program was used (Schafferet al., 2001). Multiple alignments were constructed usingthe MUSCLE (Edgar, 2004) and/or KALIGN programs(Lassmann and Sonnhammer, 2005), followed by manualadjustment based on PSI-BLAST hsps and informationprovided by solved three-dimensional structures. Proteinstructures were visualized using the Swiss-PDB viewer(Guex and Peitsch, 1997) and cartoons were constructedwith the PyMOL program (http://www.pymol.org). Pro-

Page 13: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218 217

tein secondary structure predictions were made with theJPRED program, which computes a consensus predictedstructure using the information from the conservation pro-file, a PSI-BLAST position-specific score matrix and a hid-den Markov model (Cuff and Barton, 2000).

Structure similarity searches were conducted using thestandalone version of the DALI program with the querystructures scanned against a local current version of thePDB that has all chains as separate entries (Holm and San-der, 1995). The structural hits for each query were col-lected, even if the DALI Z-score for the match was lessthan 2.0 and then parsed for topological congruence tothe E2 structural template (Fig. 2) using a custom PERLscript. To assess topological congruence, coordinates ofthe matching regions detected by DALI searches usingknown E2 domains as queries were extracted and analyzedfor secondary structure using the DSSP program (Kabschand Sander, 1983). These secondary structure elementswere then represented as a string (corresponding to a rowin table 1) along with the polarity of the secondary struc-ture element determined from the DALI match to thequery structure. These strings were then matched with theequivalent secondary structure pattern strings constructedof bona fide E2 domains. If a complete match was obtainedthese structures were tagged as congruent, while thosewhich were not were ranked in descending order of ele-ments that did not match. This discrimination of the poten-tial candidates was further confirmed by visualexamination of each structure. Structural multiple align-ments were carried out using the MUSTANG program(Konagurthu et al., 2006). The interacting residues in var-ious members of the E2 fold were deduced using a PERLscripts from the TASS package (V. Anantharaman, S. Bal-aji and LA, unpublished). The scripts encode interactingdistance cut-off values of 5.0 and 3.5 A between appropri-ate atoms in 3-D for deducing the hydrophobic and polarinteractions respectively. These inferred interactions wereconfirmed via manually examination using Swiss-PDBviewer (Guex and Peitsch, 1997).

Preliminary clustering of the E2 superfamily was per-formed using the BLASTCLUST program (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html) with a range ofBLAST bit score density cutoffs (1–0.4) and length ratiosfor the alignments (0.8–0.6) in order to establish the coreof potential families. These were further extended usingreciprocal BLAST hits. The phyletic profiles of individualfamilies were determined using scripts of the TASS pack-age that query into the NCBI taxonomy database. Higherorder relationship between families was inferred usingshared conserved motifs in structure (when available) andsequence. Phylogenetic analysis within families was carriedout using a variety of methods including maximum-likeli-hood, neighbor-joining, and minimum evolution (leastsquares) methods with the MEGA, PHYLIP and MOL-PHY packages (Adachi and Hasegawa, 1992; Felsenstein,1996; Kumar et al., 2004). All large-scale sequence andstructure analysis procedures were carried out with the

TASS software package, a successor to the SEALS pack-age (Walker and Koonin, 1997).

Acknowledgments

The authors gratefully acknowledge the Intramural re-search program of the National Library Of Medicine, Na-tional Institutes of Health, USA for funding their research.Comprehensive supplementary material is also available at:http://www.ncbi.nlm.nih.gov/CBBresearch/Lakshmin/E2.

Appendix A. Supplementary data

Supplementary data associated with this article can befound, in the online version, at doi:10.1016/j.jsb.2007.12.006.

References

Adachi, J., Hasegawa, M., 1992. MOLPHY: Programs for MolecularPhylogenetics Institute of Statistical Mathematics, Tokyo.

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller,W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a newgeneration of protein database search programs. Nucleic Acids Res.25, 3389–3402.

Anantharaman, V., Iyer, L.M., Aravind, L., 2007. Comparative genomics ofprotists: new insights into the evolution of eukaryotic signal transductionand gene regulation (�). Annu. Rev. Microbiol. 61, 453–475.

Arai, R., Yoshikawa, S., Murayama, K., Imai, Y., Takahashi, R.,Shirouzu, M., Yokoyama, S., 2006. Structure of human ubiquitin-conjugating enzyme E2 G2 (UBE2G2/UBC7). Acta Crystallogr. Sect.F. Struct. Biol. Cryst. Commun. 62, 330–334.

Aravind, L., Iyer, L.M., Koonin, E.V., 2003. Scores of RINGS but noPHDs in ubiquitin signaling. Cell Cycle 2, 123–126.

Aravind, L., Iyer, L.M., Koonin, E.V., 2006. Comparative genomics andstructural biology of the molecular innovations of eukaryotes. Curr.Opin. Struct. Biol. 16, 409–419.

Ardley, H.C., Robinson, P.A., 2005. E3 ubiquitin ligases. Essays Biochem.41, 15–30.

Balakirev, M.Y., Tcherniuk, S.O., Jaquinod, M., Chroboczek, J., 2003.Otubains: a new family of cysteine proteases in the ubiquitin pathway.EMBO Rep. 4, 517–522.

Bartke, T., Pohl, C., Pyrowolakis, G., Jentsch, S., 2004. Dual role ofBRUCE as an antiapoptotic IAP and a chimeric E2/E3 ubiquitinligase. Mol. Cell 14, 801–811.

Berg, J.M., Tymoczko, J.L., Stryer, L., 2002. Biochemistry, fifth ed. W.H.Freeman and Co., New York.

Berleth, E.S., Pickart, C.M., 1996. Mechanism of ubiquitin conjugatingenzyme E2-230K: catalysis involving a thiol relay? Biochemistry 35,1664–1671.

Burroughs, A.M., Balaji, S., Iyer, L.M., Aravind, L., 2007. Small butversatile: the extraordinary functional and structural diversity of thebeta-grasp fold. Biol. Direct 2, 18.

Capili, A.D., Lima, C.D., 2007. Taking it step by step: mechanisticinsights from structural studies of ubiquitin/ubiquitin-like proteinmodification pathways. Curr. Opin. Struct. Biol.

Chen, P., Johnson, P., Sommer, T., Jentsch, S., Hochstrasser, M., 1993.Multiple ubiquitin-conjugating enzymes participate in the in vivodegradation of the yeast MAT alpha 2 repressor. Cell 74,357–369.

Christensen, D.E., Brzovic, P.S., Klevit, R.E., 2007. E2-BRCA1 RINGinteractions dictate synthesis of mono- or specific polyubiquitin chainlinkages.. Nat. Struct. Mol. Biol. 14, 941–948.

Page 14: Anatomy of the E2 ligase fold: Implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

218 A.M. Burroughs et al. / Journal of Structural Biology 162 (2008) 205–218

Ciechanover, A., Orian, A., Schwartz, A.L., 2000. Ubiquitin-mediatedproteolysis: biological regulation via destruction. Bioessays 22, 442–451.

Cuff, J.A., Barton, G.J., 2000. Application of multiple sequence alignmentprofiles to improve protein secondary structure prediction. Proteins 40,502–511.

Dye, B.T., Schulman, B.A., 2007. Structural mechanisms underlyingposttranslational modification by ubiquitin-like proteins. Annu. Rev.Biophys. Biomol. Struct. 36, 131–150.

Eddins, M.J., Carlile, C.M., Gomez, K.M., Pickart, C.M., Wolberger, C.,2006. Mms2-Ubc13 covalently bound to ubiquitin reveals the struc-tural basis of linkage-specific polyubiquitin chain formation. Nat.Struct. Mol. Biol. 13, 915–920.

Eddy, S.R., 1998. Profile hidden Markov models. Bioinformatics 14, 755–763.

Edgar, R.C., 2004. MUSCLE: a multiple sequence alignment method withreduced time and space complexity. BMC Bioinformatics 5, 113.

Felsenstein, J., 1996. Inferring phylogenies from protein sequences byparsimony, distance, and likelihood methods.. Methods Enzymol. 266,418–427.

Garrus, J.E., von Schwedler, U.K., Pornillos, O.W., Morham, S.G., Zavitz,K.H., Wang, H.E., Wettstein, D.A., Stray, K.M., Cote, M., Rich, R.L.,Myszka, D.G., Sundquist, W.I., 2001. Tsg101 and the vacuolar proteinsorting pathway are essential for HIV-1 budding. Cell 107, 55–65.

Guex, N., Peitsch, M.C., 1997. SWISS-MODEL and the Swiss-PdbView-er: an environment for comparative protein modeling. Electrophoresis18, 2714–2723.

Hau, D.D., Lewis, M.J., Saltibus, L.F., Pastushok, L., Xiao, W.,Spyracopoulos, L., 2006. Structure and interactions of the ubiquitin-conjugating enzyme variant human Uev1a: implications for enzymaticsynthesis of polyubiquitin chains. Biochemistry 45, 9866–9877.

Hershko, A., Heller, H., Elias, S., Ciechanover, A., 1983. Components ofubiquitin–protein ligase system. Resolution, affinity purification, androle in protein breakdown. J. Biol. Chem. 258, 8206–8214.

Holm, L., Sander, C., 1995. Dali: a network tool for protein structurecomparison. Trends Biochem. Sci. 20, 478–480.

Ichimura, Y., Kirisako, T., Takao, T., Satomi, Y., Shimonishi, Y.,Ishihara, N., Mizushima, N., Tanida, I., Kominami, E., Ohsumi, M.,Noda, T., Ohsumi, Y., 2000. A ubiquitin-like system mediates proteinlipidation. Nature 408, 488–492.

Iyer, L.M., Koonin, E.V., Aravind, L., 2004. Novel predicted peptidaseswith a potential role in the ubiquitin signaling pathway. Cell Cycle 3,1440–1450.

Iyer, L.M., Burroughs, A.M., Aravind, L., 2006. The prokaryoticantecedents of the ubiquitin-signaling system and the early evolutionof ubiquitin-like beta-grasp domains. Genome Biol. 7, R60.

Johnson, E.S., 2004. Protein modification by SUMO. Annu. Rev.Biochem. 73, 355–382.

Johnston, S.C., Riddle, S.M., Cohen, R.E., Hill, C.P., 1999. Structuralbasis for the specificity of ubiquitin C-terminal hydrolases. EMBO J.18, 3877–3887.

Kabsch, W., Sander, C., 1983. Dictionary of protein secondary structure:pattern recognition of hydrogen-bonded and geometrical features.Biopolymers 22, 2577–2637.

Knipscheer, P., Sixma, T.K., 2006. Divide and conquer: the E2 active site.Nat. Struct. Mol. Biol. 13, 474–476.

Komatsu, M., Chiba, T., Tatsumi, K., Iemura, S., Tanida, I., Okazaki, N.,Ueno, T., Kominami, E., Natsume, T., Tanaka, K., 2004. A novelprotein-conjugating system for Ufm1, a ubiquitin-fold modifier.EMBO J. 23, 1977–1986.

Konagurthu, A.S., Whisstock, J.C., Stuckey, P.J., Lesk, A.M., 2006.MUSTANG: a multiple structural alignment algorithm. Proteins 64,559–574.

Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: Integrated software forMolecular Evolutionary Genetics Analysis and sequence alignment.Brief Bioinform. 5, 150–163.

Lassmann, T., Sonnhammer, E.L., 2005. Kalign—an accurate and fastmultiple sequence alignment algorithm. BMC Bioinformatics 6, 298.

Lespinet, O., Wolf, Y.I., Koonin, E.V., Aravind, L., 2002. The role oflineage-specific gene family expansion in the evolution of eukaryotes.Genome Res. 12, 1048–1059.

Melchior, F., Schergaut, M., Pichler, A., 2003. SUMO: ligases, isopep-tidases and nuclear pores.. Trends Biochem. Sci. 28, 612–618.

Melner, M.H., Haas, A.L., Klein, J.M., Brash, A.R., Boeglin, W.E.,Nagdas, S.K., Winfrey, V.P., Olson, G.E., 2006. Demonstration ofubiquitin thiolester formation of UBE2Q2 (UBCi), a novel ubiquitin-conjugating enzyme with implantation site-specific expression. Biol.Reprod. 75, 395–406.

Mizushima, T., Tatsumi, K., Ozaki, Y., Kawakami, T., Suzuki, A.,Ogasahara, K., Komatsu, M., Kominami, E., Tanaka, K., Yamane,T., 2007. Crystal structure of Ufc1, the Ufm1-conjugating enzyme.Biochem. Biophys. Res. Commun. 362, 1079–1084.

Moraes, T.F., Edwards, R.A., McKenna, S., Pastushok, L., Xiao, W.,Glover, J.N., Ellison, M.J., 2001. Crystal structure of the humanubiquitin conjugating enzyme complex, hMms2-hUbc13. Nat. Struct.Biol. 8, 669–673.

Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C., 1995. SCOP: astructural classification of proteins database for the investigation ofsequences and structures. J. Mol. Biol. 247, 536–540.

Nameki, N., Yoneyama, M., Koshiba, S., Tochio, N., Inoue, M., Seki, E.,Matsuda, T., Tomo, Y., Harada, T., Saito, K., Kobayashi, N., Yabuki,T., Aoki, M., Nunokawa, E., Matsuda, N., Sakagami, N., Terada, T.,Shirouzu, M., Yoshida, M., Hirota, H., Osanai, T., Tanaka, A.,Arakawa, T., Carninci, P., Kawai, J., Hayashizaki, Y., Kinoshita, K.,Guntert, P., Kigawa, T., Yokoyama, S., 2004. Solution structure of theRWD domain of the mouse GCN2 protein. Protein Sci. 13, 2089–2100.

Pickart, C.M., 2001. Mechanisms underlying ubiquitination. Annu. Rev.Biochem. 70, 503–533.

Rawlings, N.D., Barrett, A.J., 1994. Families of cysteine peptidases.Methods Enzymol. 244, 461–486.

Reverter, D., Lima, C.D., 2005. Insights into E3 ligase activity revealedby a SUMO-RanGAP1-Ubc9-Nup358 complex. Nature 435, 687–692.

Schaffer, A.A., Aravind, L., Madden, T.L., Shavirin, S., Spouge,Wolf, Y.I., Koonin, E.V., Altschul, S.F., 2001. Improving theaccuracy of PSI-BLAST protein database searches with composi-tion-based statistics and other refinements. Nucleic Acids Res. 29,2994–3005.

Tong, H., Hateboer, G., Perrakis, A., Bernards, R., Sixma, T.K., 1997.Crystal structure of murine/human Ubc9 provides insight into thevariability of the ubiquitin-conjugating system. J. Biol. Chem. 272,21381–21387.

Walker, D.R., Koonin, E.V., 1997. SEALS: a system for easy analysis oflots of sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 333–339.

Wang, J., Hu, W., Cai, S., Lee, B., Song, J., Chen, Y., 2007. The intrinsicaffinity between E2 and the Cys domain of E1 in ubiquitin-likemodifications. Mol. Cell 27, 228–237.

Wertz, I.E., O’Rourke, K.M., Zhou, H., Eby, M., Aravind, L., Seshagiri,S., Wu, P., Wiesmann, C., Baker, R., Boone, D.L., Ma, A., Koonin,E.V., Dixit, V.M., 2004. De-ubiquitination and ubiquitin ligasedomains of A20 downregulate NF-kappaB signalling. Nature 430,694–699.

Wu, P.Y., Hanlon, M., Eddins, M., Tsui, C., Rogers, R.S., Jensen, J.P.,Matunis, M.J., Weissman, A.M., Wolberger, C., Pickart, C.M., 2003.A conserved catalytic residue in the ubiquitin-conjugating enzymefamily. EMBO J. 22, 5241–5250.

Yamada, Y., Suzuki, N.N., Hanada, T., Ichimura, Y., Kumeta, H.,Fujioka, Y., Ohsumi, Y., Inagaki, F., 2007. The crystal structure ofAtg3, an autophagy-related ubiquitin carrier protein (E2) enzyme thatmediates Atg8 lipidation. J. Biol. Chem. 282, 8036–8043.

Yunus, A.A., Lima, C.D., 2006. Lysine activation and functional analysisof E2-mediated conjugation in the SUMO pathway. Nat. Struct. Mol.Biol. 13, 491–499.

Zheng, N., Wang, P., Jeffrey, P.D., Pavletich, N.P., 2000. Structure of ac-Cbl-UbcH7 complex: RING domain function in ubiquitin-proteinligases.. Cell 102, 533–539.