predicting allergen cross reactions by protein sequence

65
Predicting Allergen Cross Reactions by Protein Sequence Graduate School for Cellular and Biomedical Sciences University of Bern MD-PhD Thesis Submitted by Pascal Bruno Pfiffner from Mels SG and Mels-Weisstannen SG Thesis advisor Prof. Dr. Beda M Stadler University Institute of Immunology Medical Faculty of the University of Bern Original document saved on the web server of the University Library of Bern This work is licensed under a Creative Commons Attribution-Non-Commercial-No derivative works 2.5 Switzerland licence. To see the licence go to http://creativecommons.org/licenses/by-nc-nd/2.5/ch/ or write to Creative Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA.

Upload: others

Post on 21-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

University of Bern
Thesis advisor
Medical Faculty of the University of Bern
Original document saved on the web server of the University Library of Bern
This work is licensed under a Creative Commons Attribution-Non-Commercial-No derivative works 2.5 Switzerland licence. To see the licence
go to http://creativecommons.org/licenses/by-nc-nd/2.5/ch/ or write to Creative Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA.
Copyright Notice
You are free:
to copy, distribute, display, and perform the work Under the following conditions:
Attribution. You must give the original author credit.
Non-Commercial. You may not use this work for commercial purposes.
No derivative works. You may not alter, transform, or build upon this work.. For any reuse or distribution, you must take clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author’s moral rights according to Swiss law. The detailed license agreement can be found at: http://creativecommons.org/licenses/by-nc-nd/2.5/ch/legalcode.de
Accepted by the Faculty of Medicine, the Faculty of Science and the
Vetsuisse Faculty of the University of Bern at the request of the Graduate
School for Cellular and Biomedical Sciences
Bern, Dean of the Faculty of Medicine
Bern, Dean of the Faculty of Science
Bern, Dean of the Vetsuisse Faculty Bern
Contents
3 Scientific Overview 9 3.1 Allergen Cross-Reactions . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Allergy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.2 The Context of Cross-Reactions . . . . . . . . . . . . . . . . 9 3.1.3 The Molecular Basis . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Allergenicity Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.1 From Skin Testing to Laboratory Analysis . . . . . . . . . . 14 3.2.2 Quantifying Cross-Reactivity . . . . . . . . . . . . . . . . . 15 3.2.3 Allergy Array Test System . . . . . . . . . . . . . . . . . . . 16
3.3 Allergenicity Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.1 Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.2 Epitope Focused Prediction . . . . . . . . . . . . . . . . . . 17 3.3.3 Structure-Sequence Relationship . . . . . . . . . . . . . . . . 18 3.3.4 Identifying Conserved Domains . . . . . . . . . . . . . . . . 18 3.3.5 Motifs and General Profiles . . . . . . . . . . . . . . . . . . 20
3.4 Bioinformatics of Cross Reactions . . . . . . . . . . . . . . . . . . . 21 3.4.1 Motif Calculation . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.2 Web Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5 Outlook: Protein Surface Comparison . . . . . . . . . . . . . . . . . 22 3.5.1 Ab initio Protein Folding Prediction . . . . . . . . . . . . . 22 3.5.2 Homology Modeling . . . . . . . . . . . . . . . . . . . . . . 24 3.5.3 Prediction of Similar Surfaces . . . . . . . . . . . . . . . . . 25
4 Results – Dissertation Equivalents 35 4.1 Dissertation Equivalent I . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2 Dissertation Equivalent II . . . . . . . . . . . . . . . . . . . . . . . 45
5 Acknowledgements 64
List of Figures
3.1 Structural alignment of Bet v 1 and Mal d 1 . . . . . . . . . . . . . 12 3.2 Structures and folds newly added to the PDB . . . . . . . . . . . . 19 3.3 Iterative motif discovery . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Web-based front-end . . . . . . . . . . . . . . . . . . . . . . . . . . 23
List of Tables
LIST OF TABLES 5
1 Abstract
Clinically important allergen cross-reactions such as the pollen-food syndromes have been shown to originate from structural homology. Additionally, in the last two years no new protein folds were discovered, implying that the universe of unique protein folds may be almost complete. For allergy this may suggest that the immune system also reacts adversely in a predictable way, namely by recognizing homologous proteins once a sensitization is established. Thus we have applied sequence-based computational homology prediction to assess the extent of cross- reactivity.
In a first paper we have analyzed more than 5’000 serum samples, each tested for specific IgE against multiple allergen extracts (ImmunoCAP R©). We found the degree of cross-reactivity to be astonishingly high. However, as specific IgE determinations were based on crude allergen extracts, we were unable to conclude that the observed cross-reactivity reliably depends on the allergen sequence.
Thus in a second paper we utilized data obtained for specific IgE directed against highly purified natural or recombinant proteins on a new allergen chip system (ISAC R©). Thereby we assessed the sensitization pattern against 105 proteins in more than 3’000 serum samples. Protein pairs predicted to cross-react, based on computationally identified homology, co-reacted significantly more often than protein pairs without apparent homology. Additionally, we demonstrated that the allergen source, and therefore co-sensitization, was much less important than protein homology.
We conclude that cross-reactivity is an important mechanism in the development of allergic diseases, more important than generally accepted. Allergy diagnosis and treatment may benefit from the combination of allergen chip data, i.e. specific IgE values directed against purified and recombinant proteins, and computationally predicted cross-reactivity. Finally, our continuous endeavor to assess the number of structural motifs in allergens shows that not only protein folds, but also the number of allergen motifs may soon reach a plateau. Hence allergenicity prediction may become as valid as wet lab testing for new and potential allergens.
6
Bet v 1 Betula verrucosa 1, major birch pollen allergen
CASP critical Assessment of Techniques for Protein Structure Prediction
CCD Cross-reacting Carbohydrate Determinant
CDR Complementary Determining Region
FEIA Fluoro-Enzyme Immuno Assay
Mal d 1 Malus domesticus 1, major apple allergen
NMR Nuclear Magnetic Resonance
PDB Protein Data Bank
7
SPT Skin Prick Test
TCR T cell receptor
WHO World Health Organization
3.1.1 Allergy
Allergy is a hypersensitivity type I disorder. It is characterized by typical clinical reactions such as hay fever, asthma, food allergies, eczema and urticaria against usually harmless substances. These manifestations are mostly mediated by Im- munoglobulin E (IgE) antibodies, highly specific binding proteins produced by plasma cells. By recognizing a certain pattern on the surface of their antigen, the epitope, antibodies elicit various immune reactions against these molecules. Anti- bodies of the IgE class are also expressed in high quantities during infections with helminths (Erb, 2007), but are the main culprit in allergic sensitizations leading to type I hypersensitivity (Gould et al., 2003).
The allergic reaction Antibodies of the IgE subclass are required for type I hypersensitivity reactions (Kay, 2000). Symptoms occurring during an allergic response, like swelling and itching, are a result of mast cell and basophil degran- ulation. Mast cells and basophils carry Fcε receptors on their surface which are able to bind to the Fc portion of IgE antibodies. Cross-linking surface-bound IgE on high-affinity Fcε type I receptors triggers degranulation (Helm et al., 1988). The cells release mediators such as histamine and serotonine into the surrounding tissue (Metzger, 1991; Nadler et al., 2000), which causes the symptoms typical for allergy.
3.1.2 The Context of Cross-Reactions
The phenomenon of cross-reactivity has long been known. Reports from the nineteen-thirties mention evolutionary relationship as possible explanations for ob-
9
served cross-reactivity, but state that “mere similarity would be sufficient” (Hooker and Boyd, 1934). After publications revealed that any form of gelatin, essen- tially a denatured protein fragment without fixed 3-D structure, was antigenic in man (Maurer, 1954), Gell and Benaceraf presumed that “any specificity which it [the protein] has must therefore be resident in the amino-acid sequence of the protein chain” (Gell and Benacerraf, 1959). In a series of publications, they as- sessed various aspects of cross-reactivity against native and denatured proteins. Interestingly, they found delayed type hypersensitivity skin reactions in guinea pigs sensitized to native ovalbumin when challenged with denatured ovalbumin, and vice versa (Gell and Benacerraf, 1959). Still, the denatured proteins were unable to elicit immediate type hypersensitivity reactions, meaning the antibod- ies recognizing the native protein were unable to recognize linear parts thereof. This suggested that structural epitopes are more important for antigen recogni- tion by antibodies than sequence features alone. The two even considered different parts of the proteins to be independently antigenic, coining the term “antigenic motifs” (Benacerraf and Gell, 1959).
Allergic patients commonly react to more than a single allergen, true single positive sensitizations are very rare (cf. dissertation equivalent I). In which proportion this observation is caused by multiple sensitizations and in which proportion by cross-reactions may be disputed. On one hand, a TH2 response to an allergen in the TH2-tilted milieu of allergic patients facilitates the sensitization against con- currently present proteins, for example different proteins in a pollen grain. In principle, T helper cells will also stimulate B cells that are reactive to a non-cross- reactive epitope. On the other hand it can be demonstrated that co-sensitization between proteins occurring in the same material is astonishingly low (cf. disser- tation equivalent II).
Cross-reactivity may not be sufficiently explained by predicting IgE cross-reactivity alone. Factors other than the protein itself, most importantly the type and time of exposure, are also crucial to sensitization (Ferreira et al., 2004). Yet these factors can neither be influenced nor measured for diagnostic purposes. Thus the question remains what diagnostic and therapeutic value IgE cross-reactivity prediction can add to the current allergy evaluation procedure.
3.1.3 The Molecular Basis
The immune system recognizes allergen antigens in two different forms:
• Antigen presenting cells (APCs) present peptide fragments of the digested antigens on MHC class II molecules cells as linear structures. T helper cells require this form of antigen presentation in order to elicit effector functions.
• Antibodies and therewith B cell receptors recognize conformational epitopes on the tertiary structure of the folded antigen.
3. Scientific Overview 10
Ultimately, the activation of B cells resulting in antibody production requires both forms of recognition, as most allergens are proteins and thus thymus-dependent antigens. There is no B cell activation without help of T helper cells. For cross- reactivity the question arises which presentation leads to the generation of a cross- reactive immune response.
T cell cross-reactivity Peptides presented on MHC class II molecules are usu- ally 10 - 25 amino acids long. Of this fragment, only a limited number of residues is directly interfaced by the T cell receptor (TCR) (Wucherpfennig and Strominger, 1995). By contrast, a high affinity antibody can form around fifteen to twenty bonds with its antigen (Davies et al., 1988; Lafont et al., 2007). Thus the T cell is, already for statistical reasons, more likely to encounter indistinguishable struc- tures originating from different proteins when compared to the B cell. It can be demonstrated that some TCRs recognize not only a single peptide, but rather a limited repertoire of related peptides, derived from different antigens. This cross- recognition leads to efficient activation of the T cell, for example in the setting of autoimmunity linked to viral antigens such as Multiple Sclerosis (Wucherpfennig and Strominger, 1995). Additionally, during the physiological process of positive selection in the thymus, cross-reactivity at the T cell level is a common phe- nomenon. T cells with low affinity for self-MHC molecules are kept alive (positive selection) while T cells with higher affinity for the self-MHC/self-peptide complex are deleted (negative selection) (Kappler et al., 1987; Kisielow et al., 1988). These findings demonstrate that cross-reactivity at the T cell level is common. An activated T helper cell potentially activates B cells presenting a range of different peptides, hence T helper cells do not limit allergen cross-reactivity.
Cross-reactive antibodies Cross-reactive antibodies have the ability to bind an antigen different from its immunogen. Given the high specificity antibodies achieve during affinity maturation, the existence of antibodies recognizing struc- tures different from their template structure seems surprising. A way out of this catch is looking at antigen specificity as a quantitative rather than qualitative concept. Antigen-antibody interactions are based on physicochemical processes dependent on spatial and electrostatic properties of both molecules’ surfaces. In this context, a perfect match in the sense of spatially and electrostatically per- fectly complementary molecules, reminiscent of the key-lock analogue, would re- sult in maximum affine binding. However, less-than-perfect surface pairs would still be able to bind, admittedly with lower affinity. An antibody may therefore be expected to bind various related and unrelated molecules given a high enough similarity. The quality of such an interaction would only differ quantitatively (pun intended) in thermodynamic properties, such as interaction rates or dissociation constants. This mechanism is commonly termed “molecular mimicry”. Thus, from this stereochemical standpoint, molecular mimicry and therewith structural simi- larity between proteins build the foundation for cross-reactivity on the level of the
3. Scientific Overview 11
antibody.
The need for structural similarities for cross-reactivity at the antibody level is not only of theoretical nature but can easily be demonstrated. Structural similarity as a consequence of phylogenetic inheritance correlates well with observed cross- reactivity. When examining clinically well known cross-reactions such as the apple- birch cross-reactivity, their major allergenic proteins Bet v 1 and Mal d 1 exhibit potential epitopes for cross-reactive antibodies as identified by crystal structure and sequence analysis (Holm et al., 2001). Figure 3.1 demonstrates the similarities between the backbones of Bet v 11 and Mal d 12. Further examples are highly conserved proteins such as taxonomically related group I grass pollen allergens, which demonstrate a high degree of cross-reactivity (Laffer et al., 1994, 1996). Thus also experimentally, phylogenetic relationship and with it conserved protein domains exhibiting structural similarity are a main cause for cross-reactivity.
Figure 3.1: Cartoon models of a structural alignment of Bet v 1 (purple, PDB accession number 1B6F) and Mal d 1 (green, modeled). Mal d 1 has been modeled after Pru av 1 (PDB accession number 1E09). Orientation in view B is perpendicular to the Y-axis of view A.
Profilins and pathogenesis related proteins Following the early investiga- tions in the apple-birch cross-reactivity, it became clear that even much more dis- tantly related proteins were capable of eliciting cross-reactions. For the apple-birch cross-reactivity, the proline binding protein, profilin, was identified as causative al- lergen (Valenta et al., 1992). This protein turned out to be the most important cross-reacting allergen discovered thus far. The profilin family constitutes a type of pan-allergens sharing IgE epitopes present in most cells of eukaryotic organ- isms (Valenta et al., 1992). Nowadays, there are many well-known cross-reactions which can be allotted to omnipresent, evolutionary conserved pan-allergens. These include profilins, α-Amylase inhibitors, peroxidases, thiol-proteases, seed storage proteins and lectins (Breiteneder and Ebner, 2000).
1PDB Model 1B6F: http://www.pdb.org/pdb/explore/explore.do?structureId=1e09 2Protein model based on template 1E09 chain A: http://www.proteinmodelportal.org/
?pid=modelDetail&pmpuid=1000000075750
3. Scientific Overview 12
PR Classification Example allergens
PR-2 β-1,3-Glucanases Banana, latex, potato, tomato PR-3 Basic chitinases Avocado, banana, chestnut, latex PR-4 Win-like proteins Elderberry, turnip PR-5 Thaumatin-like proteins Apple, bell pepper, cherry, kiwi,
mountain cedar PR-10 Bet v 1 homologs Apple, apricot, carrot, celery,
cherry, parsley, pear, potato PR-14 Lipid transfer proteis Apple, barley, peach, soybean
Table 3.1: Examples of allergens homologous to pathogenesis related proteins
Hydrophobic Stickiness It has been suggested that antibodies may be cross- reactive due to hydrophobic stickiness, a nonspecific hydrophobic interaction (Pad- lan, 1994). Additionally, antibodies have been demonstrated to bind a range of antigens directly related to their hydrophobicity (Barbas et al., 1997). However, such nonspecific binding can not explain the high specificity with which antibod- ies are known to interact with their antigen. Furthermore, no correlation between hydrophobicity and affinity has been found in recent studies (James and Tawfik, 2003), thus hydrophobic stickiness may contribute to cross-reactions, but is not their basis.
Post-Translational Modification An aspect easily forgotten is that transla- tion is not the final step in the formation of a protein from DNA. Post-translational modifications (PTM) are processes not entirely defined by the DNA sequence, but instead determined by factors of the host. A broad range of PTMs has been described. For example, these mechanisms are able to add functional groups or even entire proteins by gamma-carboxylation, change the chemical nature of amino acids by citrullination or induce structural changes, most notably by disulfide bond formation.
A PTM leading to an extraordinary broad cross-reactivity through anti-carbo- hydrate responses is glycosylation, creating cross-reacting carbohydrate determi- nants (CCDs) on proteins from different sources. CCDs are common in plant allergens (pollen as well as food) and in Hymenoptera venoms. Even though a clinical effect of CCDs has been suggested (Fotisch et al., 1999), it is the general
3. Scientific Overview 13
opinion that CCDs are not clinically relevant but must be considered when inter- preting in-vitro specific IgE assays, especially in pollen- and Hymenoptera venom sensitivity (Aalberse et al., 1981; Mari et al., 1999; Erzen et al., 2009). The clini- cal insignificance mostly stems from an inability to trigger mast cells or basophils through receptor cross-linking. CCD structures are monoglycosylated as a conse- quence of their small size, and thus only represent monovalent epitopes (Vieths et al., 2002), unable to establish the cross-linking.
Cross-reactions require homology It seems clear that there is one main rea- son for cross-reactivity: Homology. There is no relevant cross-reactivity without structural similarity (Aalberse et al., 2001). Cross-reactions without sequence similarity so far have only been demonstrated between anti-idiotypic antibodies. These antibodies do not have any similarity in the amino-acid sequences encoding their complementary determining region (CDR) (Lescar et al., 1995). With the exception of these antibodies, cross-reactive allergens without apparent homology have not been demonstrated so far (Aalberse, 2005).
3.2 Allergenicity Testing
3.2.1 From Skin Testing to Laboratory Analysis
Skin testing Since its inception by Blackley in 1873 (Blackley, 1873), skin prick tests (SPT) are still the most widely used clinical tests to assess sensitization against a substance of interest (Neto and Rosario, 2009). Hypersensitivity type I reactions can be provoked by pricking or injecting a minute amount of allergen intradermally, usually to a patient’s forearm. Comparing the size of the wheal induced by an allergen to the size of a control wheal (usually provoked by a saline solution) allows to diagnose whether a patient is sensitized against the allergen, to a certain degree even the strength of the reaction. This allows for quick and reliable testing as it has a good negative predictive value (Sicherer and Sampson, 2010). However, SPTs are impractical to assess the full range of sensitizations due to the number of pricks a patient would have to endure.
To not only demonstrate the presence of sensitization but also to quantitate its strength, intradermal dilutional testing (IDT) can be performed by applying var- ious dilutions of the antigen. As one early form of IDT, skin end point titration (SET) is widely used in the diagnosis and treatment of inhalant allergens. Its efficacy in guiding desensitization immunotherapy however is only little supported by controlled experimental data (Krouse and Mabry, 2003), despite clinical ex- pertise having shown its usefulness and effectiveness. SET is commonly used to find a safe starting dose for immunotherapy. In the case of food allergies, SET is still investigatory and is not typically used in the clinical setting, however holds promise to become a realistic diagnostic choice (Tripodi et al., 2009).
3. Scientific Overview 14
In vitro The presence of IgE antibodies specific against an allergen is necessary but not sufficient to provoke allergic responses. In other words, sensitization does not necessarily imply clinical allergy. This, however, is a topic beyond the scope of this thesis. Nevertheless, testing patients for the presence of specific IgE is an important part of every thorough allergic assessment.
Today, the fluoro-enzyme immuno assay (FEIA) is the most widely used specific IgE detection method. It has replaced the previously used radio allergo-sorbent tests (RAST). After adding a patient’s serum sample to the test capsule, specific IgE present in the serum binds to covalently coupled allergen preparations. An anti-human IgE antibody mixture, fluorescently labelled, is then added and the resulting fluorescence is measured in a spectrophotometer (ImmunoCAP R© system, Phadia AB, Uppsala, Sweden). IgE quantities are expressed in kilo-units of antigen per liter (kUA/L), where 1 unit corresponds to 2.4 ng of IgE (Pastorello et al., 1995).
The allergen preparations used in these assays are mixtures of proteins which are prepared from biological extracts and are known to be heterogeneous, often also containing non-allergenic proteins (Chapman et al., 2000). They can even be con- taminated with allergens from different sources. For these reasons, the use of highly purified natural or even recombinant allergen proteins has been promoted. Recombinant allergen proteins are an attractive choice because their pure form pro- motes reproducibility and standardization (Hamilton, 2010) and allows to exactly determine against which proteins a patient is sensitized. The latest microarray chip technology utilizing recombinant proteins will be discussed in more detail in section 3.2.3.
3.2.2 Quantifying Cross-Reactivity
In a first study (Dissertation Equivalent I) we have utilized a large database of specific IgE values obtained by FEIA (ImmunoCAP R©) to evaluate the degree of sensitization against various allergens and their relationship to cross-reactivity. We found that allergen cross-reactions might be much more common than generally assumed. For some extracts we found that well over 80% of the patients tested positive were also tested positive against extracts presumably cross-reacting with the original extract. Furthermore, with an increasing number of extracts tested, the percentage of sera sensitized against only one single allergen extract decreased from approximately 10% for sera tested against 10 to 20 extracts to 1.6% for sera tested against at least 90 extracts. This suggests that the true number of single positive sera must be low and therefore the rate of co-sensitization and presumably the rate of cross-reactivity must be high. We concluded that using allergen extracts for cross-reactivity assessment might introduce a certain bias as an extract contains a number of allergenic and non-allergenic proteins. Therefore, an assessment at the protein level using recombinant proteins would be desirable.
3. Scientific Overview 15
3.2.3 Allergy Array Test System
The ability to clone and purify single proteins has recently opened the door to component-resolved diagnostics (CRD) (Valenta et al., 1999; van Hage-Hamsten and Pauli, 2004). CRD has been commercialized in the form of a microarray chip, the Immuno Solid-phase Allergen Chip (ISAC R©) (Hiller et al., 2002). Firstly, CRD allows to identify the disease-eliciting protein and not only the extract po- tentially containing many different proteins. Secondly, CRD in the form of the ISAC system allows to determine sensitivity against a broad panel of allergens in a single measurement. Currently available chips contain 103 different purified allergen molecules. Plans to further extend this number are made in pursuit of offering an allergen screening test covering the widest range of allergens possible and necessary.
By testing a patient’s serum against 103 purified proteins, the sensitization pat- tern exhibited allows to further study the relationship between co-sensitization and cross-reaction patterns. In our second study (Dissertation Equivalent II) we analyzed the sensitization pattern of 3’142 patients, determined by ISAC eval- uations. The focus of our analysis lied with the relationship between predicted cross-reactions and observed co-reactions, as described in section 3.3. We found a high correlation between predicted and observed reactions, which further vali- dates the use of probabilistic sequence motifs for allergenicity prediction of new proteins.
3.3 Allergenicity Prediction
3.3.1 Necessity
Risk in biotechnology Allergenicity is one of the most frequently asked ques- tions in connection with the safety of genetically modified (GM) foods (FAO and WHO, 2001). Consequently, allergenicity assessment of GM foods is one of the most important parts of risk assessment in biotechnology, in line with evaluation of direct toxicological and nutritional effects. The importance of the allergenicity aspect has especially become clear after the inadvertent generation of an allergenic soy plant by transfer of a brazil-nut allergen (Nordlee et al., 1996). As a result of this assessment, development of said soy plant was abandoned and the organism was never introduced to the food chain.
Naturally, the amino-acid sequences of GM foods are known. Hence the most obvious choice to assess potential allergenicity is by sequence comparison to known allergens. Significant similarity between transgenic and known allergen sequence would predict the transgene to be allergenic itself or to cross-react with known allergens. The question arises what constitutes a “significant similarity”.
3. Scientific Overview 16
A Joint FAO (Food and Agriculture Organization of the United Nations) /WHO (World Health Organization) Expert Consultation on Foods Derived from Biotech- nology devised guidelines for allergenicity evaluation. According to these guide- lines, a novel protein is regarded allergenic if:
a) it has an identity of at least six contiguous amino acids or
b) more than 35% sequence similarity over a window of 80 amino acids
compared to any known allergen3. However, this method proved to be of low preci- sion, predicting more than 40% of human proteins as allergens (Stadler and Stadler, 2003). In the same study, a new sequence based method has been proposed, to be introduced in section 3.4.1. Various alternative allergenicity prediction meth- ods utilizing different statistical models have been published in the years after (Li et al., 2004; Riaz et al., 2005; Thomas et al., 2005; Saha and Raghava, 2006; Zhang et al., 2006; Kong et al., 2007; Cui et al., 2007; Schein et al., 2007; Barrio et al., 2007; Tong and Tammi, 2008; Lim et al., 2008; Muh et al., 2009; Ivanciuc et al., 2009).
Significance in standard allergy assessment Cross-reactivity has also sev- eral implications in clinical allergy assessment. Not only are cross-reactions in vitro potentially creating clinically non-significant results, as for example with CCDs, but determining the original sensitizing agent may be complicated due to true cross-reactions. Identification of the primary sensitizing allergen however is likely to be relevant, because desensitization against the “true sensitizer” may cover the widest spectrum of specificities (Aalberse et al., 2001) and is likely to relieve symp- toms provoked by other allergens as well (Asero, 1998). Additionally, the ability to predict potential cross-reactions may alleviate the need for multiple allergen tests. These predictions would probably comprise a broader range of allergens than direct testing and would therefore also caution the patient about unforeseen allergic (cross-)reactions. Thus, allergen cross-reactivity prediction would consti- tute a valuable asset in clinical allergy diagnosis.
3.3.2 Epitope Focused Prediction
As mentioned above, T cell epitopes possess a certain potential in inducing cross- reactions. However, cross-reactivity at the T cell level is abundant and therefore not the limiting step in cross-reactivity induction. The key point whether cross- reactivity occurs rather lies at the level of the antibody, therefore cross-reactivity prediction efforts should commence at the antibody level.
Cross-reacting antibodies are able to recognize epitopes on different proteins, given that these epitopes are stereochemically similar enough. As we have seen, the sin-
3Full report: http://www.who.int/foodsafety/publications/biotech/en/ec_jan2001.
3.3.3 Structure-Sequence Relationship
When Pascarella and Argos in 1991 grouped protein structures with similar main- chain fold, they found that proteins in the same group exhibited strong sequence and functional similarities. Not only did this strongly imply their evolution from a common ancestor (Pascarella and Argos, 1992), their findings confirmed that structural similarities are reflected in sequence similarity.
Current State of the PDB If similar sequences encode the same main-chain fold, then a large number of sequences has to encode for only a limited number of folds. Indeed, when comparing the number of protein structures versus the amount of different folds in the Protein Data Bank (PDB), this reasoning seems to hold true. During the first decade of this millennium (2000-2009) the number of structures contained in the PDB increased from 9’749 to 57’613, an almost 6 fold increase. In the same timeframe, the number of unique folds grew from 622 to 1’393, a more than 2 fold increase after all. The yearly increase of new folds however stagnated at around 100 new folds until 2006, dropped to 6 in 2008 and since then, no new folds have been added (as of January 2011). This declining rate is documented in figure 3.2 where it can also be seen that the number of new structures characterized yearly keeps to increase. This observation roughly coincides with early predictions that the majority of proteins stems from no more than a thousand different families (Chothia, 1992; Aloy and Russell, 2004), representing the structural building blocks of protein evolution.
3.3.4 Identifying Conserved Domains
A correlation between structural similarity and amino-acid sequence seems plausi- ble, as mentioned above, however measuring sequence similarity as percent identity in sequence alignments is too simplistic. As an illustration, proteins of the globin family have diversely evolved in different species, yet are still folded in the same general 3-D pattern. The amino-acid sequences are only identical in very few residues, some globins differ from others in as many as 130 of the approximately 150 positions (Dickerson and Geis, 1983). This means that the “globin fold“ is encoded in various amino-acid sequences barely reminiscent of each other. Thus, a more dedicated approach to assess structural from sequence similarity, termed “generalized profiles“, was introduced.
3. Scientific Overview 18
0 25 00
Structures and folds newly added to the PDB (1990 - 2010)
year
Figure 3.2: The number of newly added structures and folds to the PDB, per year. The numbers for the years 1990-2010 have been extracted from data available at http://www.rcsb.org/pdb/ statistics/contentGrowthChart.do?content=molType-protein&seqid=100 (Yearly Growth of Protein Structures) and http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?
content=fold-scop (Growth Of Unique Folds Per Year As Defined By SCOP (v1.75))
Generalized Profiles Generalized profiles are a very sensitive method for de- tecting even distant protein relationships by sequence comparison (Gribskov et al., 1987). In order to identify homologous proteins, the query sequences are not queried by a single sequence but a profile constructed from a family of related se- quences. The profiles themselves are derived from multiple alignments of an initial sequence pool and contain the following information:
• The residues which are allowed at what position
• The importance of the positions
• Which positions allow insertions
• Which positions may be dispensable
As such, generalized profiles may describe common characteristics of even dis- tantly related protein sequences. Proteins which contain the desired motif can be identified by comparing their sequence to the profile.
3. Scientific Overview 19
3.3.5 Motifs and General Profiles
The profile information is stored in a position-specific scoring matrix (PSSM). This matrix can be used to search a sequence database for occurrences of the motif. As the name “scoring matrix” implies, comparing sequences to PSSMs returns a match score, a quantitative rather than a qualitative estimation for the relationship between a protein sequence and the profile. In order to decide whether a query sequence contains a motif, the significance of the profile-sequence match has to be evaluated by defining threshold levels for the match score. Sequences scoring above the match score threshold likely contain the motif and are therefore predicted to be phylogenetically related.
Obtaining relevant cut-off levels is a difficult task. We used an approach based on the probability of finding a profile in nature, substituting ‘nature’ with a random- ized database. Our randomized database was created from Uniprot4 release 44.0 by regional shuffling using a window of 20 amino-acid residues, thereby preserving size, sequence length distribution and amino-acid composition (Pearson and Lip- man, 1988). This approach has been criticized for introducing significant bias such as over-fitting and failure to reflect the natural processes of random nucleotide and amino acid replacement, thus using a database consisting of randomly selected but unshuffled sequences might be an alternative worth considering (Mitrophanov and Borodovsky, 2006).
Scoring all amino acid sequences of the randomized database against the profile returns an empirical score frequency distribution. By fitting an extreme value dis- tribution (EVD, Gumbel distribution) to these scores, an E-value can be deduced for each score with the formula:
E(x,A) = A× 10−R1−R2x
where R1 = lnA
λ
ln10
The E-value associates an expected number of chance hits with each score, i.e. the number of hits with scores exceeding x in a database with A residues (Pagni and Jongeneel, 2001). N corresponds to the number of sequences in the database, λ and µ are characteristics of the EVD. The E-value parameters R1 and R2 can then be used by profile search algorithms to return normalized scores5. Unlike raw scores, the normalized scores can be compared between different profiles and a threshold value separating significant from random matches can be defined.
4Uniprot: http://www.uniprot.org/ 5pftools Nscore calculation: http://www.isrec.isb-sib.ch/profile/scoredoc.html
3. Scientific Overview 20
3.4.1 Motif Calculation
The theoretical background to allergen cross-reactivity prediction has been de- scribed in section 3.3.5. We have utilized an approach previously developed at our institute (Stadler and Stadler, 2003), which aimed at identifying potentially allergenic novel proteins in GM food. This approach uses MEME 6 (Bailey and Elkan, 1994) and the pftools7 (Bucher et al., 1996) in an iterative fashion as shown in figure 3.3.
remove matching sequences
MEME pftools
Figure 3.3: Iterative motif discovery. Allergen sequences are downloaded from Allergome and clustered using cd-hit. MEME analyzes all sequences in the run set and identifies the most significant motif. This motif is scaled against a randomized database using pfscale and stored as a PSSM. The run set is scanned using pfscan and all matching sequences are removed from the set. With the remaining sequences this cycle is repeated until no more significant motifs are found.
Three changes to the original approach have been applied:
First, since the inception of the original approach, almost five times as many aller- gen sequences are now available. This increase in sequences also saw an increase in isoforms. Therefore we decided to cluster sequences with a an identity of 90% or more prior to submission to our iterative motif discovery using cd-hit8 (Li et al., 2001). This clustering not only reduced the required calculation time, which raises exponentially with the number of sequences, but also prevents the generation of motifs consisting entirely of isoforms.
Second, in order to allow several motifs per protein sequence, all protein sequences
6MEME is available from http://meme.sdsc.edu/meme/ 7pftools are available from http://www.isrec.isb-sib.ch/ftp-server/pftools/ 8cd-hit is available from http://www.bioinformatics.org/cd-hit/
3. Scientific Overview 21
are screened against all discovered motifs. In the original approach, only the motif discovered during the iterative process was assigned to a protein.
Third, MEME was allowed to choose a variable motif length of 35 to 70 amino acid residues (cf. dissertation equivalent II).
3.4.2 Web Interface
The work presented here requires data from different sources. PSSMs representing allergen motifs (1) are calculated from allergen sequences (2) retrieved from an online database, Allergome9. These calculations are compared against wet lab data (3) obtained in the form of spreadsheet files. In order to efficiently work with this diverse data, source material was processed and stored in a MySQL10
database. A web-based front-end in the PHP11 and JavaScript programming languages was built in order to allow quick data lookups. It is publicly accessible from our institute’s website: http://www.iib.unibe.ch/allergen/. Figure 3.4 shows a screenshot of the front-end.
3.5 Outlook: Protein Surface Comparison
The rationale behind predicting cross-reactivity by sequence similarity mostly stems from the broad availability of sequence data. A prediction closer to nature would be direct comparison of two proteins’ surfaces and consequentially judge the possibility of a cross-reaction. After all, structure is evolutionary more conserved than sequence (Holm and Sander, 1996). The low number of available structures compared to the number of sequences made this approach unfeasible so far. How- ever, this proportion is changing over time. More and more protein structures are being determined by X-ray or nuclear magnetic resonance (NMR) imaging. Additionally, structures not yet experimentally determined may be inferred from ab initio protein folding prediction and homology modeling with increasing relia- bility.
3.5.1 Ab initio Protein Folding Prediction
Ab initio protein folding prediction is the prediction of yet unknown protein struc- tures only from amino-acid sequences. Despite the vast amount of possible con- formations for each sequence, proteins generally fold into uniquely native states, their thermodynamically most stable conformation. The dihedral angles φ and ψ may each assume one of three stable positions, hence knowledge of the amino-acid
9Allergome: http://www.allergome.org/ 10MySQL is available from http://www.mysql.com/downloads/mysql/ 11PHP is available from http://www.php.net/downloads.php
3. Scientific Overview 22
Figure 3.4: Screenshot of the web-based front-end. The website is built with Web 2.0 tech- nologies and allows to lookup allergen extracts, proteins and motifs via life-search. Furthermore, custom protein sequences can be checked for occurrences of allergen motifs.
sequence is potentially sufficient to predict the native fold of a protein. The idea of letting a computer test all possible conformations comes to mind. This computer would choose the thermodynamically most favorable conformation, thus finding a protein’s native state would merely be a question of available computer time. However, an amino-acid sequence of 100 residues may hypothetically fold into 3198
potential conformations (three states for each of the 99 φ and 99 ψ angles). If a protein was to fold into each of these conformations in order to find its native state, it would have to fold for a time period much longer than the age of our known universe, even if it would only use picoseconds per state (cf. the Levinthal paradox ). As of December 2010, the fastest supercomputer in the world can per- form 2’570 calculations per picosecond12. This machine would have to calculate for more than 1071 years, even by oversimplifying one complete structure compar- ison to one clock cycle. It is evident that this brute force approach is not least impossible.
When applying evolutionary information and stochastic methods to this approach, the calculations still require vast computational resources and so far have only been
122.566 petaflops. Rank 1 in November 2010’s TOP500 list of the world’s most powerful supercomputers: http://www.top500.org/lists/2010/11
3. Scientific Overview 23
3.5.2 Homology Modeling
A different approach to bioinformatic protein structure prediction is homology modeling or template based modeling. Compared to ab initio prediction, homol- ogy modeling predicts the protein structure via comparison to a template struc- ture. Therefore, the existence of similar structures in the PDB is a necessity for a successful prediction. Identifying and aligning the best template structure (termed threading or fold recognition) is the first important step towards a correct prediction. Not surprisingly, the most often used amongst the many threading approaches use sequence profile-profile alignments to identify phylogenetically re- lated structure templates (Skolnick et al., 2004; Jaroszewski et al., 2005). Zhang and Skolnick recently showed that high-quality full-length models can be built for all protein targets with an average root mean square deviation (rmsd) of 2.25 A (Zhang and Skolnick, 2005). This suggests that the structural universe of the current PDB library is essentially complete for solving the protein structure prob- lem, at least for single-domain proteins.
The protein folding prediction field has quite literally turned into a sport with different research groups trying to best each other in predicting structures in a biannual large-scale experiment known as the Critical Assessment of Techniques for Protein Structure Prediction (CASP)13. The advances in the field already to- day offer the possibility to predict yet unknown protein structures from sequence with an astonishing accuracy. For cross-reactivity prediction, homology modeling may possibly provide protein structures even for novel proteins, which may subse- quently be used to seek surface epitopes. Using above-mentioned rmsd of 2.25 A as a reference, the accuracy of the predicted structures is potentially high enough for the prediction of cross-reactive epitopes, given that the antibody-antigen bind- ing surface encompasses almost 1’000 A2 (Davies et al., 1988; Braden and Poljak, 1995). Therefore it seems feasible to substitute computationally predicted struc- tures for protein structures which have not yet been experimentally determined.
13CASP: http://predictioncenter.org/
3.5.3 Prediction of Similar Surfaces
Predicting the fold of a protein however is only half the story. After the generation of the tertiary structure, the proteins’ B-cell epitopes have to be identified and subsequently, these epitopes have to be compared in order to identify cross-reacting proteins. A full molecular docking prediction is not needed as we are not interested in the binding capacity of an antibody, but merely the similarity of two protein surfaces.
The first problem, accurately predicting B-cell epitopes, is a major challenge es- pecially in vaccine development. However, even though recent publications pro- pose improved epitope prediction methods (Scarabelli et al., 2010; Fiorucci and Zacharias, 2010), the field has apparently not yet achieved a high level of reliabil- ity allowing to forego laboratory experiments (Bryson et al., 2010). Whether the reliability would be high enough for cross-reactivity prediction would have to be determined. Anyway, a similarity search on entire protein surfaces as opposed to only searching epitopes might eliminate the need to identify epitopes in the first place.
Thus a last problem persists: comparing the surfaces of two proteins and identify similar patches. Several approaches to this problem have been proposed, some purely geometrical (e.g. spin-image representations (Bock et al., 2007), geometric invariant fingerprints (Yin et al., 2009)), others respecting electrostatic properties (e.g. the adaptive Poisson-Boltzmann solver (Baker et al., 2001)). It would cer- tainly be interesting to apply these techniques to 3-D allergen structures in order to identify potentially cross-reacting surface patches.
3. Scientific Overview 25
References
Klaus J Erb. Helminths, allergic disorders and ige-mediated immune responses: where do we stand? Eur J Immunol, 37(5):1170–3, May 2007. doi: 10.1002/eji. 200737314.
Hannah J Gould, Brian J Sutton, Andrew J Beavil, Rebecca L Beavil, Na- talie McCloskey, Heather A Coker, David Fear, and Lyn Smurthwaite. The biology of ige and the basis of allergic disease. Annu Rev Im- munol, 21:579–628, Jan 2003. doi: 10.1146/annurev.immunol.21.120601.141103. URL http://www.annualreviews.org/doi/abs/10.1146/annurev.immunol.
21.120601.141103.
A B Kay. Overview of ’allergy and allergic diseases: with a view to the future’. Br Med Bull, 56(4):843–64, 2000. URL http://www.ncbi.nlm.nih.gov/pubmed/
11359624.
Birgit Helm, Philip Marsh, Donata Vercelli, Eduardo Padlan, Hannah Gould, and Raif Geha. The mast cell binding site on human immunoglobulin e. Nature, 331 (6152):180, Jan 1988. doi: doi:10.1038/331180a0. URL http://www.nature.
com/nature/journal/v331/n6152/abs/331180a0.html.
H Metzger. The high affinity receptor for ige on mast cells. Clin Exp Allergy, 21 (3):269–79, May 1991.
M J Nadler, S A Matthews, H Turner, and J P Kinet. Signal transduction by the high-affinity immunoglobulin e receptor fc epsilon ri: coupling form to function. Adv Immunol, 76:325–55, Jan 2000.
Sanford B Hooker and William C Boyd. The existence of antigenic determinants of diverse specificity in a single protein — the journal of immunology. Journal of Immunology, 26:469–79, 1934. URL http://www.jimmunol.org/content/
26/6/469.abstract.
P H Maurer. I. antigenicity of oxypolygelatin and gelatin in man. J Exp Med, 100(5):497–513, Nov 1954. URL http://www.ncbi.nlm.nih.gov/pubmed/
13211910.
P G H Gell and B Benacerraf. Studies on hypersensitivity. ii. delayed hypersensi- tivity to denatured proteins in guinea pigs. Immunology, 2(1):64–70, Jan 1959. URL http://www.ncbi.nlm.nih.gov/pubmed/13640681.
B Benacerraf and P G H Gell. Studies on hypersensitivity. i. delayed and arthus- type skin reactivity to protein conjugates in guinea pigs. Immunology, 2(1): 53–63, Jan 1959. URL http://www.ncbi.nlm.nih.gov/pubmed/13640680.
F Ferreira, T Hawranek, P Gruber, N Wopfner, and Adriano Mari. Allergic cross- reactivity: from gene to the clinic. Allergy, 59(3):243–67, Mar 2004. doi: 10. 1046/j.1398-9995.2003.00407.x.
3. Scientific Overview 26
K W Wucherpfennig and J L Strominger. Molecular mimicry in t cell-mediated autoimmunity: viral peptides activate human t cell clones specific for myelin basic protein. Cell, 80(5):695–705, Mar 1995.
D R Davies, S Sheriff, and E A Padlan. Antibody-antigen complexes. J Biol Chem, 263(22):10541–4, Aug 1988. URL http://www.jbc.org/content/263/
22/10541.long.
Virginie Lafont, Michael Schaefer, Roland H Stote, Daniele Altschuh, and Annick Dejaegere. Protein-protein recognition and interaction hot spots in an antigen- antibody complex: free energy decomposition identifies ”efficient amino acids”. Proteins, 67(2):418–34, May 2007. doi: 10.1002/prot.21259. URL http://www.
ncbi.nlm.nih.gov/pubmed/17256770.
J W Kappler, N Roehm, and P Marrack. T cell tolerance by clonal elimination in the thymus. Cell, 49(2):273–80, Apr 1987.
P Kisielow, H S Teh, H Bluthmann, and H von Boehmer. Positive selection of antigen-specific t cells in thymus by restricting mhc molecules. Nature, 335 (6192):730–3, Oct 1988. doi: 10.1038/335730a0. URL http://www.nature.
com/nature/journal/v335/n6192/abs/335730a0.html.
J Holm, G Baerentzen, M Gajhede, H Ipsen, J N Larsen, H Løwenstein, M Wis- senbach, and M D Spangfort. Molecular basis of allergic cross-reactivity be- tween group 1 major allergens from birch and apple. J Chromatogr B Biomed Sci Appl, 756(1-2):307–13, May 2001. URL http://www.ncbi.nlm.nih.gov/
pubmed/11419722.
S Laffer, R Valenta, S Vrtala, M Susani, R van Ree, D Kraft, O Scheiner, and M Duchene. Complementary dna cloning of the major allergen phl p i from timothy grass (phleum pratense); recombinant phl p i inhibits ige binding to group i allergens from eight different grass species. J Allergy Clin Immunol, 94 (4):689–98, Oct 1994. URL http://www.ncbi.nlm.nih.gov/pubmed/7930302.
S Laffer, M Duchene, I Reimitzer, M Susani, C Mannhalter, D Kraft, and R Va- lenta. Common ige-epitopes of recombinant phl p i, the major timothy grass pollen allergen and natural group i grass pollen isoallergens. Mol Immunol, 33(4- 5):417–26, Jan 1996. URL http://www.ncbi.nlm.nih.gov/pubmed/8676893.
R Valenta, M Duchene, C Ebner, P Valent, C Sillaber, P Deviller, F Ferreira, M Tejkl, H Edelmann, and D Kraft. Profilins constitute a novel family of functional plant pan-allergens. J Exp Med, 175(2):377–85, Feb 1992.
H Breiteneder and C Ebner. Molecular and biochemical classification of plant- derived food allergens. J Allergy Clin Immunol, 106(1 Pt 1):27–36, Jul 2000. doi: 10.1067/mai.2000.106929.
T Midoro-Horiuti, E G Brooks, and R M Goldblum. Pathogenesis-related proteins of plants as allergens. Ann. Allergy Asthma Immunol., 87(4):261–71, Oct 2001. doi: 10.1016/S1081-1206(10)62238-7.
3. Scientific Overview 27
1998.00325.x/pdf.
E A Padlan. Anatomy of the antibody molecule. Molecular Immunology, 31(3): 169–217, 1994. URL http://www.ncbi.nlm.nih.gov/pubmed/8114766.
C F Barbas, A Heine, G Zhong, T Hoffmann, S Gramatikova, R Bjornestedt, B List, J Anderson, E A Stura, I A Wilson, and R A Lerner. Immune versus natural selection: antibody aldolases with enzymic rates but broader scope. Science, 278(5346):2085–92, Dec 1997. URL http://www.sciencemag.org/
content/278/5346/2085.long.
Leo C James and Dan S Tawfik. The specificity of cross-reactivity: promis- cuous antibody binding involves specific hydrogen bonds rather than nonspe- cific hydrophobic stickiness. Protein Science : A Publication of the Pro- tein Society, 12(10):2183–93, Oct 2003. doi: 10.1110/ps.03172703. URL http://www.ncbi.nlm.nih.gov/pubmed/14500876.
K Fotisch, F Altmann, D Haustein, and S Vieths. Involvement of carbohy- drate epitopes in the ige response of celery-allergic patients. Int Arch Al- lergy Immunol, 120(1):30–42, Sep 1999. URL http://content.karger.com/
produktedb/produkte.asp?typ=fulltext&file=iaa20030.
Rob C Aalberse, V Koshte, and J G J Clemens. Immunoglobulin e antibod- ies that crossreact with vegetable foods, pollen, and hymenoptera venom. Journal of Allergy and Clinical Immunology, 68(5):356–364, 1981. doi: doi: 10.1016/0091-6749(81)90133-0. URL http://www.ncbi.nlm.nih.gov/pubmed/
7298999.
A Mari, P Iacovacci, C Afferni, B Barletta, R Tinghino, G Di Felice, and C Pini. Specific ige to cross-reactive carbohydrate determinants strongly af- fect the in vitro diagnosis of allergic diseases. J Allergy Clin Immunol, 103(6): 1005–11, Jun 1999. URL http://linkinghub.elsevier.com/retrieve/pii/
S0091674999003486.
Renato Erzen, Peter Korosec, Mira Silar, Ema Music, and Mitja Kosnik. Car- bohydrate epitopes as a cause of cross-reactivity in patients allergic to hy- menoptera venom. Wiener klinische Wochenschrift, 121(9-10):349–52, Jan 2009. doi: 10.1007/s00508-009-1171-1.
Stefan Vieths, Stephan Scheurer, and Barbara Ballmer-Weber. Current under- standing of cross-reactivity of food allergens and pollen. Ann N Y Acad Sci, 964:47–68, May 2002.
3. Scientific Overview 28
Rob C Aalberse, J Akkerdaas, and R van Ree. Cross-reactivity of ige antibodies to allergens. Allergy, 56(6):478–90, Jun 2001.
J Lescar, M Pellegrini, H Souchon, D Tello, R J Poljak, N Peterson, M Greene, and P M Alzari. Crystal structure of a cross-reaction complex between fab f9.13.7 and guinea fowl lysozyme. J Biol Chem, 270(30):18067–76, Jul 1995. URL http://www.jbc.org/content/270/30/18067.long.
Rob C Aalberse. Assessment of sequence homology and cross-reactivity. Toxicol Appl Pharmacol, 207(2 Suppl):149–51, Sep 2005. doi: 10.1016/j.taap.2005.01. 021.
C H Blackley. Experimental researches on the causes and nature of cattarrhus aestivus. Balliere, Trindall, & Cox, 1873.
H J Chong Neto and N A Rosario. Studying specific ige: in vivo or in vitro. Allergologia et immunopathologia, 37(1):31–5, Jan 2009. URL http://www.
elsevier.es/revistas/ctl_servlet?_f=7014&articuloid=13133446.
Scott H Sicherer and Hugh A Sampson. Food allergy. J. Allergy Clin. Immunol., 125(2 Suppl 2):S116–25, Feb 2010. doi: 10.1016/j.jaci.2009.08.028. URL http:
//www.ncbi.nlm.nih.gov/pubmed/20042231.
John H Krouse and Richard L Mabry. Skin testing for inhalant allergy 2003: current strategies. Otolaryngol Head Neck Surg, 129(4 Suppl):S33–49, Oct 2003. URL http://www.ncbi.nlm.nih.gov/pubmed/14574280.
S Tripodi, A Di Rienzo Businco, C Alessandri, V Panetta, P Restani, and P M Ma- tricardi. Predicting the outcome of oral food challenges with hen’s egg through skin test end-point titration. Clin Exp Allergy, 39(8):1225–33, Aug 2009. doi: 10.1111/j.1365-2222.2009.03250.x. URL http://onlinelibrary.wiley.com/
doi/10.1111/j.1365-2222.2009.03250.x/abstract.
E A Pastorello, C Incorvaia, C Ortolani, S Bonini, G W Canonica, S Romag- nani, A Tursi, and C Zanussi. Studies on the relationship between the level of specific ige antibodies and the clinical expression of allergy: I. definition of levels distinguishing patients with symptomatic from patients with asymp- tomatic allergy to common aeroallergens. J Allergy Clin Immunol, 96(5 Pt 1):580–7, Nov 1995. URL http://linkinghub.elsevier.com/retrieve/pii/
S0091-6749(95)70255-5.
Martin D Chapman, A M Smith, L D Vailes, L K Arruda, V Dhanaraj, and A Pomes. Recombinant allergens for diagnosis and therapy of allergic dis- ease. J Allergy Clin Immunol, 106(3):409–18, Sep 2000. doi: 10.1067/ mai.2000.109832. URL http://linkinghub.elsevier.com/retrieve/pii/
S0091674900564069.
3. Scientific Overview 29
20176264.
R Valenta, J Lidholm, V Niederberger, B Hayek, D Kraft, and H Gronlund. The recombinant allergen-based concept of component-resolved diagnostics and im- munotherapy (crd and crit). Clin Exp Allergy, 29(7):896–904, Jul 1999.
M van Hage-Hamsten and G Pauli. Provocation testing with recombinant aller- gens. Methods, 32(3):281–91, Mar 2004. doi: 10.1016/j.ymeth.2003.08.007. URL http://www.ncbi.nlm.nih.gov/pubmed/14962763.
Reinhard Hiller, Sylvia Laffer, Christian Harwanegg, Martin Huber, Wolfgang M Schmidt, Anna Twardosz, Bianca Barletta, Wolf M Becker, Kurt Blaser, Heimo Breiteneder, Martin Chapman, Reto Crameri, Michael Duchene, Fatima Fer- reira, Helmut Fiebig, Karin Hoffmann-Sommergruber, Te Piao King, Tamara Kleber-Janke, Viswanath P Kurup, Samuel B Lehrer, Jonas Lidholm, Ulrich Muller, Carlo Pini, Gerald Reese, Otto Scheiner, Annika Scheynius, Horng-Der Shen, Susanne Spitzauer, Roland Suck, Ines Swoboda, Wayne Thomas, Raf- faela Tinghino, Marianne Van Hage-Hamsten, Tuomas Virtanen, Dietrich Kraft, Manfred W Muller, and Rudolf Valenta. Microarrayed allergen molecules: diag- nostic gatekeepers for allergy treatment. FASEB J., 16(3):414–6, Mar 2002. doi: 10.1096/fj.01-0711fje. URL http://www.fasebj.org/content/early/2002/
03/02/fj.01-0711fje.long.
FAO and WHO. Evaluation of allergenicity of genetically modified foods. report of a joint fao/who expert consultation on allergenicity of foods derived from biotechnology. Jan 2001.
J A Nordlee, S L Taylor, J A Townsend, L A Thomas, and R K Bush. Identification of a brazil-nut allergen in transgenic soybeans. N. Engl. J. Med., 334(11):688– 92, Mar 1996. doi: 10.1056/NEJM199603143341103. URL http://www.nejm.
org/doi/full/10.1056/NEJM199603143341103.
Michael B Stadler and Beda M Stadler. Allergenicity prediction by protein se- quence. FASEB J., 17(9):1141–3, Apr 2003. doi: 10.1096/fj.02-1052fje. URL http://www.fasebj.org/cgi/content/abstract/02-1052fjev1.
Kuo-Bin Li, Praveen Issac, and Arun Krishnan. Predicting allergenic proteins using wavelet transform. Bioinformatics, 20(16):2572–8, Nov 2004. doi: 10. 1093/bioinformatics/bth286. URL http://bioinformatics.oxfordjournals.
org/cgi/reprint/20/16/2572.
Tariq Riaz, Hen Ley Hor, Arun Krishnan, Francis Tang, and Kuo-Bin Li. We- ballergen: a web server for predicting allergenic proteins. Bioinformatics, 21(10):2570–1, May 2005. doi: 10.1093/bioinformatics/bti356. URL http:
//bioinformatics.oxfordjournals.org/cgi/content/full/21/10/2570.
Karluss Thomas, Gary Bannon, Susan Hefle, Corinne Herouet, Michael Holsap- ple, Gregory Ladics, Sue Macintosh, and Laura Privalle. In silico methods for
3. Scientific Overview 30
evaluating human allergenicity to novel proteins: International bioinformatics workshop meeting report, 23-24 february 2005. Toxicological Sciences, 88(2): 307–10, Dec 2005. doi: 10.1093/toxsci/kfi277.
Sudipto Saha and G P S Raghava. Algpred: prediction of allergenic proteins and mapping of ige epitopes. Nucleic Acids Res, 34(Web Server issue):W202–9, Jul 2006. doi: 10.1093/nar/gkl343. URL http://nar.oxfordjournals.org/cgi/
content/full/34/suppl_2/W202.
ZH Zhang, JL Koh, GL Zhang, KH Choo, MT Tammi, and JC Tong. Allertool: a web server for predicting allergenicity and allergic cross-reactivity in proteins. Bioinformatics, Dec 2006. doi: 10.1093/bioinformatics/btl621. URL http:
//bioinformatics.oxfordjournals.org/cgi/content/abstract/btl621v1.
Waiming Kong, Tsu Soo Tan, Lawrence Tham, and Keng Wah Choo. Improved prediction of allergenicity by combination of multiple sequence motifs. In Sil- ico Biol (Gedrukt), 7(1):77–86, Jan 2007. URL http://www.bioinfo.de/isb/
2006070006/.
Juan Cui, Lian Yi Han, Hu Li, Choong Yong Ung, Zhi Qun Tang, Chan Juan Zheng, Zhi Wei Cao, and Yu Zong Chen. Computer prediction of allergen pro- teins from sequence-derived protein structural and physicochemical properties. Mol Immunol, 44(4):514–20, Jan 2007. doi: 10.1016/j.molimm.2006.02.010.
Catherine H. Schein, Ovidiu Ivanciuc, and Werner Braun. Bioinformatics ap- proaches to classifying allergens and predicting cross-reactivity. Immunol- ogy and allergy clinics of North America, 27(1):1, Feb 2007. doi: 10.1016/j. iac.2006.11.005. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=1941676.
Alvaro Martinez Barrio, Daniel Soeria-Atmadja, Anders Nister, Mats G Gustafs- son, Ulf Hammerling, and Erik Bongcam-Rudloff. Evaller: a web server for in silico assessment of potential protein allergenicity. Nucleic Acids Res, 35 (Web Server issue):W694–700, Jul 2007. doi: 10.1093/nar/gkm370. URL http://nar.oxfordjournals.org/cgi/content/full/35/suppl_2/W694.
Joo Chuan Tong and Martti T Tammi. Prediction of protein allergenicity using local description of amino acid sequence. Front Biosci, 13:6072–8, Jan 2008. URL http://www.bioscience.org/2008/v13/af/3138/fulltext.htm.
Shen Jean Lim, Joo Chuan Tong, Fook Tim Chew, and Martti T Tammi. The value of position-specific scoring matrices for assessment of protein allegenicity. BMC Bioinformatics, 9 Suppl 12:S21, Jan 2008. doi: 10.1186/1471-2105-9-S12-S21.
Hon Cheng Muh, Joo Chuan Tong, and Martti T Tammi. Allerhunter: a svm-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins. PLoS ONE, 4(6):e5861, Jan 2009. doi: 10.1371/journal.pone. 0005861. URL http://www.plosone.org/article/info%253Adoi%252F10.
1371%252Fjournal.pone.0005861.
R Asero. Effects of birch pollen-specific immunotherapy on apple allergy in birch pollen-hypersensitive patients. Clin Exp Allergy, 28(11):1368–73, Nov 1998. URL http://onlinelibrary.wiley.com/doi/10.1046/j.1365-2222.
1998.00399.x/abstract.
S Pascarella and P Argos. A data bank merging related protein structures and sequences. Protein Eng, 5(2):121–37, Mar 1992. URL http://peds.
oxfordjournals.org/content/5/2/121.long.
C Chothia. Proteins. one thousand families for the molecular biologist. Nature, 357(6379):543–4, Jun 1992. doi: 10.1038/357543a0. URL http://www.nature.
com/nature/journal/v357/n6379/abs/357543a0.html.
Patrick Aloy and Robert B Russell. Ten thousand interactions for the molecular bi- ologist. Nature Biotechnology, 22(10):1317, Oct 2004. doi: doi:10.1038/nbt1018. URL http://www.nature.com/nbt/journal/v22/n10/full/nbt1018.html.
R E Dickerson and I Geis. Hemoglobin: structure, function, evolution and pathol- ogy. Benjamin/Cummings Publishing Co., Inc., 1983.
M Gribskov, A D McLachlan, and D Eisenberg. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci USA, 84(13):4355–8, Jul 1987. URL http://www.pnas.org/cgi/reprint/84/13/4355.
W R Pearson and D J Lipman. Improved tools for biological sequence comparison. Proc Natl Acad Sci USA, 85(8):2444–8, Apr 1988. URL http://www.pnas.org/
content/85/8/2444.long.
Alexander Yu Mitrophanov and Mark Borodovsky. Statistical significance in bio- logical sequence analysis. Brief Bioinformatics, 7(1):2–24, Mar 2006.
M Pagni and C V Jongeneel. Making sense of score statistics for sequence alignments. Brief Bioinformatics, 2(1):51–67, Mar 2001. URL http://bib.
oxfordjournals.org/content/2/1/51.long.
Timothy L Bailey and C Elkan. Fitting a mixture model by expectation maxi- mization to discover motifs in biopolymers. Proceedings / International Con- ference on Intelligent Systems for Molecular Biology ; ISMB International Con- ference on Intelligent Systems for Molecular Biology, 2:28–36, Jan 1994. URL http://www.ncbi.nlm.nih.gov/pubmed/7584402?dopt=abstract.
P Bucher, K Karplus, N Moeri, and K Hofmann. A flexible motif search technique based on generalized profiles. Comput Chem, 20(1):3–23, Mar 1996. URL http:
//www.ncbi.nlm.nih.gov/pubmed/8867839.
W Li, L Jaroszewski, and A Godzik. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics (Oxford, England), 17(3):282–3, Mar 2001.
L Holm and C Sander. Mapping the protein universe. Science, 273(5275):595–603, Aug 1996. URL http://www.sciencemag.org/content/273/5275/595.long.
Yang Zhang. Progress and challenges in protein structure prediction. Curr Opin Struct Biol, 18(3):342–8, Jun 2008. doi: 10.1016/j.sbi.2008.02.004.
Bojan Zagrovic, Christopher D Snow, Michael R Shirts, and Vijay S Pande. Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing. J Mol Biol, 323(5):927–37, Nov 2002. URL http://www.ncbi.nlm.nih.gov/pubmed/12417204.
Michael Shirts and Vijay S Pande. Screen savers of the world unite! Science, 290 (5498):1903–4, 2000. doi: 10.1126/science.290.5498.1903. URL http://www.
ncbi.nlm.nih.gov/pubmed/17742054.
Vijay S Pande. A simple theory of protein folding kinetics. 2010. URL http:
//arxiv.org/abs/1007.0315.
Jeffrey Skolnick, Daisuke Kihara, and Yang Zhang. Development and large scale benchmark testing of the prospector 3 threading algo- rithm. Proteins, 56(3):502–18, Aug 2004. doi: 10.1002/prot. 20106. URL http://onlinelibrary.wiley.com/doi/10.1002/prot.20106/
abstract;jsessionid=7D0BBD01853416611E9A418F01E41F82.d02t02.
Lukasz Jaroszewski, Leszek Rychlewski, Zhanwen Li, Weizhong Li, and Adam Godzik. Ffas03: a server for profile–profile sequence alignments. Nucleic Acids Res, 33(Web Server issue):W284–8, Jul 2005. doi: 10.1093/nar/gki418. URL http://nar.oxfordjournals.org/content/33/suppl_2/W284.long.
Yang Zhang and Jeffrey Skolnick. The protein structure prediction problem could be solved using the current pdb library. Proc Natl Acad Sci USA, 102(4):1029– 34, Jan 2005. doi: 10.1073/pnas.0407152101. URL http://www.pnas.org/
content/102/4/1029.long.
B C Braden and R J Poljak. Structural features of the reactions between antibodies and protein antigens. FASEB J., 9(1):9–16, Jan 1995.
Guido Scarabelli, Giulia Morra, and Giorgio Colombo. Predicting interaction sites from the energetics of isolated proteins: a new approach to epitope mapping. Biophys J, 98(9):1966–75, May 2010. doi: 10.1016/j.bpj.2010.01.014. URL http://www.ncbi.nlm.nih.gov/pubmed/20441761.
Sebastien Fiorucci and Martin Zacharias. Prediction of protein-protein interaction sites using electrostatic desolvation profiles. Biophys J, 98(9):1921–30, May 2010. doi: 10.1016/j.bpj.2009.12.4332. URL http://www.ncbi.nlm.nih.gov/
pubmed/20441756.
2010&issue=24010&article=00001&type=abstract.
Mary Ellen Bock, Claudio Garutti, and Concettina Guerra. Discovery of similar regions on protein surfaces. J Comput Biol, 14(3):285–99, Apr 2007. doi: 10. 1089/cmb.2006.0145.
S Yin, E. A Proctor, A. A Lugovskoy, and N. V Dokholyan. Fast screening of protein surfaces using geometric invariant fingerprints. Proceedings of the Na- tional Academy of Sciences, 106(39):16622–16626, Sep 2009. doi: 10.1073/pnas. 0906146106. URL http://www.pnas.org/content/106/39/16622.full.
N A Baker, D Sept, S Joseph, M J Holst, and J A McCammon. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci USA, 98(18):10037–41, Aug 2001. doi: 10.1073/pnas.181342398. URL http:
//www.pnas.org/content/98/18/10037.long.
Dissertation Equivalent I
Pfiffner P, Truffer R, Matsson P, Rasi C, Mari A, Stadler BM. Allergen cross reactions: a problem greater than ever thought? Allergy 2010; 65: 1536–1544.
Dissertation Equivalent II
Pfiffner P, Stadler BM, Rasi C, Scala E, Mari A. Allergen clustering evaluated by in silico motifs or in vitro IgE microarray testing using highly purified allergens manuscript in preparation.
35
ORIGINAL ARTICLE EXPERIMENTAL ALLERGY AND IMMUNOLOGY
Allergen cross reactions: a problem greater than ever thought? P. Pfiffner1, R. Truffer1, P. Matsson2, C. Rasi3, A. Mari3,4 & B. M. Stadler1
1University Institute of Immunology, Bern, Switzerland; 2Phadia AB, Uppsala, Sweden; 3Center for Clinical and Experimental Allergology,
IDI-IRCCS, Rome; 4Allergy Data Laboratories s.c., Latina, Italy
To cite this article: Pfiffner P, Truffer R, Matsson P, Rasi C, Mari A, Stadler BM. Allergen cross reactions: a problem greater than ever thought? Allergy 2010;
65: 1536–1544.
Determination of specific IgE in patient sera is a valuable
test for allergologists (1). The number of potential allergens is
steadily increasing (2) and suppliers of allergy tests are provid-
ing ever longer lists of allergenic preparations to be used for in
vitro assays. In most instances, allergens are still relatively
crude extracts of organisms or parts thereof (3). Recently,
allergen diagnosis has improved by the use of highly purified
natural or recombinant allergens and protein microarrays (3–
8). This may improve allergy diagnostics in the future.
Cross reactions are allergic reactions against other aller-
gens without prior sensitization. They have been extensively
studied and a handful of well-defined cross reactivity syn-
dromes are clinically highly important, e.g., the pollen-food
syndromes (9). Cross reactions between recombinant aller-
gens are also documented (3, 7, 10). Thus, the immune sys-
tem might recognize common structures, allowing to predict
allergic reactions that have not been tested physically but
were derived by similarity.
capable to define a much lower number of potentially aller-
genic structures, termed motifs, than the number of known
protein sequences of allergens (11). These motifs represent a
scaled profile over a window of 50 amino acids, derived from
all currently known allergen protein sequences. They serve as
an identifier for evolutionary conserved protein domains. Con-
sequently, if protein sequences match a given motif, these pro-
teins are predicted to fold into the same protein domain and
therefore exhibit similar surface structures. We showed that
Keywords
Immunology, Sahli Haus 2, Inselspital, 3010
Bern, Switzerland.
E-mail: [email protected]
DOI:10.1111/j.1398-9995.2010.02420.x
Abstract
Background: Cross reactions are an often observed phenomenon in patients with
allergy. Sensitization against some allergens may cause reactions against other seem-
ingly unrelated allergens. Today, cross reactions are being investigated on a per-case
basis, analyzing blood serum specific IgE (sIgE) levels and clinical features of
patients suffering from cross reactions. In this study, we evaluated the level of sIgE
compared to patients’ total IgE assuming epitope specificity is a consequence of
sequence similarity.
Methods: Our objective was to evaluate our recently published model of molecular
sequence similarities underlying cross reactivity using serum-derived data from IgE
determinations of standard laboratory tests.
We calculated the probabilities of protein cross reactivity based on conserved
sequence motifs and compared these in silico predictions to a database consisting of
5362 sera with sIgE determinations.
Results: Cumulating sIgE values of a patient resulted in a median of 25–30% total
IgE. Comparing motif cross reactivity predictions to sIgE levels showed that on
average three times fewer motifs than extracts were recognized in a given serum
(correlation coefficient: 0.967). Extracts belonging to the same motif group
co-reacted in a high percentage of sera (up to 80% for some motifs).
Conclusions: Cumulated sIgE levels are exaggerated because of a high level of
observed cross reactions. Thus, not only bioinformatic prediction of allergenic
motifs, but also serological routine testing of allergic patients implies that the
immune system may recognize only a small number of allergenic structures.
Allergy
1536 Allergy 65 (2010) 1536–1544 ª 2010 John Wiley & Sons A/S
4. Results – Dissertation Equivalents 36
this method of cross reactivity prediction is superior to the
FAO/WHO rule, which states that a protein is allergenic if it
has either an identity of at least six continuous amino acids or
more than 35% sequence similarity over a window of 80 amino
acid residues. Especially in view of false positive matches
(67.3% of all Swiss-Prot proteins were predicted to be
allergenic by the FAO/WHO rule), the motif-based approach
performed much better (2.6% predicted to be allergenic) (11).
Thus, the question remains whether the in silico prediction
of allergenicity may be confirmed by wet lab data. For this pur-
pose, we have analyzed 5362 sera corresponding to 203 283
specific IgE determinations. We could demonstrate that the
degree of cross reaction was greater than ever thought.
Materials and methods
Data on 5456 serum samples were obtained by testing for
IgE using Phadia’s ImmunoCAP (former UniCAP, Phadia
AB, Uppsala, Sweden) systems. These are sandwich immuno-
assay systems where serum IgE antibodies react with anti-IgE
covalently coupled to the system in case of total IgE deter-
mination or with solid-phase bound allergen extracts to
determine specific IgE. Bound antibodies are detected and
quantified using enzyme-labeled anti-IgE-antibodies and fluo-
rescence detection.
Tests were performed in the years from 1988 to 2006 in 17
different countries in different laboratories. Raw, anonymized
IgE data (no age, sex, and other demographic and clinical
information) were collected as quality assurance; therefore,
no selection criteria were applied. Test results were collected
in a clinical setting; most sera are presumably from patients
with atopy.
All IgE levels are expressed in kilo units of antigen per
liter serum (kUA/l). Specific IgE levels >0.35 kUA/l (Class I
and higher) were regarded as a positive test result, levels
>100 kUA/l were capped at 100 kUA/l, which affected 1578
values.
Included in the database were serum levels for 99 allergens
as well as the total IgE level. According to the manufacturer,
the 99 allergen extracts used to determine the specific IgE val-
ues are the 99 most tested allergens among a list of more
than 700 allergens available in Phadia’s catalog. Table 1 lists
the extracts and groups them into major subsets.
Sera had to be tested for total IgE, against at least 10 dif-
ferent allergens and yield at least one positive specific IgE test
result to be allowed for the final database. With a total of
203 283 specific IgE tests, 5362 sera met our criteria and were
used for the analysis.
Databases and software
We created a MySQL database to hold the serum data (MyS-
QL 5.0, obtained from http://www.mysql.com/). Allergen
protein sequences were extracted from the Allergome data-
base (http://www.allergome.org/ as of January 2009). MEME
3.5.7 (12) (obtained from http://meme.sdsc.edu/meme/) and
pftools 2.3.4 (13) (obtained from http://www.isrec.isb-sib.ch/
ftp-server/pftools/) were used for the iterative allergen motif
discovery. Perl 5.8.8 (http://www.perl.org/), PHP 5.2+
(http://www.php.net/), and R 2.8 (14) (http://www.r-project.
org/) scripts were created to extract the desired statistical
calculations.
to Stadler and Stadler. (11) using 2189 protein sequences
from Allergome. These sequences are known to encode aller-
genic proteins (2). We identified 97 motifs with a residue
length of 50 amino acids. Three hundred and four of the
sequences used for identification did not match an allergen
motif, 96 thereof were shorter than 50 amino acids in length
and therefore could not match a motif, and 26 additional
proteins were known to only encode protein fragments.
To identify the motifs present in each allergen extract, the
proteins used in the motif identification process had to be
linked to extracts in which they occur. This linking was
achieved by first matching the allergen extract to the corre-
sponding allergen source within Allergome and then assign-
ing the proteins to the extract as defined by Allergome (15).
Our database now made it possible to computationally
determine which motifs occur in which extract(s), and there-
fore which extracts are likely to cross react due to their struc-
tural similarity.
We used the data of specific and total IgE determinations
comprising a total of 5362 sera. Figure 1A shows the distri-
bution of IgE levels in this serum collection. As expected
for a serum collection of allergic individuals, total IgE levels
peaked above the threshold value of 100 kUA/l (16), namely
between 200 and 400 kUA/l. For our study purpose, it was
not necessary to define a lower cut-off of total IgE levels.
Most sera were from the US (2264) and Sweden (2119), from
Western European countries (635) and Russia (230) while the
residual sera were either obtained from Japan, Southern Africa
or Canada (81) or its origin was unspecified (33).
All sera were tested against a subset of a panel consisting
of 99 allergens, as described in Table 1. Among these sera,
1471 sera have been tested against 90 or more of the aller-
gens and 3448 sera against 10–30 allergens.
We used the traditional cutoff of 0.35 kUA/l (Class I or
higher) to test positively against a given allergen extract. This
yielded 99 276 positive values representing IgE from classes
I–IV from a total of 203 283 allergen-specific IgE determina-
tions.
(Fig. 1B), Fig. 1C shows that the percentage of positive
specific IgE tests steadily increased with increasing total IgE,
resulting in more than 90% positive tests for sera with very
high total IgE (>3200 kUA/l).
Pfiffner et al. Allergen cross reactions
Allergy 65 (2010) 1536–1544 ª 2010 John Wiley & Sons A/S 1537
4. Results – Dissertation Equivalents 37
Table 1 The allergen extracts used to determine specific IgE levels. Identifier for extracts is Allergome’s accession number and Phadia’s
product code. The number of known proteins (excluding isoforms, within brackets the total number of proteins including isoforms) is
estimated based on available sequence data. The number of distinct motifs present within the extract is derived from contained proteins
Group Extract
Epidermals Cat epithelium and dander 1819 e1 6 (8) 3
Dog dander 1756 e5 5 (7) 4
Guinea pig epithelium 1765 e6 2 (2) 0
Horse dander 1813 e3 5 (6) 3
Mouse epithelium; serum
Rat epithelium; serum and
Foods of animal origin Beef 2019 f27 5 (5) 3
Blue mussel 1413 f37 1 (1) 1
Chicken 2037 f83 0 0
Egg white 1832 f1 6 (8) 4
Fish (cod) 1831 f3 1 (2) 1
Milk 1747 f2 7 (11) 6
Pork 2088 f26 0 0
Scallop 2012 f338 3 (3) 2
Shrimp 1893 f24 0 0
Tuna 2375 f40 0 0
Foods of plant origin Almond 1948 f20 4 (5) 2
Apple 1871 f49 4 (71) 4
Banana 1882 f92 1 (1) 1
Barley 2040 f6 7 (13) 4
Brazil nut 1738 f18 2 (2) 2
Buckwheat 1816 f11 4 (9) 3
Cacao 1369 f93 0 0
Carrot 1799 f31 2 (8) 2
Celery 1721 f85 4 (5) 2
Cherry 1946 f242 4 (7) 3
Coconut 3559 f36 1 (1) 0
Garlic 1706 f47 1 (2) 1
Gluten 651 f79 0 0
Hazel nut 2028 f17 7 (17) 5
Kiwi 1697 f84 8 (21) 3
Maize; Corn 2092 f8 4 (5) 3
Oat 2018 f7 0 0
Onion 1704 f48 1 (1) 1
Orange 1774 f33 3 (6) 2
Pea 1931 f12 2 (4) 1
Peach 1949 f95 3 (6) 2
Peanut 1723 f13 13 (36) 9
Potato 1977 f35 6 (19) 4
Rice 2058 f9 2 (2) 1
Rye 2076 f5 1 (2) 0
Sesame seed 1971 f10 7 (7) 3
Soya bean 1834 f14 13 (32) 7
Strawberry 2251 f44 3 (14) 3
Tomato 1870 f25 5 (12) 5
Wheat 1993 f4 13 (29) 8
White bean 1923 f15 0 0
Yeast 1960 f45 2 (2) 1
Allergen cross reactions Pfiffner et al.
1538 Allergy 65 (2010) 1536–1544 ª 2010 John Wiley & Sons A/S
4. Results – Dissertation Equivalents 38
Table 1 (Continued)
Cocksfoot 1798 g3 5 (16) 2
Common reed 1927 g7 0 0
Johnson grass 1979 g10 1 (1) 0
Sweet vernal grass 1718 g1 1 (2) 0
Timothy 1924 g6 10 (36) 8
Insects Cockroach; German 1742 i6 11 (115) 10
Microorganisms Alternaria alternata (tenuis) 1708 m6 12 (16) 10
Aspergillus fumigatus 1730 m3 24 (27) 14
Aureobasidium pullulans 2197 m12 0 0
Botrytis cinerea 1630 m7 0 0
Candida albicans 1757 m5 3 (3) 2
Cladosporium herbarum (Hormodendrum) 1775 m2 12 (13) 7
Epicoccum purpurascens 1810 m14 2 (2) 1
Fusarium moniliforme 1554 m9 0 0
Helminthosporium halodes 1125 m8 0 0
Mucor racemosus 2291 m4 0 0
Penicillium notatum 1912 m1 5 (8) 2
Phoma betae 2303 m13 0 0
Rhizopus nigricans 1622 m11 0 0
Stemphylium botryosum 2637 m10 1 (1) 1
Mites Dermatophagoides pteronyssinus 1803 d1 16 (32) 12
Dermatophagoides farinae 1801 d2 20 (72) 14
Tree pollens American beech 2249 t5 0 0
Box-elder 2136 t1 0 0
Common silver birch 1741 t3 6 (67) 6
Cottonwood 2324 t14 0 0
Elm 2385 t8 0 0
Japanese cedar 1784 t17 5 (31) 6
Maple leaf sycamore; London plane 1932 t11 3 (4) 3
Mountain juniper 1851 t6 3 (4) 3
Oak 1955 t7 1 (5) 1
Olive 1888 t9 11 (97) 6
Walnut 2044 t10 0 0
White ash 2253 t15 0 0
White pine 2312 t16 0 0
Willow 2355 t12 0 0
Venoms Common wasp (Yellow jacket) 2008 i3 4 (5) 4
Honey bee 1722 i1 9 (13) 8
Weed pollens Cocklebur 2401 w13 0 0
Common ragweed 1710 w1 9 (17) 5
Dandelion 6146 w8 1 (1) 1
Goosefoot; Lamb’s quarters 1768 w10 3 (3) 3
Marguerite; Ox-eye daisy 1567 w7 0 0
Mugwort 1728 w6 6 (25) 7
Nettle 2390 w20 0 0
Plantain (English); Ribwort 1933 w9 0 0
Saltwort (prickly); Russian thistle 1961 w11 2 (3) 1
Scale; Lenscale 2193 w15 0 0
Sheep sorrel 2353 w18 0 0
Wall pellitory (Parietaria officinalis) 1906 w19 1 (8) 0
Wall pellitory (Parietaria judaica) 1904 w21 4 (9) 3
Western ragweed 1711 w2 1 (2) 1
Pfiffner et al. Allergen cross reactions
Allergy 65 (2010) 1536–1544 ª 2010 John Wiley & Sons A/S 1539
4. Results – Dissertation Equivalents 39
Cumulated specific IgE
Next, we cumulated all individual specific IgE levels within
the 5362 sera and plotted them against the total serum IgE
(Fig. 2A). The data show that in 88.95% of the sera, total
serum IgE exceeds the cumulated specific IgE. Figure 2B
shows the percentage of cumulated specific IgE if sera were
grouped according to total IgE levels. Of all data, 90.2%
lay within a range between >0 and 1600 kUA/l and in
this range, specific IgE was between 25% and 30% of total
IgE.
1600 kUA/l total IgE showed a decreasing percentage of spe-
cific IgE. Thus, even though high total IgE sera at a greater
number result in positive tests (Fig. 1C), their specific cumu-
lated IgE fraction was lower (Fig. 2B). However, sIgE deter-
minations were capped at 100 kUA/l and we found that with
increasing total IgE sera were more likely to contain sIgE
values affected by this capping (Fig. 2C), which would under-
estimate the percentage of cumulated specific IgE in these
sera.
Cross reactivity by motifs
Based on the most recent sequence database, we found 64 of
the 99 tested allergens to contain proteins containing our
defined motifs. Some extracts contain more than one motif
resulting in 86 motifs to be associated with the tested allergen
extracts. In 13 extracts, we found only one motif (Table 1).
On the other hand, extract d2 (House Dust Mite, Dermato-
phagoides farinae) contained 14 motifs, m3 (Aspergillus fumig-
atus) also 14, d1 (House Dust Mite, Dermatophagoides
pteronyssinus) 12, i6 (German Cockroach) 10, and so on.
Excluding those extracts without a known motif, we found a
median of three motifs per extract.
Figure 3A shows that also the number of recognized
motifs steadily increased with higher IgE levels. On the other
hand, Fig. 3B shows that there are approximately three times
less motifs recognized than extracts from different allergenic
sources (correlation coefficient: 0.967).
The question remained whether this relation between
motifs and extracts is linear and is directly due to cross reac-
tions. Our next assumption was simple: If different extracts
from different sources theoretically contain the same motif,
one would expect a cross reaction. For this purpose, we ana-
lyzed all motifs that occurred in three or more different
extracts. Table 2 shows all motifs that fulfill this criterion.
For example, it is seen that motif 1 (corresponding to the
group of the Bet v 1 allergen) occurs in 25 extracts of our
allergen panel.
Figure 4 depicts two examples from the list in Table 2. We
have chosen motif 4 occurring in 10 different extracts and
motif 8, occurring in nine different extracts, as examples.
Motif 4 was chosen because it is an example for relatively
high cross reaction, while motif 8 is an example for a ‘low’
cross reactive motif. Our results, depicted in a spider form,
show how closely related the different allergen motif contain-
ing extracts actually are. The graphical depiction of the rela-
tionship between the motif defined cross reactions also allows
to create a ranking. For example, motif 4, extract f85 (Cel-
ery), shows cross reaction at an average of 92.7% with all
other allergen extracts in the group, while the lowest extract
(t3, Common silver birch) cross reacts at 69.6%. For illustra-
>0 –
total IgE content (kU/l)
s
C
Figure 1 Grouping the sera by their total IgE content resulted in
the distribution shown in (A). The amount of total IgE had no influ-
ence on the number of tests being performed (B); however, it
affected the outcome of the tests (C). The lines in B and C indicate
the median, the error bars represent the 25th and 75th percentile,
respectively.
Allergen cross reactions Pfiffner et al.
1540 Allergy 65 (2010) 1536–1544 ª 2010 John Wiley & Sons A/S
4. Results – Dissertation Equivalents 40
tion, we have chosen this high and low cross reactive level
from Table 2 that shows at the same time the absolute num-
ber of sera falling within this group.
As cross reactions seemed to be very frequent, we ana-
lyzed how many sera recognized extracts without cross
reaction. Three hundred and eighty-five sera (7.2%) were
positive against one extract only. However, 12 single
positive sera could not be considered as there are yet no
protein sequences containing a motif within the positive
tested extract [2x guinea pig epithelium (e6); 2x elm (t8);
2x wall pellitory (w19); 1x rye (f5); 1x oat (f7); 1x shrimp
(f24); 1x chicken (f83); 1x mucor racemosus (m4); 1x
walnut (t10)].
To truly exclude cross reactions, for each of the motifs
occurring in the positive extract, the single positive sera
would have to be tested against at least one other extract the-
oretically containi