the state of the art experimental approaches x-ray...
TRANSCRIPT
Synthetic program of the course
Structural Biology
The state of the art
Experimental approaches
X-ray crystallography
Electron microscopy
Computational approaches
Protein structure prediction methods
Molecular dynamics simulation
Structure validation
Successful examples: The ribosome machinery and the DNA structure
Rational Design of bioactive molecules
Properties of protein structures
Ligand-protein interactions
Structure-based drug discovery
Ligand-based drug discovery
Fragment-based drug discoveryClassical and recent successful examples of rational drug design
Lesson 1
Part b
Structural Biology
• Successful examples: The ribosome machinery
and the DNA structure
The structural biology and the DNA
A very brief description of (a) the gene theory and of (b) the experiments
that led to the identification of DNA as the chemical entity carrying the
hereditary traits
The discovery of the DNA double helix
The impact of this structure in life sciences
The discovery of the DNA structure
Complexity of Nucleic acids: DNA and RNAChains of DNA are made of millions of nucleotides
Current record for sequence deconding is of 2.3 million bases of human DNA in one step
Formula Cxxxxxxxx Oxxxxxxx Hxxxxxxxx Pxxxxxx Nxxxxxxxx
In fibers the molecular entities are ordered along the so-called
fiber axis
Fiber X-ray diffraction analysis
Theory
Fiber diffraction
The discovery of the alpha-helix (1951)
The fiber diffraction
pattern
The 3D model
by Pauling and Corey
Wool diffraction pattern:
meridional reflection (5.1 Å)
equatorial reflection (10 Å)
The DNA structure
• 1866: Gregor Mendel discovered that heredity is transmitted in discrete units
called “cell elements” – these elements were later termed “genes”;by Jhannsen
(1909)
• 1900: Mendel was re-discovered by De Vries, Correns and Tschermak.
• 1920: Thomas Hunt Morgan found hereditary traits in fruit flies and started his
experiments on radiation-induced mutagenesis;
• 1937: Max Delbruck moved to Caltech, to work with Morgan on fruit-flies, but
soon he decided that this kind of research was sterile and started his own
research on phages; a group soon formed, known informally as the “Phage
group”
• 1940’s: George Beadle and Edward Tatum established that any given enzyme is
associated to just one gene (a one-to-one relationship);
• 1943: Salvador Luria and Max Delbrück published their research that proved
unequivocally that new features (phenotype) arise from random mutations
(Lamarck loses, Darwin wins ... );
Key events and concepts in understanding heredity
• 1902: William Sutton recognized that chromosomes (whose function nobody
knew) were like Mendel’s “units”;
• 1944: Avery MacLeod and McCarty demonstrated that chromosomes are made
of DNA;
• 1952: Alfred Hershey and Martha Chase provided further experimental evidence
that DNA and not protein is the genetic material; they showed that when
bacteriophages, which are composed of DNA and protein, infect bacteria, their
DNA enters the host bacterial cell, but most of their protein does not;
In 1922, Muller published a summary of the gene capacities
1) Autocatalisys (self-reproduction)
2) Heterocatalysis (production of non-genetic material)
3) Ability to mutate (while retaining the other two properties)
Main steps into the identification of the chemical entity
carrying the genetic information
The DNA (macro)molecule
The early days of structural biology
The state of the art of structural characterizations of
biomolecules in 1940-1950
Examples of the most complex systems unveiled at atomic
levelMid-forties Mid-fifties
The birth of molecular biology now called structural biology
Pennicillin Vitamin B12
In this scenario, the determination of atomic structures of
biological macromolecules was an extremely challenging
task
Some data were available but their structural interpretation
was missing – Fiber data provided only some structural hints
whereas the phase problem was a major hurdle for the
exploitation of single crystal data
It is important to note that the consideration of these
macromolecules as defined chemical entities with specific
sequences of either aminoacids or nucleotides was a very
recent, and somehow debated, concept
Biological macromolecules
Protein crystals have been known for long time. In 1909
crystals of Hemoglobins isolated from different 200
organisms were reported
Barnal and Hodgkin obtained the first diffraction spots from a
protein crystal (pepsin) in 1936
In 1937 Max Perutz started a Ph. D. project whose aim the
determination of Hemoglobin crystal structure
The determination of Hemoglobin structure took 23 years
Once Perutz stated that he was lucky about the fact that his
thesis supervisors and evaluators didn’t require him to solve
the structure to award the degree
Availability of protein crystals
One of the first attempt to retrieve structural features for a
protein wasmade by the founder of structural crystallography
(Bragg law) and by the founders of structural biology…
…. it was a failure despite the scientific rank (three
Nobel laureates) of the authors
According to Linus Pauling
of Caltech the failure was
due to the scarce
knowledge of the authors of
basic stereochemistry.
He stated that likely none
of them had read his
seminal text-book “The
Nature of the Chemical
Bond”
This was the scenario when, in 1951, James D. Watson, a
young American zoologist with a passion for bird watching
and a vague fascination for genetics and with no experience
in organic and structural chemistry decided to turn its
attention to the structure of DNA
The interest of Watson for DNA initiated during the congress
“The Submicroscopical Structure of the Protoplasm.” (May
1951) held at Stazione Biologica in Napoli. During this event
Maurice Wilkins delivered a lecture on X-ray diffraction of
DNA crystals and on that occasion he showed some slides.
In Watson’s words: “Suddenly, I was excited about chemistry.
Before Maurice’s talk I had worried about the possibility that
the gene might be fantastically irregular. Now, however, I
knew that genes could crystallize; hence they must have a
regular structure that could be solved in a straightforward
fashion.” (Watson, 1968).
An old picture of the
Stazione Zoologica
Anton Dohrn
“Immediately,” Watson continues, “I began to wonder whether
it would be possible for me to join Wilkins in working on DNA”
During an excursion to Paestum on the following day, Watson
tried to impress the British professor with his enthusiasm
and ideas. Unsuccessfully, though.
Only at the end of 1952 he would finally succeed in his
attempt to work in England. Not in London, however, he was
sent to Cambridge, the biophysics laboratory directed by
William Lawrence Bragg. There he had the second seminal
and lucky meeting of his life, with Francis Crick.
A false start
Watson and Crick start to work to the DNA structure by
considering a triple helix model with the phosphate groups
in the center and the bases protruding toward the exterior .
In their model, the negatively charged moieties of the
phosphate groups were held together by positively charged
magnesium ion.
However, for several reasons, Watson and Crick were not
convinced by this model.
Somehow surprisingly, Linus Pauling was working on a similar triple
helix model for DNA that was characterized by the presence of the
phosphate groups in the central core and the nucleobases in the
external region.
Watson and Crick were pretty sure that the Pauling model was incorrect
as they previously investigated and discarded a similar model, Therefore,
they rushed toward alternative solutions as they were aware of the fact
that Pauling would have soon assessed the incorrectness of his own
model
A key event
In one meeting in London, Wilkins showed to Watson an impressive
DNA diffraction pattern recorded by Rosalind Franklin.
The famous Photo 51
Fiber diffraction data
Photo 51 Analysis◦ “X” pattern characteristic
of helix
◦ Diamond shapes indicate long, extended molecules
◦ Smear spacing reveals distance between repeating structures
◦ Missing smears indicate interference from second helix
Photo 51
“X” pattern characteristic of helix
Photo 51
Photo 51 Analysis◦ “X” pattern characteristic
◦ of helix
Diamond shapes indicate long, extended molecules
◦ Smear spacing reveals distance between repeating structures
◦ Missing smears indicate interference from second helix
Diamond shapes indicate extended molecules
Photo 51
Photo 51 Analysis◦ “X” pattern characteristic
of helix
◦ Diamond shapes indicate long, extended molecules
◦ Smear spacing reveals distance between repeating structures
◦ Missing smears indicate interference from second helix
Smear spacing reveals distance between repeating structures
Photo 51
Photo 51 Analysis◦ “X” pattern characteristic
of helix
◦ Diamond shapes indicate long, extended molecules
◦ Smear spacing reveals distance between repeating structures
◦ Missing smears indicate interference from second helix
Photo 51
Photo 51 Analysis◦ “X” pattern characteristic
of helix
◦ Diamond shapes indicate long, extended molecules
◦ Smear spacing reveals distance between repeating structures
◦ Missing smears indicate interference from second helix
Missing smears indicate interference from second helix
Summary of the information gained from Photo 51
The structure of DNA must be characterized by:
Double Helix
Radius: 10 angstroms
Distance between bases: 3.4 angstroms
Distance per turn: 34 angstroms
Chargaff "Rules" (1950)
After developing a new paper chromatography method for separating and
identifying small amounts of organic material, Chargaff concluded that
almost all DNA--no matter what organism or tissue type it comes from--
maintains certain properties, even as its composition varies. In particular,
the amount of adenine (A) is usually similar to the amount of thymine (T),
and the amount of guanine (G) usually approximates the amount of
cytosine (C). In other words, the total amount of purines (A + G) and the
total amount of pyrimidines (C + T) are usually nearly equal. (This second
major conclusion is now known as "Chargaff's rule.") Chargaff himself
could not imagine the explanation of these relationships-
Watson tried to couple bases through H-bonds
The initial pairings were, however, not satisfactorily
Jerry Donohue, an American crystallographer working in his office,
noticed that he was using the enol forms of the bases and suggested him
to switch to the keto forms
When he switched to the Keto forms he found perfect
couplings
When the H-bonding pattern was included in the double helix
scaffold Watson and Crick were able to generate a reliable
and insightful model
The famous Nature paper that
announced the modelling of the DNA
structure as a double helix
A closer look to the model
It would be difficult to find a more pointed example illustrating how a
molecular structure explains function. Admiring the elegant double-
helical DNA with a constant sugar-phosphate backbone and a variable
sequence of uniquely paired A–T and G–C bases, even a layman almost
intuitively feels how such a molecule can pass its sequence to daughter
molecules. (Wloadaver et a. 2014)
The double helix (experimentally) seen at atomic level
atomic level.The time interval between the proposal of the structure of DNA and its
verification at atomic detail was quite long, leading Richard Dickerson to
comment that ‘DNA is probably the most discussed and least observed
of all biological macromolecules’
In the early 1980s, the structures of the right handed double helices of
B- and A-DNA were confirmed with much more precise data derived
from single-crystal diffraction.
Also, in the late 1970s, the structure of an entirely different, left-handed
DNA was discovered in the laboratories of Alexander Rich
B-DNA Z-DNA
Major issues related to the properties of the gene could be
easily interpreted in structural terms at atomic level.
Self-Replication
Possibility to incorporate and preserve mutations
A possible genetic code as function of the sequences of the
bases in the gene
What about Linus Pauling?
Why did most prominent structural chemist predict the
wrong model ?
This is not an original question……..
Watson and Crick took the center stage, with Pauling assuming the
smaller part of an offstage voice, a legendary Goliath in a far land felled
by two unlikely Davids. A year would rarely go by after 1953 without
someone, a scientist or writer, asking Pauling where he had gone
wrong. His wife, Ava Helen, finally tired of it. After hearing the questions
and explanations over and again, she cut through the excuses with a
simple question. "If that was such an important problem," she asked
her husband, "why didn't you work harder on it?"
The main open question was related to the mechanism by
which the genetic information is transferred to proteins
Just after the publication of the Watson and Crick paper a
the cosmologist George Gamow, the author of the alpha-
beta-gamma theory, entered in the business.
He was just shocked by the paper. Nirenberg, the scientist
that decoded the genetic code, recalls
“He told me that he went down to his driveway to the mailbox to pick up
the mail, and picked up that issue of Nature that contained Watson and
Crick's article on the helical nature of DNA. And he read it while he was
standing at the mailbox with one arm on the mailbox, and immediately
thought that three bases in DNA corresponded to one amino acid, there
are four kinds of bases in DNA, twenty kinds of amino acids in protein.
And so, taking them three at a time there are 64 possible combinations
of the three bases.
Gamow contribution to the field:
The prediction that the sequence of three bases defined a
single aminoacid in the expressed protein
The correct assumption that 20 is the number of genetically
encoded aminoacids
He founded a club of scientists (Tie Club) to discuss about
this problem
The RNA Tie Club Members
Gamow also devised a model on how the DNA sequence
could determine the protein sequence
Francis Crick and the Central Dogma
Although Crick acknowledged the role of Gamow in
stimulating the field he considered his model highly unlikely
He was convinced that the entire process required an
intermediate player. He suggested, on the bases of the DNA
structure and of the base pairing that the intermediate
species was RNA
For Crick, four kinds of information transfer clearly existed:
DNA->DNA (DNA replication),
DNA->RNA (the first step of protein synthesis),
RNA->protein (the second step of protein synthesis)
RNA->RNA (RNA viruses copying themselves).
There were two steps for which there was no evidence but
that Crick thought were possible (hence the dotted lines
in the figure):
DNA->protein (this would mean RNA was not involved in
protein synthesis)
RNA->DNA (structurally possible, but at the time, there no
was no perceptible biological function).
Crick was able to make some astonishing predictions
Role of protein sequence in their folding
Use of DNA sequence comparison for evolutionary studies
Features of the intermediate state
(adaptor)
In 1970, following the discovery by Howard Temin and David
Baltimore of reverse transcriptase, which enables information
to flow in the direction RNA->DNA, Nature published an
editorial entitled `Central dogma reversed'. Crick wrote a
slightly tetchy response
In summary, this was a short and incomplete story of one of
the most influential discoveries in science that changed the
logic of biology in less than a decade
One of the main actors in this process was a young
American in the very beginning of his career with a very
limited scientific expertise but with a lot of enthusmiasm
But then people gets old and can quickly destroy their
reputation
Oct 25th 2007. Ten days after sparking controversy with comments on
race and intelligence, James Watson today announced that he is
retiring as chancellor of Cold Spring Harbor Laboratory (CSHL) in
Long Island, New York.
Jan 11th 2019. In response to his statements made during the recent PBS
documentary, which “effectively” reverse the written apology and retraction
Watson made in 2007, the CSHL also revoked his honorary titles of
Chancellor Emeritus, Oliver R. Grace Professor Emeritus, and Honorary
Sic transit gloria mundi
So pass the worldly glories
….but surely the DNA double helix will
survive
Central Dogma of Molecular Biology:
THE CONTRIBUTION FROM X-RAY
CRYSTALLOGRAPHY
DNA mRNA Polypeptide
Replication
Transcription Translation
Francis Crick 1958, the central dogma of molecular biology:
Francis Crick
Nobel Prize in
Physiology or Medicine
1962
The central dogma of molecular biology was first
enunciated by Francis Crick in 1958 and re-stated in a Naturepaper published in 1970:
The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential
information. It states that such information cannot be transferred back from protein to either protein or nucleic
acid.
In other words, 'once information gets into protein, it can't flow
back to nucleic acid.'
DNA mRNA Polypeptide
Replication
Transcription Translation
The central DogmaProtein biosynthesis
DNA
tells a cell how to build the proteins.
Code can't be based on a one-to-one match nucleotides : amino acids; only four nucleotides and 20 amino acids that must be coded.
If the nucleotides are grouped in threes, 64 possibletriplets (codons).
PROTEIN
20 amino acids4 nucleotides
Ala, Arg, Cys,…
3´ T-A-C-A-A-G-C-A-G-T-T-G-G-T-C... 5´ DNA
5’ A-U-G-U-U-C-G-U-C-A-A-C-C-A-G... 3’ mRNA
1. Transcription
RNA strand contains the message that was coded in the DNA,
messenger RNA, or mRNA.
Transcription: a naive sketch
2. Translation
The messenger RNA now binds to a ribosome, message is translated into a sequence of amino acids.
The Genetic Code
Second Position
U C A G
U Phe Ser Tyr Cys U
U Phe Ser Tyr Cys C
3’5’
Translation into bits and pieces:
What is needed:1) a ribosome 2) tRNAs
4) mRNA
3) Loads of protein factors
a) Initiation factors IF1, IF2, IF3
b) Elongation factor
c) Translocation factor EF-G
d) Termination factor
e) Trigger factor
f) Recicling factor
…
tRNA
Phenylalanine
Anticodon
mRNA
PE
PE A
P A
PE
Elongation cycle
Codon-anticodoninteraction
Conformational changetriggering GTP hydrolysis
30 years of structural biology of ribosome
Low resolution
negative stain EM
~ 50 Å
High resolution cryo-EM
~ 10 Å
Atomic resolution X-ray
crystallography
~ 3 Å
1970s
1990s 2000s
30S50S subunitM.W. 1,450,000 Da
About 3000 nucleotidi
35 proteine
M.W. 850,000
1500 nucleotidi
circa 20 proteine
Deinococcus radiodurans
Thermus thermophylus
Several lines of evidence support the idea
that the ribosome is a ribozyme.
1. The existence of other RNA catalysts
2. The fact that rRNA is the major and most
conserved component of ribosomes
3. Extraction of most ribosomal proteins
does not block catalysis
4. Most mutations that confer resistance to
antibiotics that block protein synthesis occur
in the rRNA genes
5. Specific 23s rRNA residues are required
for catalysis (peptide bond formation can be
catalyzed by the 50s subunit alone)
Ribosome is a ribozyme
The peptidyl transferase
center is located entirely
on the 50S subunit.
A full description of protein biosynthesis can be found at
https://www.youtube.com/watch?v=BSRzTBHjQcQ