the state of the art experimental approaches x-ray...

Synthetic program of the course

Structural Biology

The state of the art

Experimental approaches

X-ray crystallography

Electron microscopy

Computational approaches

Protein structure prediction methods

Molecular dynamics simulation

Structure validation

Successful examples: The ribosome machinery and the DNA structure

Rational Design of bioactive molecules

Properties of protein structures

Ligand-protein interactions

Structure-based drug discovery

Ligand-based drug discovery

Fragment-based drug discoveryClassical and recent successful examples of rational drug design

Lesson 1

Part b

Structural Biology

• Successful examples: The ribosome machinery

and the DNA structure

The structural biology and the DNA

A very brief description of (a) the gene theory and of (b) the experiments

that led to the identification of DNA as the chemical entity carrying the

hereditary traits

The discovery of the DNA double helix

The impact of this structure in life sciences

The discovery of the DNA structure

Complexity of Nucleic acids: DNA and RNAChains of DNA are made of millions of nucleotides

Current record for sequence deconding is of 2.3 million bases of human DNA in one step

Formula Cxxxxxxxx Oxxxxxxx Hxxxxxxxx Pxxxxxx Nxxxxxxxx

In fibers the molecular entities are ordered along the so-called

fiber axis

Fiber X-ray diffraction analysis

Theory

Fiber diffraction

The discovery of the alpha-helix (1951)

The fiber diffraction

pattern

The 3D model

by Pauling and Corey

Wool diffraction pattern:

meridional reflection (5.1 Å)

equatorial reflection (10 Å)

The DNA structure

• 1866: Gregor Mendel discovered that heredity is transmitted in discrete units

called “cell elements” – these elements were later termed “genes”;by Jhannsen

(1909)

• 1900: Mendel was re-discovered by De Vries, Correns and Tschermak.

• 1920: Thomas Hunt Morgan found hereditary traits in fruit flies and started his

experiments on radiation-induced mutagenesis;

• 1937: Max Delbruck moved to Caltech, to work with Morgan on fruit-flies, but

soon he decided that this kind of research was sterile and started his own

research on phages; a group soon formed, known informally as the “Phage

group”

• 1940’s: George Beadle and Edward Tatum established that any given enzyme is

associated to just one gene (a one-to-one relationship);

• 1943: Salvador Luria and Max Delbrück published their research that proved

unequivocally that new features (phenotype) arise from random mutations

(Lamarck loses, Darwin wins ... );

Key events and concepts in understanding heredity

• 1902: William Sutton recognized that chromosomes (whose function nobody

knew) were like Mendel’s “units”;

• 1944: Avery MacLeod and McCarty demonstrated that chromosomes are made

of DNA;

• 1952: Alfred Hershey and Martha Chase provided further experimental evidence

that DNA and not protein is the genetic material; they showed that when

bacteriophages, which are composed of DNA and protein, infect bacteria, their

DNA enters the host bacterial cell, but most of their protein does not;

In 1922, Muller published a summary of the gene capacities

1) Autocatalisys (self-reproduction)

2) Heterocatalysis (production of non-genetic material)

3) Ability to mutate (while retaining the other two properties)

Main steps into the identification of the chemical entity

carrying the genetic information

The DNA (macro)molecule

The early days of structural biology

The state of the art of structural characterizations of

biomolecules in 1940-1950

Examples of the most complex systems unveiled at atomic

levelMid-forties Mid-fifties

The birth of molecular biology now called structural biology

Pennicillin Vitamin B12

In this scenario, the determination of atomic structures of

biological macromolecules was an extremely challenging

task

Some data were available but their structural interpretation

was missing – Fiber data provided only some structural hints

whereas the phase problem was a major hurdle for the

exploitation of single crystal data

It is important to note that the consideration of these

macromolecules as defined chemical entities with specific

sequences of either aminoacids or nucleotides was a very

recent, and somehow debated, concept

Biological macromolecules

Protein crystals have been known for long time. In 1909

crystals of Hemoglobins isolated from different 200

organisms were reported

Barnal and Hodgkin obtained the first diffraction spots from a

protein crystal (pepsin) in 1936

In 1937 Max Perutz started a Ph. D. project whose aim the

determination of Hemoglobin crystal structure

The determination of Hemoglobin structure took 23 years

Once Perutz stated that he was lucky about the fact that his

thesis supervisors and evaluators didn’t require him to solve

the structure to award the degree

Availability of protein crystals

One of the first attempt to retrieve structural features for a

protein wasmade by the founder of structural crystallography

(Bragg law) and by the founders of structural biology…

…. it was a failure despite the scientific rank (three

Nobel laureates) of the authors

According to Linus Pauling

of Caltech the failure was

due to the scarce

knowledge of the authors of

basic stereochemistry.

He stated that likely none

of them had read his

seminal text-book “The

Nature of the Chemical

Bond”

This was the scenario when, in 1951, James D. Watson, a

young American zoologist with a passion for bird watching

and a vague fascination for genetics and with no experience

in organic and structural chemistry decided to turn its

attention to the structure of DNA

The interest of Watson for DNA initiated during the congress

“The Submicroscopical Structure of the Protoplasm.” (May

1951) held at Stazione Biologica in Napoli. During this event

Maurice Wilkins delivered a lecture on X-ray diffraction of

DNA crystals and on that occasion he showed some slides.

In Watson’s words: “Suddenly, I was excited about chemistry.

Before Maurice’s talk I had worried about the possibility that

the gene might be fantastically irregular. Now, however, I

knew that genes could crystallize; hence they must have a

regular structure that could be solved in a straightforward

fashion.” (Watson, 1968).

An old picture of the

Stazione Zoologica

Anton Dohrn

“Immediately,” Watson continues, “I began to wonder whether

it would be possible for me to join Wilkins in working on DNA”

During an excursion to Paestum on the following day, Watson

tried to impress the British professor with his enthusiasm

and ideas. Unsuccessfully, though.

Only at the end of 1952 he would finally succeed in his

attempt to work in England. Not in London, however, he was

sent to Cambridge, the biophysics laboratory directed by

William Lawrence Bragg. There he had the second seminal

and lucky meeting of his life, with Francis Crick.

A false start

Watson and Crick start to work to the DNA structure by

considering a triple helix model with the phosphate groups

in the center and the bases protruding toward the exterior .

In their model, the negatively charged moieties of the

phosphate groups were held together by positively charged

magnesium ion.

However, for several reasons, Watson and Crick were not

convinced by this model.

Somehow surprisingly, Linus Pauling was working on a similar triple

helix model for DNA that was characterized by the presence of the

phosphate groups in the central core and the nucleobases in the

external region.

Watson and Crick were pretty sure that the Pauling model was incorrect

as they previously investigated and discarded a similar model, Therefore,

they rushed toward alternative solutions as they were aware of the fact

that Pauling would have soon assessed the incorrectness of his own

model

A key event

In one meeting in London, Wilkins showed to Watson an impressive

DNA diffraction pattern recorded by Rosalind Franklin.

The famous Photo 51

Fiber diffraction data

Photo 51 Analysis◦ “X” pattern characteristic

of helix

◦ Diamond shapes indicate long, extended molecules

◦ Smear spacing reveals distance between repeating structures

◦ Missing smears indicate interference from second helix

Photo 51

“X” pattern characteristic of helix

Photo 51


◦ of helix

Diamond shapes indicate long, extended molecules



Diamond shapes indicate extended molecules

Photo 51


of helix




Smear spacing reveals distance between repeating structures

Photo 51


of helix




Photo 51


of helix




Missing smears indicate interference from second helix

Summary of the information gained from Photo 51

The structure of DNA must be characterized by:

Double Helix

Radius: 10 angstroms

Distance between bases: 3.4 angstroms

Distance per turn: 34 angstroms

Chargaff "Rules" (1950)

After developing a new paper chromatography method for separating and

identifying small amounts of organic material, Chargaff concluded that

almost all DNA--no matter what organism or tissue type it comes from--

maintains certain properties, even as its composition varies. In particular,

the amount of adenine (A) is usually similar to the amount of thymine (T),

and the amount of guanine (G) usually approximates the amount of

cytosine (C). In other words, the total amount of purines (A + G) and the

total amount of pyrimidines (C + T) are usually nearly equal. (This second

major conclusion is now known as "Chargaff's rule.") Chargaff himself

could not imagine the explanation of these relationships-

Watson tried to couple bases through H-bonds

The initial pairings were, however, not satisfactorily

Jerry Donohue, an American crystallographer working in his office,

noticed that he was using the enol forms of the bases and suggested him

to switch to the keto forms

When he switched to the Keto forms he found perfect

couplings

When the H-bonding pattern was included in the double helix

scaffold Watson and Crick were able to generate a reliable

and insightful model

The famous Nature paper that

announced the modelling of the DNA

structure as a double helix

A closer look to the model

It would be difficult to find a more pointed example illustrating how a

molecular structure explains function. Admiring the elegant double-

helical DNA with a constant sugar-phosphate backbone and a variable

sequence of uniquely paired A–T and G–C bases, even a layman almost

intuitively feels how such a molecule can pass its sequence to daughter

molecules. (Wloadaver et a. 2014)

The double helix (experimentally) seen at atomic level

atomic level.The time interval between the proposal of the structure of DNA and its

verification at atomic detail was quite long, leading Richard Dickerson to

comment that ‘DNA is probably the most discussed and least observed

of all biological macromolecules’

In the early 1980s, the structures of the right handed double helices of

B- and A-DNA were confirmed with much more precise data derived

from single-crystal diffraction.

Also, in the late 1970s, the structure of an entirely different, left-handed

DNA was discovered in the laboratories of Alexander Rich

B-DNA Z-DNA

Major issues related to the properties of the gene could be

easily interpreted in structural terms at atomic level.

Self-Replication

Possibility to incorporate and preserve mutations

A possible genetic code as function of the sequences of the

bases in the gene

What about Linus Pauling?

Why did most prominent structural chemist predict the

wrong model ?

This is not an original question……..

Watson and Crick took the center stage, with Pauling assuming the

smaller part of an offstage voice, a legendary Goliath in a far land felled

by two unlikely Davids. A year would rarely go by after 1953 without

someone, a scientist or writer, asking Pauling where he had gone

wrong. His wife, Ava Helen, finally tired of it. After hearing the questions

and explanations over and again, she cut through the excuses with a

simple question. "If that was such an important problem," she asked

her husband, "why didn't you work harder on it?"

The main open question was related to the mechanism by

which the genetic information is transferred to proteins

Just after the publication of the Watson and Crick paper a

the cosmologist George Gamow, the author of the alpha-

beta-gamma theory, entered in the business.

He was just shocked by the paper. Nirenberg, the scientist

that decoded the genetic code, recalls

“He told me that he went down to his driveway to the mailbox to pick up

the mail, and picked up that issue of Nature that contained Watson and

Crick's article on the helical nature of DNA. And he read it while he was

standing at the mailbox with one arm on the mailbox, and immediately

thought that three bases in DNA corresponded to one amino acid, there

are four kinds of bases in DNA, twenty kinds of amino acids in protein.

And so, taking them three at a time there are 64 possible combinations

of the three bases.

Gamow contribution to the field:

The prediction that the sequence of three bases defined a

single aminoacid in the expressed protein

The correct assumption that 20 is the number of genetically

encoded aminoacids

He founded a club of scientists (Tie Club) to discuss about

this problem

The RNA Tie Club Members

Gamow also devised a model on how the DNA sequence

could determine the protein sequence

Francis Crick and the Central Dogma

Although Crick acknowledged the role of Gamow in

stimulating the field he considered his model highly unlikely

He was convinced that the entire process required an

intermediate player. He suggested, on the bases of the DNA

structure and of the base pairing that the intermediate

species was RNA

For Crick, four kinds of information transfer clearly existed:

DNA->DNA (DNA replication),

DNA->RNA (the first step of protein synthesis),

RNA->protein (the second step of protein synthesis)

RNA->RNA (RNA viruses copying themselves).

There were two steps for which there was no evidence but

that Crick thought were possible (hence the dotted lines

in the figure):

DNA->protein (this would mean RNA was not involved in

protein synthesis)

RNA->DNA (structurally possible, but at the time, there no

was no perceptible biological function).

Crick was able to make some astonishing predictions

Role of protein sequence in their folding

Use of DNA sequence comparison for evolutionary studies

Features of the intermediate state

(adaptor)

In 1970, following the discovery by Howard Temin and David

Baltimore of reverse transcriptase, which enables information

to flow in the direction RNA->DNA, Nature published an

editorial entitled `Central dogma reversed'. Crick wrote a

slightly tetchy response

In summary, this was a short and incomplete story of one of

the most influential discoveries in science that changed the

logic of biology in less than a decade

One of the main actors in this process was a young

American in the very beginning of his career with a very

limited scientific expertise but with a lot of enthusmiasm

But then people gets old and can quickly destroy their

reputation

Oct 25th 2007. Ten days after sparking controversy with comments on

race and intelligence, James Watson today announced that he is

retiring as chancellor of Cold Spring Harbor Laboratory (CSHL) in

Long Island, New York.

Jan 11th 2019. In response to his statements made during the recent PBS

documentary, which “effectively” reverse the written apology and retraction

Watson made in 2007, the CSHL also revoked his honorary titles of

Chancellor Emeritus, Oliver R. Grace Professor Emeritus, and Honorary

Sic transit gloria mundi

So pass the worldly glories

….but surely the DNA double helix will

survive

Central Dogma of Molecular Biology:

THE CONTRIBUTION FROM X-RAY

CRYSTALLOGRAPHY

DNA mRNA Polypeptide

Replication

Transcription Translation

Francis Crick 1958, the central dogma of molecular biology:

Francis Crick

Nobel Prize in

Physiology or Medicine

1962

The central dogma of molecular biology was first

enunciated by Francis Crick in 1958 and re-stated in a Naturepaper published in 1970:

The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential

information. It states that such information cannot be transferred back from protein to either protein or nucleic

acid.

In other words, 'once information gets into protein, it can't flow

back to nucleic acid.'

DNA mRNA Polypeptide

Replication

Transcription Translation

http://en.wikipedia.org/wiki/Francis_Crick

http://en.wikipedia.org/wiki/1958

http://en.wikipedia.org/wiki/Nature_(journal)

http://en.wikipedia.org/wiki/1970

http://en.wikipedia.org/wiki/Molecular_biology

The central DogmaProtein biosynthesis

DNA

tells a cell how to build the proteins.

Code can't be based on a one-to-one match nucleotides : amino acids; only four nucleotides and 20 amino acids that must be coded.

If the nucleotides are grouped in threes, 64 possibletriplets (codons).

PROTEIN

20 amino acids4 nucleotides

Ala, Arg, Cys,…

3´ T-A-C-A-A-G-C-A-G-T-T-G-G-T-C... 5´ DNA

5’ A-U-G-U-U-C-G-U-C-A-A-C-C-A-G... 3’ mRNA

1. Transcription

RNA strand contains the message that was coded in the DNA,

messenger RNA, or mRNA.

Transcription: a naive sketch

2. Translation

The messenger RNA now binds to a ribosome, message is translated into a sequence of amino acids.

The Genetic Code

Second Position

U C A G

U Phe Ser Tyr Cys U

U Phe Ser Tyr Cys C

3’5’

Translation into bits and pieces:

What is needed:1) a ribosome 2) tRNAs

4) mRNA

3) Loads of protein factors

a) Initiation factors IF1, IF2, IF3

b) Elongation factor

c) Translocation factor EF-G

d) Termination factor

e) Trigger factor

f) Recicling factor

…

tRNA

Phenylalanine

Anticodon

mRNA

PE

PE A

P A

PE

Elongation cycle

Codon-anticodoninteraction

Conformational changetriggering GTP hydrolysis

30 years of structural biology of ribosome

Low resolution

negative stain EM

~ 50 Å

High resolution cryo-EM

~ 10 Å

Atomic resolution X-ray

crystallography

~ 3 Å

1970s

1990s 2000s

30S50S subunitM.W. 1,450,000 Da

About 3000 nucleotidi

35 proteine

M.W. 850,000

1500 nucleotidi

circa 20 proteine

Deinococcus radiodurans

Thermus thermophylus

Several lines of evidence support the idea

that the ribosome is a ribozyme.

1. The existence of other RNA catalysts

2. The fact that rRNA is the major and most

conserved component of ribosomes

3. Extraction of most ribosomal proteins

does not block catalysis

4. Most mutations that confer resistance to

antibiotics that block protein synthesis occur

in the rRNA genes

5. Specific 23s rRNA residues are required

for catalysis (peptide bond formation can be

catalyzed by the 50s subunit alone)

Ribosome is a ribozyme

The peptidyl transferase

center is located entirely

on the 50S subunit.

A full description of protein biosynthesis can be found at

https://www.youtube.com/watch?v=BSRzTBHjQcQ

https://www.youtube.com/watch?v=BSRzTBHjQcQ

the state of the art experimental approaches x-ray...

Documents