lecture 1 prokaryotic expression systems€¦ · lecture 1 – prokaryotic ... transcription and...

LSM4242 Protein Engineering

1

Lecture 1 – Prokaryotic Expression Systems

Example of recombinant insulin:

Digestion of plasmid with restriction endonucleases

Insertion of insulin gene into plasmid through ligation

Transformation into host cell where recombinant plasmid will be propagated.

Use blue-white assay for screening:

Host strain carries lacZ deletion mutant (lacZM15) which contains the -

peptide which in inactive.

Transformed strain carries the -peptide which results in fully functional β-

galactosidase which will cause XGal to turn blue in the presence of IPTG.

Expression systems are based on the insertion of the gene of interest into a host cell for its

translation and expression into protein.

Every protein in unique and no clear strategy will work every time, thus different

systems have to be tried.

Expression systems include bacteria, yeast, insect and mammalian cells.

Prokaryotic expression system

Advantages:

Low cost

Simple and well characterized physiology

Short time required for growth and expression

Large yield of products

Elements needed for expression which are often in a vector:

Promoters – control point for transcription

Strong promoters have high affinity for RNA polymerase and are

frequently transcribed

Regulatable promoters can be controlled by inducers/co-repressors

E.g. lac promoter.

o Negative and inducible regulation where repressor binds to

operator and inducer (allolactose/IPTG) binds to repressor

to inactivate it.

o Positive regulation where high concentrations of cAMP will

bind to CAP during low glucose conditions, activating the

promoter.


2

E.g. lacUV5 promoter.

o CAP binding site is deleted and the -10 sequence is optimised.

E.g. Trp promoter

o Gene is inactive when repressor + co-repressor (Trp) is

present.

E.g. tac (trc) promoter

o Hybrid of lac and trp promoters where the -35 region is from

trp and the -10 region is from lac. Inducible by IPTG and has

about 3 times the strength of trp promoter and 11 times the

strength of the lac promoter.

E.g. λ pL promoter

o Regulated by the λ repressor which is encoded by the cI gene.

Often the cI857 mutant is used as it encodes for a

temperature sensitive repressor. At 28C, it represses the λ

promoter strongly, but at 42C, the protein denatures which

cause expression.

E.g. Phage T7 promoter

o Requires T7 RNA polymerase for expression. Often the T7

RNA polymerase gene is placed under control of the lac

promoter, and the gene of interest will be under control of T7

gene 10 promoter ensures tighter control.

o In BL21(DE3) strain of E. coli, the gene for T7 RNA

polymerase is integrated into its genome under the control of

Plac and thus inducible by IPTG.

Ribosome binding sites (RBS) – necessary for translation

Shine-Dalgarno sequence in prokaryotes which is 6-8bp length, often

10bp upstream of AUG start codon. Sequence is complementary to 3’

region of 16S rRNA small subunit in ribosome.

The stronger the binding between mRNA and 16S rRNA, the greater

the efficiency of translation initiation.

Antibiotic resistance and tags – for selection, purification and detection.

Examples of popular plasmid vectors

pUC family: ColE1 origin

of replication which has

high copy number, drug

resistance marker is β-

lactamase (ampR), has

lacZ for blue/white

selection. Plac and Olac is

used to control the gene

expression. Gene insertion site (polylinker region – pLink) contains many

restriction enzyme sites.

pET family: widely used due to strong selectivity of T7 RNA polymerase to

promoter, high activity of polymerase, and high translation efficiency. Has

ColE1 origin of replication and ampR resistance marker. Main difference in

the promoter being T7 and the presence of lacI gene which codes for lac

repressor.


3

Two main problems associated with production of recombinant proteins are

degradation/insolubility and purity. Solution is to use a cleavable fusion protein system

which attaches target protein to a stable cellular protein.

Results in a protein more soluble and resistant to proteases. E.g. thioredoxin

(11.7kDa) can keep fusion protein soluble even though it makes up 40% of total

protein.

For fusion protein, stop codon of stable cellular protein must be removed and

reading frame of fusion protein must be contiguous.

Example would be to fuse with ompF (Porin – forms pores that allow passive

diffusion of small molecules across the outer membrane) upstream for secretion or

lacZ downstream for stability.

Host proteins fused to proteins of interest by acting as tags allows for purification in

one step.

Affinity chromatography - small molecule (e.g. streptavidin/biotin) or

antigen can be covalently linked where the fusion protein will bind to the

column.

Insert a protease cleavage sequence between fusion partner and gene of interest to

isolate your desired protein.

Examples of proteases: factor Xa, enterokinase (serine protease) and TEV

protease.

The pTrcHis system uses lacIq which produces

more lacR than normal, the tac(trc) promoter

which is stronger, lacO which allows for IPTG

regulation, “His tag” region for purification via

metal chelation and an enterokinase (EK)

cleavage recognition sequence downstream of the

“His tag”.

Presence of His residues may prevent

normal protein function and thus can be

cleaved away by endopeptidase at EK site.

NH2-(His)6-(Asp)4-Lys-(Protein of interest)

PinPoint Xa system is used for the

production and purification of fusion

proteins that are biotinylated in vivo.

Gene of interest is fused to a

biotinlyated lysine tag followed by

a Xa protease recognition site.

Purification can be carried out on

an avidin/streptavidin resin

followed by proteolytic cleavage.

Disadvantages: Often it does not fold properly and there is a lack of post-

translational modifications. Toxic proteins can be solved by using inducible

promoters, while insoluble proteins can be expressed with a highly soluble partner

such as glutathione-S-transferase (GST) or maltose binding protein (MBP) to

improve its solubility.


4

Lecture 2 – Eukaryotic Expression Systems

Problems with regard to stability and activity would arise when eukaryotic proteins are

expressed in prokaryotic cells mainly because of the absence of PTM.

E.g. correct disulfide bond formation, proteolytic cleavage of inactive precursor,

glycosylation, alteration of amino acids such as phosphorylation and acetylation.

Genetic features in eukaryotic expression vectors include

selectable markers (e.g. AmpR), eukaryotic promoters, mRNA

polyA signal, origin of replication (orieuk) if plasmid based

and chromosomal DNA segment for homologous

recombination into host chromosome.

Yeast Expression Systems

Advantages:

Single cell

Well characterized genetically

Strong promoters available

Natural plasmid (2m)

Post-translational modification

Secretes a few proteins normally

Generally recognized as safe (GRAS) organism according to FDA.

Three types of expression vectors:

Episomal – most widely used but unstable in large scale cultures (>10L).

Strategy is to alter growth conditions to try and stabilize the episomal vector

by using mutant host strains requiring specific nutrients (e.g. Leu, Trp, His)

Integrating – generally stable but only has 1 copy inserted into chromosome.

Able to insert tandem arrays of genes and thus increases expression but also

instability.

E.g. Use AOX1 gene sequence to flank gene of interest and selection

marker, where integration into yeast chromosome will be done

through homologous recombination.

Yeast artificial chromosome (YAC) – designed to clone large pieces of DNA

(100 to 1500kb) and thus is generally not stable and not used for protein

production.

Yeast promoters:

Disadvantages:

Instability of plasmids in scale-up especially for episomal vectors.

Over-glycosylation of glycoproteins which may alter protein activity

100+ mannose residues vs. 8-13 normally


5

Solution to problems would then be to use other types of yeast such as Pichia

pastoris or Candida sp rather than the usual S. cerevisiae or other eukaryotic cells.

Current application of yeasts include:

Production of human glycoproteins by deleting yeast glycosylation genes

and adding human ones.

Yeast as cell factories by reconstructing heterologous biosynthesis pathways

to produce complex natural products such as isoprenoids and sterols

through the mevalonate pathway and flavonoids, stilbenes and opioids

through amino acid biosynthesis pathways.

Higher eukaryotic cell expression systems – often needed for production of therapeutic

proteins where correct PTMs are needed e.g. erythropoietin, interleukin-2, mAbs, growth

hormones etc.

Generalized mammalian expression vector

often include the following:

Origin of replication which is derived

from animal virus (SV40)

Promoters derived from animal

viruses or highly expressed

mammalian genes

Selectable markers e.g. methionine sulfoximine (MSX) inhibits glutamine

synthetase and thus cells cannot make endogenous glutamine, transfection

with a vector encoding Glutamine Synthetase ensures cell survival in culture.

Translation control elements on mRNA in order from 5’ to 3’ end:

Kozak sequence CCRCCAUGG

Signal sequence for secretion

Affinity tag for purification

Proteolytic cleavage site

Some therapeutic proteins are composed of two chains such as insulin and thus we

can express both subunits at the same time in stoichiometric amounts through:

Two vector expression system – clone target genes into 2 vectors using 2

different selectable markers and co-transfect them into cells.

Issues: can lose one of the plasmids, different copy numbers and

different promoter strength.

Two gene expression vector (double cassette vectors) – put both genes into

one plasmid where each gene is a separate transcription unit with own

promoter and polyA region.

Issues: may not get same amount of protein due to differences in

transcription and translation.


6

Dicistronic vectors – gene expression of both genes controlled by a single

promoter, thus sharing the same transcription unit. However this requires

an internal ribosome entry site which is derived from mammalian virus

genomes.

Insect cell expression systems are based on baculovirus which exclusively infect

invertebrates.

Baculovirus infects the insect cell early and produces polyhedron proteins which

trap many virions in a stable polyhedron package. Upon ingestion by insect host,

polyhedrin protein will be broken down and virions will be released, starting the

infection.

Since polyhedrin promoter is strong, by replacing the polyhedrin coding sequence

with gene of interest, lots of protein can be expressed within 36-48 hours of post-

infection.

Virus utilized is often the Autographa californica nucleopolyhedrovirus (AcMNPV)

which is able to infect over 30+ insects and the cell line used is the Spodoptera

frugiperda – fall armyworm moth Sf9.

Transfer vector is designed by flanking with AcMNPV sequences at both 5’ and 3’

ends and introduced into genome by homologous recombination.

To improve this process, linearise the AcMNPV prior to transfection to increase

frequency of recombination due to crossing over.

Flow chart: transform gene of interest into bacteria amplify the recombinant

bacteria plasmid transfect using baculovirus into insect cell line and obtain

recombinant virus particles for higher expression infect in new insect cell line

harvest protein.

Disadvantages of this method would be that it is expensive and production is not

continuous since baculovirus kills the host.

GATEWAY cloning technology used to transfer DNA fragments between plasmids using a set

of recombination sequences and enzymes. Goal is to move the gene from one vector

backbone to another.

Site-specific recombination mediated by phage λ recombination proteins. Specific

and directional which requires two different combinations of enzymes.

The att sites contain binding sites for proteins that mediate recombination, and

integration reaction is mediated by integrase and host integration factor (IHF) for

BP reaction and additionally excisionase (Xis) for the LR reaction.

When integration occurs, two new sites are created which flank the integrated

prophage with no loss of DNA sequence.

Advantages: entry clone can be easily sub-cloned into wide variety of destination

vectors, thus minimize planning, eliminate cloning and maximize compatibility and

flexibility.


7

Lecture 3 – In vitro translation and Site-directed Mutagenesis

Wide variety of applications:

Rapid identification of gene products

Localization of mutations via synthesis of truncated gene products

Incorporation of modified or unnatural amino acids for functional studies

Protein folding studies by using chaperones

Advantages over in vivo expression when:

Product is toxic to host cell

Product is insoluble or forms inclusion bodies

Protein undergoes rapid proteolytic degradation by intracellular proteases

Standard translation systems require purified RNA as a template for translation. If DNA is

used, then transcription and translation are coupled.

They generally contain all the macromolecular components required for translation

such as 70S (bacteria)/80S (eukaryotic) ribosomes, tRNAs, aminoacyl-tRNA

synthetases, initiation, elongation and termination factors supplemented with amino

acids, energy sources and energy regenerating systems.

In an Eppendorf tube containing the system, we can add either DNA or RNA to kick-

start the transcription or translation process (~1h each @ 30C). Product can be

quantified by introducing 35S into methionine and performing autoradiography via

SDS-PAGE since other proteins would also be present in the system.

Examples of standard translation systems are:

Rabbit reticulocyte lysate – highly efficient in vitro eukaryotic protein synthesis

system used for translation of exogenous RNAs. In vivo, reticulocytes are highly

specialized cells primarily responsible for the synthesis of hemoglobin, which

represents more than 90% of the protein made in the reticulocyte. These immature

red cells have already lost their nuclei, but contain adequate mRNA, as well as

complete translation machinery, for extensive globin synthesis. They are often

treated with nuclease to reduce background and increase efficient utilization of

exogenous RNAs.

Wheat germ extract – has low background incorporation due to its low level of

endogenous mRNA. Recommended for translation of RNA containing small

fragments of double-stranded RNA or oxidized thiols, which are inhibitory to the

rabbit reticulocyte lysate.

E. coli cell-free system – simple translational apparatus with less complicated

control at the initiation level, allowing this system to be very efficient in protein

synthesis. Bacterial extracts are often unsuitable for translation of RNA, because

exogenous RNA is rapidly degraded by endogenous nucleases. However, E. coli

extracts are ideal for coupled transcription-translation from DNA templates.

Linked transcription-translation:

Transcription with bacteriophage polymerase in prokaryotic system followed by

translation in eukaryotic system.

Coupled transcription-translation:

Simultaneously in E. coli extract, one-step reaction in vitro which results in efficient

expression of either prokaryotic or eukaryotic gene products.

Important elements in DNA for translation – eukaryotic requires 7-methyl-GTP 5’ cap, 3’

poly A tail and Kozak sequence while prokaryotic require Shine-Dalgarno sequence (RBS).


8

Site-directed mutagenesis (SDM) – alteration of amino acids at a given position.

Conventional mutagenesis is random, results in multiple possible mutations where

most are detrimental.

SDM allows for the characterization of the dynamic and complex relationships

between protein structure and function, study of gene expression elements and

vector modification.

Requires:

DNA sequence to determine which codons to alter.

3D structure of protein or bioinformatics to determine which amino acids

are candidates for modification in relation to active site, protein stability and

regulatory elements.

Oligonucleotide directed mutagenesis requires templates:

M13 ssDNA – single stranded bacteriophage, but double stranded in replicative form.

Use an oligonucleotide complementary to

desired codon.

Theoretically, half of phage should be mutants,

but in reality only 1-5% is mutated due to DNA

repair mechanisms.

Improved method for M13 vector – grow M13 vector

in mutant E. coli carrying two mutations:

dut – defective dUTPase which elevates

intracellular level of dUTP, resulting in

some being incorporated in DNA.

ung – defective uracil N-glycosylase

which prevents uracil from being

removed in DNA.

Thus ~1% of U exists in DNA which

lowers the possibility of DNA repair at

the mutagenesis site.

When transformed back into wild-type E.

coli (ung+), all uracil will be removed

where mutated form is not degraded.

Note that it is inconvenient to work with

M13 phage as it requires many steps.

Plasmid dsDNA – preferred method for

mutagenesis as it is specific and quicker, any kind of DNA is usable.

Introduce mutagenic oligonucleotide in

PCR to generate mutants

In 2-step overlap PCR, first reaction can

be used to introduce the mutation in

two halves where the second PCR can

be used to get a clonable mutated gene.

QuikChange II site-directed mutagenesis – Commercial

product which allows for mutagenesis in any double

stranded plasmid.

Quick three-step procedure which has a

mutation rate of greater than 80% efficiency in a single reaction.


9

Requires:

2 synthetic oligonucleotide primers containing desired mutation.

High-fidelity (HF) DNA polymerase which extends primer with highest

fidelity.

DpnI endonuclease which digests the parental (methylated) DNA template.

Note that dam- strains are not suitable

Lecture 4 – Molecular Evolution

Enzymes are adapted to function optimally in living cells for the conversion of natural

substrates, metabolic control and rapid turnover.

Since they generally have limited stress tolerance, a natural variant may not perform

perfectly in an industrial process because of the distinct conditions and demands.

Thus laboratory evolutions methods are required to fine-tune the selectivity and

activity of enzymes.

Differs from natural evolution as it is directed towards a functional goal –

think of breeding.

Have been successfully applied in:

Protein/ligand binding – antibody detection, stronger ligand binding

Improving protein stability – heat and solvent stability

Modifying enzyme selectivity – accepting other substrates

Rate of evolution of a single gene can be indeed be accelerated under in vitro selective

pressure through the generation of a new and more efficient functional variant of the same

gene. General approach: construct library of variant genes and screen/select the protein

products of the genes.


10

Library construction method:

Random mutagenesis – introduce change through the gene, useful if you

don’t know what mutation to use.

Use physical (UV) or chemical mutagens (ROS)

Error prone PCR - add Mn2+ and biased concentrations of dNTPs

along with error prone DNA polymerases (e.g. Mutazyme or Taq).

Mutator strains containing error-prone DNA polymerases

Directed methods – randomize only at a specific position, useful if you know

the area of interest.

Recombination (chimeric) methods – bring existing sequence diversity

together in novel combinations, either from point mutants or from different

parental DNA sequences, results in overall structural change.

DNA shuffling – use DNaseI digestion on dsDNA to generate small

fragments which act as overlapping primers where they are

randomly annealed to obtain full-length DNA. Factors involved:

♦ Similarity of genes selected

♦ Size of DNA fragments

♦ Annealing temperature

Staggered extension process (StEP) where small segments are added

to the end of a growing DNA strand in a series of very short extension

steps (extension time is varied)

Random chimeragenesis on transient templates (RACHITT) which

produce chimeras with a much larger number of crossovers.

Example of DNA shuffling:

4 genes from 4 microbial species encoding class C cephalosporinases which are 58-

82% identical at DNA level were shuffled either individually or as a pool where the

transformants with resultant libraries were screened for antibiotic moxalactam

resistance.

Single gene shuffling only resulted in 8-fold increased resistance while multi-

gene shuffling resulted in 270-540 fold increase in resistance.

By mixing several gene sequences, the enzyme structure can change

which may prove to be more effective than changing sequences on a

single gene.

In viruses, multiple parental sequences of viral glycoprotein involved in viral vector

can be shuffled where vectors with altered tropism can be achieved.


11

Directed evolution – majority of reported experiments are a combination of error-prone

PCR and DNA shuffling.

E.g. GFP gene was amplified by ep-PCR and cut into 50-300bp pieces. They were

then assembled by second PCR without primer (random annealing) and the

products were cloned into an expression vector.

Selection of resulting clones was done by FACS and the most fluorescent cells

(> 100 fold) were amplified, sorted, characterized and sequenced.

Resulted in the creation of eBFP, eCFP, eGFP, eYFP and dsRED.

E.g. Engineering of p450 BM3 from Bacilius megaterium to metabolize hydrocarbons

Medium chain fatty acid monooxygenase heme enzyme.

Single polypeptide chain containing hydroxylase domain and reductase

domain.

Upon mutagenic PCR, StEP and ep-PCR, the mutant obtained was able to use

other hydrocarbons as substrate with higher maximum turnover rate as

compared to the wild type.

General strategy for large scale analysis of protein function:

DNA library express proteins select/screen proteins for desired

function isolate DNA and select DNA sequence amplify or mutate for

improvements Repeat

Lecture 5 – Display Technology

Genetic material is physically associated with the proteins for selection/screening in library

and thus the success of display selection relies on the ability to retrieve the genetic

information along with the functional protein.

In vivo display – based on M13 and phagemid-based cloning system

Mutated gene is inserted into M13 g3p (pilus) or g8p (surface coat protein) gene to

form C-terminal fusion protein, and then transformed into E. coli.

By fusing proteins to pIII (>50aa) and pVIII (6-30aa) gene, they can be displayed on

the phage. Note that this only works for small proteins/peptides, bigger proteins

would interfere in phage assembly.

In screening the phage library assuming they express a scFv coupled to a HA tag,

incubate them in an antigen coated well.

Eluted phages can then be used for enrichment by infect E. coli again to generate a

secondary library. Use ELISA to test for binding affinity.


12

Single-Chain Fv (variable fragment) is the favoured form to be displayed.

In construction of antibody libraries, DNA sequences encoding VH and VL domains

are amplified by PCR and paired randomly. The scFv sequences are amplified by PCR

using primers incorporating restriction sites and then cloned into the phagemid

vector on pIII.

In vitro display – cell free display system through formation of stable protein-ribosome-

mRNA (PRM) complexes.

In this system, stable PRM complexes and correct folding of protein needs to be

established.

Stop codon can be removed to ensure that protein does not dissemble from the

ribosome.

Advantages:

Larger screening capacities (1014/mL) - probably the smallest system since

ribosomes are used.

No limit on transformation efficiency

PCR products can be utilized

Able to handle toxic, proteolytically sensitive and modified amino acids

which may not be possible in vivo.

Can be used for the improvement of stability and activity of proteins, antibody

engineering and generation of new multidomain/multifunctional proteins.


13

Lecture 6 – Antibodies and SiRNA

Polyclonal antibodies are those collected from serum of exposed animals which can

recognize multiple antigenic sites of injected biochemical.

Monoclonal antibodies (mAbs) are cloned and cultured individual B lymphocyte

hybridomas that are secreted and collected from culture media, only able to recognize one

antigenic site of injected biochemical.

Hybridoma technology is a method for producing large numbers of identical antibodies

(also called monoclonal antibodies).

This process starts by injecting a mouse (or other mammal) with an antigen that

provokes an immune response. B cells will produce antibodies that bind to the

injected antigen and these newly produced cells are then harvested.

These isolated B cells are in turn fused with immortal B cell cancer cells, a myeloma

to produce a hybrid cell line called a hybridoma, which has both the antibody-

producing ability of the B-cell and the exaggerated longevity and reproductivity of

the myeloma.

B cells are HGPRT+ (HGPRT plays a central role in the generation of purine

nucleotides through the purine salvage pathway) while tumor cells are HGPRT- and

thus cannot utilize the salvage pathway.

When grown in a HAT (hypoxanthine, aminopterin and thymidine) medium which

inhibits de novo synthesis of nucleic acids, myeloma cells that cannot switch over to

the salvage pathway are killed due to lack of HGPRT.

Aminopterin inhibits the de novo pathway which is required for cell division

while hypoxanthine and thymine provides the source provided the right

enzymes are present.

In vitro culture would be less concentrated and contains bovine serum while ascites

fluid (from mouse) will contain high concentration with minor contamination of

mouse Ig.

The mAbs can be purified via affinity purification using epitope (the part of an

antigen molecule to which an antibody attaches itself).

ScFv fragments are linked by a C to N terminal linker peptide to stabilize them. Their size

and specificity may allow for attachment to cryptic sites. Can be selected through phage

display technology.


14

Testing of mAb production by enzyme linked immunosorbent assay (ELISA)

Direct sandwich method for testing antigen: antibody + antigen + enzyme-linked

antibody

Indirect method for testing antiserum: antigen + antibody + enzyme-linked antibody

With enzyme-linked antibody, substrate is added and reaction produces a visible

colour change.

PNPP (p-Nitrophenyl Phosphate, Disodium Salt) is a widely used substrate

for detecting alkaline phosphatase in ELISA applications.

TMB (3,3',5,5'-tetramethylbenzidine) soluble substrates yield a blue color

when detecting HRP.

Monoclonal antibodies can be used for:

Protein purification in affinity chromatography

Identification and isolation of cell subpopulations using FACS – fluorescent

antibodies bind to surface markers on cells

Tumor detection, imaging and killing – select antibodies from phage display library

using antigen from cancer patient serum.

E.g. mAbs can be tagged with radioactive tracer which when injected into the

body will localize in areas of recurrent carcinoma cells.

Molecular drugs – against TNF- (septic shock), CD3 (transplants e.g. anti-CD3 mAbs

such as Muromonab which eliminates graft vs. host reaction), IL-2R

(leukaemia/lymphoma), anti-venoms (snake bites), viruses (infections)

E.g. Metastatic breast cancer patients overexpress HER-2 (EGF) receptor and

thus anti-HER mAbs can be used to block EGF binding to HER-2 which slow

cell growth.

E.g. fuse toxin to antibody to generate immunotoxin, only the targeted

tumour cells are killed by the toxin.

Problems arising in mAb production:

Repeated immunisation required in mice, lengthy procedure which involves

recovery of B cells and generation of hybridomas.

Humans can generate immune response to mouse antibodies.

Human mAb production will require large blood volumes/excised lymph nodes and

thus ethical difficulties, cell lines are also often unstable.

Thus instead of producing human antibodies, we produce chimeric human-mouse

antibodies which can function the same way.

Xenomouse – genetically engineered mouse where murine IgH and IgK loci

are replaced with human Ig counterparts.

Method works as human Ig transgenes carry majority of the variable

repertoire and can undergo class switching from IgM to IgG isotypes.

On using antisense oligonucleotides to selectively suppress and shutting down target mRNA

through:

When directed to terminus of 5’ UTR, no initiation due to prevention of ribosome

binding.

When directed to downstream of 5’ UTR, no translation occurs

When oligomer directed to splice site, no splicing occurs.

When oligomer directed to critical region of RNA domain of ribonucleoprotein (e.g.

telomerase), it inhibits the activity.


15

E.g. PKC is a central enzyme in tumor progression. The use of ISIS3521 results in the

expression of protein kinase C (PKC) isozyme to be specifically reduced.

Disadvantages include:

Limited efficacy

Poor specificity

Platelet toxicity

Overall down regulation of gene expression resulting in increased cell

invasiveness.

Solution lies in siRNA (small interfering RNA) which are duplexes of 19-21 nucleotides

RNAs with symmetric 2 nucleotide 3’ overhangs.

They are encoded in the genome as RNA and are produced when DICER

(endonuclease) cuts it into short pieces.

Binds to mRNA in the RISC complex which causes mRNA degradation, thus silencing

the gene.

Been shown to be able to suppress expression of GFP in oocytes.

Been shown to be able to fight HIV virus by silencing the gag gene which encodes for

an essential HIV core protein p24.

Guidelines for siRNA design (note that 50% of them give >50% silencing while 25% give >

70% silencing, thus need to test with at least 4 to be sure):

1. Find occurrences within mRNA with “AA” dinucleotide overhangs

2. Capture following 19 nucleotides

3. GC content should be 30-50%

4. BLAST search to find sequences with low homology to other genes

Methods to produce siRNAs:

In vitro:

Chemical synthesis

In vitro transcription

RNase III/DICER digestion of long dsRNA

In vivo:

Plasmids, PCR Templates (siRNA expression cassettes), viral vectors.

Consist of promoter, siRNA template (hairpin) consisting of sense +

antisense strand and termination signal (3-5Ts or polyA signal)

Usage of plasmid vectors eliminate the need to work with RNA and is able to

produce large quantities however requires cloning and thus troublesome.

Usage of siRNA expression cassettes involves three-step PCR method and

thus skips cloning, however requires time to optimize the PCR conditions.


16

Lecture 7 – Genome Editing

Genomic editing is the introduction of targeted genomic sequence changes including

targeted deletions, insertions and precise sequence changes into living cells and organisms.

First step is to create a DNA double-stranded break (DSB) which can be repaired by NHEJ or

HDR.

Systems for inducing DSB initially

relied on protein based systems with

customizable DNA-binding specificities

such as zinc finger nucleases (ZFNs)

and transcription activator-like effector

nucleases (TALENs )

Both use protein-DNA

interactions for targeting and

have extended DNA recognition

sequences (14 to 40bp)

The construction of engineered

zinc finger array is difficult

while the highly repetitive nature of TALEN-coding sequences is also a problem for

delivery using viral vectors.

Recently developed bacterial CRISPR-associated protein (Cas9) nuclease from Streptococcus

pyogenes which are RNA-guided nucleases.

Use simple base-pairing rules between engineered RNA and target DNA site.

CRISPR systems are adaptable immune mechanisms used by bacteria to protect

themselves from foreign nucleic acids such as viruses or plasmids.

Requires either a crRNA/tracrRNA hybrid or gRNA bound to Cas9 protein for

recognition at the PAM (NGG) site of the target DNA for cleavage.

Nickase domain in Cas9 cleaves only the DNA strand that is complementary

to and recognized by the gRNA.

Cas9 have been shown to be able to insert/delete base pairs, insert/replace

sequences, delete/rearrange sequences.

By deactivating the nickase domain, dCas9 can be used for:


17

Gene activation by binding it to an activation domain

DNA modification by binding it to an effector domain

Imaging of a genomic locus by binding it with GFP

Parameters to evaluate genome editing tool:

Targeting efficiency : % of desired mutation achieved

Cas9 (>70% in zebrafish) is much better as compared to TALENs and ZFNs

(1-50% in human cells)

Off-target mutations – likely to appear in sites that have differences of only a few

nucleotides compared to original sequence as long as they are adjacent to PAM

sequence.

Cas9 can tolerate up to 5 base mismatches within the protospacer region or

a single base difference in PAM sequence

Off target mutations are hard to detect as whole genome sequencing is

required.

To reduce off-target mutations:

Use truncated gRNA or add two extra G at 5’ end

Use paired nickase – two sgRNAs complementary to adjacent area on

opposite strands. Although it induces DSBs in target DNA, it only

creates single nicks in off-target locations and thus minimal off target

mutations.

Use web based tools to facilitate identification of potential CRISPR

target sites and assess their potential for off-target cleavage.

Current applications of CRISPR/Cas9:

Have already been used in many cell lines and organisms such as human, bacteria,

zebrafish, C. elegans, plants, Xenopus tropicalis, yeast, Drosophila, monkeys, rabbits,

pigs, rats and mice.

Single point mutations in a particular target gene via single gRNA.

Induce large deletions or rearrangements such as inversions or translocations using

a pair of gRNA-directed Cas9 nucleases.


18

Using dCas9 to target protein domains for transcriptional regulation, epigenetic

modification and visualization of genome loci.

Enables rapid genome-wide interrogation of gene function by generating large gRNA

libraries for genomic screening.


19

Lecture 8 – Structure-based Protein Design and Engineering

Protein engineering is needed for better catalysts in the industry, and as therapeutic agents.

We want to manipulate proteins in a controlled and rational way. Thus we need to know the

principles and mechanisms such as structure, folding and stability as well as catalysis before

using molecular biology to engineer it.

E.g. Dengue virus is produced as a long amino acid chain which gets cleaved by

proteases. Genome encodes for 10 proteins where 3 are structural proteins (coat

and RNA delivery) and the remaining 7 are non-structural proteins (production of

new viruses).

Structural studies reveal that NS3 protease is an intrinsically disordered

chymotrypsin fold which requests NS2B for correct folding and functional dynamics.

Solution conformations of NS2B and NS3 proteins show that they can be inhibited by

natural products from edible plants.

Proteins are made up of 20 natural amino acids which are hydrophobic, charged or polar.

Their functions are defined by their 3D structures (e.g. random coil, alpha helix or beta sheet)

Protein folding is spontaneous and starts from the random coil state.

Afinsen experiment on ribonuclease:

Adding urea and mercaptoethanol to ribonuclease result in inactive enzyme

and reduced disulfide bonds.

Removal of urea first allowed protein to reform into its native state,

following which removal of mercaptoethanol allowed the correct disulfide

bonds to form.

Removal of mercaptoethanol first caused the wrong disulfide bonds to form,

following which when urea was removed, the enzyme was inactive.

This suggests that the amino acid sequence determines the folding of the

protein where it folds to reduce the Gibbs free energy of the whole system.

Folding models:

Framework model – secondary structures formed first which further pack into final

structure with well-defined side chain packing through diffusion.

Nucleation model – most stable secondary structure formed first, folding starts at

nucleation site and spreads throughout protein.


20

Hydrophobic collapse & “molten globule” model – more compact state with

hydrophobic side chains inside.

Folding funnels – many possible pathways to native state, however there will only

be one global minimum.

Driving forces for protein folding:

Hydrophobic effect is the main driving force where hydrophobic side chains

cluster/exclude water; result in release of water cages in unfolded state which cause

lowering of free energy.

Hydrogen bonds, electrostatic interactions (salt bridge) and chemical cross links

such as disulfides or metal ions also stabilize protein structure.

Some comments:

Are there any exact rules for H-bond as basis for folding? We can remove H-bond via

SDM to test but still there is no clear exact answer because this is context dependent.

The entire system of H-bond must be considered for the entire structure. Removing

one may cause the formation of another H-bond etc. Can we do a systems approach

for protein folding?

Introducing disulfide bonds may not always stabilize the protein. Protein stability

may be higher, but may not fold into desired conformation.

Each mutation does not just affect one area, it will affect many other factors such as

Van der Waals interactions, electrostatic interactions etc.

Misfolded proteins can be inserted into membranes due to formation of helices/loops which

may be caused due to the environment, may lead to toxicity.


21

Lecture 9 – Intrinsically disordered proteins and protein-protein Interactions

Many gene sequences in eukaryotic genomes encode entire proteins or large segments of

proteins that lack a well-structured three-dimensional fold.

Disordered regions can be highly conserved between species in both composition

and sequence.

Most are functional where many disordered segments fold on binding to their

biological targets (coupled folding and binding) whereas others constitute flexible

linkers that have a role in assembly of macromolecular arrays

E.g. CREB-binding protein binding to CREB.

Well folded proteins have high complexity sequences but they only have up 50% of

all proteins. IDPs are hard to study as they get rapidly degraded by proteases.

Hydrophobicity and mean net charge can help to determine if a protein in

natively unfolded or folded.

IDPs are characterised with high net charge and low hydrophobicity.

IDPs function by coupled folding and binding it folds upon binding to a protein.

E.g. IDP binds to PAK4 and forms a helix structure.

Infected prions can cause a change in secondary structure in normal prions upon contact.

Prion-like domains have low complexity sequence enriched in polar and uncharged

amino acids such as Gln, Asn, Ser, Gly and Tyr.

Generally involved in neurodegenerative diseases such as mad-cow disease and

Creutzfeld-Jacob disease.

Liquid-liquid phase separation (LLPS) is the principle behind the formation of membrane-

less organelles.

ATP concentration is very high in the cell and thus it is suggested that they help to

solubilize hydrophobic molecules in aqueous solutions, acting as a bivalent binder

(bind to different parts) and thus preventing formation of protein aggregates.

LLPS dissolves at high ATP concentrations.

Lack of ATP can also enhance aggregation.

Proteins can interact with other proteins, nucleic acids and small molecules at genome,

proteome and metabolome level.


22

Focus is at the proteome level where protein-protein interactions are examined

Scaffolding proteins are very important for integrating signals from upstream.

Driving force for protein-protein interactions are the same as those for protein

folding

Protein interfaces are diverse in size, shape, composition and solvent content and thus no

reliable method is available so far to detect protein-protein interaction interfaces.

However there are two categories which are well studied – enzyme protein

substrate interactions and pure protein-protein interactions.

Protein-ligand interaction can be described by 3 theories:

Lock and key model – specificity of enzyme where substrate has

specific complementary shape that fits active site.

Induced fit hypothesis – some complexes have different

conformation from unbound state as the bound conformations are

induced by the binding partner.

Conformational selection and population shift – biomolecules exist in

dynamic ensembles of conformations where during binding,

conformers that are most complementary to some pre-existing

ligand conformation are preferentially bound. This disturbs the

equilibrium, resulting in a population shift such that equilibrium is

restored. Note that the conformation is always induced by the ligand.

Design and discovery of molecules to disrupt protein-protein interaction interfaces are a

critical approach to develop therapeutics, either by random screening or by rational design.

lecture 1 prokaryotic expression systems€¦ · lecture 1 – prokaryotic ... transcription and...

Documents