biotech.bio5.orgbiotech.bio5.org/sites/default/files/vaccines_sg_5-17.docx  · web...

12
Creating a vaccine against COVID-19 What is a vaccine? The goal of a vaccine is to safely create an immune response in an individual so that they are protected against an infectious disease or pathogen. There are different types of vaccines: Whole-Pathogen Vaccine—Traditionally, vaccines consisted of the entire infectious agent that had been either killed or weakened so that infection does not occur. Inactivated vaccines are usually pathogens which have been killed with chemicals, heat or radiation, but still are capable of causing an immune response to specific aspects of the pathogen. Generally, the polio vaccine is inactivated vaccine. Influenza can come as an inactivated vaccine; however, this is not as effective as the other types of vaccines and usually only given to pregnant women and individuals who are immune- compromised. Live-attenuated or weakened vaccines are more effective in terms of immune response. These are live active pathogens that have been weakened to not be as pathogenic. Measles, mumps and rubella are generally live vaccines. The immune response is robust, though my not be safe for some individuals; on rare occasions the vaccine may unintentionally cause the disease. Subunit Vaccines—these vaccines use small pieces of the pathogen for immune identification. This could be part of the protein surface of a virus, such as the Spike Protein, which are on the outside of the Coronavirus (CoV). Many bacteria have carbohydrate molecules on their surface, which do not result in a strong immune response. A conjugate formed between the carbohydrate and a toxin can increase the immune response with the recognition of the bacteria carbohydrate. Nucleic Acid Vaccines—Current efforts in vaccines includes the use of nucleic acids encoding the information to make the antigen which then will be used for the immune response. The nucleic acid can be either DNA with the gene(s) to encoding the antigen, or mRNA which can be used directly by the cell to create the antigen. If we use the Spike Protein of COVID-19 example in a DNA vaccine, the gene encoding the Spike Protein would be placed into a vector, which is usually a nonpathogenic virus that will infect host cells. A DNA virus has to be incorporated in the DNA in the nucleus, transcribed

Upload: others

Post on 27-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

Creating a vaccine against COVID-19

What is a vaccine? The goal of a vaccine is to safely create an immune response in an individual so that they are protected against an infectious disease or pathogen. There are different types of vaccines:

● Whole-Pathogen Vaccine—Traditionally, vaccines consisted of the entire infectious agent that had been either killed or weakened so that infection does not occur. Inactivated vaccines are usually pathogens which have been killed with chemicals, heat or radiation, but still are capable of causing an immune response to specific aspects of the pathogen. Generally, the polio vaccine is inactivated vaccine. Influenza can come as an inactivated vaccine; however, this is not as effective as the other types of vaccines and usually only given to pregnant women and individuals who are immune-compromised. Live-attenuated or weakened vaccines are more effective in terms of immune response. These are live active pathogens that have been weakened to not be as pathogenic. Measles, mumps and rubella are generally live vaccines. The immune response is robust, though my not be safe for some individuals; on rare occasions the vaccine may unintentionally cause the disease.

● Subunit Vaccines—these vaccines use small pieces of the pathogen for immune identification. This could be part of the protein surface of a virus, such as the Spike Protein, which are on the outside of the Coronavirus (CoV). Many bacteria have carbohydrate molecules on their surface, which do not result in a strong immune response. A conjugate formed between the carbohydrate and a toxin can increase the immune response with the recognition of the bacteria carbohydrate.

● Nucleic Acid Vaccines—Current efforts in vaccines includes the use of nucleic acids encoding the information to make the antigen which then will be used for the immune response. The nucleic acid can be either DNA with the gene(s) to encoding the antigen, or mRNA which can be used directly by the cell to create the antigen. If we use the Spike Protein of COVID-19 example in a DNA vaccine, the gene encoding the Spike Protein would be placed into a vector, which is usually a nonpathogenic virus that will infect host cells. A DNA virus has to be incorporated in the DNA in the nucleus, transcribed into mRNA, which is then translated into protein, the antigen. Whereas, vaccines made of mRNA can immediately be used for protein synthesis and antigen production. mRNA molecules are delivered to the host cell in a variety of ways that do not include the use of a virus and since incorporating into the nuclear DNA is not involved, mutagenesis is not a side effect. The disadvantage to mRNA is instability of the molecule. Some of the delivery methods are adding stabilizing agents.

Using the Spike Protein of SARS CoV-2 as the target for a vaccine, which of these vaccines could be used? You can have more than one answer.

This lesson will look at Spike Protein sequences and determine if this would be a good choice for a SARS CoV-2 vaccine. What makes something a good choice?

1. Must mount an immune response-needs to be recognized by the immune system as foreign.

Page 2: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

2. Something that won’t change (mutate) as in a new strain. For example, influenza has close to 200 different strains, though multiple stains have proteins that are similar enough to be recognized by the immune system.

To start this lesson, you will need a DNA Subway account. When you register for a DNA Subway account you will be directed to the Cyverse website. Once you have registered you can log into DNA Subway. You will be using the Blue Line (Determine Sequence Relationships) for this analysis.

We will look at the sequence of the Spike Protein, which is on the outside of all coronaviruses. There are only seven different coronaviruses that have infected humans:

SARS CoV-2 which causes COVID-19 (Beta CoV)SARS CoV the original SARS from 2003 (Beta CoV)MERS (Middle East Respiratory Syndrome) outbreak in 2012 (Beta CoV)OC43 Common cold symptoms (Beta CoV)HKU1 Common cold symptoms (Beta CoV)229E Common cold symptoms (Alpha CoV)NL63 Common cold symptoms (Alpha CoV)

How similar are the Spike Proteins between all of the seven CoVs?

Part 1: Identifying and copying the Spike Protein translations for all seven Coronaviruses that infect humans. In order to compare the Spike Proteins of all seven Corona Viruses, you will use the NCBI (National Center for Biotechnology Information)website and Cold Spring Harbor’s DNA Subway. 1. Finding and identifying the protein sequence for the SARS CoV-2 Spike Protein

Open NCBI in your browser ( https://www.ncbi.nlm.nih.gov/.)Select Genome, on the right hand side of the page. Select Viruses in the center of the new page.Click on NCBI VirusClick on “Search by virus”Search for SARS CoV-2 (the taxid is 269049)

Page 3: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

The first results shown are called Ref. Seq. (Reference Sequence). Note the first SARS CoV-2 genome sequence was added on 1-13-2020 and has an Accession number (NC_045512). Every gene sequence uploaded to NCBI is given a unique accession number that serves as a catalog number in the genomic archive known as Gene Bank. After the reference sequences, the SARS-CoV-2 sequences are listed in chronological order starting with the most recent deposit. The number of deposits is growing daily, with a total of 1,998 on 5-1-2020.

2. Select NC_045512. When you click on NC_045512, a Nucleotide Detail window opens.

Using information in that window, answer the following questions:1.What is the length of this nucleotide? 2. What is the molecule type? 3. Where and when was the sample collected? 4. What is the host?

3. Select NC_045512 in the Nucleotide Detail window, which will take you to GenBank and the complete record for this viral sequence (Accession NC_045512). This page contains information about publications related to the sequence, animo acid sequences based on translations of Open Reading Frames and genes, and the nucleotide sequence itself.

Notice that the first line is labeled LOCUS. It lists the the accession number as well as the genome size in base pairs and notes that the genome is linear single stranded RNA

LOCUS NC_045512 29903 bp ss-RNA linear VRL 30-MAR-2020

Page 4: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

Then there are several journal articles associated with this sequence. If you scroll down the page you will see a FEATURES section which details the translated sequences of Open Reading Frames (ORFlab) and Genes associated with the virus and at the very bottom of the page, you will see the ORIGIN section showing the nucleotide sequence, which is shown as DNA, even though this is an RNA genome.

4. We are interested in the translated amino acid sequence of the Spike Protein. Scroll back to the middle of the page to find the FEATURES of this sequence.

Under Features, find Gene “S” for structural protein, Spike Protein

5. Copy the entire translated amino acid sequence from MFV…HYT, and paste this sequence into a word document. There will be some spaces in the sequence, this does not matter.

6. Save the sequence as a FASTA format, in which each unique sequence is identified with a > followed by a defining name. For example:

>refseq_SARS-CoV-2_NC_045512_2020Jan13MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFR...to the end of the sequence

Now you have SARS CoV-2 sequence saved in the FAST format and ready for DNA Subway analysis.Save this as a word document but keep this document handy for the other six sequences needed. You will follow the same steps for the other six Coronaviruses, adding each one to the word document in FASTA Format.

1.SARS CoV-2 which causes COVID-19 (Beta CoV)2. SARS CoV the original SARS from 2003 (Beta CoV)3. MERS (Middle East Respiratory Syndrome) outbreak in 2012 (Beta CoV)

Page 5: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

4. OC43 Common cold symptoms (Beta CoV)5. HKU1 Common cold symptoms (Beta CoV)6. 229E Common cold symptoms (Alpha CoV)7. NL63 Common cold symptoms (Alpha CoV)

7. The next sequence will be that of the original SARS outbreak in 2003. In NCBI Virus, search by virus and type in SARS. Select the first result, Severe acute respiratory syndrome-related CoV, taxid:694009.

There are over 5000 nucleotide deposits on this site. It is important to realize that SARS CoV-2 are included, but you only want the original SARS from 2003. (Accession NC_004718).

Click the accession number, copy the spike protein translated sequence and paste it in your word document. Be sure to include the FASTA format, and give it a name that you will recognize in DNA Subway.

8. Type MERS into the NCBI virus search. You will have two reference sequences, feel free to use either one of these. Click the accession number, copy the spike protein translated sequence and paste it in your word document. Be sure to include the FASTA format, and give it a name that you will recognize in DNA Subway.

9. Search for OC43 and select Human CoV OC43 (HCoV-OC43), taxid:31631, select the refseq (accession NC_006213), copy spike protein sequence and add it to your word document in FASTA format.

10. Search for HKU1 and select Human CoV HKU1 (HCoV-HKU1),taxid:290028, select the refseq (accession NC_006577), copy spike protein sequence and add it to your word document in FASTA format.

11. Search for 229E and select Human CoV 229E,taxid:11137, select the refseq (accession NC_002645), copy spike protein sequence and add it to your word document in FASTA format.

12. Search for NL63 and select Human CoV NL63 (HCoV-NL63),taxid:277944, select the refseq (accession NC_005831), copy spike protein sequence and add it to your word document in FASTA format.

You have now completed the first part of the investigation, identifying and copying the Spike Protein translations for all seven Coronaviruses that infect humans.

Part 2: Aligning Coronairus Spike Protein sequences

Page 6: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

1. Open DNA Subway, Blue Line. ,Select project type Protein, select “enter sequences in FASTA format”, and copy your seven FASTA formatted sequences on your word doc & paste them in the window (or upload the file). Give your project a name and continue.

2. View the sequences in sequence viewer. 3. Align the sequences using MUSCLE,

Select all of the sequences, save and use MUSCLE to see the alignment.Look at the pattern of the bar-code alignment of the seven sequences. Be sure to scroll to the right for more sequence information. If you see a lot of different lines, these represent differences compared to the consensus sequence, dark gray areas represent deletion of amino acids. The consensus sequence is somewhat arbitrary, in that it represents whichever amino acid is found at that position most often out of the sequences you have selected. If you put in one additional sequence the consensus could be dramatically different.

Do you see differences between these sequences? How can you explain the differences?Do you see any similarity between any of the sequences? If yes, for which viruses? Why do you think these sequences are more similar?

4. To see the sequence similarity, select SEQUENCE SIMILARITY %. Determine the similarity between:

1. The two SARS CoVs (SARS CoV and SARS CoV-2).2. The two Alpha CoVs (229E and NL63), 3. The two Beta CoVs (OC43 and HKU1) that cause the common cold4. Where does MERS fit in based on sequence similarity?

5. Build a phylogenetic tree from these sequences.To build a phylogenetic tree, use an outgroup (something that is evolutionarily very different) sequence as a comparison. Gamma CoV, should be different enough from the seven viruses in our analysis to serve as a rooting out group. Go back to NCBI Virus, Search by virus and type in Gamma CoV, select Gamma CoV, taxid:694013. You will see four reference sequences, shouldn't matter which you select, the example uses NC_010800. Copy the spike protein sequence and add it to your word document in FASTA format. Select Data, including the Gamma CoV, MUSCLE and then build a tree using PHYLIP ML:

Page 7: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

What can you tell about the relationships of these viruses to each other based on the spike protein sequence? What surprised you about the phylogenetic results?

Comparison of the five Beta CoVs Spike Proteins.

Use the data in the DNA Subway project you already have. Select all of the Beta CoVs: OC43, HKU1, MERS, SARS, and SARS-CoV-2. Then save selection and MUSCLE for the alignment.

1. Do any similar features stand out with this alignment?

2. Build a tree for this alignment, using Gamma CoV as an outgroup and realign the sequences.

Compare the two Alpha CoVs to each other. Go back to Select Data, select just 229E and NL63, save selection and MUSCLE for the alignment.

What do you notice? In what way are the results as you expected them to be and in what ways do the results surprise you?

Compare the Spike Proteins of the two SARS viruses (SARS CoV, SARS CoV-2)What regions of the protein would you suggest to target for vaccine production? Why?

Compare different SARS CoV-2 Spike ProteinsReturn to the NCBI Virus Database and search for several SARS CoV-2 sequences. Use the Search, retrieve and analyze Severe acute respiratory syndrome CoV-2 (SARS-CoV-2) sequences in NCBI Virus, however if this is not an option on this page then search for the virus by typing in SARS CoV-2 (the taxid is 2697049). Select at least 10 other SARS CoV-2 sequences to add to and compare with your ref seq which was collected Dec 2019.

Page 8: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

To capture the scope of mutations or differences in SARS CoV-2, select sequences based on time, particularly the date collected, and on location. Notice that after the ref seq, the sequences are in order of most recently released. Some of those may be from another date, for example, Accession number MT370517 sequence was released on 4-23-2020 from a sample that was collected 2-27-2020. Scroll right in the results window to see the collection date. Not every sequence has the collection date information.

Order your sequences based on collection date. Try also to select sequences from samples based on different locations (China, USA, Italy, etc). The geolocation site is available when you click on the accession number and open the Nucleotide Detail window.

As an alternative to copying and pasting the translated sequences of the Spike Protein, you can upload the sequence directly into DNA Subway using the protein ID. Select Upload Data and enter the protein_id in the Import sequence from GenBank box.

Align only SARS CoV-2 sequences and see how these samples compare.

1. How many differences do you see among the sequences?

2. What is the % similarity?

3. Based on this analysis, would you think the Spike Protein is a good choice for a vaccine target? Why or why not?

Compare ten Spike Protein sequences for at least two of the four coronaviruses that produce symptoms of the common cold.

Since SARS CoV-2 has not been around for long, not seeing many differences in Spike Proteins may be because there has not been enough time for mutations to accumulate. The four common cold coronaviruses have been around for decades and may be a better tool to look at how rapidly this protein can accumulate mutations.

As above, in the search window of NCBI Virus, type in either OC43, HKU1, NL63 or 229E. Find ten sequences with divergent collection dates and align these sequences in DNA Subway.

1. What are the % sequence similarities between these Spike Proteins? 2. How does that compare to the % similarity between different SARS CoV-2 viruses? 3. Based on all of the evidence do you think Spike Protein is a good target for a vaccine?

Why or why not (use evidence from this analysis to defend your answer).

Page 9: biotech.bio5.orgbiotech.bio5.org/sites/default/files/Vaccines_sg_5-17.docx  · Web view2020-05-19 · OC43 Common cold symptoms (Beta CoV) HKU1 Common cold symptoms (Beta CoV) 229E

Additional resources

April 28, 2020 Nature article on the race for a coronavirus vaccine.

A promising Vaccine is being developed by a company in Massachusetts, Moderna. They are targeting the Spike Protein. For more information check out the YouTube and their website:

https://youtu.be/qJlP91xjvsQ

https://www.modernatx.com/modernas-work-potential-vaccine-against-covid-19