Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Sequence analysis of SARS-CoV-2 genome reveals features important for vaccine design.

Scientific reports | 2020

As the SARS-CoV-2 pandemic is rapidly progressing, the need for the development of an effective vaccine is critical. A promising approach for vaccine development is to generate, through codon pair deoptimization, an attenuated virus. This approach carries the advantage that it only requires limited knowledge specific to the virus in question, other than its genome sequence. Therefore, it is well suited for emerging viruses, for which we may not have extensive data. We performed comprehensive in silico analyses of several features of SARS-CoV-2 genomic sequence (e.g., codon usage, codon pair usage, dinucleotide/junction dinucleotide usage, RNA structure around the frameshift region) in comparison with other members of the coronaviridae family of viruses, the overall human genome, and the transcriptome of specific human tissues such as lung, which are primarily targeted by the virus. Our analysis identified the spike (S) and nucleocapsid (N) proteins as promising targets for deoptimization and suggests a roadmap for SARS-CoV-2 vaccine development, which can be generalizable to other viruses.

Pubmed ID: 32973171 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

  • Agency: NHLBI NIH HHS, United States
    Id: R01 HL151392
  • Agency: NIH HHS, United States
    Id: HL151392
  • Agency: Harvard University, International
    Id: Harvard Quantitative Biology Initiative
  • Agency: U.S. Food and Drug Administration, International
    Id: CBER Coronavirus (COVID-19) Supplemental Funding

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


MATLAB (tool)

RRID:SCR_001622

Multi paradigm numerical computing environment and fourth generation programming language developed by MathWorks. Allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, Java, Fortran and Python. Used to explore and visualize ideas and collaborate across disciplines including signal and image processing, communications, control systems, and computational finance.

View all literature mentions

Codon and Codon-Pair Usage Tables (tool)

RRID:SCR_018504

Database includes genomic codon-pair and dinucleotide statistics of all organisms with sequenced genome. Facilitates genetic variation analyses and recombinant gene design. Derived from all available GenBank and RefSeq data.

View all literature mentions