Computational Prediction and Evaluation of SARS-CoV-2 Protein Structures
: Investigating the Molecular and Evolutionary Dynamics of NSP4, NSP6, ORF6, and ORF10

  • Annie Omoregie

Student thesis: Master's Thesis


The outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has necessitated a comprehensive understanding of its structure, as well as the molecular and evolutionary dynamics of its lesser-studied proteins. This study aims to computationally predict and evaluate the protein structures of four SARS-CoV-2 proteins: NSP4, NSP6, ORF6, and ORF10. Additionally, it seeks to investigate their molecular and evolutionary dynamics. Genetic variability among different protein variants has revealed sequence variations that suggest functional implications. The construction of a phylogenetic tree displayed branching patterns and the relatedness of the SARS-CoV-2 proteins to other species. Methods such as Fixed Effects Likelihood (FEL) and Single-Likelihood Ancestor Counting (SLAC) detected sites undergoing positive and negative selection pressures in NSP4, NSP6, and ORF6. This is indicative of adaptive changes and conservation of specific amino acids. The Mixed Effects Model of Evolution (MEME) and Fast, Unconstrained Bayesian Approximation (FUBAR) methods were utilized to identify the sites undergoing positive selection that could be associated with functional changes in the proteins of interest. Although this study does not conclusively determine the origin of SARS-CoV-2, our findings indicate that the NSP4, NSP6, and ORF6 proteins have undergone significant evolutionary pressure. This information could be utilized in antiviral drug development by targeting functional changes or protein stability, leveraging adaptive advantages, comprehending resistance mechanisms, and adjusting protein-protein interactions. These proteins' suggested roles in the SARS-CoV-2 replicase system and host cell and immune responses make them particularly relevant for such strategies. A range of in-silico techniques including Robetta, ITASSER and Phyre-2 were utilised to discover potential ins-silico three-dimensional (3D) models for NSP4, NSP6, ORF6, and ORF10. The Ramachandran plot, the ERRAT score, the QMEAN quality factor score, and the Molecular Dynamics (MD) simulation were used to assess the validity and stability of some of the proteins in the in-silico membrane. This assessment was compared to well-established in-silico 3D models from programs such as AlphaFold. The in-silico models in addition to the findings of the molecular evolution analysis of SARS-CoV-2 NSP4, NSP6, ORF6 and ORF10 can be a guide for experimental design. The in-silico models can assist data interpretation, enhancing structural refinement and serve as a substitute for accelerating structure determination for virtual screening, structure-based optimisation of drug candidates, rational drug design and identification of specific regions and binding sites that can serve as targets for drug molecules. Overall, this study contributes to our understanding of the molecular evolutionary dynamics and protein structure of lesser-known SARS-CoV-2 proteins NSP4, NSP6, ORF6 and ORF10 which could lead to better functional annotation of these proteins and drug target identification against COVID-19.
Date of Award26 Jan 2024
Original languageEnglish
SupervisorRichard Bingham (Main Supervisor) & Dougie Clarke (Co-Supervisor)

Cite this