Computational Modeling and Quality Validation of HMG-CoA Reductase for Drug Design Applications
The enzyme 3-hydroxy-3-methylglutaryl CoA reductase (HMG-CoA reductase or HMGR) catalyzes the conversion of HMG-CoA to mevalonate, which is the rate-limiting step in cholesterol and isoprenoid biosynthesis. HMG-CoA reductase (HMGR) is the main target of statins, which are commonly used medicines to lower cholesterol and prevent heart-related diseases. However, the use of statins may lead to harmful side effects, like muscle pain (myopathy) and muscle breakdown (rhabdomyolysis), creating a demand for alternative therapeutic options. This study aims to analyze and predict the 3D structure of HMGR from 12 different species, such as humans, fruit flies, plants, and protists, using a computer-based method to evaluate their potential for future drug design. Protein sequences were retrieved from the NCBI database and analyzed using various bioinformatics tools. The physicochemical properties and amino acid composition were computed by ExPASy's ProtParam tool, and the secondary structural features were determined by SOPMA and PHD servers. ClustalW, CELLO, and MEME, respectively, predicted the phylogenetic tree, subcellular localization, and structural motifs. Homology modeling was performed using the Swiss-Model Workspace with selected templates and visualized with PyMOL software. The predicted models were evaluated with PROCHECK analysis by Ramachandran plot and VERIFY 3D, and ERRAT values using the SAVES server. The aliphatic index and GRAVY values indicated that most HMGRs were thermally stable and hydrophobic. Among the selected HMG-CoA reductase sequences, the alpha helix dominated over the extended strand, beta turn, and random coil, according to the secondary structural properties. The phylogenetic analysis ensured the relationship among experimental species, and the predicted motifs indicated uniformity in the conserved sequences. The result of the predicted model indicated good quality and high reliability of the protein for most species, particularly for H. sapiens and D. melanogaster. This in silico study provides valuable insights into the structural and functional aspects of HMGR across species, laying a strong foundation for the development of safer and more effective cholesterol-lowering medications in the future.
HMG-CoA reductase (3-hydroxy-3-methyl-glutaryl-CoA reductase) is considered an important enzyme due to its catalytic activity. It catalyzes the metabolic processes that synthesize cholesterol, isoprenoids, and other lipids by converting HMG-CoA to mevalonate. Within this metabolic pathway, HMG-CoA reductase controls the rate-limiting step of cholesterol synthesis (Friesen & Rodwell, 2004). Excessive synthesis of cholesterol has a major impact on human morbidity and mortality, particularly from atherosclerosis, followed by myocardial infarction or stroke (Vance & Van den Bosch, 2000). This metabolic route, commonly known as the mevalonate or isoprenoid pathway, is highly conserved across eukaryotes, archaea, and certain bacteria (Buhaescu & Izzedine, 2007).
Drugs that inhibit HMG-CoA reductase are referred to as HMG-CoA reductase inhibitors, or "statins." Statins primarily target this enzyme and block the conversion of HMG-CoA to mevalonate (Jiang et al., 2018). Statins inhibit the activity of 3-hydroxy-3-methyl-glutaryl-CoA reductase (HMGCR) by directly binding to its active site and inducing conformational changes (Stancu & Sima, 2001). Owing to this mechanism, statins are the most commonly prescribed drugs for the prevention of cardiovascular diseases (Wang et al., 2015). Additionally, recent research has indicated that statins have protective effects against colorectal cancer, and by inhibiting HMGCR, they can reduce the risk of colorectal cancer by 43% with at least 5 years of use (Bonovas et al., 2007; Lochhead & Chan, 2013). Currently, more than 25 million people worldwide are using statins to lower their risk of cardiovascular disorders (Lipkin et al., 2010). But statins can cause myalgia and rhabdomyolysis, which is characterized by muscular injury, weakness, and dark urine, and severe rhabdomyolysis can cause acute kidney disease (Malani et al., 2024; Petreski et al., 2021). These side effects highlight the need for alternative therapeutic strategies (Gesto et al., 2014; Yousuf et al., 2023).
In this context, computational approaches have become indispensable in structure-based drug design, as they help in understanding the three-dimensional structure of the protein target to which the candidate drugs are intended to bind. Once the special target site of the protein is known, docking methods can be applied for designing drugs (Nishant et al., 2011). Experimental techniques like NMR and X-ray crystallography are typically used to discover the three-dimensional structure of proteins, but these methods are highly expensive, time-consuming, and quite labor-intensive. With the help of various bioinformatics tools, this study examines evolutionary relationships, conserved motifs, and physicochemical parameters of HMGR proteins, thereby providing a primary step for designing a target drug for the reduction of cholesterol synthesis in humans.
Retrieval of target and template sequences
Twelve full-length amino acid sequences of 3-hydroxy-3-methylglutaryl CoA reductase (HMGR) from Drosophila melanogaster I (AAF56175.1), Drosophila melanogaster II (NP_732900.1), Homo sapiens (NP_001124468.1), Paris fargesii (AEL-30660.1), Catharanthus roseus (AAA33108.1), Try-panosoma grayi (XP_009313205.1), Leishmania donovani (AF054499.1), Brevibacillus leterosporus (WP_022584705.1), Vibrio alginolyticus (WP_-017820181.1), Encephalitozoon cuniculi GB-M1 (CAD25893.2), Aspergillus oryzae 100-8 (KDE-78541.1), and Haloterrigena turkmenica DSM 5511 (YP_003402108.1) were searched and retrieved from the NCBI database and saved in FASTA format. For clarity, the two Drosophila melanogaster sequences were designated as D. melanogaster I and D. melanogaster II. The sequences were organized according to species group: fruit fly, human, plants, protists, bacteria, fungi, and archaea.
Physicochemical characterization
Physicochemical properties of the protein, which include the theoretical isoelectric point (PI value) (Bjellqvist et al., 1993), total number of negatively (Asp+Glu) and positively (Arg+Lys) charged residues, and extinction coefficients (Gill & von Hippel, 1989), instability index (Guruprasad et al., 1990), aliphatic index (Ikai, 1980; Saikat et al., 2020), and grand average of hydropathicity (GRAVY) (Kyte and Doolittle, 1982), were computed using Expasy's Protparam server (Wilkins et al., 1999). The molecular weight and amino acid composition (essential and non-essential residues) were also calculated using the same server.
Secondary structure prediction
Secondary structural features of amino acid sequences were predicted by the Self-Optimized Prediction Method Alignment (SOPMA) server (Geourjon & Deleage, 1995) and PHD server (Rost et al., 1994). These tools classified regions of the proteins into alpha helices, beta strands, and random coils. The secondary structures from amino acid sequences were predicted by setting default parameters in the web server (number of states: 3, similarity threshold: 8, and window width: 17).
Subcellular localization prediction
Subcellular localization was predicted to provide insights into the biological functions of the proteins (Gillani & Pollastri, 2024). For fruit fly, plant, and fungal sequences, localization was predicted using WoLF PSORT (Horton et al., 2007). Subcellular localization of human, protists, and bacteria was determined by CELLO (Yu et al., 2004), while archaeal sequences were analyzed by PSORT-b (Gardy et al., 2003).
Multiple sequence alignment and construction of a phylogenetic tree
A multiple sequence alignment was conducted using ClustalW (Thompson, Higgins, & Gibson, 1994) software to compare sequences of HMG-CoA reductase from human and other species. A phylogenetic tree was constructed using the UPGMA method (Schlee, 1975) and were aligned with each other by using ClustalW software (Thompson et al., 1994).
Motif analysis
Protein sequence motifs were predicted using Multiple EM for Motif Elicitation (MEME) version 4.10.0 (Bailey et al., 2006) and Motif Alignment and Search Tool (MAST) version 4.10.0 (Bailey & Gribskov, 1998). The parameters were set as follows: the minimum and maximum motif widths were 6 and 50, respectively; the maximum number of motifs was limited to 10; and each sequence was allowed zero or one occurrence of a motif.
Homology modeling and visualization
Three-dimensional structures of the HMGR proteins were generated using the SWISS-MODEL workspace (https://swissmodel.expasy.org) (Arnold et al., 2006). The predicted models were visualized and analyzed using PyMOL version 1.3 (DeLano, 2002).
Protein 3D model quality assessment
The stereochemical quality of the predicted protein models was assessed using PROCHECK (Laskowski et al., 1996) by Ramachandran plot analysis (Spencer et al., 2019), through PDBsum (Laskowski et al., 2005). Additionally, ERRAT (Colovos & Yeates, 1993), and VERIFY 3D (Bowie et al., 1991; Eisenberg et al., 1997) scores by using the Structure Analysis and Verification (SAVES) Server version 4 (https://saves.mbi.ucla.edu/) were employed for further evaluation of the HMG-CoA reductase protein models.
Physico-chemical characterization
The physicochemical properties of each of the proteins were determined by the ProtParam server (Wilkins et al., 1999). These characteristics include the molecular weight, theoretical pI, and extinction coefficient (Gill & von Hippel, 1989), instability index (Guruprasad et al., 1990), aliphatic index (Ikai, 1980), grand average of hydropathicity (GRAVY) (Kyte & Doolittle, 1982) and the absolute number of negatively and positively charged residues, which are shown in Table 1, and the essential amino acid compositions are shown in Fig. 1.
Table 1: Parameters computed using Expasy's ProtParam tools.
*Extinction coefficient (EC), *Isoelectric Point (PI), *Grand average of hydropathicity (GRAVY)
Understanding the theoretical pI, which is the pH at which a specific molecule or surface has no net electrical charge, helps understand the charge stability of proteins (Enany, 2014). The calculated isoelectric points (pI) of HMG-CoA reductase proteins ranged from 4.43 to 9.01. Determination of the Instability index results in an understanding of the stability of the protein. An Instability index value exceeding the value of 40 indicates an unstable nature of the protein, while the protein whose Instability index value is less than 40, predicted as stable (Guruprasad et al., 1990). The instability index value indicated that HMG-CoA reductase proteins from H. sapiens (51.39) and C. roseus (52.25), P. fargesii (44.59), D. melanogaster I and II (41.89) are unstable; on the other hand, proteins of the rest of the species were stable (instability index < 40), indicating better robustness.
A protein that has a high aliphatic index is thermostable over a broad temperature range (Ikai, 1980). Value of AI, an indicator of thermostability, was highest in H. sapiens (97.27), closely followed by C. roseus (96.74), D. melanogaster (96.58), and P. fargesii (95.48), while the lowest was in E. cuniculi (87.66), indicating reduced thermal stability. Proteins with positive or negative grand average hydropathy (GRAVY) values are considered hydrophobic or hydrophilic, respectively (Kyte & Doolittle, 1982). GRAVY (Grand Average of Hydropathicity) analysis suggested that most HMG-CoA reductase proteins, including H. sapiens, were hydrophobic. Exceptions included B. laterosporus (-0.013), E. cuniculi (-0.108), and H. turkmenica (-0.084), indicating hydrophilicity.
The percentages of essential amino acids show that all species contain higher amounts of leucine and valine compared to other essential amino acids.
Fig. 1: Percentages of essential amino acids of HMG-CoA reductase of human and 11 other species.
Protein secondary structure prediction
There remains a close relationship between protein structure and function, so secondary structure analysis is a necessity for understanding protein function (Mamun et al., 2024). Predicting the secondary structure of proteins can also help predict the tertiary structure by minimizing the gap between the primary sequence and the tertiary structure (Zhang et al., 2018). The secondary structure implies whether the given amino acids lie in a helix, strand, or coil. Secondary structural features analyzed by SOPMA (Geourjon & Deleage, 1995) and PHD server (Rost et al., 1994) are shown in Table 2.
In this study, the amino acid sequences were retrieved and saved in FASTA format from the NCBI database. Sequence analysis showed that the HMG-CoA reductase of D. melanogaster I, D. Melanogaster II, H. sapiens, P. fargesii, C. roseus, T. grayi, L. donovani, B. leterosporus, V. alginolyticus, E. cuniculi GB-M1, A. oryzae 100-8, and H. turkmenica DSM 5511 consisted of amino acids 920, 920, 835, 575, 601, 435, 434, 432, 420, 398, 1044, and 408 amino acids, respectively. Analysis of the instability index (Guruprasad et al., 1990) indicated that most of the evaluated proteins were unstable, except proteins from humans, fruit flies, and plants, which showed values below the threshold of 40. In this in silico analysis, aliphatic indices were calculated to assess the structural stability of the selected enzyme. The results showed that the aliphatic index (Ikai, 1980) of the protein sequences ranged from 87.66 to 97.27, suggesting the stability of these enzymes across a wide range of temperatures. Positive GRAVY (Kyte & Doolittle, 1982) values indicated the hydrophobic nature of the protein from different species, with an exception in B. leterosporus (−0.013), E. cuniculi (−0.108), and H. turkmenica DSM 5511 (−0.084). Overall, the computational analysis suggested that HMG-CoA reductase proteins from D. melanogaster, P. fargesii, and C. roseus share strong structural and physicochemical similarities with the human enzyme.
Essential amino acids, particularly leucine and valine, are known to play a crucial role in reducing the synthesis of cholesterol (Adhikari et al., 2013). By lowering serum cholesterol, the essential amino acids leucine and valine may help in reducing the risk of atherosclerosis (Cojocaru et al., 2010). In this study, the high level of leucine and valine in every species suggested that these enzymes could serve as potential sources of HMG-CoA reductase for reducing the human blood cholesterol levels. In the case of secondary structure analysis, joint prediction with SOPMA and a neural network method (PHD) correctly predicts secondary structure with 82.2% accuracy (Rost et al., 1994). Both prediction servers indicated that α-helices were the predominant structural elements, followed by random coils and extended strands.
Multiple sequence alignment of HMG-CoA reductases from different species indicated that they likely originated from a common ancestral gene. A motif is a sequence pattern that occurs repeatedly in a group of related protein or DNA sequences. It may also have a role in the development of the protein's binding sites (Sanchita et al., A motif analysis by MEME (Bailey et al., 2006) and the MAST server (Bailey & Gribskov, 1998) indicated the uniformity in the conserved sequences of all the species. Proteins found in the cytoplasm and surface membranes can act as potential drug or vaccine targets (Barh et al., 2011). Subcellular localization analysis using PSORT-b (Gardy et al., 2003), WoLF PSORT (Horton et al., 2007), and CELLO (Yu et al., 2004) suggested that most of the proteins were cytoplasmic and could serve as promising drug target candidates. To predict the three-dimensional structure of HMG-CoA reductase protein for fruit flies, humans, plants, protists, bacteria, archaea, and fungi by homology modelling, appro-priate templates were selected by subjecting the sequences of proteins to BLASTp against the PDB database, considering several parameters like query coverage, lower resolution, lowest E-value, and above 30% sequence identity. SWISS-MODEL server (Arnold et al., 2006) provided suitable templates, and all sequences were found to have identity values above 30%, suggesting their suitability for reliable 3D modeling.
Model validation using PROCHECK (Laskowski et al., 1996) demonstrated that most species had 85% to 89.9% of residues in the most favored regions of the Ramachandran plot (Spencer et al., 2019). These findings displayed a higher percentage of residues occupying the favorable regions within the plot, signifying good stereochemical quality and structural reliability. No species in the experimental sequences exceeds the value of most favored regions above 90%. For the quality of the protein model to be considered satisfactory, it is expected to have a Verify3D score greater than 80% (Khor et al., 2014). In this study, for most species, a value of more than 80% indicates that at least 80% of the amino acids have scored > 0.2 in the 3D/1D profile, which is considered passed, suggesting that most protein model quality in this analysis is satisfactory. ERRAT (Colovos & Yeates, 1993) predicted an overall quality factor value above 80 for most of the protein, which indicates the high reliability of the predicted protein model (Colovos & Yeates, 1993). But according to Messaoudi (Messaoudi et al., 2011), a generally acceptable model range is >50 for a high-quality model. It is found from analysis by ERRAT that, overall quality factor value for all the species is >60, with a value of H. sapiens, along with D. melanogaster I, D. melanogaster II, P. forgesii, C. roseus, T. grayi, L. donovani, and H. turkmenica DSM 5511 (Table 5), above 80. The overall quality factor obtained through ERRAT indicated a high-quality protein model.
This study successfully performed a comprehensive in silico characterization and comparative analysis of HMG-CoA reductase (HMGR) enzymes from twelve diverse species to investigate their properties, subcellular localization, functional motifs, and phy-logenetic relationships using various bioinformatics tools and servers. Additionally, this study aimed to construct a 3D model of the HMG-CoA reductase enzyme, as understanding its three-dimensional structure is crucial for gaining insights into its functional characteristics. The findings provide com-prehensive information on the molecular properties, architecture, and functional roles of the HMG-CoA reductase enzyme. The results suggest that HMG-CoA reductase enzymes from all the studied species may serve as potential sources for developing targeted cholesterol-lowering drugs in humans.
R: Conducted data collection, performed bioinfor-matics analyses, and prepared the initial draft of the manuscript. M.M.R.: Supervised methodological design, guided phylogenetic and motif analyses, and critically reviewed and revised the manuscript for important intellectual content. S.M.: Conceived and designed the research, interpreted the results, and finalized the manuscript for submission and publication. All authors read and approved the final version of the manuscript.
The authors gratefully acknowledge Mr. Utpal Kumar Adhikari of the School of Medicine, Western Sydney University, Australia, for his kind cooperation during this research.
The authors declare that there is no conflict of interest.
UniversePG does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted UniversePG a non-exclusive, worldwide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.
Academic Editor
Dr. Phelipe Magalhães Duarte, Professor, Faculty of Biological and Health Sciences, University of Cuiabá, Mato Grosso, Brazil
Rubaya, Rahman MM, and Mahmud S. (2025). Computational modeling and quality validation of HMG-CoA reductase for drug design applications. Am. J. Pure Appl. Sci., 7(6), 485-498. https://doi.org/10.34104/ajpab.025.04850498