Schizophrenia (SCZ) is a major psychiatric disorder and often presents with psychiatric comorbidities. But, the interactions or links between the pathogenesis of SCZ and comorbidities are not known. In this study, we aimed to develop an integrated multi-omics approach based on gene expression, gene ontology, pathways, protein-protein interactions data that help clinical researchers to assess the links between SCZ and major psychiatric pathologies. We compared the transcriptomic alterations between diseases and controls and observed significant perturbed gene expression patterns i.e. differentially expressed (DEGs) shared among SCZ and major depressive disorders, obsessive-compulsive disorder, alcoholism, eating disorder. We observed deregulated expression of three DEGs, namely, HAPLN1, CNDP1, SLC12A2 in SCZ and pathologies, which were common among the selected pathologies suggesting the selected disorders are comorbidities of SCZ. The pathways including FoxO signaling pathway, MAPK signaling pathway, transcriptional misregulation in cancer, cellular senescence, cell cycle, PI3-Akt signaling pathway, TNF signaling pathway, and TGF-beta signaling pathway altered by the shared SCZ and psychiatric comorbidities also identified. The present study revealed biomolecules (DEGs), ontologies, and cellular pathways of the etiopathogenetic mechanisms of SCZ and psychiatric comorbidities.
Schizophrenia (SCZ) is one of the prevalent psy-chiatric disorders characterized by psychosis (Rund, 2018). The epidemiological shreds of evidence sug-gest the SCZ has some major psychiatric comor-bidities, namely obsessive-compulsive disorder (OCD), major depression, substance abuse disorder (i.e, alcoholism). These comorbidities of SCZ pati-ents make the burden worse (Buckley et al., 2009). Genome-wide association studies (GWAS) have previously identified significant genetic heri-tability representing about 23% heritability on the liability scale in SNP-based study assuming 0.7% population prevalence (Pardiñas et al., 2018). This polygenic signal from GWAS has been distributed among lots of genes and complex biological systems genome-wide (Sullivan et al., 2012), thus to fully appreciate these GWAS findings, further studies atthe integra-tive omics level are required to clarify its relevance to the pathogenesis of SCZ. Despite some efforts made previously to identify the links between SCZ and psychiatric illness, no effective markers are available now (Etemadikhah et al., 2020; Rees et al., 2014; Brown et al., 2015; and Rahman et al., 2021). Thus, to under-stand the molecular links between SCZ and psy-chiatric illness, we designed a bioinfor-matics and systems biology pipeline that can detect the inter-actions of SCZ and comorbidities.
Several efforts were made by previous studies to identify critical genes and pathways in different psy-chiatric diseases individually (Etemadikhah et al., 2020; Rees et al., 2014; Brown et al., 2015; and Rahman et al., 2021), but the inter-connection of SCZ and its comorbidities have not been studied yet. The SCZ and associated psychiatric comorbidities are very complex in terms of molecular pathological mechanism. The vast of amount of transcriptomic data dispersedly available and lack of integration of datasets suggested an unmet need to perform an integrative analysis to dismantle the correlations between SCZ and asso-ciated disorders. Moreover, numerous existing data-bases and clinical resources can not be used due to the dearth of bioinformatics pipeline. Therefore, in this study we have imple-mented a several bioinformatics methodologies to investigate SCZ and comor-bidities by integrating gene expression and gene ontologies and pathways data by gene set enrichment analysis (GSEA) and semantic similarities methods.
In the present study, we developed a bioinformatics and systems biology framework/pipline to assess the common cell pathways and markers shared by SCZ and its well-known comorbidities learning gene expression microarray datasets, gene ontology, path-ways, protein-protein interactions (Saikat et al., 2020). The analysis revealed crucial pathways and regulators of the DEGs that may drive the patho-genesis of psychiatric disorders.
2.1 Retrieval of Datasets - We have obtained the transcriptomic datasets from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih. gov/geo/) (Clough and Barrett, 2016). We obtained five transcriptomic datasets of SCZ that satisfied our selection criterion of being human brain transcrip-tomic data. We obtained datasets based on criteria i) redundancy-several datasets were generated based similar states or condition ii) typology-datasets should have accurate structure iii) relevance-datasets should be relevant to specific pathologies studies in this study, and iv) species-datasets must be generated from human source.
2.2 Transcriptomes Profiling and Statistical Analyses - For this study, we collected 15 brain datasets of Schizophrenia and its psychiatric comor-bidities from the NCBI Gene Expression Omnibus (GEO) database (Clough and Barrett, 2016). The characteristics and details of the datasets are pre-sented in Table 1. We have independently conducted differential analysis of data sets using the widely used R package Limma. The statistical threshold condition of p-value<0.01 was considered for scree-ning statistically significant differentially expressed genes (DEGs).
Table 1: Characteristics of employed datasets and differentially expressed genes
GEO: Gene expression omnibus; DEGs: differentially expressed genes; SCZ: Schizophrenia; MDD: Major depression disorder; OCD: Obsessive-compulsive disorder; AUD: Alcohol use disorder; ED: Eating disorder.
2.3 Gene Set Enrichment Analysis - Gene Set Enrichment Analysis (GSEA) is an analytical appro-ach to identify the class of genes through several statistical methods, which considered various biolo-gical functions, and/or regulation (Subramanian et al., 2005). There may be interrelation among these genes with disease phenotypes. GSEA compares genes obtained from transcriptomicsby analyzing differential expression amongst numerous disease states. These genes may be causative for disease and are considered in the list of up-regulated and down-regulated related to the phenotypic differences.
2.4 Ontology Analysis - Gene ontology (GO) is a conceptual model that may give significant bio-logical information that can be explored income-putable and well-known structures (Schriml et al., 2012; Satu et al., 2021). The GO term represents genes and theirattributes across all species. The Gene Ontology comprises the three terms: biological pro-cess, cellular component, and molecular functions. In this process, the pathological developments, experi-mental conditions, or temporal information are not captured. On the other hand, disease ontology (DO) signifies open-source ontology and sprawling infor-mation about inherited, developmental, and acquired human diseases (Schriml et al., 2012). The DO terms are used in this study for the corresponding diseases such as Schizophrenia DO ID: 5419, post-traumatic stress disorder DO ID: 2055, obsessive-compulsive disorder DOID: 10933, generalized anxiety disorder DO ID: 14320, major depression disorder DO ID: 1470, alcohol abuse disorder DO ID: 1574. These DO IDs are retrieved from https://disease-onto-logy.org/. The whole procedure of analyzing and visualizing the GO and DO terms have been imple-mented via the cluster profiler R package (Yu et al., 2012) in this study.
2.5 Pathway Enrichment Analysis - The identi-fication of pathways enriched by DEGs may provide critical signaling pathways involved in the patho-genesis (Satu et al., 2021). To identifythe molecular pathways, we used Kyoto Encyclopedia of Genes and Genomes (KEGG) databases to identify mole-cular pathways enriched by the DEGs among schizo-phrenia and its psychiatric comorbidities via cluster profiler R package (Yu et al., 2012).
2.6 Semantic Similarity - Semantic similarity is a technique for measuring the proximity between two terms based on ontologies by defining a topological similarity (Satu et al., 2021, Pesquita et al., 2008). Several annotation statistics of their shared ancestors have been used. In this study, the graph-based approach was employed for the comparisons of the relationship among individual terms (genes, GO, DO) (Satu et al., 2021). For this purpose, the Wang method fits for its graph-based mythology to be constructed on the topology inherited by the selected ontology (Pesquita et al., 2008).
2.7 Protein-protein interaction analysis - We conducted protein-protein interaction (PPI) analysis as described elsewhere (Rahman et al., 2021). We utilized several databases namely, BioGrid, Omni-Path, InWeb_IM to build the PPI network (Zhou et al., 2008). We then performed the module detection (i.e., densely connected component of the network) from this PPI using MCODE plugin software. To get a better insight into the functional importance of these modules, we then also performed pathway and enrichment analysis as described elsewhere (Zhou et al., 2008). Three best-scoring terms by p-value have been presented as functional descriptions of the modules.
3.1 Differential expression analysis of transcrip-tomes between Schizophrenia and its major psy-chiatric disorder - In this study, we employed bio-informatics methodologies to identify DEGs, gene ontologies, and pathwaysthat are common and associated with schizophrenia and its major psy-chiatric comorbidities. We comprehensively searc-hed the GEO database to identify several gene expre-ssion profiling datasets of schizophrenia (SZ) and its major psychiatric comorbidities, namely, major depression disorder (MDD), obsessive-compulsive disorder (OCD), substance use disorder (eating disorder and alcohol use disorder (AUD). Taking these datasets, we processed and analyzed the trans-criptomic data to identify DEGs via R package limma at p-value<0.01. The statistical summary of the employed datasets is presented in Table 2. We compared the identified DEGs with SCZ and its selected/associated comorbidities, we observed deregulated expression of three DEGs, namely, HAPLN1, CNDP1, SLC12A2 in SCZ and patholo-gies, which were common among the selected patho-logies suggesting the selected disorders are comor-bidities of SCZ. Fig 1 shows the common genes between SCZ and selected comorbidities. Then, we detected common genes in pairwise comparison that demonstrated SCZ has four DEGs (HIF3A, HAPLN1, CNDP1, SLC12A2) common with AUD, MDD, OCD. A gene signature consisting of six genes (ENPP2, HAPLN1, DTNA, TMTC4, CNDP1, SLC12A2) were common between SCZ and MDD, AUD, ED. Furthermore, it was observed a common gene signature (ACKR1, TBC1D2B, COL6A2, ADA-M22, MAPK11, HAPLN1, MTHFR, ZNF493, CLIP1, CMTM3, CNDP1, SLC12A2, DCT, MAP2K7, DIAP-H1) between SCZ and MDD, OCD, ED. While we detected several genes common between SCZ and AUD, OCD, ED (CEBPD, HSBP1, CREB1, HAP-LN1, MRVI1, NTM, ANGPTL4, CNDP1, SLC12A2, ZNF114) were common between SCZ and alcohol use disorder, obsessive-compulsive disorder, eating disorder.
Fig 1: The Venn diagram shows the pairwise comparison of genes among schizophrenia and comorbidities.
Table 2: Summary statistics of significant differ-entially expressed genes
GEO: Gene expression omnibus; DEGs: differentially ex-pressed genes; SCZ: Schizophrenia; MDD: Major depre-ssion disorder; OCD: Obsessive-compulsive disorder; AUD: Alcohol use disorder; ED: Eating disorder
3.2 GO Pathways - To provide insights into the bio-logical processes, gene ontologies are important to dissect the molecular involvement of the DEGS. It is well-accepted bioinformatics methods to detect gene ontologies that may clarify the biological associa-tions. We sought to identify gene ontologies that are common between SCZ and its comorbidities. We obtained several significant gene ontologies that were significantly enriched by the DEGs of schizo-phrenia. The biological processes enriched by each dataset of SCZ are summarized below in Table 3.
We performed comparativeanalyses to identify com-mon gene ontologies between SCZ and selected pathologies. Fig 2 shows gene ontologies which were common between SCZ and comorbid patho-logies:
a) reproductive structure development;
a) reproductive system development;
b) response to antibiotic;
c) regulation of MAP kinase activity;
d) stress-activated protein kinase activity;
e) positive regulation of protein serine/threo-nine kinase activity;
f) stress activated MAPK cascade;
g) activation of protein kinase activity;
h) multicellular organism process;
i) ameboidal-type cell migration;
j) meiotic cell cycle;
k) modulation of chemical synaptic trans-mission;
l) regulation of trans-synaptic signaling;
m) regulation of protein kinase B signaling;
n) peptidyl-tyrosine phosphorylation;
o) peptidyl-tyrosine modification;
p) positive regulation of protein-kinase B sign-aling;
q) negative regulation of cytokine production.
Table 3: The biological processes enriched by each dataset of schizophrenia
3.3. Semantic similarity and KEGG enrichment - We have evaluated the similarities of the ontologies and pathways via semantic similarity approach. Our analysis showed the semantic connection of the DEGs (Fig 3).
At a semantic similarity value of 0.7, SCZ1 GSE 12654, SCZ2 GSE17612, SCZ3 GSE21138, SCZ4 GSE21935, SCZ5 GSE37981 was associated with several comorbidities particularly with MDD, OCD, AUD, ED. We then investigated the semantic simi-larity of the GO terms. The close associations based on semantic similarity of the GO terms were shown in Fig 4.
Fig 2: The bar plot shows the biological process.
Fig 3: Semantic matrix similarity of differential genes (from the first five GO terms). The three-letter suffix before the GSE codes are referred to the following: AUD, Alcohol use disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; OCD, Obsessive-compulsive disorder; PTSD, post-traumatic stress disorder; SCZ, Schizophrenia.
Fig 4: Semantic matrix of GO similarities (1st five GO terms).
Fig 5: Semantic matrix of DO terms.Legend: AUD, Alcohol use disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; OCD, Obsessive-compulsive disorder; PTSD, post-traumatic stress disorder; SCZ, Schizophrenia.
Notably, at a value over 0.7, we found SCZ1 GSE-12654, SCZ2 GSE17612, SCZ3 GSE21138, SCZ4 GSE21935, and SCZ5 GSE37981 datasets were clustered with all MDD and AUD. The ED and OCD are clustered at the score of semantic similarity 0.4. Then, we also performed semantic similarity ana-lysis of the DO terms, and our analysis showed, over a threshold of 0.2, MDD and OCD were most asso-ciated disorders to the SCZ row (Fig 5). In parti-cular, MDD has a similarity value of 0.39.
Finally, we analyzed the semantic similarity of KEGG pathways with investigated datasets (Fig 6). Our analysis showed major repetitive pathways among SZ data sets, are FoxO signaling pathway, MAPK signaling pathway, transcriptional misregu-lation in cancer, cellular senescence, cell cycle, PI3-Akt signaling pathway, prostate cancer, TNF signal-ing pathway, and TGF-beta signaling pathway. Most of these pathways are also common between SCZ and other pathologies (Fig 6).
Fig 6: KEGG pathways overrepresentation analysis of differential genes. Row represents pathways related to diseases; columns shows the data sets. The circle size is proportional to frequency of genes in the pathway.
Fig 7: PPI network on common differential genes between SCZ and its comorbidities. A) The merged PPI network of differentially expressed genes; B) The modules obtained in the PPI network; C) The ontologies are enriched by the identified modules.
3.4 Protein-protein interaction network analysis - To reveal the interactions of signaling molecules in the context of networks in cellular systems, we have performed PPI and cluster analysis with a global network to provide insight into the interactions of SCZ and comorbidities (Fig 7a). We utilized several databases namely, BioGrid, OmniPath, InWeb_IM to build the PPI network. We then performed the module detection (i.e., densely connected component of the network) from this PPI using MCODE plugin software. The MCODE detected eight densely connected modules (Fig 7b,c). To get a better insight into the functional importance of these modules, we then also performed pathway and enrichment analysis. Three best-scoring terms by p-value have been presented as functional descriptions of the modules (Fig 7c).
The core aim of this study was to design a systems biology approach to investigate the molecular asso-ciations of the complex disease Schizophrenia and its psychiatric comorbidities. The study leveraged publicly available information and multi-omics data-sets toidentify potential interconnections of SCZ and other pathologies. Gene expression profiling is a rapid and extensively utilized technique to identify significant genes and markers of disease compared to controls (Rahman et al., 2021), this expression profiling data of SCZ was the initial step of our approach. GSEA is a well-known bioinformatics method that explains the involved pathways, bio-logical process, relationship of other pathologies, and works as a bridge among various levels of omics data utilizing different ontologies such as GO and DO (Rahman et al., 2021).
Another interesting approach of bioinformatics is a semantic similarity that quantifies or measures the closeness of different datasets (i.e., omics data) based on selected ontology without considering the statistical parameters or measures. However, based on implementing these approaches we have designed a bioinformatics approachto study interconnections of SCZ with other pathologies. Our approach reveals significant key genes, biological processes, cellular pathways, signaling molecules. Among the key genes, namely, HAPLN1, CNDP1, SLC12A2 which were critical components at the transcriptome levels in the development and progression of SCZ and other selected pathologies i.e., MDD, OCD, AUD, ED. HAPLN1 is one of the major components of peri-neural structure which are principal components of the extracellular matrix of neuronal structure in central nervous systems including hippocampus, cerebellum, spinal cord (Zimmermann et al., 2008). This perineurite structure is usually deregulated and acknowledged in neurodegenerative diseases. The mRNA expression levels of HAPLN1 have been observed significantly upregulated in neuronal diff-erentiating cells compared to non-differentiating cells (Zimmermann et al., 2008).
However, the upregulation of mRNA expression, intracellular protein levels of HAPLN1 were reduced suggesting that after neuronal differentiation the HAPLN1 was secreted into the extracellular milieu. The HAPLN1 has an incredible role in formations of perineurite structure which is initiated from the embryonic stages. From this point of view of the critical role in neuronal differentiation of HAPLN1, we suspect the developmental defects may be asso-ciated with predisposition to SCZ. The CNDP1 encoded protein is exclusively expressed in the brain. In proteomic profiling of neuropsychiatric study, CNDP1 was significantly down-expressed in cerebrospinal fluids in SCZ patients in comparison to healthy control individuals (Al Shweiki et al., 2020), which corroborates our finding that CNDP1 is one of the critical markers of psychiatric illness. SLC12A2 encodes a Cl(-)-importing cation-Cl(-) cotransporter which is involved in the γ-amino-butyric acid (GABA) neurotransmission. The per-turbed GABA neurotransmission in the prefrontal region of the brain has been suggested to be asso-ciated with SCZ pathogenesis. The SNPs in the SLC12A2 have been previously described to deve-lop an increased risk of SCZ via dysregulated ex-pression at the mRNA levels. Moreover, later funct-ional missense functional mutation of SLC12A2 in human SCZ has been discovered suggesting gene-tically regulated alterations may be involved in the etiopathogenesis of SCZ (Merner et al., 2016; Panic-hareon et al., 2012).
The gene ontology pathways that are common between SCZ and psychiatric comorbid disease were explored and several crucial pathways are revealed implicated in pathogenesis,for instance, MAP kinase activity pathway and stress-activated MAPK cascade pathways havea crucial role in neuronal functions including synaptic plasticity, learning, memory, and cell survival. The MAP kinase activity in the patho-genesis of SCZ is increasingly being recognized. Recent evidence suggests the MAP kinase pathway via ERK signaling pathway contributes to the pathogenesis of SCZ particularly in the cerebellum region of the brain. Our approach also revealed syn-aptic dysregulations (modulation of chemical syn-aptic transmission; regulation of trans-synaptic sign-aling) as a prominent feature of psychiatric disorders and neurologicaldisorders (Lepeta et al., 2016). The synaptic functions and perturbed synaptic commu-nications are widely accepted to be associated with psychiatric and neurological diseases (Lepeta et al., 2016). Recent findings show that major psychiatric pathologies are connected to synapse pathology characterized by perturbed synaptic signaling and perturbations,synapse loss, altered density, and mor-phology of the dendritic spine (Wang et al., 2018; Van Spronsen et al., 2010). Thus, the synapse is an essential focus for therapy to delay the development and retain cognitive and functional capacities thro-ughout the disorder. Among the several KEGG pathways were detected via the semantic similarity approach, FoxO pathway was significantly common among the SCZ and other pathologies (Santo et al., 2018).
The proteins of the FoxO generally occur in the whole body but they are selectively expressed in central nervous systems (CNS) (Santo et al., 2018). They have been proposed to regulate the stem cell proliferation and survival of differentiated cells (Santo et al., 2018). Considering these crucial roles in cellular functions in CNS, they have been sugges-ted as therapeutic targets for various neurological diseases (Maiese, 2015). Taking this importance of FOXO protein and its associated pathways, these complex interactions impact apoptosis and auto-phagy, which uncover potential for therapeutic stra-tegies in the treatment of neurological disease. Another prominent significant pathway, identified in this study, the MAPK signaling pathway and PI3-Akt signaling pathway have been previously known to contribute to cellular proliferation, survival, diff-erentiations, cell death and neural plasticity (Kim and Choi, 2010; Matsuda et al., 2019). Perturbations of this pathway have been suggested to be imply-cated in neurodegenerative disease and psychiatric illness. A study revealed the MAPK along with cAMP pathways were significantly altered in the frontal cortex region inschizophrenia suggesting hypoglutamatergic functions of SCZ (Funk et al., 2012; Yuan et al., 2010). These findingsare con-sistenet with our findings that the MAPK pathway which is a key intracellular pathway may contribute to the pathophysiology of psychiatric illness. PI3-Akt signaling pathway participates in the neural plasticity, and aberration of these pathways may be involved in the development and progression of SCZ. Previous findings showed the synaptic dys-functions are associated with the development of SCZ either at the initial stage of brain synaptic circuit development or later modulating the synaptic plasticity.
Alterations of PI3-Akt pathways were previously shown involved in SCZ (Kalkman et al., 2006; Law et al., 2012; Zheng et al., 2012) that are consistent with our observation. This PI3-Akt pathway has been described asthe target in the treatment of psychiatric disorders. The TNF signaling pathway was identified significantly in this study, a study assessed this pathway to decipher its role in SCZ and bipolar disorder using plasma and brain dorsolateral prefrontal cortex (Hoseth et al., 2017). Their study suggests increased expression of markers TNF sign-aling pathways in plasma without corroborating gene expression increase in blood cells; however, their role in the dorsolateral prefrontal cortex was also uncertain. The involvement of immune systems and cytokines are hypothesized as a possible cause in the etiology of SCZ. A study demonstrated that inflame-matory markers including IL-6 and TGF-beta were significantly overexpressed in patients with schizo-phrenia (Ergün et al., 2018), which is consistent with our observations that the TGF-beta pathway is enriched in SCZ and its comorbidities. Despite the critical findings from this study, several points should be noted as limitations in our approach. The availability of brain data for psychiatric illness is rare, thus we could assess all types of psychiatric comorbidities, this approach is based on the inte-gration of heterogeneous diseases thus the influence of covariates may affect the outcome, etc. Since the findings are extensively on bioinformatic integra-tions, thus interpretations of the results should be made with caution.
The present study aimed to understand the inter-connections of SCZ and its major psychiatric comorbidities based on an integrative bioinformatics approach. We developed analytical pipelines to de-code molecular functions and processes dysre-gulated between SCZ and its related pathologies. The analysis showed three genes namely, HAPLN1, CNDP1, SLC12A2 as key DEGs shared between SCZ and other psychiatric pathologies. The findings also highlight several interesting pathways mainly involved in synaptic plasticity and synaptic activity regulations. The FoxO signaling pathway, MAPK pathway, PI3-Akt pathway, TNF signaling pathway, and TGF-beta signaling pathway came into promi-nence at crucial pathways that may involve the etiopathogenetic mechanism of SCZ and associated pathologies. This designed pipeline will be of great interest to biological scientists and clinicians to study SCZ and associated comorbidities. To further evaluate the clinical significance of the study, we may suggest further studies by the clinical researcher.
Data Availability
All the utilized data are available at public database. The codes used in this study will be provided upon reasonable request to corresponding author.
Not applicable.
The authors declare no conflict of interest.
Academic Editor
Dr. Abduleziz Jemal Hamido, Deputy Managing Editor (Health Sciences), Universe Publishing Group (UniversePG), Haramaya, Ethiopia.
Assistant Professor, Dept. of Biotechnology and Genetic Engineering, Islamic University, Kushtia, Bangladesh.
Islam MM, Auwul MR, Begum MM, and Faruquee HM. (2021). Bioinformatics and multi-omics approach to identify comorbidities with application in Schizophrenia with psychiatric disorders, Eur. J. Med. Health Sci., 3(2), 35-47. https://doi.org/10.34104/ejmhs.021.035047