• No results found

neurodevelopmental and psychiatric disorders

N/A
N/A
Protected

Academic year: 2022

Share "neurodevelopmental and psychiatric disorders "

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

The role of RFX-target genes in

neurodevelopmental and psychiatric disorders

Abhishekapriya Ganesan

Degree project in bioinformatics, 2021

Examensarbete i bioinformatik 45 hp till masterexamen, 2021

Biology Education Centre, Uppsala University, and Department of Women’s and Children’s Health, Karolinska Institutet

Supervisors: Kristiina Tammimies and Michelle Watts

(2)
(3)

ABSTRACT

Neurodevelopmental disorders such as autism spectrum disorder (ASD) and psychiatric disorders, for example, schizophrenia (SCZ) represent a large spectrum of disorders that manifest through cognitive and behavioural problems. ASD and SCZ are both highly heritable, and some phenotypic similarities between ASD and SCZ have sparked an interest in understanding their genetic commonalities. The genetics of both disorders exhibit significant heterogeneity. Developments in genomics and systems biology, continually increases people’s understanding of these disorders. Recently, pathogenic genetic variants in the regulatory factor X (RFX) family of transcription factors have been identified in a number of ASD cases. In this thesis, common genetic variants and expression patterns of genes identified to have a conserved promotor X-Box motif region, a binding site of RFX factors, are studied. Significant common variants identified through expression quantitative trait loci (eQTLs) and genome wide association studies (GWAS) are mapped to the regulatory regions of these genes and analysed for putative enrichment. In addition, single-cell RNA sequencing data is utilised to examine enrichment of cell types having high X-Box gene expression in the developing human cortex.

Through the study, genes that have eQTLs or SNPs in the genomic regulatory regions of the X-Box genes have been identified. While there were no eQTLs or GWAS SNPs in the X-Box motifs, in the X-Box promoter regions there were 247 eQTLs, 359 ASD GWAS SNPs, 16 eQTL-ASD GWAS SNP overlaps, 745 SCZ GWAS SNPs and 27 eQTL-SCZ GWAS SNP overlaps. By hypergeometric distribution testing and the subsequent p-values obtained, all of these distributions are statistically under-enriched. Further, major cell types in the cortical region with increased expression of the X-Box genes and most expressed genes among these enriched cell types have been identified. Among the 11 cell types seven were found to be enriched for X-Box genes and many of the most expressed genes in these cell-types were similar. A further study into the cell types and genes identified, along with additional systems biological data analysis, could reveal a larger list of X-Box genes involved in ASD and SCZ and the specific roles of these genes.

(4)
(5)

ARE GENES CONTROLLED BY RFX FACTORS INVOLVED IN NEURODEVELOPMENTAL AND PSYCHIATRIC DISORDERS?

Popular Science Summary Abhishekapriya Ganesan

Neurodevelopmental and psychiatric disorders are a large group of disorders characterised by behavioural and cognitive defects. Neurodevelopmental disorders like autism spectrum disorders (ASD) and psychiatric disorders like schizophrenia (SCZ) are known to affect a significant number of people. Given the similarities in phenotype between these two disorders, understanding the commonalities in the underlying genetics is also of interest. Abnormalities in genes can cause ASD and SCZ. These abnormalities, unlike many of the well understood disorders, can vary highly among autistic and schizophrenic individuals. The genetic changes can be inherited or occur de novo, and variants can be classified as either rare or common variants. Transcription factors are a set of proteins that bind to certain regions of the DNA called regulatory regions and regulate the expression of associated genes. RFX transcription factors have recently been identified as ASD risk genes through identification of rare variants associated with ASD. RFX factors are known to bind to X-Box genes containing promotor X- Box motif sites. The aim of this study was to characterise regulatory targets of RFX factors through examining common variants in X-Box sites and patterns of X-Box gene expression in the brain. By studying X-Box gene regulation, the role of RFX factors and their downstream regulatory networks in ASD could be further understood and potential X-Box genes that may contribute to the pathogenicity of RFX alterations could be identified. Through this study, ASD and SCZ related genetic factors have been identified in the regulatory regions of the X-Box genes indicating that the effect of the genetic changes could potentially regulate the X-Box genes. Also, by analysing single cells from the developing human cortex, a number of cell types have been shown to have increased expression of X-Box genes, and specific genes that are highly expressed in each of these cell-types have been identified. Further studies of these genes could establish a stronger connection between the disorders and these X-Box genes.

Degree project in bioinformatics, 2021

Examensarbete i bioinformatik 45 hp till masterexamen, 2021 Biology Education Centre, Uppsala University, and Department of Women’s and Children’s Health, Karolinska Institutet

Supervisors: Kristiina Tammimies and Michelle Watt

(6)
(7)

CONTENTS

1. Introduction ... 1

1. Neurodevelopmental and psychiatric disorders ..………... 1

1.1. Autism Spectrum Disorder ……….…………... 1

1.1.1. Genetics of ASD ..………... 1

1.2. Schizophrenia ..……….…………... 2

1.3. Schizophrenia and Autism spectrum disorders: the connection ... 2

2. Regulatory factor X transcription factors ... 2

2.1. Target genes of RFX factors ... 3

2.2. RFX factors in psychiatric disorders ... 3

3. Expression quantitative trait loci ... 3

4. Genome wide association studies ... 4

5. Single-cell RNA sequencing analysis ... 4

5.1. Expression weighted cell type enrichment ... 4

2. Aims ... 5

3. Methods ... 6

Part-A 1. Analysis of eQTLs and X-Box genes ... 6

1.1. GTEx eQTL data ... 6

1.2. RFX target gene list ... 6

1.3. Background gene list: all brain expressed genes ... 6

1.4. Data processing and statistical analysis ... 6

1.5. Data visualization ... 7

2. Analysis of GWAS data with eQTLs and X-Box Genes ... 7

2.1. GWAS data ... 7

2.2. Data processing and statistical analysis ... 7

2.2.1. eQTLs – SNPs in X-Box promoters ... 7

2.2.2. Statistical analysis ... 7

2.3. Data visualization ... 8

3. Overall data processing ... 8

Part-B 1. Single cell RNA sequencing data-Expression weighted cell type enrichment analysis ... 8

1.1. Data processing ... 8

1.2. Data visualization ... 9

4. Results ... 10

Part-A 1. Investigating brain eQTLs in X-Box gene regulatory regions ... 10

1.1. eQTL distribution in X-box gene TSS flanking regions ……... 10

1.2. Proximity of eQTLs to regulatory regions ...………... 11

1.3. eQTLs in the promoter region of X-Box genes and background genes ... 13

2. GWAS SNPs within the promoter regions of X-box genes ...…... 14

2.1. A comparison of GWAS hits in X-Box gene promotors and background promotors ... 15

2.2. Intersection of eQTLs and GWAS SNPs in X-Box gene promoter regions ...…... 17

2.3. Visualizing genes with many eQTLs and GWAS hits in the promotor ... 18

Part-B 1. Single cell RNA sequencing analysis ... 18

1.1. Expression weighted cell type enrichment analysis of scRNA seq expression matrix .... 18

1.2. Identifying top genes in the level 1 annotation cell types with enriched expression ... 21

5. Discussion ... 23

6. Future work ... 26

7. Acknowledgement ... 27

8. References ... 28

9. Supplementary material ... 34

(8)
(9)

ABBREVIATIONS

ASD Autism Spectrum Disorder BED Browser Extensible Data CNV Copy Number Variation

DSM-5 The Diagnostic and Statistical Manual of Mental Disorders eQTL Expression Quantitative Trait Loci

GTEx Genotype-Tissue Expression GWAS Genome wide association studies

MANE Matched Annotation between NCBI and EBI RFX Regulatory factor X

scRNA seq Single-cell RNA sequencing SCZ Schizophrenia

SMR Summary-data-based Mendelian Randomization SNP Single Nucleotide Polymorphism

TF Transcription factors TSS Transcription Start Site WGS Whole Genome Sequencing

(10)
(11)

1

INTRODUCTION

1 NEURODEVELOPMENTAL AND PSYCHIATRIC DISORDERS

Neurodevelopmental and psychiatric disorders represent a large spectrum of disorders affecting the brain and manifesting through problems in areas such as cognition, learning, behavior, and emotion (Thapar, Cooper, and Rutter 2017; Miyoshi and Morimura 2010). In the case of neurodevelopmental disorders such as autism spectrum disorder (ASD), the growth and development of the brain is disrupted, and symptoms have early onset (Thapar, Cooper, and Rutter 2017). Psychiatric disorders such as schizophrenia (SCZ) can occur later in development during adolescence and adulthood (Thapar, Cooper, and Rutter 2017; Masi and Liboni 2011).

The challenging neurobiology of the brain impedes deeper understanding in the field. Notably, neurodevelopmental and psychiatric disorders are highly heritable and studying the genetic basis of these disorders would help further our understanding of the biology of these disorders (Lotan et al. 2014).

1.1 AUTISM SPECTRUM DISORDER

ASD is a neurodevelopmental disorder with varying severity and a wide spectrum of behavioural and cognitive characteristics (Lai, Lombardo, and Baron-Cohen 2014). In Europe ASD is estimated to affect 1 in 89 children (“ASDEUExecSummary27September2018.Pdf”) and incurs a substantial economic burden. Based on the Diagnostic and statistical manual of mental disorders (DSM- 5), the diagnosis of ASD is determined by persisting social communication defects, repetitive activities and behaviours, impairment in social situations due to these symptoms and presentation of symptoms during early developmental stages (Association 2013). ASD is often found to be comorbid with other neurodevelopmental disorders (Simonoff et al. 2008) and psychiatric disorders (Lugo Marín et al. 2018). Although, there is currently no reliable biomarker(s) for the diagnosis of ASD, in a minority of cases, a molecular diagnosis can be established through genome-wide genetic tests like whole exome sequencing (Tammimies et al. 2015; Rossi et al. 2017).

1.1.1 GENETICS OF ASD

Genetics play a major role in ASD risk. ASD is highly heritable (Sandin et al. 2017; Tick et al.

2016) with an estimate of 90% of phenotypic variants being heritable (Sandin et al. 2017) and the genetic factors that play a role in the disorder are highly heterogeneous (Betancur 2011) with contributions from common and rare as well as inherited and de novo genetic variants (Betancur 2011; An and Claudianos 2016). Many ASD sequencing studies have identified hundreds of ASD-associated genes (Betancur 2011), and rare variants in these genes account for about 10-30 % of the ASD cases (Ronemus et al. 2014; Ohashi et al. 2021).

Although a high number of genes involved in ASD have been identified their function in ASD etiology is not clearly understood. Mechanistic understanding of the molecular effects would prove useful in the treatment of ASD. An effort to understand the mechanisms has indicated that ASD genes are interconnected with regulatory gene networks, such that ASD gene

(12)

2

variations cause trans-regulatory effects and also cause the cis-regulatory targets to regulate proximal ASD genes (Iakoucheva, Muotri, and Sebat 2019).

Currently, relatively little is known about the regulatory variants in ASD and how they add to the genomic landscape. For instance, studying rare variants affecting promoter regions, expression quantitative trait loci (eQTLs), and genetic variants associated with gene expression levels could reveal additional levels of genetic associations in ASD (Cheng, Quinn, and Weiss 2013; Persico and Napolioni 2013). To better understand the role of identified genes and biological networks underlying the disorder, detailed studies investigating expression patterns and regulatory effects using relevant tissues and cells are needed.

1.2 SCHIZOPHRENIA

SCZ is a psychiatric condition with heterogeneous behavioural and cognitive manifestations (Owen, Sawa, and Mortensen 2016). It occurs at a rate of 15.2 per 100,000 people and is associated with an increased (2-3 fold) risk of death (McGrath et al. 2008). It incurs significant economic as well as emotional burden. As per the DSM-5 SCZ can be diagnosed by persistent presentations of any two of the symptoms such as delusions, hallucinations, incoherent speech or disorganized behaviour which cause disturbances in social interactions and normal functioning for at least six months (Association 2013). SCZ is also highly heritable (Hilker et al. 2018) having estimated heritability of about 81% (Sullivan, Kendler, and Neale 2003) with contributions from common and rare mutations (Mojarad et al. 2021).

1.3 SCHIZOPHRENIA AND AUTISM SPECTRUM DISORDER: THE CONNECTION SCZ and ASD share phenotypic similarities related to difficulties in social interactions and cognition. Due to this, in recent times, there has been an increase in the efforts to identify whether there may be a link between the two and to understand the molecular basis of commonalities between ASD and SCZ (Pinkham et al. 2008; De Crescenzo et al. 2019).

2 REGULATORY FACTOR X TRANSCRIPTION FACTORS

Transcription Factors (TFs) encompass a wide range of proteins involved in transcription initiation and regulation of gene expression. They contain a DNA binding domain which enables them to bind to specific regions in the DNA, namely enhancers and promoters, initiating transcription through the formation of the transcription initiation complex which then causes transcriptional regulation (“General Transcription Factor / Transcription Factor | Learn Science at Scitable”).

In mammals, the RFX (Regulatory factor binding to the X-Box) TF family consists of eight RFX isoforms, the majority of which are expressed within the brain (Choksi et al. 2014). The RFX family shares a characteristic winged-helix type DNA binding domain.

(13)

3 2.1 TARGET GENES OF RFX FACTORS

In different organisms, RFX TFs are known to control genes involved in various pathways from DNA repair (Zaim, Speina, and Kierzek 2005) to the regulation of cells involved in immune response (Senti et al. 2009) and the formation of cilia on cell surfaces (Choksi et al. 2014).

Such a broad range of functions implies that alteration to the RFX family of transcription factors and their targets could affect various cellular pathways.

The RFX family of transcription factors are regulators of ciliogenesis gene networks and contain a highly conserved DNA binding domain that recognizes regulatory targets through X- Box motifs in promotor regions (Efimenko et al. 2005). The structural arrangement of a target gene and its X-Box motif are represented in Fig 1.

FIGURE 1: Representation of X-Box and transcription start site with respect to the downstream gene.

2.2 RFX FACTORS IN PSYCHIATRIC DISORDERS

There are numerous mutations identified in RFX factors and primarily RFX3 that link it to ASD and SCZ. A recent analysis of the de novo and inherited variants of RFX3 identified 15 distinct variants of various types, including two frameshift variants and eight missense variants (Harris et al. 2021). The study has also shown that there is an enrichment of RFX ChIP-seq peaks in known ASD risk genes, specifically in the cis-regulatory regions (Harris et al. 2021). The expression patterns of RFX TFs in developing and adult human cortex showed the highest expression for RFX3 and RFX7. RFX3 was expressed in glutamatergic layer 2/3 neurons and astrocytes in the adult cortex, and RFX7 in inhibitory and excitatory neurons. Such patterns may indicate that they play a role in altering cell fates during development or impact the functioning of the mentioned neuronal cell types and astrocytes (Harris et al. 2021). There is also evidence of RFX3 being implicated in SCZ through copy number variants (CNVs) (Sahoo et al. 2011). Additionally, as RFX3 contributes to development and functioning of cilia (El Zein et al. 2009), the connection between neurodevelopment and ciliary function can be an aspect of investigation.

3 EXPRESSION QUANTITIATIVE TRAIT LOCI

Expression quantitative trait loci (eQTLs) are common genomic variants that explain a fraction of the variation in a particular gene expression phenotype (Nica and Dermitzakis 2013). eQTLs are identified by analysing genetic variations and gene expression data to find associations between variants and gene expression. eQTL data can be trans or cis QTL data, where trans

(14)

4

indicates the eQTL is on a regulatory region of a gene farther from the eQTL, cis indicates the eQTL is close to the gene and likely of higher importance as proximity is typically associated with greater regulatory control. The Genotype-Tissue Expression (GTEx) project is a public resource to study gene expression and regulation in different tissues. As it is a comprehensive data source, an exhaustive list of single tissue cis QTL data for 13 different regions of the brain, used in this study was obtained from the GTEx portal.

4 GENOME WIDE ASSOCIATION STUDIES

To understand the contributions of common genetic variants to a disease, large scale genomic studies are required. Genome wide association studies (GWAS) utilize high-throughput, single nucleotide polymorphism (SNP) arrays to screen hundreds or thousands of individuals to identify significant associations of SNPs with the phenotype in question (Manolio 2010) .

eQTL data can also be intersected with GWAS statistics to determine if disease-associated SNPs from GWAS are involved in regulating gene expression. In many cases GWAS SNPs that are in linkage disequilibrium, i.e. SNPs which are non-randomly associated, span multiple genes. eQTLs aid in the selection of the causal genes in such a scenario (Jansen et al. 2017) .

5 SINGLE-CELL RNA SEQUENCING ANALYSIS

Single-cell RNA sequencing (scRNA-seq) is a method of sequencing RNA (transcriptomic data) isolated from single-cells (Tang et al. 2009). In comparison to other methods that analyse large tissues or bulk cells, looking at individual cells increases the resolution. This enables scRNA-seq analysis to identify expression profiles (cell states, phenotypes) of various cell types in a sample, along with genes in each cell-type (Hedlund and Deng 2018). This serves as another method of analysing and identifying risk genes associated with disorders such as ASD and SCZ.

5.1 EXPRESSION WEIGHTED CELL-TYPE ENRICHMENT

Given the varied genes contributing to brain-related disorders, identifying the cell types that are affected and which stages they may be affected in, would deepen the understanding of neurodevelopmental and psychiatric disorders. This can be enabled by analysing the expression level probability distribution of the genes with known susceptibility to a particular disorder with respect to the average expression for the cell types in normal scenario. The enrichment analysis utilizes the expression matrix generated in single-cell sequencing studies to statistically evaluate the chance of expression of a given set of genes in the cell types included in the matrix (Skene and Grant 2016). This has mainly been impactful in analysing polygenic disorders, where identification of the impacted cell-types and progression of the disorders can be understood (Skene and Grant 2016; Amar et al. 2021; Kim et al. 2021).

(15)

5

AIMS

This master thesis had two goals to achieve:

Part - A: Map and analyse common variants occurring in X-Box genes

The regulatory variants within X-Box genes were to be mapped and their enrichment in ASD and SCZ was to be investigated.

To achieve this aim, a two-part approach was taken up. The first part was, mapping of significant eQTLs to the regulatory regions of RFX target genes and all brain expressed genes.

The next part was, mapping of the ASD and SCZ GWAS SNPs, and overlapping eQTL-GWAS SNPs to the regulatory regions of RFX target genes and all brain expressed genes. Finally, the enrichment of these variants in ASD and SCZ was to be identified.

Part – B: Determine enrichment of X-Box genes in neuronal cell-types using scRNA-seq data

Expression matrix obtained from analysis of scRNA-seq data of the human foetal cortex was to be used to accomplish this aim. The expression matrix was to be used to identify enriched cell-types, that had higher expression of the X-Box genes. Further, specific genes that may be highly expressed in the enriched cell types were to be identified.

(16)

6

METHODS

PART-A

1 Analysis of eQTLs and X-Box genes

1.1 GTEx eQTL data

Single-tissue conditionally independent cis eQTL data for all brain regions (13 regions in total) were obtained from the GTEx portal. The 13 lists were merged and modified in R program using the dplyr package (version 1.0.5). A browser extensible data (BED) file was generated containing only columns with the eQTL chromosome, eQTL start and stop position, the associated gene, and the tissue region they were found in.

1.2 RFX target gene list

The curated list of conserved human X-Box sites, obtained from the collaborator Dr Peter Swoboda (KI) was used (Henriksson et al. 2013). The list contained information of 999 X-Box motifs, associated genes and their transcription start site (TSS) and strand details. The X-Box list was processed to contain the chromosome, start and stop position and the corresponding gene and strand information. There were three X-Box BED files used with positional information for, 1) the X-Box motif sites, 2) the TSS flanking region of the X-Box genes, and 3) the promoter regions of the X-Box genes. The TSS flanking regions were classified as 2,500 base pairs upstream and 1000 base pairs downstream of the transcription start site. The TSS flanking regions had been selected to scan for the X-Box motifs and the same regions were retained and used in this study to analyse the common variants. The list of promoter regions of the X-Box genes was obtained by selecting the 2000 base-pair region upstream of the transcription start site for each gene.

1.3 Background gene list: all brain expressed genes

A background gene list of all brain expressed genes was generated from the human protein atlas which contains 16227 human brain expressed genes. The background gene list was used for comparison with genes of interest. To extract promotor regions for these genes with multiple transcripts the Matched Annotation between NCBI and EBI (MANE) select transcript was used. This provided a main transcript for each gene based on various annotations. Ensembl transcript IDs were then used to extract the chromosome, strand and TSS information using Ensembl BioMart. For the gene promoters as before the 2000 base pairs region upstream of the transcription start site of each gene was taken, forming the background promoter list.

1.4 Data processing and statistical analysis

Analysis of eQTLs was performed on the Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) server (codes in supplementary material S1) after loading of the necessary bioinformatics tools module and the BEDTools module (version 2.29.2). This was followed by lexicographic (alphabetical) sorting of bed file entries using the BEDTools sort function. Finally, the BEDTools intersect function was used to identify common entries

(17)

7

among the input files. The intersection of eQTLs with the X-Box sites, X-Box genes, the X- Box gene promoters, and the background promoter regions was performed.

Using the BEDTools closest function the distance of eQTL hits from both the X-Box motif sites and the gene TSS were determined. To summarise the eQTL distances from the respective regulatory sites, the distance data was processed and binned in set intervals up to 2000 base pairs upstream and 2000 base pairs downstream of the regulatory sites for a total 4000 base pairs region. Next, the TSS was alone considered and their data was binned again in set intervals up to 2000 base pairs upstream and 2000 base pairs downstream of the TSS.

This was followed by enrichment testing of the brain region-wise eQTL distributions in the X- Box promoters versus background promoters using the hypergeometric p-value calculator.

1.5 Data visualization

Using the ggplot2 package (version 3.3.3) in R, the distribution of eQTLs in different brain regions was visualized in a bar plot.

The distribution of eQTLs at different distance intervals from the X-Box sites and TSS were also visualized in R through the ggplot2 package (version 3.3.3), as percentage plots.

Then, the distribution of eQTLs in different brain regions for the target gene promoters and background promoters were plotted.

2 Analysis of GWAS data with eQTLs and X-Box Genes

2.1 GWAS data

For the GWAS analysis, with respect to ASD, the data from Grove et al.(2019) was used. The ASD GWAS study was conducted using 18,381 ASD patients’ data. For SCZ the data from Consortium et al.(2020) was used consisting 69,369 patients’ data.

2.2 Data processing and statistical analysis

For both ASD and SCZ, the GWAS data was processed to ensure the required columns for bed file format were present. The files were then lexicographically sorted using BEDTools sort function. This was followed by individually intersecting the GWAS data with X-Box sites, X- Box gene TSS flanking regions, and the corresponding promoter sites as well as all brain expressed gene promoters using the BEDTools intersect function. (Codes in supplementary material S2)

2.2.1 eQTLs – SNPs in X-Box promoters

To identify overlapping eQTL-SNPs in the X-Box data, the eQTLs file was intersected with the GWAS files using BEDTools intersect. The resultant file was then intersected with the X- Box sites, X-Box genes and the promoters linked to this.

2.2.2 Statistical analysis

Hypergeometric p-value calculation was performed to identify enrichment of the GWAS SNP counts in the target gene promoters in comparison to the background.

(18)

8

Similarly, hypergeometric distribution calculation was done to test for enrichment of overlapping eQTL-SNPs in the X-Box gene promoters versus background promoters for both ASD and SCZ GWAS SNPs.

2.3 Data visualization

The distribution of ASD and SCZ GWAS SNPs in the X-Box and background promoter regions were plotted in R using ggplot2 package.

The eQTL-GWAS SNPs in X-Box promoters were visualized in R using ggplot2 for the distribution of overlapping eQTL-SNPs in different brain regions for the X-Box gene promoters.

The genes identified to have overlapping eQTL-GWAS SNPs in the promoters as well as those genes with a high count of either eQTLs or SNPs were then visualized using the UCSC genome browser.

3 Overall data processing

As the GWAS data was only available in the human genome version GRCh37/hg19 and the eQTL data was generated in the GRCh38/hg3 version, eQTL data was converted to the GRCh37/hg19 format using the UCSC genome LiftOver tool. The same was done for MANE select transcripts TSS sites as these transcripts are only annotated in ensemble GRCh38/hg38.

PART-B

1 Single cell RNA sequencing data – Expression weighted cell type enrichment analysis The second part of the project entailed utilizing the expression matrices from the single-cell RNA sequencing data (Accession number: SRP161906) (Fan et al. 2020). In the reference source single-cell RNA sequencing had been performed for cells in the human foetal cortical region followed by quality control and filtering which left 13,399 cells data which had then been clustered to identify major cell types and these were analysed for expression of 24,153 genes (Fan et al. 2020).

1.1 Data processing

The scRNA-seq data was formatted by combining the normalized matrix and metadata. This combined data-frame was then given as input, and the expression weighted cell-type enrichment (EWCE) package (version 0.99.2) in R was used with the X-Box genes as the target genes and the list of all brain expressed genes as the background. Bootstrap analysis was performed at 10000 reps. The bootstrap analysis performed was processed to obtain -log10(P- value) to identify enriched cell-types.

(19)

9

The single-cell data was subset using the X-Box gene list to only include the genes overlapping in the two datasets. The enriched cell types were segregated into individual matrices and then, the mean of expression was calculated for each gene in the X-Box gene subset of the single- cell data for each enriched cell type.

1.2 Data visualization

The standard deviation from the mean expression for each cell-type obtained through bootstrap analysis and the -log10(P-value) were both plotted in R using ggplot2 (version 3.3.3).

After identification of enriched cell-types a threshold value was set to filter the top genes which were then plotted in R using the pheatmap package (version 1.0.12) and RcolorBrewer package (version 1.1-2).

(20)

10

RESULTS

PART – A

1 Investigating brain eQTLs in X-Box gene regulatory regions

The investigation of eQTL distributions could shed light on specific genes in the target gene list that may have a role to play in neurodevelopment after further analysis. The eQTLs obtained were from 13 brain regions amounting to a total of 109,792 eQTLs.

Part of the first goal was to report eQTLs identified within the X-box motifs of RFX target genes, however no eQTLs were identified within the X-Box motifs. Next, larger regions within the X-Box genes were analysed including the TSS flanking region as well as the proximal promoter regions of the X-Box genes.

1.1 eQTL distribution in X-box gene TSS flanking regions

In the initial analysis of eQTLs intersecting with X-Box gene TSS flanking region covering 3500 base pairs with 2500 base pairs upstream and 1000 base pairs downstream of the TSS, a total of 505 eQTLs were identified among 219 of the X-Box genes.

The frequency distribution of eQTLs identified in the X-Box associated gene TSS flanking regions was visualised as eQTL counts per gene (Fig. 2A).

Among the X-Box gene TSS flanking regions containing eQTLs, a majority had one to two eQTLs. There were six genes having greater than 10 eQTLs of which the gene Zinc finger 593 (ZNF593) had 17 eQTLs and the gene Cortexin 1(CTXN1) had 14 eQTLs.

Next, the distribution of eQTLS across the different brain regions was analysed by the count of eQTLs in each region of the brain (Fig. 2B)

Fig. 1B indicates that the highest proportion of X-Box associated gene eQTLs are found in the cerebellum, however, it is not clear from this data whether this is only due to the overall distribution of eQTLs or if it is specific to the distribution of X-Box gene eQTLs.

(21)

11

FIGURE 2. eQTL occurrences in X-Box gene TSS flanking regions spanning 3500 base pairs. A. Number of genes as a function of count of eQTLs in each gene. B. Distribution of eQTLs in different brain regions, plotted by counts of eQTL in each region

1.2 Proximity of eQTLs to regulatory regions

In order to further identify eQTLs that may be the most relevant to gene regulation, the distance of eQTLs was calculated from regulatory regions. Then, the distribution of eQTLs was analysed by binning the distance into various intervals. To limit spread of data, the distances were binned and plotted only up to 2000 base pairs upstream and 2000 base pairs downstream of the regulatory regions.

From the X-Box site to 2000 base pairs in both directions, there were 543 eQTLs identified.

After examining different intervals within this region, the highest number of eQTLs were found within 500 – 1000 base pairs on both sides of X-Box motifs, with 162 eQTLs. Similarly, from the TSS to 2000 base pairs in both directions, 545 eQTLs were identified, among which the highest number of eQTLs (176) were in the 500-1000 base pairs interval on both directions from TSS.

(22)

12

Since, the X-Box motifs can be located anywhere from few tens to hundreds of base pairs away from TSS, there is high overlap in the observed data. As there were similar number of eQTLs and high amount of overlap observed through both of these analyses, 2000 base pairs upstream (comprising promoter) of TSS was analysed to identify 247 eQTLs and the eQTL distribution 2000 base pairs downstream (comprising part of gene body) of TSS identified 298 eQTLs.

To determine whether there was any brain-region specific pattern of eQTLs close to TSS, the data of eQTL distributions within 2000 base pairs in both directions of the regulatory regions was visualized as the percentage of eQTLs in different brain regions at specified distance intervals upstream (Fig. 3A) and downstream (Fig. 3B) of the TSS.

No brain region could be distinguished to be specifically important, as at every distance interval the eQTLs were evenly distributed across each brain region.

FIGURE 3: Percentage distribution of eQTLs in different brain regions A. eQTL distribution at varying distances upstream of TSS B: eQTL distributions at varying distances downstream of TSS; Percentage calculations were done for the same 13 brain regions in both plots thus, the legend in Fig 2.A also applies to Fig 2.B.

(23)

13

1.3 eQTLs in the promoter region of X-Box genes and background genes

From the previous analysis, many eQTLs had been found in the 2000 base pairs flanking the regulatory regions. Therefore, the 2000 base-pair regions upstream of the TSS constituting the promoters were further analysed. There were 247 eQTLs identified in the X-Box gene promoters.

The distribution of the eQTLs in X-Box gene promoters across different brain regions was visualized as a distribution of raw eQTL counts per region (Fig. 4A).

For comparison, the raw counts of number of eQTLs in the background promoters was visualized across different brain regions (Fig. 4B).

FIGURE 4: Plot of raw counts distribution of eQTLs among the different brain regions A. eQTL distributions in the X-Box gene promoters; B. eQTL distributions in the background gene promoters. The x-axis labelling is common for both the plots.

(24)

14

In Fig. 4A and 4B, a higher proportion of promoter eQTLs is observed in the cerebellum in the X-Box gene promoters compared to the background. The statistical significance of this increase was to be tested.

Brain Region Xbox Count BG Count Xbox % BG % P-value

Amygdala 9 222 3.64 3.42 0.4711

Anterior Cingulate Cortex 12 385 4.86 5.93 0.2866

Caudate Basal Ganglia 20 601 8.10 9.25 0.3062

Cerebellar Hemisphere 29 762 11.74 11.73 0.5289

Cerebellum 44 920 17.81 14.17 0.0597

Cortex 26 691 10.53 10.64 0.5287

Frontal Cortex 17 522 6.88 8.04 0.2947

Hippocampus 17 411 6.88 6.33 0.3952

Hypothalamus 9 372 3.64 5.73 0.0916

Nucleus Accumbens Basal Ganglia 22 588 8.91 9.05 0.5239

Putamen Basal Ganglia 25 530 10.12 8.16 0.1516

Spinal Cord 11 294 4.45 4.53 0.5569

Substantia Nigra 6 197 2.43 3.03 0.3721

TABLE 1. Comparison of raw counts for the different brain regions between X-Box gene promoters and all brain expressed gene promoters (background) and the hypergeometric distribution p-values.

The hypergeometric distribution calculation-based p-value for cerebellum as listed in Table 1, indicates that significant differences are absent in all the regions; however, a trend was noted for increased cerebellum eQTLs occurring in the X-Box promoters (p-value 0.0597).

2 GWAS SNPs within the promoter regions of X-Box genes

To further investigate common variants in X-Box genes that may be relevant in neurodevelopmental and psychiatric disorders, genetic variants were mapped to X-Box regulatory regions from the ASD and SCZ GWAS datasets. Filtering SNPs with p-value less than 0.05 resulted in 620,887 SNPs in the ASD GWAS dataset and 1,081,200 SNPs in the SCZ GWAS dataset.

Initially, the prevalence of the SNPs in the X-Box motifs was analysed but as there was none to be found, the regulatory regions were analysed. Of the ASD GWAS SNPs, 359 were found in the X-Box gene promoters and 6157 in the background promoters. Among the SCZ GWAS SNPs, there were 745 SNPs in the X-Box gene promoters and 11,635 in the background promoters.

(25)

15

The distribution of ASD GWAS SNPs in the promoter regions of X-Box genes and background genes were compared (Fig. 5A and 5B). A similar distribution analysis was done for the SCZ GWAS SNPs (Fig. 6A and 6B).

2.1 A comparison of GWAS hits in X-Box gene promotors and background promotors

FIGURE 5: Plot of distribution of ASD GWAS SNPs A. ASD GWAS SNP counts in X-Box gene promoters; B. ASD GWAS SNP counts in background promoters

Among the 999 X-Box gene promoters, 359 intersected with ASD GWAS SNPs, and in the background 6,157 of the 14,525 promoter regions contained ASD GWAS SNPs.

Testing for expected SNP occurrences and their significance, resulted in the 423.46 expected GWAS SNPs in the X-Box gene promoters. With a p-value of 9.57e-06, the result indicated a significant 1.18-fold under enrichment i.e. significantly lower than expected SNPs are observed in the X-Box promoters.

(26)

16

FIGURE 6: Plot of distribution of SCZ GWAS SNPs A. GWAS SNP counts in X-Box promoter associated genes; B. GWAS SNP counts in all brain expressed genes’ promoters

Of the 999 X-Box gene promoters, 745 intersected with SCZ GWAS SNPs and 11,635 of the 14,525 background promoters intersected with SCZ GWAS SNPs.

Testing for significance of observed X-Box promoters resulted in, 800.23 expected GWAS SNPs in X-Box promoters. The p-value of 6.08e-06 indicated a statistically significant 1.07- fold under enrichment i.e. lower than expected prevalence of SCZ GWAS SNPs in the X-Box promoter sites.

(27)

17

2.2 Intersection of eQTLs and GWAS SNPs in X-Box gene promoter regions

To tie together the eQTLs and GWAS SNPs with the X-Box region, the distribution of the overlapping eQTLs and SNPs of each disorder GWAS dataset was analysed in the X-Box gene promoter regions and the background.

Table 2 summarizes counts of e-QTLs and GWAS SNPs and their overlap in X-Box promoters and background promoters, and the respective hypergeometric distribution test calculations.

Counts eQTLs

ASD GWAS

SNPs

SCZ GWAS

SNPs

eQTLs intersecting ASD GWAS SNPs

eQTLs intersecting SCZ

GWAS SNPs

X-Box promoters hits 247 359 745 16 27

Background promoters’ hits 6495 9157 11635 477 1016

p-value 7.18E-42 9.57E-06 6.08E-06 5.55E-05 5.46E-10

Fold enrichment -1.81 -1. 18 -1.07 -2.05 -2.59

TABLE 2: Summary of individual and overlapping eQTLs and GWAS SNPs for both ASD and SCZ

The overlap of X-Box promoters-eQTLs and ASD GWAS SNPs

Among the 999 X-Box gene promoters 16 intersected with overlapping eQTL-ASD GWAS SNPs and 477 of the 14,525 background promoters, intersected with overlapping eQTL- ASD GWAS SNPs.

The expected number of eQTL-SNPs in the X-Box promoters was calculated to be 32.8, thus the analysis shows that there were less eQTL-SNPs than expected (2.05-fold, p-value 5.55e- 05).

The overlap of X-Box promoters-eQTLs and SCZ GWAS SNPs

There were 27 out of the 999 X-Box promoters and 1,016 of the 14,525 background promoters intersecting with the overlapping eQTL-SCZ GWAS SNPs.

Significance testing resulted in 69.87 expected overlaps of X-Box promoter-eQTL-GWAS SNPs. Thus, the analysis shows that there was significantly 2.59-fold less than expected eQTL- SNPs (p-value 5.46e-10).

(28)

18

2.3 Visualizing genes with many eQTLs and GWAS hits in the promotor

Although most genes with eQTLs/GWAS SNPs had very few common variants, at least three genes had a higher density of overlapping eQTL-SNP occurrences within their promoter regions.

These genes are potentially of interest, as a higher density of eQTLs could indicate greater susceptibility to changes in brain expression from common variants in regulatory regions and a greater proportion of GWAS SNPs indicate a greater disease association.

UCSC genome browser enabled visualization of such higher eQTL or GWAS SNP distributions, with the possibility to change the focus and specifically visualize TSS flanking regions of interest. The gene Mitochondrial ribosomal protein L34 (MRPL34), had overlapping eQTLs and SNPs as seen in Fig. 7.

FIGURE 7: A snapshot from UCSC genome browser: the distribution of ASD and SCZ GWAS SNPs along with the eQTLs and X-Box motifs are represented for the promoter region of the MRLP34 gene. The gene site is indicated by the UCSC gene, which matches with the data used in this study.

PART-B

1 Single-cell RNA-seq analysis

To further understand the function of X-Box genes in neurodevelopment, single-cell data from the developing human brain was used. For this study, an expression matrix of single-cell RNA- seq data from different regions of developing human cortices published by Fan et al.(2020), was used to identify if the X-Box genes were specifically enriched in the identified cell types.

The data had been clustered into 11 main cell-types (level 1 annotation) and 30 sub-types (level 2 annotation).

1.1 Expression weighted cell type enrichment analysis of scRNA seq expression matrix Standard deviations from expected bootstrapped values and the p-values, were obtained by performing bootstrap enrichment test on the expression matrix. The negative log base 10 of the p-values were plotted and the Bonferroni p-value adjustment revealed the cell types with significantly high enrichment of X-Box genes.

(29)

19

FIGURE 8: Level 1 annotation cell-types expression in X-Box genes versus expected values of background genes calculated by bootstrapping A. Std. deviation of cell-type expression; B. -log10(p-value) of cell-type expression with indication of cell types that have significant p-value ( where base line was obtained using Bonferroni p-value adjustment). The cell types in level 1 annotation are: Pons-neu: projection neuron in pons; Oligo: oligodendrocyte; NPC: neuronal progenitor cells; IN_cor:

cortical inhibitory neurons; Immune: Immune cells; EX_cor: cortical excitatory neuron; Endo: Endothelial cells; CR: Cajal- Retzius cell; Astro: astrocyte; The early cluster included majority of the cells in gestational week 7/8 cortices and gestational week 8/9 pons.

(30)

20

FIGURE 9: Level 2 annotation cell-types expression in X-Box genes versus expected values of background genes calculated by bootstrapping A. Std. deviation of cell-type expression; B. -log10(p-value) of cell-type expression with indication of cell types that have significant p-value (where base line was obtained using Bonferroni p-value adjustment). The cell types in level 2 annotation are: Pons-neu: projection neurons in pons; Oligo: oligodendrocyte; NPC: neuronal progenitor cells; IN: inhibitory neurons; Immune: immune cells including microglias, macrophages and T-cells; EX: Excitatory neurons; CR: Cajal-Retzius cell; Astro: astrocyte; SMC: smooth muscle cell; VEC: vascular endothelial cell; VLMC: vascular leptomeningeal cell; The early cluster included majority of the cells in gestational week 7/8 cortices and gestational week 8/9 pons.

(31)

21

While the eQTL and GWAS data did not reveal any specific X-Box genes having higher expression, the single-cell expression matrix analysis revealed specific cortical cell types having a higher expression of X-Box genes with respect to background genes.

From Fig. 8A, 9 cell-types out of 11 had an increase in X-Box gene expression levels in comparison to the bootstrapped mean expression level, and from Fig. 9A, 23 out of the 30 cell types had increased X-Box gene expression levels.

In Fig. 8B and 9B the list of enriched cell-types was narrowed down by basal p-value of 0.05/30, to result in seven significantly enriched cell-types in level 1 annotation and 16 significantly enriched cell-types in level 2 annotation. For the level 1 annotation the cell-type enrichment showed a significantly enriched expression for the X-Box genes in Projection neuron in pons (PONS-neu), neuronal progenitor cells (NPC), cortical inhibitory neurons (IN_cor), cortical excitatory neuron (EX_cor), Cajal-Retzius cell (CR), astrocyte (Astro), and early cell cluster which included majority of the cells in gestational week 7/8 cortices and gestational week 8/9 pons (Early).

While the enrichment of seven out of 11 cell types could be indicative of enhanced X-Box gene expressions, this enrichment had to be scrutinised further to know if, the increased enrichment was an effect of select few genes or there was an overall increase in expression of all the X- Box genes.

1.2 Identifying top genes in the level 1 annotation cell types with enriched expression

As the next goal was to identify highly expressed X-Box genes using the scRNA-seq expression matrix data, the cell types identified to have higher expression of the X-Box genes were analysed.

Among the 999 X-Box genes 847 were being expressed in these highly enriched cell-types. A list of 35 genes which were highly expressed per cell type was obtained based on a cut-off value of 2.5 for the expression means. The expression of these genes was then visualized for each cell type as depicted in Fig. 10.

(32)

22

FIGURE 10: Heatmap of the most expressed genes in each of the 7 enriched cell types of level 1 annotation.

The cell types are: Astro: astrocyte; CR: Cajal-Retzius cell; The early cluster included majority of the cells in gestational week 7/8 cortices and gestational week 8/9 pons.; EX_cor: cortical excitatory neuron; IN_cor: cortical inhibitory neurons; NPC:

neuronal progenitor cells; Pons-neu: projection neuron in pons;

Observing expression patterns of the highly expressed genes through the heatmap indicated that all the enriched cell types had high expression of the top three genes which were the High Mobility Group Nucleosomal Binding Domain 2 (HMGN2), Cofilin 1 (CFL1) and Calmodulin 2 (CALM2). On comparing these genes with the genes having high incidence of eQTLs or GWAS SNPs, no overlapping genes were identified.

(33)

23

DISCUSSION

The thesis aimed at mapping common variants affecting genomic regions of X-Box genes and analysing the enrichment of these common variants in ASD and SCZ. The X-Box genes that were highly expressed in human foetal cortical cell types were also identified using scRNA- seq data.

First, the eQTL distributions were analysed in the X-Box datasets and the different brain regions. The attempt to identify eQTLs in the X-Box motifs resulted in no eQTLs being found.

Given that the X-Box motif is a small region of 14 base pairs, this was to be expected. Analysis of eQTLs in the X-Box gene TSS flanking regions lead to identification of a number of genes having high distribution of eQTLs. Among these genes, the ZNF593 gene was found to have the highest number of eQTLs i.e. 17 eQTLs. ZNF593 is zinc finger 593 protein coding gene is known to be involved in the modulation of DNA binding (Hayes et al. 2008). Such findings indicate that a few genes among the X-Box genes may be more interesting.

Analysing the brain regions to identify eQTL distribution patterns, indicated enrichment of cerebellum eQTLs in the X-Box gene TSS flanking regions and promoters which could have myriad outcomes. The study confirming RFX3 disruption causing cerebral developmental defects (Ha et al. 2019) indicates that the X-Box genes, containing high cerebellar eQTLs, may be involved in cerebellum development. Studies showing the role of cerebellum in cognitive functions (Buckner 2013) and emotional processing (Snow et al. 2014), could indicate that X- Box genes, identified to have increased cerebellar eQTLs, may be associated with cognitive and emotional neural processes. An additional analysis approach could compare the eQTL distributions in the whole gene body versus the upstream and downstream of the gene body itself, this may provide results that vary from what has been observed in this study but could comprehensively identify all the eQTLs in the gene bodies as well as compare them to the upstream promoters.

The next part of the analyses comprised studying the GWAS SNPs in the X-Box datasets.

Through the GWAS data analysis both in ASD and SCZ there was a lack of SNPs in the X- Box motifs, but a number of SNPs were present in the described gene regions and promoters.

Again, the absence of SNPs in the X-Box motifs can be attributed to the small size of the X- Box motifs. Looking at the gene region SNP distributions, the genes with highest number of GWAS SNPs were ORAI2 having 4 SCZ GWAS SNPs and FABP3 having 2 ASD GWAS SNPs. The ORAI2 codes for calcium release activated calcium modulator 2 protein which mediates Ca2+ currents and controls neuronal functioning with identified SCZ links (Moccia et al. 2015). FABP3 coding for fatty acid binding protein 3 is involved in transport of fatty acids and has known links to both ASD and SCZ (Shimamoto et al. 2014). While this finding indicates that among the X-Box genes, some are more interesting, such a small number of genes having higher distribution of variants could also mean that, overall, the list might not be regulated by common variants implicated in ASD and SCZ.

(34)

24

As for the SNPs identified in the X-Box promoters, there are evidences such as the study by Iotchkova et al. 2016, showing prevalence of SNPs in the regulatory regions is to be expected.

A study by Zhang et al.(2019) shows how common variants in promoter region of a gene significantly increase risk of disease, this can be extrapolated to this project where further analysis may reveal the disease risk that the promoter SNPs pose.

To tie together the two parts of the analysis described above, the eQTL and GWAS SNP data were studied for overlaps and the enrichment of these common variants in ASD and SCZ was analysed. Studying the overlap of GWAS SNPs and eQTLs has been proven to give insights into variant distributions and gene associations for a range of complex disorders (Zhao et al.

2019; Albert and Kruglyak 2015). On assessing the overlap in this project, there were 16 eQTL- GWAS SNPs in the X-Box gene promoters for ASD dataset and 27 in the SCZ dataset. Though the distribution was under-enriched, i.e. there were less than expected number of overlaps, the analysis of the effect of identified variants on increasing disease risk would prove the importance of these overlaps. A further analysis strategy that can be adopted to study the overlap of eQTLs and GWAS SNPs would be an SMR (Summary-data-based Mendelian Randomization) analysis, where genes influencing multiple phenotypes and having an association between gene expression and complex traits, can be identified (Pavlides et al.

2016).

The under-enrichments observed in eQTLs, GWAS SNPs and overlapping eQTL-SNP data for the target gene regulators versus the background regulators indicates there are less overall common variants than expected. Such an observation may be reflective of a lower amount of overall variation in these regions that may be caused by conservation of these genetic regions, which is also reflected in lower numbers of GWAS hits. In such a scenario, studying the impact of the identified hits would be of interest since these hits may have a larger effect. Alternatively, a possible cause of under-enrichment could be that all the variants were not identified in the variant identification studies. Only a fraction of variants being identified in studies like GWAS could be caused either by common variants with diminished effects or by extremely impactful, hidden rare variants (Gibson 2012). Studies on some ultra-rare variants have reported potential increase in disease risk in case of SCZ (Halvorsen et al. 2020) and rare variant analysis of ASD has resulted in identification of genes that may result in molecular and cellular defects associated with the disorder (Buxbaum 2009).

The under-enrichment of the common variants in addition to the lack of common genes among the few identified hits of the ASD and SCZ datasets resulted in inability to establish a link between the genetics of the two disorders. As common variants only explain a limited fraction of the disease traits, identifying and analysing the rare variants in target genes accounting for about 30% of the ASD cases (Ronemus et al. 2014; Ohashi et al. 2021), could help identify the causal genes for many of the ASD phenotypes and the underlying genetic make-up as well.

The understanding of X-Box genes involved in ASD and SCZ, could then be used to draw conclusions on their interconnectivity.

(35)

25

The final aim of this study entailed, the analysis of the expression matrix from the single-cell data which indicated, a high enrichment of X-Box genes among multiple cell types in the data.

Among the enriched cell types, a small number of X-Box genes showed high expression across all cell types and these genes could have potentially driven the enrichment. Further analysis could help identify whether the enrichment has been caused by overall enrichment of X-Box list or whether the enrichment is being driven by core genes that are highly expressed.

(36)

26

FUTURE WORK

Given that there were time constraints some extensions to the study done thus far could not be performed, but these can be incorporated to achieve a complete picture of the X-Box gene associations with the ASD and SCZ.

While there were no eQTLs and SNPs identified in the X-Box motifs, a number of them were found in the promoters, thus, the regulatory effect of these eQTLs and SNPs and the role of the X-Box genes in disorder conditions could be studied further along the lines of Zhang et al.

(2019) study.

Further studies need to be performed with single-cell sequencing data in the other brain regions to expand on the knowledge about region specific X-Box gene expression.

Whole genome sequencing (WGS) provides improved precision in variant identification (Höglund et al. 2019). The WGS data may thus identify an increased number of disease- associated variants and this dataset could be used to identify association of target genes with the disease. For instance, it might be more likely that rare variants in the X-Box motifs could be identified which have a more severe disease impact.

These additional analyses would serve as basis to ascertain the specific genes among the target gene list which may have a role to play in ASD, SCZ and other neurodevelopmental and psychiatric disorders.

(37)

27

ACKNOWLEDGEMENTS

I would like to acknowledge everyone who has been a part of and eased this journey of master’s degree and this thesis.

A heartfelt thanks to supervisor Kristiina Tamimmies for providing the opportunity to work on this project and being ready to provide guidance always. Another thanks to Michelle Watts for providing constant support, for the brainstorming sessions and accommodating the oddly time meeting requests. Additionally, I would like to thank all the members of the Tammimies group for the feedback and interactions. A thank you to Lars Feuk who agreed to be the subject reader and provided necessary and useful outsider perspective on the findings of the project and the due course of action. Lastly, I am ever grateful to the wonderful friends and family who have made this journey a breeze and have provided much needed moral and emotional support.

(38)

28

REFERENCES

1. Albert, Frank W., and Leonid Kruglyak. 2015. “The Role of Regulatory Variation in Complex Traits and Disease.” Nature Reviews Genetics 16 (4): 197–212.

https://doi.org/10.1038/nrg3891.

2. Amar, Megha, Akula Bala Pramod, Nam-Kyung Yu, Victor Munive Herrera, Lily R.

Qiu, Patricia Moran-Losada, Pan Zhang, et al. 2021. “Autism-Linked Cullin3 Germline Haploinsufficiency Impacts Cytoskeletal Dynamics and Cortical Neurogenesis through RhoA Signaling.” Molecular Psychiatry, March, 1–28. https://doi.org/10.1038/s41380- 021-01052-x.

3. An, Joon Yong, and Charles Claudianos. 2016. “Genetic Heterogeneity in Autism:

From Single Gene to a Pathway Perspective.” Neuroscience and Biobehavioral Reviews 68 (September): 442–53. https://doi.org/10.1016/j.neubiorev.2016.06.013.

4. “ASDEUExecSummary27September2018.Pdf.” n.d. Accessed May 19, 2021.

http://asdeu.eu/wp-

content/uploads/2016/12/ASDEUExecSummary27September2018.pdf.

5. Association, American Psychiatric. 2013. Diagnostic and Statistical Manual of Mental Disorders (DSM-5®). American Psychiatric Pub.

6. Betancur, Catalina. 2011. “Etiological Heterogeneity in Autism Spectrum Disorders:

More than 100 Genetic and Genomic Disorders and Still Counting.” Brain Research, The Emerging Neuroscience of Autism Spectrum Disorders, 1380 (March): 42–77.

https://doi.org/10.1016/j.brainres.2010.11.078.

7. Buckner, Randy L. 2013. “The Cerebellum and Cognitive Function: 25 Years of Insight from Anatomy and Neuroimaging.” Neuron 80 (3): 807–15.

https://doi.org/10.1016/j.neuron.2013.10.044.

8. Buxbaum, Joseph D. 2009. “Multiple Rare Variants in the Etiology of Autism Spectrum Disorders.” Dialogues in Clinical Neuroscience 11 (1): 35–43.

9. Cheng, Ye, Jeffrey Francis Quinn, and Lauren Anne Weiss. 2013. “An EQTL Mapping Approach Reveals That Rare Variants in the SEMA5A Regulatory Network Impact Autism Risk.” Human Molecular Genetics 22 (14): 2960–72.

https://doi.org/10.1093/hmg/ddt150.

10. Choksi, Semil P., Gilbert Lauter, Peter Swoboda, and Sudipto Roy. 2014. “Switching on Cilia: Transcriptional Networks Regulating Ciliogenesis.” Development 141 (7):

1427–41. https://doi.org/10.1242/dev.074666.

11. Consortium, The Schizophrenia Working Group of the Psychiatric Genomics, Stephan Ripke, James TR Walters, and Michael C. O’Donovan. 2020. “Mapping Genomic Loci Prioritises Genes and Implicates Synaptic Biology in Schizophrenia.” MedRxiv, September, 2020.09.12.20192922. https://doi.org/10.1101/2020.09.12.20192922.

12. De Crescenzo, Franco, Valentina Postorino, Martina Siracusano, Assia Riccioni, Marco Armando, Paolo Curatolo, and Luigi Mazzone. 2019. “Autistic Symptoms in Schizophrenia Spectrum Disorders: A Systematic Review and Meta-Analysis.”

Frontiers in Psychiatry 10 (February). https://doi.org/10.3389/fpsyt.2019.00078.

(39)

29

13. Efimenko, Evgeni, Kerry Bubb, Ho Yi Mak, Ted Holzman, Michel R. Leroux, Gary Ruvkun, James H. Thomas, and Peter Swoboda. 2005. “Analysis of Xbx Genes in C.

Elegans.” Development (Cambridge, England) 132 (8): 1923–34.

https://doi.org/10.1242/dev.01775.

14. El Zein, Loubna, Aouatef Ait-Lounis, Laurette Morlé, Joëlle Thomas, Brigitte Chhin, Nathalie Spassky, Walter Reith, and Bénédicte Durand. 2009. “RFX3 Governs Growth and Beating Efficiency of Motile Cilia in Mouse and Controls the Expression of Genes Involved in Human Ciliopathies.” Journal of Cell Science 122 (17): 3180–89.

https://doi.org/10.1242/jcs.048348.

15. Fan, Xiaoying, Yuanyuan Fu, Xin Zhou, Le Sun, Ming Yang, Mengdi Wang, Ruiguo Chen, et al. 2020. “Single-Cell Transcriptome Analysis Reveals Cell Lineage Specification in Temporal-Spatial Patterns in Human Cortical Development.” Science Advances 6 (34). https://doi.org/10.1126/sciadv.aaz2978.

16. “General Transcription Factor / Transcription Factor | Learn Science at Scitable.” n.d.

Accessed April 22, 2021. https://www.nature.com/scitable/definition/transcription- factor-167/.

17. Gibson, Greg. 2012. “Rare and Common Variants: Twenty Arguments.” Nature Reviews. Genetics 13 (2): 135–45. https://doi.org/10.1038/nrg3118.

18. Grove, Jakob, Stephan Ripke, Thomas D. Als, Manuel Mattheisen, Raymond K.

Walters, Hyejung Won, Jonatan Pallesen, et al. 2019. “Identification of Common Genetic Risk Variants for Autism Spectrum Disorder.” Nature Genetics 51 (3): 431–

44. https://doi.org/10.1038/s41588-019-0344-8.

19. Ha, Thomas J., Peter G. Y. Zhang, Remi Robert, Joanna Yeung, Douglas J. Swanson, Anthony Mathelier, Wyeth W. Wasserman, et al. 2019. “Identification of Novel Cerebellar Developmental Transcriptional Regulators with Motif Activity Analysis.”

BMC Genomics 20 (1): 718. https://doi.org/10.1186/s12864-019-6063-9.

20. Halvorsen, Matthew, Ruth Huh, Nikolay Oskolkov, Jia Wen, Sergiu Netotea, Paola Giusti-Rodriguez, Robert Karlsson, et al. 2020. “Increased Burden of Ultra-Rare Structural Variants Localizing to Boundaries of Topologically Associated Domains in Schizophrenia.” Nature Communications 11 (1): 1842. https://doi.org/10.1038/s41467- 020-15707-w.

21. Harris, Holly K., Tojo Nakayama, Jenny Lai, Boxun Zhao, Nikoleta Argyrou, Cynthia S. Gubbels, Aubrie Soucy, et al. 2021. “Disruption of RFX Family Transcription Factors Causes Autism, Attention-Deficit/Hyperactivity Disorder, Intellectual Disability, and Dysregulated Behavior.” Genetics in Medicine, March, 1–13.

https://doi.org/10.1038/s41436-021-01114-z.

22. Hayes, Paulette L., Betsy L. Lytle, Brian F. Volkman, and Francis C. Peterson. 2008.

“The Solution Structure of ZNF593 from Homo Sapiens Reveals a Zinc Finger in a Predominately Unstructured Protein.” Protein Science 17 (3): 571–76.

https://doi.org/10.1110/ps.073290408.

23. Hedlund, Eva, and Qiaolin Deng. 2018. “Single-Cell RNA Sequencing: Technical Advancements and Biological Applications.” Molecular Aspects of Medicine, The emerging field of single-cell analysis, 59 (February): 36–46.

https://doi.org/10.1016/j.mam.2017.07.003.

(40)

30

24. Henriksson, Johan, Brian P. Piasecki, Kristina Lend, Thomas R. Bürglin, and Peter Swoboda. 2013. “Finding Ciliary Genes: A Computational Approach.” Methods in Enzymology 525: 327–50. https://doi.org/10.1016/B978-0-12-397944-5.00016-X.

25. Hilker, Rikke, Dorte Helenius, Birgitte Fagerlund, Axel Skytthe, Kaare Christensen, Thomas M. Werge, Merete Nordentoft, and Birte Glenthøj. 2018. “Heritability of Schizophrenia and Schizophrenia Spectrum Based on the Nationwide Danish Twin Register.” Biological Psychiatry, Novel Mechanisms in Schizophrenia Pathophysiology, 83 (6): 492–98. https://doi.org/10.1016/j.biopsych.2017.08.017.

26. Höglund, Julia, Nima Rafati, Mathias Rask-Andersen, Stefan Enroth, Torgny Karlsson, Weronica E. Ek, and Åsa Johansson. 2019. “Improved Power and Precision with Whole Genome Sequencing Data in Genome-Wide Association Studies of Inflammatory Biomarkers.” Scientific Reports 9 (1): 16844. https://doi.org/10.1038/s41598-019- 53111-7.

27. Iakoucheva, Lilia M., Alysson R. Muotri, and Jonathan Sebat. 2019. “Getting to the Cores of Autism.” Cell 178 (6): 1287–98. https://doi.org/10.1016/j.cell.2019.07.037.

28. Iotchkova, Valentina, Graham R. S. Ritchie, Matthias Geihs, Sandro Morganella, Josine L. Min, Klaudia Walter, Nicholas Timpson, et al. 2016. “GARFIELD - GWAS Analysis of Regulatory or Functional Information Enrichment with LD Correction.”

BioRxiv, November, 085738. https://doi.org/10.1101/085738.

29. Jansen, Rick, Jouke-Jan Hottenga, Michel G. Nivard, Abdel Abdellaoui, Bram Laport, Eco J. de Geus, Fred A. Wright, Brenda W.J.H. Penninx, and Dorret I. Boomsma. 2017.

“Conditional EQTL Analysis Reveals Allelic Heterogeneity of Gene Expression.”

Human Molecular Genetics 26 (8): 1444–51. https://doi.org/10.1093/hmg/ddx043.

30. Kim, Minsoo, Jillian R. Haney, Pan Zhang, Leanna M. Hernandez, Lee-kai Wang, Laura Perez-Cano, Loes M. Olde Loohuis, Luis de la Torre-Ubieta, and Michael J.

Gandal. 2021. “Brain Gene Co-Expression Networks Link Complement Signaling with Convergent Synaptic Pathology in Schizophrenia.” Nature Neuroscience, May, 1–11.

https://doi.org/10.1038/s41593-021-00847-z.

31. Lai, Meng-Chuan, Michael V Lombardo, and Simon Baron-Cohen. 2014. “Autism.”

The Lancet 383 (9920): 896–910. https://doi.org/10.1016/S0140-6736(13)61539-1.

32. Lotan, Amit, Michaela Fenckova, Janita Bralten, Aet Alttoa, Luanna Dixson, Robert W. Williams, and Monique van der Voet. 2014. “Neuroinformatic Analyses of Common and Distinct Genetic Components Associated with Major Neuropsychiatric Disorders.” Frontiers in Neuroscience 8. https://doi.org/10.3389/fnins.2014.00331.

33. Lugo Marín, Jorge, Montserrat Alviani Rodríguez-Franco, Vinita Mahtani Chugani, María Magán Maganto, Emiliano Díez Villoria, and Ricardo Canal Bedia. 2018.

“Prevalence of Schizophrenia Spectrum Disorders in Average-IQ Adults with Autism Spectrum Disorders: A Meta-Analysis.” Journal of Autism and Developmental Disorders 48 (1): 239–50. https://doi.org/10.1007/s10803-017-3328-5.

34. Manolio, Teri A. 2010. “Genomewide Association Studies and Assessment of the Risk of Disease.” Review-article. Https://Doi.Org/10.1056/NEJMra0905980. Massachusetts Medical Society. World. July 2, 2010. https://doi.org/10.1056/NEJMra0905980.

(41)

31

35. Masi, Gabriele, and Francesca Liboni. 2011. “Management of Schizophrenia in Children and Adolescents.” Drugs 71 (2): 179–208. https://doi.org/10.2165/11585350- 000000000-00000.

36. McGrath, John, Sukanta Saha, David Chant, and Joy Welham. 2008. “Schizophrenia:

A Concise Overview of Incidence, Prevalence, and Mortality.” Epidemiologic Reviews 30 (1): 67–76. https://doi.org/10.1093/epirev/mxn001.

37. Miyoshi, Koho, and Yasushi Morimura. 2010. “Clinical Manifestations of Neuropsychiatric Disorders.” In Neuropsychiatric Disorders, edited by Koho Miyoshi, Yasushi Morimura, and Kiyoshi Maeda, 1–14. Tokyo: Springer Japan.

https://doi.org/10.1007/978-4-431-53871-4_1.

38. Moccia, Francesco, Estella Zuccolo, Teresa Soda, Franco Tanzi, Germano Guerra, Lisa Mapelli, Francesco Lodola, and Egidio D’Angelo. 2015. “Stim and Orai Proteins in Neuronal Ca2+ Signaling and Excitability.” Frontiers in Cellular Neuroscience 9 (April). https://doi.org/10.3389/fncel.2015.00153.

39. Mojarad, Bahareh A., Yue Yin, Roozbeh Manshaei, Ian Backstrom, Gregory Costain, Tracy Heung, Daniele Merico, Christian R. Marshall, Anne S. Bassett, and Ryan K. C.

Yuen. 2021. “Genome Sequencing Broadens the Range of Contributing Variants with Clinical Implications in Schizophrenia.” Translational Psychiatry 11 (1): 1–12.

https://doi.org/10.1038/s41398-021-01211-2.

40. Nica, Alexandra C., and Emmanouil T. Dermitzakis. 2013. “Expression Quantitative Trait Loci: Present and Future.” Philosophical Transactions of the Royal Society B:

Biological Sciences 368 (1620). https://doi.org/10.1098/rstb.2012.0362.

41. Ohashi, Kei, Satomi Fukuhara, Taishi Miyachi, Tomoko Asai, Masayuki Imaeda, Masahide Goto, Yoshie Kurokawa, et al. 2021. “Comprehensive Genetic Analysis of Non-Syndromic Autism Spectrum Disorder in Clinical Settings.” Journal of Autism and Developmental Disorders, February. https://doi.org/10.1007/s10803-021-04910-3.

42. Owen, Michael J., Akira Sawa, and Preben B. Mortensen. 2016. “Schizophrenia.” The Lancet 388 (10039): 86–97. https://doi.org/10.1016/S0140-6736(15)01121-6.

43. Pavlides, Jennifer M. Whitehead, Zhihong Zhu, Jacob Gratten, Allan F. McRae, Naomi R. Wray, and Jian Yang. 2016. “Predicting Gene Targets from Integrative Analyses of Summary Data from GWAS and EQTL Studies for 28 Human Complex Traits.”

Genome Medicine 8 (1): 84. https://doi.org/10.1186/s13073-016-0338-4.

44. Persico, Antonio M., and Valerio Napolioni. 2013. “Autism Genetics.” Behavioural Brain Research, SI:Neurobiology of Autism, 251 (August): 95–112.

https://doi.org/10.1016/j.bbr.2013.06.012.

45. Pinkham, Amy E., Joseph B. Hopfinger, Kevin A. Pelphrey, Joseph Piven, and David L. Penn. 2008. “Neural Bases for Impaired Social Cognition in Schizophrenia and Autism Spectrum Disorders.” Schizophrenia Research 99 (1): 164–75.

https://doi.org/10.1016/j.schres.2007.10.024.

46. Ronemus, Michael, Ivan Iossifov, Dan Levy, and Michael Wigler. 2014. “The Role of de Novo Mutations in the Genetics of Autism Spectrum Disorders.” Nature Reviews Genetics 15 (2): 133–41. https://doi.org/10.1038/nrg3585.

47. Rossi, Mari, Dima El-Khechen, Mary Helen Black, Kelly D. Farwell Hagman, Sha Tang, and Zöe Powis. 2017. “Outcomes of Diagnostic Exome Sequencing in Patients

References

Related documents

Industrial Emissions Directive, supplemented by horizontal legislation (e.g., Framework Directives on Waste and Water, Emissions Trading System, etc) and guidance on operating

The EU exports of waste abroad have negative environmental and public health consequences in the countries of destination, while resources for the circular economy.. domestically

Re-examination of the actual 2 ♀♀ (ZML) revealed that they are Andrena labialis (det.. Andrena jacobi Perkins: Paxton & al. -Species synonymy- Schwarz & al. scotica while

In our co- hort, the minor allele frequency (MAF) of the 102 GWAS studied SNPs that were in a range similar to other Caucasian populations [4] (Supplementary Table 1). We identified

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i