• No results found

Characterizing Rett disease genes on the promoter level (Paper IV)

In contrast to the previous study that was based on characterizing various aspects of and contrasting a vast amount of, data, in this study we picked 3 genes specific for Rett syndrome. We used the FANTOM5 data for human and mouse for in detail characterization the promoters, regulators and expression patterns of these three genes.

First, we identified CAGE-derived TSSs for all 3 genes in both human and mouse. We identified a novel main promoter in mouse for Foxg1. Both human and mouse had high expression of Foxg1 in brain, but intriguingly, expression was absent in the cerebellum samples. Analyzing ENCODE DNAse-I hypersensitive sites and active promoter histone marks (H3K4me3) available for mouse, we found evidence of silencing of FoxG1 by the PRC2 complex in the cerebellum. For Mecp2, we identified two TSSs in both human and mouse (with an exception of a third lowly expressed TSS expressed exclusively in blood primary cells, particularly in CD14 monocytes). We found Mecp2 to be ubiquitously expressed in all tissues and not just the brain. For Cdkl5 we also found 2 TSSs for both species, also expressed ubiquitously.

We compared all the TSSs of these 3 genes for correlations in expression levels and found that Mecp2 and Cdkl5 are more highly correlated to each other than they are to Foxg1. Additionally, the expression of main TSSs of Foxg1 and Mecp2 are in contrast to each .

Using Encode ChIP data we additionally characterized the mouse promoters for the presence of enhancer or promoter markers. For all the main promoters of the 3 genes, we find enhancer marks close by in the mouse, while for human, using the enhancer

data of FANTOM5, we find 4, 14 and 1 significantly correlated enhancers for FOXG1, MECP2 and CDKL5 respectively.

When we characterized the promoter shapes of all the TSSs, we found that most of them are broad, consistent with their connection to CpG islands. We also found a strong conservation of promoter shape across species, particularly visible in the highest expressed TSS of FOXG1.

We calculated also the probabilities of having the same TFs regulate the three genes.

Our data reveal that the sequence region around the main promoter of FOXG1 in human is significantly enriched in binding sites for the RREB1 (p = 0.01), FOXP1 (p = 0.03), and NFY (p = 0.01) transcription factors. NFY is also predicted to regulate MECP2 (p = 0.01) and possibly CDKL5 (p = 0.09). In mouse, for all three genes the promoter regions are enriched for motifs associated with transcription factor NFY, as well as Sp1.

FOXG1 and MECP2 both are TFs and thus binding DNA at the promoter regions.

Using the expression data of the whole FANTOM5 we checked their motif activities (ref). Although FOXG1 is expressed mostly in brain tissues, its motif activity did not show any significant difference between brain and other tissues in either human or mouse (human p <= 0.2738, mouse p <= 0.3272). In contrast, the motif activity of MECP2 is significantly lower in brain compared to other tissues both in human and mouse (human p <= 1.019e-10, mouse p <= 0.0005343), consistent with a role of MECP2 as a negative regulator in brain.

Our comprehensive analyses of the data from the FANTOM5 project reveal that although the three genes are related by disease phenotype, their genomic architecture, expression and regulation are independent of each other. Thus intersecting molecular pathways likely cause the overlapping phenotypes seen in Rett patients.

9 CONCLUSIONS AND PERSPECTIVES

Through these papers we used the power of CAGE in conjunction with next-generation sequencing to define the promoter region of genes, regulatory element binding sites as well as alternative promoter usage depending on different states.

In paper I we developed a method combining siRNA knockdown and sequencing based CAGE to show we can accurately predict genomic loci where TFs are actively regulating transcription. We then applied this same approach to a gene linked to a disease but without completely known function and without a known DNA binding site. Even without a TF function, we could still derive the downstream effects of perturbing DYX1C1.

For papers III and IV we used the FANTOM5 database of tissues and cells. In paper III we used this to make global conclusions about differences in regulation between i) brain and other tissues, ii) adult and infant brain as well as iii) different adult brain regions. Using an opposite approach, in paper IV we started with 3 previously not associated genes and investigated their detailed promoter-level functionality using the database of tissues and cells. This way we have shown how the same data repository can be used for more global questions as well as specific questions such as the common characteristics of three genes.

This approach has shown us that genome-wide studies and quantitative measures can give a glance at to why the cells are different and what might be the underlying mechanisms. To gain further details, additional studies that would involve complementary methods such as RNA-Seq to give us an idea if the alternative TSS we find do give different protein products in the end. Of course, in complex disorders we should not forget the potential influences of other factors including non-coding RNAs, enhancers, epigenetics or repeat regions.

The advent of next-generation sequencing has brought us into an era of vast biological data. The advances in sequencing technologies and lowering of costs in general has brought us closer to clinical application of these technologies. Main goal in this area is to develop diagnostic tools that medical doctors can use in everyday work with patients to quickly diagnose or classify a disease. This would particularly be necessary to end

the 'diagnostic odyssey': the grueling, painful, expensive and sometimes decades-long journey from negative test to negative test, failing to diagnose a rare disease. Currently methods are being developed with exome sequencing as a diagnostic tool with 25% of diagnosis rate for now (Glusman G, 2013). Additional diagnostic methods can rely in developing un-invasive ways to diagnose a patient that would normally require invasive procedures, such as identifying markers in the blood that are quick to be checked by clinical personnel. The miniaturization of sequencing machines is also leading to this point that dektop sequencers for small samples would be available to clinicians.

Good communication needs to be established between the lab and the clinic for both to profit from it. The clinician could this way provide valuable disease samples which in turn could lead to better diagnostic methods or even potentially to identify intervention points for drug treatment.

Another novel topic in this field, which arose due to lowering costs of sequencing and the ability to take small amounts of RNA, is single cell sequencing. It is a common perception that expression in a tissue is an average of the expression of all the single cells making it up. First single cell studies have found genes can be highly expressed in some cells while they are virtually not at all expressed in other cells of the same cell type possibly resulting from expression bursts. Not only is the biology of single cells still poorly understood but also the methods required to obtain single cells (often manually) are complicated and it is hard to separate cells but still keep the information about their cross-talk to be visible in the sequencing step.

The methods described here are basic science that is gradually turning toward clinical samples and clinical application. Hopefully, the average patient will be able to soon benefit from these services and have the diagnostics and treatment be speeded up with these technologies.

10 ACKNOWLEDGEMENTS

I am grateful to have been given the opportunity to work towards my PhD in a joint graduate program of RIKEN, Japan and Department of Cell and Molecular Biology, Karolinska Institutet, Sweden. This has been an amazing experience and I have been lucky and fortunate to be working with such fantastic people all around the world.

First of all, to Carsten Daub, my supervisor. Thank you for being so positive from the beginning and embarking on this long journey with me, bringing me to Japan and then connecting me to Sweden. Thank you for your guidance, advice (both on science and life) and discussions. This whole journey was a big learning experience and your calm composure helped make it all possible and very fun. To Juha Kere, my mentor. Thank you for your words of advice throughout these years and for encouraging me to push forward. I am looking forward to our future plans. To Björn Andersson, my co-supervisor, thank you for your support. To Matti Nikkola, head of studies at CMB.

Thank you for you support for this program through these years, you have always been there to help us and show us the hoppi happy way. To Yoshihide Hayashizaki, thank you for having me at your center and helping us out in realizing this joint graduate program, I feel very privileged to have been able to do my work at the OSC.

Throughout my studies I have been supported by the International Program Associate Grant from RIKEN and by a grant from the Frankopani Foundation. A special thank you also goes out to the administrative staff that also made this collaboration possible, especially Reiji Nagashima, from the GRO at RIKEN, thank you for your support of this project and making the contract between RIKEN and KI possible.

I have had the best and most amazing Lab during my stay at the OSC. I will never forget the great times I had with the bioinformatics group and the amazing people I met, you were more than co-workers and without your support I would not have been here today. Timo, thank you for your patience with me and teaching me the ropes in that first year, I learned a lot about science and what to focus on to progress in it.

Alistair, thank you for your encouragement at tough times, scientific discussions and overall positive support, I look forward to many more discussion over beer in the future! Michiel, thank you for teaching me statistics on the serious side and for exploring ‘culture’ with me on the fun side. I did miss the Yamathon this year, but I am

sure we can come up with some equivalent for it in Europe when we meet again. Erik, thank you for always being so positive and encouraging, your words helped in the tough times. Nicolas, thank you for the advice, doing weird configurations for me and overall support. Kawaji-san, thank you for the support and student journal club starting, I learned a lot in that seminar series about presentations and critical opinion.

My dear Marina, thank you so so much for being a great friend and co-worker through these years, don’t know what I would have done without you! Keep up the good work and I hope you find your shining star soon! Andrew, thank you for the one year of awesome mayhem and for all the fun gaming, drinking and talking, some jokes will never get old! Mickaël, don’t know what I would have done about that zombie infestation without you! Thank you for the fun times, we need to meet up sometime to finish the game. Jessica, thank you for your friendship, support and company during the late working hours. I always enjoyed going to your DJs and talking about music and life with you! Jordan, dude, it was short, but so much fun! Thank you for your continued support and middle of the night messages of support during my writing period. Jayson, though we did not hang out as much as we planned, I still enjoyed our little talks and complaining sesions. Ohmiya-san, it was a great experience to work with you on a project, I hope you get to work on many more interesting projects.

Hasegawa-san thank you for your support always and your bright attitude. Bogumil, Serkan, Makis, thank you guys for your support over the short period we worked together.

To some ex-lab members. Thierry, thank you for your company through all the different activities. I am looking forward to making the Europe plans come true. Max, thank you for teaching me things during your stay at the OSC, I wish you all the best in your endeavors . Eivind and Joost, you guys made my first year in Japan, we has so much fun, thank you also for waiting for me to finish my late evening entrance exam to celebrate over some iconic doughnuts. Sylvia, I won’t forget our fun times when you stayed with us and all the different stories we shared, I am looking forward to seeing you again soon, XOXO. Owen, what can I say, not fair you coming to visit after I left, hope we can make up for it somewhere in Europe. Sebastian, thank you for improving my coding skills and enthusiasm for the small project. Nancy, it was nice to have you with us for a short stay, hopefully we can enjoy a great Stockholm summer this year.

Jenny, thank you for your support during your short stay, I hope you also choose your path and pursue it in the future. Helena, thank you for the pancakes and discussions!

To my Yuripong. It is hard now to imagine having lunch at work without you and your calm perspective of things. Thank you for all your trust, support and humor over this time, you helped me see the fun side of even the hardest moments.

Many other members of the OSC have contributed to make my stay in Japan more enjoyable either scientifically or as friends. Piero, thank you for your support of my ideas and guidance on the general perspective. I always enjoyed your enthusiasm for science and I hope I can keep that spark going also in myself. Yuki-chan, you are amazing and so sweet! Thank you for all the good times and the incredible support, never change and good luck with your future projects! Dave, my basketball buddy, we had so much fun at the practice and tournaments, please keep up the good work and the trophy in our hands. Also, thank you for the bioinformaticy discussions and advice! To a certain group of misfits whose natural habitat is the Fantom Bar, Hazuki, Ana, Ale, Giovanni, Marco. Thank you for the sometimes very much needed distractions from the gruelling science and for the fun beer and churrasco times we shared. Diane, thank you for the super nice mousse and always being so positive. Kimura-san, thank you for the basketball fun and overall sports discussions. Linda, draga moja hvala na svoj pomoci i nadam se da ce ti daljni posao biti uspjesan! My and Hanna, thank you for your help, support and enthusiasm through these years. Tsugumi-san, thank you for your support, especially during the quake time, I really enjoyed your company.

I was also lucky to work with some amazing collaborators during my studies. Marghi, I could not have wished for a more fun collaborator or a better friend, working with you on ‘our’ projects was always both fun and intense, thank you for your continued support and for cheering me on, so looking forward to meet again soon. Peter, thank you having me work on your amazing data and for you support and advice. Alka, thank you for the great support and work on the Rett project, I learned so much from you.

Also thank you for your ‘women in science’ perspective. Kristiina, thank you for the dyslexia work and for your continued support. I hope we can still make some more nice papers out of this collaboration. Isabel, thank you for the fun discussion and teaching me so much more about dyslexia genes. Robin, thank you for the enhancer analysis for the papers. Albin, thank you for your support for the papers and sharing advice with me.

Finally, to my friends outside of work. Living these years in Japan I was a very lucky person to meet absolutely fantastic friends in such a short time. My girls from pole dance really made my free time so much more fun and cheerful, I will never forget the farewell party in kimonos you organized and the way you escorted me to the airport when I left Japan. Thank you for everything Momo, Chika, Juri, Rico, Rachel, Rena, Yumi, Kanae, Tsuji-chan, Ai, Mariko, Sayo, Marzena,

みなさんのおかげで東京での⽣活が楽しい充実したものとなりました。

⼼から感謝しています。また会える⽇を楽しみにしています

My friends back home in Croatia, many of you came to visit me during the studies and kept in touch through the years. I look forward to catching up with everyone again. A big thank you to my Ana for always being there, even through Skype. We celebrate 20 years of friendship this year my dear, I look forward to welcoming you in Stockholm.

Damir, thank you for making the cover page for me and for making me laugh in the hard times.

Finally, to my parents, thank you for being so supportive of me through my whole life and especially for letting me embark on this dream. You have taught me some basic values that I hope I will be able to live up to all my life. I hope we get to share many many more happy moments together. Piceki, uspjeli smo!

11 REFERENCES

Amir, R. E., Van den Veyver, I. B., Wan, M., Tran, C. Q., Francke, U., & Zoghbi, H.

Y. (1999). Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nature genetics, 23(2), 185–188.

Ariani, F., Hayek, G., Rondinella, D., Artuso, R., Mencarelli, M. A., Spanhol-Rosseto, A., et al. (2008). FOXG1 is responsible for the congenital variant of Rett syndrome. American journal of human genetics, 83(1), 89–93.

Ayub, M., & Bayley, H. (2012). Individual RNA base recognition in immobilized oligonucleotides using a protein nanopore. Nano letters, 12(11), 5637–5643.

Bailey, T. L., Williams, N., Misleh, C., & Li, W. W. (2006). MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic acids research, 34(Web Server issue), W369–73.

Bentley, D. R., Balasubramanian, S., Swerdlow, H. P., Smith, G. P., Milton, J., Brown, C. G., et al. (2008). Accurate whole human genome sequencing using reversible terminator chemistry. Nature, 456(7218), 53–59.

Bork, P., & Copley, R. (2001). The draft sequences. Filling in the gaps. Nature, 409(6822), 818–820.

Branton, D., Deamer, D. W., Marziali, A., Bayley, H., Benner, S. A., Butler, T., et al.

(2008). The potential and challenges of nanopore sequencing. Nature biotechnology, 26(10), 1146– 1153.

Brivanlou, A. H., & Darnell, J. E. (2002). Signal transduction and the control of gene expression. Science (New York, N.Y.), 295(5556), 813–818.

Carninci, P., Kasukawa, T., Katayama, S., Gough, J., Frith, M. C., Maeda, N., et al.

(2005). The transcriptional landscape of the mammalian genome. Science (New York, N.Y.), 309(5740), 1559–1563.

Carninci, P., Sandelin, A., Lenhard, B., Katayama, S., Shimokawa, K., Ponjavic, J., et al. (2006). Genome-wide analysis of mammalian promoter architecture and evolution.

Nature genetics, 38(6), 626–635.

Cerami, E. G., Gross, B. E., Demir, E., Rodchenkov, I., Babur, O., Anwar, N., et al.

(2011). Pathway Commons, a web resource for biological pathway data. Nucleic acids research, 39(Database issue), D685–90.

Chahwan, R., Wontakal, S. N., & Roa, S. (2011). The multidimensional nature of epigenetic information and its role in disease. Discovery medicine, 11(58), 233–243.

Chen, Y., Zhao, M., Wang, S., Chen, J., Wang, Y., Cao, Q., et al. (2009). A novel role for DYX1C1, a chaperone protein for both Hsp70 and Hsp90, in breast cancer.

Journal of cancer research and clinical oncology, 135(9), 1265–1276.

Cock, P. J. A., Fields, C. J., Goto, N., Heuer, M. L., & Rice, P. M. (2010). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina

FASTQ variants. Nucleic acids research, 38(6), 1767–1771.

Collas, P. (2010). The current state of chromatin immunoprecipitation. Molecular biotechnology, 45(1), 87–100.

Crick, F. (1970). Central dogma of molecular biology. Nature, 227(5258), 561–563.

D'Arcangelo, G., Homayouni, R., Keshvara, L., Rice, D. S., Sheldon, M., & Curran, T. (1999).Reelin is a ligand for lipoprotein receptors. Neuron, 24(2), 471–479.

de Hoon, M., & Hayashizaki, Y. (2008). Deep cap analysis gene expression (CAGE):

genome- wide identification of promoters, quantification of their expression, and network inference. BioTechniques, 44(5), 627–8– 630– 632.

Deaton, A. M., & Bird, A. (2011). CpG islands and the regulation of transcription.

Genes & development, 25(10), 1010–1022.

Dixit, R., Zimmer, C., Waclaw, R. R., Mattar, P., Shaker, T., Kovach, C., et al.

(2011). Ascl1 participates in Cajal-Retzius cell development in the neocortex.

Cerebral cortex (New York, N.Y. : 1991), 21(11), 2599–2611.

Djebali, S., Davis, C. A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., et al.

(2012). Landscape of transcription in human cells. Nature, 489(7414), 101–108.

Elkon, R., Linhart, C., Sharan, R., Shamir, R., & Shiloh, Y. (2003). Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. Genome research, 13(5), 773–780.

ENCODE Project Consortium, Birney, E., Stamatoyannopoulos, J. A., Dutta, A., Guigó, R., Gingeras, T. R., et al. (2007). Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature, 447(7146), 799–816.

ENCODE Project Consortium, Dunham, I., Kundaje, A., Aldred, S. F., Collins, P. J., Davis, C. A., et al. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414), 57–74.

FANTOM Consortium, Suzuki, H., Forrest, A. R. R., van Nimwegen, E., Daub, C.

O., Balwierz, P. J., et al. (2009). The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nature genetics, 41(5), 553–562.

Frith, M. C., Ponjavic, J., Fredman, D., Kai, C., Kawai, J., Carninci, P., et al. (2006).

Evolutionary turnover of mammalian transcription start sites. Genome research, 16(6), 713– 722.

Gerstein, M. B., Kundaje, A., Hariharan, M., Landt, S. G., Yan, K.-K., Cheng, C., et al. (2012). Architecture of the human regulatory network derived from ENCODE data. Nature, 489(7414),91–100.

Glusman G. (2013) Clinical applications of sequencing take center stage. Genome Biology, 14:303

Hawrylycz, M. J., Lein, E. S., Guillozet-Bongaarts, A. L., Shen, E. H., Ng, L., Miller, J. A., et al. (2012). An anatomically comprehensive atlas of the adult human brain

Related documents