• No results found

Integrating gene expression with genotyping

detailed view of gene regulation.

Protein expression measurements could also provide new opportunities for the reverse engineering community and enable researchers to moving beyond the effective gene net-work (see Section 4.2). It would be very interesting to determine whether transcription factor protein levels correlated with the mRNA expression of their target genes (see also Section 4.2). Moreover, it would be interesting to cluster protein expression data and compare it to cluster results from mRNA expression data. It is likely these data sources will reveal different aspects of the investigated processes, as a consequence extending Study III in this thesis with protein expression measurements may enable a more com-plete identification of gene modules important to atherosclerosis severity.

Last, genome-wide expression studies like in this thesis and by others generate an increasingly robust list of atherosclerosis candidate genes. In a few years from now, we may have end up with a list in the thousands of relatively well-established atherosclerosis genes. Clearly, developing custom made protein analysis platforms focusing on these genes will by-pass some of the problems inherent with whole-genome proteomic approaches.

will cause changes leading to a higher-order phenotype (e.g., sickle-cell anemia and cystic fibrosis), while in other cases, the interplay of several genetic changes leads to a higher-order phenotype. Traditional approaches—focusing on mapping genotypes to higher-higher-order phenotypes—have had trouble unraveling complex phenotypes such as atherosclerosis.

Gene expression may serve as an intermediate step between genotype and complex phe-notype. In early studies by Brem et al. [130] and Schadt et al. [131], gene expression patterns were shown to be highly hereditable; moreover, a large number of genetic loci affecting gene expression—referred to as expression quantitative loci or eQTL—were iden-tified in yeast mouse, maize, and human. Schadt and coworkers also used eQTLs and gene expression to link five genomic regions that were important in defining the fat-pad-mass trait in these mice, which would not have been possible using traditional techniques [131].

In more recent studies, this approach has been applied to a range of settings to iden-tify potential susceptibility genes for several complex traits, including obesity, diabetes, atherosclerosis, and neuronal function, in mice and in human subjects [132–136].

In the light of these results, it would be interesting to genotype patients in the STAGE study (see section 3.1.1) using a global SNP array. The benefit of the STAGE cohort is that we have multiple expression profiles for the same gene in up to five tissues, which would enable us to identify similarities and differences in the genetic architecture in those tissues.

The combined expression genotype data would, for instance, give us the opportunity to find genomic regions associated with the module shown to be related to atherosclerosis severity in Study III. In a small-scale study involving a handful selected SNPs, we could, using statistical and bioinformatic methods, show evidence that one of these SNPs is responsible for regulation in this module. This SNP have also been further validated and found to cause myocardial infarction or atherosclerosis in the Swedish population of three independent cohorts [137–139].

6 Concluding Remarks

This thesis provides evidence that analysis of global gene expression profiles isolated from a wide range of biological specimens can be used to infer functional interactions of genes in modules or networks. The content and structure of these modules and networks can be used to improve our understanding how complex disorders like atherosclerosis develop.

It is hard to predict the most efficient path to a more complete understanding of complex diseases. I believe in depth investigation of candidate genes will be important in the future but only as a complement to global approaches. Many things will be learnt from combing different genomic strategies bringing their different strengths and weaknesses to the the same table. By doing this we can get a course grained picture of the disease process at different levels, giving us the opportunity to find new disease relevant relationships.

Acknowledgements

This research has been performed at the Computational Medicine group, Department of Medicine at Karolinska Institutet. There are several people who have contributed directly or indirectly to this thesis. In particular I wish to thank:

My supervisors Johan Bj¨orkegren and Jesper Tegn´er for introducing me to the compu-tational medicine and atherosclerosis research field and also for supporting me and not loosing faith in me when I choose alterative paths. Johan for being creative, intelligent and having a positive and easy going attitude to science and life in general. Jesper for beeing open minded and willing to discuss all sorts of ideas about science, philosophy, and totally unrelated matters like the stock market.

My supervisor Josefin, for beeing enthusiastic, knowledgable, and eager to explain. And also for friendly talks and advice in scientific and non-scientific matters.

All members of Computational Medicine group, Sara, Peri, Shoreh, Olivia, and also all previous group members for good collaboration, nice lunch and coffey breaks. This thesis would not have been completed without you.

All members of the atherosclerosis research unit for providing a good scientific environ-ment.

Mika and Michael, for intersting discussions about reverse engineering schemes and cell regulation and for fruitful collaboration.

My mum and dad for always being supportive without interfering with my life.

My two children, Hampus and Molly, for being the best kids and for distracting my attention away from thesis writing and to more important things.

I also want to thank Maria for everything. You are my true love.

This research has been supported by the Swedish Knowledge Foundation through the Industrial PhD programme in Medical Bioinformatics at the Strategy and Development Office at Karolinska Institutet. The thesis has been proof read and edited by Stephen Ordaway.

References

[1] Riordan, J. R., Rommens, J. M., Kerem, B., Alon, N., Rozmahel, R., Grzelczak, Z., Zielenski, J., Lok, S., Plavsic, N., and Chou, J. L. (1989). Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA. Science 245, 1066–73.

[2] Rommens, J. M., Zengerling, S., Burns, J., Melmer, G., Kerem, B. S., Plavsic, N., Zsiga, M., Kennedy, D., Markiewicz, D., and Rozmahel, R. (1988). Identification and regional localization of DNA markers on chromosome 7 for the cloning of the cystic fibrosis gene. Am J Hum Genet 43, 645–63.

[3] McPherson, R., Pertsemlidis, A., Kavaslar, N., Stewart, A., Roberts, R., Cox, D. R., Hinds, D. A., Pennacchio, L. A., Tybjaerg-Hansen, A., Folsom, A. R., et al. (2007). A common allele on chromosome 9 associated with coronary heart disease. Science 316, 1488–91.

[4] Helgadottir, A., Thorleifsson, G., Manolescu, A., Gretarsdottir, S., Blondal, T., Jonasdottir, A., Jonasdottir, A., Sigurdsson, A., Baker, A., Palsson, A., et al. (2007). A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science 316, 1491–3.

[5] Tegner, J., Skogsberg, J., and Bj¨orkegren, J. (2007). Multi-organ whole-genome measurements and reverse engineering to uncover gene networks underlying complex traits. Journal of Lipid Research 48, 267–277.

[6] Ideker, T., Galitski, T., and Hood, L. (2001). A new approach to decoding life: systems biology.

Annu Rev Genomics Hum Genet 2, 343–72.

[7] Kitano, H. (2002). Systems biology: a brief overview. Science 295, 1662–4.

[8] Ehrenberg, M., Elf, J., Aurell, E., Sandberg, R., and Tegner, J. (2003). Systems biology is taking off. Genome Res 13, 2377–80.

[9] Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001). Initial sequencing and analysis of the human genome.

Nature 409, 860–921.

[10] Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001). The sequence of the human genome. Science 291, 1304–51.

[11] Waterston, R. H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J. F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. (2002). Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–62.

[12] Lockhart, D. J., Dong, H., Byrne, M. C., Follettie, M. T., Gallo, M. V., Chee, M. S., Mittmann, M., Wang, C., Kobayashi, M., Horton, H., et al. (1996). Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14, 1675–80.

[13] Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–70.

[14] Ginsburg, G. S., Donahue, M. P., and Newby, L. K. (2005). Prospects for personalized cardiovas-cular medicine: the impact of genomics. J Am Coll Cardiol 46, 1615–27.

[15] MacBeath, G. (2002). Protein microarrays and proteomics. Nat Genet 32 Suppl, 526–32.

[16] Lusis, A. J., Mar, R., and Pajukanta, P. (2004). Genetics of atherosclerosis. Annu Rev Genomics Hum Genet. 5, 189–218.

[17] Mecham, B. H., Klus, G. T., Strovel, J., Augustus, M., Byrne, D., Bozso, P., Wetmore, D. Z., Mariani, T. J., Kohane, I. S., and Szallasi, Z. (2004). Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene ex-pression measurements. Nucleic Acids Res 32, e74.

[18] Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–93.

[19] Irizarry, R. A., Bolstad, B. M., Collin, F., Cope, L. M., Hobbs, B., and Speed, T. P. (2003).

Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31, e15.

[20] World Health Organization http://www.who.int/mediacentre/factsheets/fs317/en/. (2007). Fact sheet No317: Cardiovascular diseases.

[21] Ghazalpour, A., Doss, S., Yang, X., Aten, J., Toomey, E. M., Nas, A. V., Wang, S., Drake, T. A., and Lusis, A. J. (2004). Thematic review series: The pathogenesis of atherosclerosis. Toward a biological network for atherosclerosis. J Lipid Res 45, 1793–805.

[22] Cohn, J. S., Wat, E., Kamili, A., and Tandy, S. (2008). Dietary phospholipids, hepatic lipid metabolism and cardiovascular disease. Curr Opin Lipidol 19, 257–62.

[23] Balkau, B., Hu, G., Qiao, Q., Tuomilehto, J., Borch-Johnsen, K., and Pyorala, K. (2004). Prediction of the risk of cardiovascular mortality using a score that includes glucose as a risk factor. The DECODE Study. Diabetologia 47, 2118–28.

[24] Nigro, J., Osman, N., Dart, A. M., and Little, P. J. (2006). Insulin resistance and atherosclerosis.

Endocr Rev 27, 242–59.

[25] Yokoyama, S. (2000). Release of cellular cholesterol: molecular mechanism for cholesterol home-ostasis in cells and in the body. Biochim Biophys Acta. 1529, 231–44.

[26] Tsujita, M., Wu, C.-A., Abe-Dohmae, S., Usui, S., Okazaki, M., and Yokoyama, S. (2005). On the hepatic mechanism of HDL assembly by the ABCA1/apoA-I pathway. J Lipid Res 46, 154–62.

[27] Maxfield, F. R. and Tabas, I. (2005). Role of cholesterol and lipid organization in disease. Nature 438, 612–21.

[28] Libby, P. (2002). Inflammation in atherosclerosis. Nature 420, 868–74.

[29] Hansson, G. K. (2005). Inflammation, atherosclerosis, and coronary artery disease. N Engl J Med 352, 1685–95.

[30] Packard, R. R. S. and Libby, P. (2008). Inflammation in atherosclerosis: from vascular biology to biomarker discovery and risk prediction. Clin Chem 54, 24–38.

[31] International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature 431, 931–45.

[32] Goff, S. A., Ricke, D., Lan, T.-H., Presting, G., Wang, R., Dunn, M., Glazebrook, J., Sessions, A., Oeller, P., Varma, H., et al. (2002). A draft sequence of the rice genome (Oryza sativa L. ssp.

japonica). Science 296, 92–100.

[33] Yu, J., Hu, S., Wang, J., Wong, G. K.-S., Li, S., Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X., et al. (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79–92.

[34] Adams, M. D., Celniker, S. E., Holt, R. A., Evans, C. A., Gocayne, J. D., Amanatides, P. G., Scherer, S. E., Li, P. W., Hoskins, R. A., Galle, R. F., et al. (2000). The genome sequence of Drosophila melanogaster. Science 287, 2185–95.

[35] Goffeau, A., Barrell, B. G., Bussey, H., Davis, R. W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J. D., Jacq, C., Johnston, M., et al. (1996). Life with 6000 genes. Science 274, 546, 563–7.

[36] Claverie, J. M. (2001). Gene number. What if there are only 30,000 human genes? Science 291, 1255–7.

[37] Tegn´er, J. and Bj¨orkegren, J. (2007). Perturbations to uncover gene networks. Trends Genet 23, 34–41.

[38] Uetz, P., Giot, L., Cagney, G., Mansfield, T. A., Judson, R. S., Knight, J. R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., et al. (2000). A comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae. Nature 403, 623–627.

[39] Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., and Sakaki, Y. (2001). A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A 98, 4569–74.

[40] Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I., et al. (2002). Transcriptional regulatory networks in saccharomyces cerevisiae. Science 298, 799–804.

[41] Horak, C., Luscombe, N., Bertone, J. Q. P., Piccirrillo, S., Gerstein, M., and Snyder, M. (2002).

Complex transcriptional circuitry at the g1/s transition in saccharomyces cerevisiae. Genes Dev 16, 3017–33.

[42] Luscombe, N., Babu, M., Yu, H., Snyder, M., Teichmann, S., and Gerstein, M. (2004). Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431, 308–312.

[43] Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K. F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., and Hirakawa, M. (2006). From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34, D354–7.

[44] Duarte, N. C., Becker, S. A., Jamshidi, N., Thiele, I., Mo, M. L., Vo, T. D., Srivas, R., and Palsson, B. O. (2007). Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci U S A 104, 1777–82.

[45] MacIsaac, K., Wang, T., Gordon, D., Gifford, D., Stormo, G., and Fraenkel, E. (2006). An improved map of conserved regulatory sites for saccharomyces cerevisiae. BMC Bioinformatics 7, 113.

[46] Barab`asi, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286, 509–12.

[47] Jeong, H., Mason, S. P., Barabasi, A. L., and Oltvai, Z. N. (2001). Lethality and centrality in protein networks. Nature 411, 41–2.

[48] Barab´asi, A. and Oltvai, Z. (2004). Network biology: understanding the cell’s functional organiza-tion. Nat Rev Genet 5, 101–113.

[49] Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N., and Barabasi, A. L. (2002). Hierarchical organization of modularity in metabolic networks. Science 297, 1551–5.

[50] Shen-Orr, S. S., Milo, R., Mangan, S., and Alon, U. (2002). Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 31, 64–8.

[51] Bansal, M., Belcastro, V., Ambesi-Impiombato, A., and di Bernardo, D. (2007). How to infer gene networks from expression profiles. Mol Syst Biol 3, 78.

[52] Kuhn, K., Baker, S. C., Chudin, E., Lieu, M.-H., Oeser, S., Bennett, H., Rigault, P., Barker, D., McDaniel, T. K., and Chee, M. S. (2004). A novel, high-performance random array platform for quantitative gene expression profiling. Genome Res 14, 2347–56.

[53] Affymetrix. (2004). GeneChip Expression Analysis Technical Manual.R

[54] Naef, F., Hacker, C. R., Patil, N., and Magnasco, M. (2002). Empirical characterization of the expression ratio noise structure in high-density oligonucleotide arrays. Genome Biol 3, RE-SEARCH0018.

[55] Boguski, M. S. and Schuler, G. D. (1995). ESTablishing a human transcript map. Nat Genet 10, 369–71.

[56] Mecham, B. H., Wetmore, D. Z., Szallasi, Z., Sadovsky, Y., Kohane, I., and Mariani, T. J. (2004).

Increased measurement accuracy for sequence-verified microarray probes. Physiol Genomics 18, 308–15.

[57] Gautier, L., Moller, M., Friis-Hansen, L., and Knudsen, S. (2004). Alternative mapping of probes to genes for Affymetrix chips. BMC Bioinformatics 5, 111.

[58] Pruitt, K. D., Tatusova, T., and Maglott, D. R. (2007). NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35, D61–5.

[59] Maglott, D., Ostell, J., Pruitt, K. D., and Tatusova, T. (2005). Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 33, D54–8.

[60] Affymetrix. GeneChip Expression Analysis Data Analysis Fundamentals.R

[61] Li, C. and Wong, W. H. (2001). Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A 98, 31–6.

[62] Zhang, L., Miles, M. F., and Aldape, K. D. (2003). A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol 21, 818–21.

[63] Wu, Z., Irizarry, R. A., Gentleman, R., Murillo, F. M., and Spencer, F. (2004). A model based background adjustment for oligonucleotide expression arrays. Journal of the American Statistical Association 99, 909–917.

[64] Affymetrix http://www.affymetrix.com/support/technical/technotes/plier technote.pdf. (2005).

Guide to Probe Logarithmic Intensity Error (PLIER) Estimation.

[65] Wit, E. and McClure, J. (2004). Statistics for Microarrays: Design, Analysis and Inference. (John Wiley & Sons, Ltd).

[66] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. pp. 289–300.

[67] Tusher, V. G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98, 5116–21.

[68] Efron, B., Tibshirani, R., Storey, J., and Tusher, V. (2001). Empirical bayes analysis of microarray experiment. J Am Stat Assoc 96, 1151–1160.

[69] Hartwell, L. H., Hopfield, J. J., Leibler, S., and Murray, A. W. (1999). From molecular to modular cell biology. Nature 402, C47–52.

[70] Ravasz, E. and Barabasi, A.-L. (2003). Hierarchical organization in complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 67, 026112.

[71] D’haeseleer, P., Liang, S., and Somogyi, R. (2000). Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16, 707–26.

[72] Asyali, M. H., Colak, D., Demirkaya, O., and Inan, M. S. (2006). Gene expression profile classifi-cation: A review. Current Bioinformatics 1, 55–73.

[73] Kerr, G., Ruskin, H. J., Crane, M., and Doolan, P. (2008). Techniques for clustering gene expression data. Comput Biol Med 38, 283–93.

[74] Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95, 14863–8.

[75] Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E. S., and Golub, T. R. (1999). Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci U S A 96, 2907–12.

[76] Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J., and Church, G. M. (1999). Systematic determination of genetic network architecture. Nat Genet 22, 281–5.

[77] Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D., and Friedman, N. (2003).

Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34, 166–76.

[78] Segal, E., Friedman, N., Kaminski, N., Regev, A., and Koller, D. (2005). From signatures to models: understanding cancer using microarrays. Nat Genet 37 Suppl, S38–45.

[79] Getz, G., Levine, E., and Domany, E. (2000). Coupled two-way clustering analysis of gene mi-croarray data. Proc Natl Acad Sci U S A 97, 12079–84.

[80] Getz, G., Gal, H., Kela, I., Notterman, D. A., and Domany, E. (2003). Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data. Bioinformatics 19, 1079–89.

[81] Margolin, A. A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Favera, R. D., and Cali-fano, A. (2006). ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 Suppl 1, S7.

[82] de la Fuente, A., Bing, N., Hoeschele, I., and Mendes, P. (2004). Discovery of meaningful associa-tions in genomic data using partial correlation coefficients. Bioinformatics 20, 3565–74.

[83] Yeung, M. K., Tegner, J., and Collins, J. J. (2002). Reverse engineering gene networks using singular value decomposition and robust regression. Proc Natl Acad Sci U S A 99, 6163–6168.

[84] Tegn´er, J., Yeung, M. K., Hasty, J., and Collins, J. J. (2003). Reverse engineering gene networks:

Integrating genetic perturbations with dynamical modeling. Proc Natl Acad Sci U S A 100, 5944–

5949.

[85] Gustafsson, M., Hornquist, M., and Lombardi, A. (2005). Constructing and analyzing a large-scale gene-to-gene regulatory network–lasso-constrained inference and biological validation. IEEE/ACM Trans Comput Biol Bioinform 2, 254–61.

[86] Gustafsson, M., H¨ornquist, M., Lundstr¨om, J., Bj¨orkegren, J., and Tegn´er, J. (2008). Reverse engineering of gene networks with lasso and non-linear basis functions. Manuscript.

[87] Friedman, N., Linial, M., Nachman, I., and Pe’er, D. (2000). Using Bayesian networks to analyze expression data. J Comput Biol 7, 601–20.

[88] Pena, J. M., Bjorkegren, J., and Tegner, J. (2005). Growing Bayesian network models of gene networks from seed genes. Bioinformatics 21 Suppl 2, ii224–9.

[89] Butte, A. J. and Kohane, I. S. (2000). Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocomput pp. 418–29.

[90] de Lichtenberg, U., Jensen, L. J., Brunak, S., and Bork, P. (2005). Dynamic complex formation during the yeast cell cycle. Science 307, 724–7.

[91] Pujana, M. A., Han, J.-D. J., Starita, L. M., Stevens, K. N., Tewari, M., Ahn, J. S., Rennert, G., Moreno, V., Kirchhoff, T., Gold, B., et al. (2007). Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat Genet 39, 1338–49.

[92] Lieu, H. D., Withycombe, S. K., Walker, Q., Rong, J. X., Walzem, R. L., Wong, J. S., Hamilton, R. L., Fisher, E. A., and Young, S. G. (2003). Eliminating atherogenesis in mice by switching off hepatic lipoprotein secretion. Circulation 107, 1315–21.

[93] Xenarios, I., Rice, D., Salwinski, L., Baron, M., and Marcotte, E. (2000). Dip: the database of interacting proteins. Nucleic Acids Res. 28, 289–91.

[94] Deane, C., Salwi´nski, L., Xenarios, I., and Eisenberg, D. (2002). Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 1, 349–56.

[95] Hughes, T. R., Marton, M. J., Jones, A. R., Roberts, C. J., Stoughton, R., Armour, C. D., Bennett, H. A., Coffey, E., Dai, H., He, Y. D., et al. (2000). Functional discovery via a compendium of expression profiles. Cell 102, 109–126.

[96] Mnaimneh, S., Davierwala, A., Haynes, J., Moffat, J., Peng, W., Zhang, W., Yang, X., Pootoolal, J., Chua, G., Lopez, A., et al. (2004). Exploration of essential gene functions via titratable promoter alleles. Cell 118, 31–44.

[97] Efron, B. and Tibshirani, R. (2002). Empirical bayes methods and false discovery rates for mi-croarrays. Genet Epidemiol 23, 70–86.

[98] Wolfram Research, I. (2003). Mathematica Edition: Version 5.1. (Champaign, Illinois: Wolfram Research, Inc.).

[99] Kaufman, L. and Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. (New York: John Wiley & Sons).

[100] Strandberg, P. E. (2005). On text mining to identify gene networks with a special reference to cardiovascular disease. Master’s thesis Link¨oping University.

[101] Dennis, G. J., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C., and Lempicki, R. A.

(2003). DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4, P3.

[102] Helfand, M., Carson, S., and Kelley, C. (2006). Drug class review on hmg-coa reductase inhibitors (statins). http://www.ohsu.edu/drugeffectiveness/reports/final.cfm.

[103] Blatt, M., Wiseman, S., and Domany, E. (1996). Superparamagnetic clustering of data. Phys.

Rev. Lett. 76, 3251–3254.

[104] Tetko, I. V., Facius, A., Ruepp, A., and Mewes, H.-W. (2005). Super paramagnetic clustering of protein sequences. BMC Bioinformatics 6, 82.

[105] Matys, V., Kel-Margoulis, O. V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., et al. (2006). TRANSFAC and its module TRANSCom-pel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34, D108–10.

[106] Mishra, G. R., Suresh, M., Kumaran, K., Kannabiran, N., Suresh, S., Bala, P., Shivakumar, K., Anuradha, N., Reddy, R., Raghavan, T. M., et al. (2006). Human protein reference database–2006 update. Nucleic Acids Res 34, D411–4.

[107] Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., et al. (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–9.

[108] Schuler, G. D. (1997). Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med 75, 694–8.

[109] Cleveland, W. (1979). Robust locally weighted regression and smoothing scatterplots. J Am Statist Assoc 74, 829–836.

[110] Zweigenbaum, P., Demner-Fushman, D., Yu, H., and Cohen, K. B. (2007). Frontiers of biomedical text mining: current progress. Brief Bioinform 8, 358–75.

[111] Zhou, D. and He, Y. (2008). Extracting interactions between proteins from the literature. J Biomed Inform 41, 393–407.

[112] Jimeno, A., Jimenez-Ruiz, E., Lee, V., Gaudan, S., Berlanga, R., and Rebholz-Schuhmann, D.

(2008). Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics 9 Suppl 3, S3.

[113] Kim, J.-J., Pezik, P., and Rebholz-Schuhmann, D. (2008). MedEvi: retrieving textual evidence of relations between biomedical concepts from Medline. Bioinformatics 24, 1410–2.

[114] Fink, L., Kwapiszewska, G., Wilhelm, J., and Bohle, R. M. (2006). Laser-microdissection for cell type- and compartment-specific analyses on genomic and proteomic level. Exp Toxicol Pathol 57 Suppl 2, 25–9.

[115] Fink, L., Kohlhoff, S., Stein, M. M., Hanze, J., Weissmann, N., Rose, F., Akkayagil, E., Manz, D., Grimminger, F., Seeger, W., et al. (2002). cDNA array hybridization after laser-assisted microdissection from nonneoplastic tissue. Am J Pathol 160, 81–90.

[116] Sims, F. H. (1983). A comparison of coronary and internal mammary arteries and implications of the results in the etiology of arteriosclerosis. Am Heart J 105, 560–6.

[117] Herrgard, M. J., Covert, M. W., and Palsson, B. O. (2003). Reconciling gene expression data with known genome-scale regulatory network structures. Genome Res 13, 2423–34.

[118] Kong, Y. M., Macdonald, R. J., Wen, X., Yang, P., Barbera, V. M., and Swift, G. H. (2006).

A comprehensive survey of DNA-binding transcription factor gene expression in human fetal and adult organs. Gene Expr Patterns 6, 678–86.

[119] Yu, H., Luscombe, N. M., Qian, J., and Gerstein, M. (2003). Genomic analysis of gene expression relationships in transcriptional regulatory networks. Trends Genet 19, 422–7.

[120] Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D., and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9, 3273–97.

[121] Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–7.

[122] Quackenbush, J. (2006). Microarray analysis and tumor classification. N Engl J Med 354, 2463–72.

[123] Wong, J. W. H., Sullivan, M. J., and Cagney, G. (2008). Computational methods for the compar-ative quantification of proteins in label-free LCn-MS experiments. Brief Bioinform 9, 156–65.

[124] Berglund, L., Bj¨orling, E., Oksvold, P., Fagerberg, L., Asplund, A., Szigyarto, C. A.-K., Persson, A., Ottosson, J., Wernerus, H., Nilsson, P., et al. (2008). A gene-centric human protein atlas for expression profiles based on antibodies. Mol Cell Proteomics 7, 2019–2027.

[125] Haab, B. B., Dunham, M. J., and Brown, P. O. (2001). Protein microarrays for highly parallel detection and quantitation of specific proteins and antibodies in complex solutions. Genome Biol 2, RESEARCH0004.

[126] Guo, Y., Xiao, P., Lei, S., Deng, F., Xiao, G. G., Liu, Y., Chen, X., Li, L., Wu, S., Chen, Y., et al.

(2008). How is mRNA expression predictive for protein expression? A correlation study on human circulating monocytes. Acta Biochim Biophys Sin (Shanghai) 40, 426–36.

[127] Redon, R., Ishikawa, S., Fitch, K. R., Feuk, L., Perry, G. H., Andrews, T. D., Fiegler, H., Shapero, M. H., Carson, A. R., Chen, W., et al. (2006). Global variation in copy number in the human genome. Nature 444, 444–54.

[128] Gonzalez, E., Kulkarni, H., Bolivar, H., Mangano, A., Sanchez, R., Catano, G., Nibbs, R. J., Freedman, B. I., Quinones, M. P., Bamshad, M. J., et al. (2005). The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307, 1434–40.

[129] Komura, D., Shen, F., Ishikawa, S., Fitch, K. R., Chen, W., Zhang, J., Liu, G., Ihara, S., Nakamura, H., Hurles, M. E., et al. (2006). Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. Genome Res 16, 1575–84.

[130] Brem, R. B., Yvert, G., Clinton, R., and Kruglyak, L. (2002). Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–5.

[131] Schadt, E. E., Monks, S. A., Drake, T. A., Lusis, A. J., Che, N., Colinayo, V., Ruff, T. G., Milligan, S. B., Lamb, J. R., Cavet, G., et al. (2003). Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302.

[132] Emilsson, V., Thorleifsson, G., Zhang, B., Leonardson, A. S., Zink, F., Zhu, J., Carlson, S., Helgason, A., Walters, G. B., Gunnarsdottir, S., et al. (2008). Genetics of gene expression and its effect on disease. Nature 452, 423–8.

[133] Chen, Y., Zhu, J., Lum, P. Y., Yang, X., Pinto, S., MacNeil, D. J., Zhang, C., Lamb, J., Edwards, S., Sieberts, S. K., et al. (2008). Variations in DNA elucidate molecular networks that cause disease. Nature 452, 429–35.

[134] Chesler, E. J., Lu, L., Shou, S., Qu, Y., Gu, J., Wang, J., Hsu, H. C., Mountz, J. D., Baldwin, N. E., Langston, M. A., et al. (2005). Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet 37, 233–42.

Related documents