• No results found

Origin of the eukaryotic cell

N/A
N/A
Protected

Academic year: 2022

Share "Origin of the eukaryotic cell"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

Origin of the eukaryotic cell

Min Wu

Degree Project in Biology, Master of Science (2 years), 2011 Examensarbete I biologi 45 hp till masterexamen, 2011

Biology Education Centre and Department of Molecular Evolution, Uppsala University

Supervisor: Siv Andersson

(2)

Abstract

The discovery of 2.1 billion years old Eukaryotic fossils in Gabon July 2010 has brought new insights into the origin of eukaryotic cell (Albani et al. 2010). The relationship among three domains, Eukaryotes, Bacteria and Archaea, in the tree of life was investigated in this project through phylogenies constructed from two mitochondrial transmembrane proteins HSP70 and HSP60. The trees were rooted with a paralog clade from endoplasmic reticulum, eukaryotes were the earliest derived clade in HSP70 tree, while in HSP60 tree, it is more related to Archaea as was observed from universal tree of life. A lateral gene transfer between Bacteria and Archaea was suspected for both genes. All mitochondrial copies of genes including those from secondary amitochondrial organisms share the same last common ancestor with

alphaproteobacteria, and appear to be particularly affiliated to Rickettsiales. Placement of SAR11 clade within alphaproteobacteria fluctuated between Rickettsiales and Rhodobacterales; Placement of SAR116 within alphaproteobacteria is identified for the first time ever since the release of its genome, and it is consistently clustered within Rhodospirillales.

There are 137 Eukaryotic Signature Proteins previously defined in yeast mitochondrial proteome ten years ago (Karlberg, 2000). Blast for these genes hits de novo prokaryotic homologs for twelve genes, whereas only five of them possess more than 5 hits.

Patterns of tRNA loss in mitochondrial genome were studied, and a co-evolution between tRNA loss and their aminoacyl-tRNA synthetase was proved in this project, according to the analysis of 1704

mitochondrial genomes, and falsified the hypothesis proposed by Schneider (Schneider 2001; Lithgow 2010) that the frequency of tRNA gene loss is related to sequence identity of bacterial and eucaryotic aminoacyl-tRNA synthetase by Pearson Correlation test.

(3)

Contents

1. Introduction ... 4

1.1 The latest discovery of 2.1 Gyr old Eukaryotic Fossils ... 4

1.2 The 3D Scenario verses the 2D Scenario ... 4

1.3 Endosymbiont Theory... 5

1.4 Eukaryotic Signatures ... 6

1.5 Hsp multigene family and Mitohchondrial Protein Import Machine... 6

1.6 tRNA gene loss in Mitochondrial genome and Aminoacyl-tRNA Synthetase gene ... 7

2. Material and Methods... 9

2.1 Blast for Eukaryotic Signature Proteins... 9

2.2 Phylogenies for Mitochondrial Transmembrane proteins ... 9

2.3 Mitochondrial tRNA gene loss... 10

2.3.1 Construction of phylogenetic trees ... 10

2.3.2Parsimony inference of tRNA loss patterns... 10

2.3.3 Statistics in R ... 11

3. Results ... 12

3.1 Blast for Eukaryotic Signature Proteins... 12

3.2 Phylogenies for HSP70 and HSP60 multigene family... 12

3.2.1 Trees for three domains of Life ... 12

3.2.2 Trees for Mitochondria and Alphaproteobacteria ... 17

3.3 Mitochondrial tRNA gene loss... 17

3.3.1 tRNA gene loss in Viridiplantae Kingdom ... 17

3.3.2 tRNA gene loss in Fungi Kingdom... 18

3.3.3 ANOVA for loss tRNA genes of different origin ... 18

3.3.4 Correlation between tRNA Min-loss and sequence identity... 19

3.3.5 Correlation between tRNA Total-loss and Mitochondrial Genome size... 20

3.3.6 Correlation between tRNA Total-loss and Mitochondrial Coding Genome size ... 20

4. Discussion ... 21

4.1 Lateral gene transfer between Bacteria and Archaea ... 21

4.2 Deep node of Cyanobacteria ... 21

4.3 Place of SAR11 and SAR116 sequences in HSP phylogenies ... 21

4.4 Clean-ups of alignment result in variations of phylogenetic topology ... 21

4.5 Paralog Rooting ... 22

4.6 Consistence with Endosymbiont Theory ... 22

4.7 Co-evolution of tRNA gene loss and Aminoacyl-tRNA Synthetase gene... 22

Acknowledgement... 23

Reference... 24

Appendix ... 27

(4)

1. Introduction

1.1 The latest discovery of 2.1 Gyr old Eukaryotic Fossils

Ever since Charles Darwin, palaeontological records were accepted as one of the most powerful evidence for biological evolution. His prediction of the extensive prehistory life, reasoned from Cambrian fossils 542 million years ago, were proved by the latest discovered macroscopic fossils from Paleoproterozoic rocks dated 2.1 billion years ago from southeast of Gabon. These fossils were believed to be eukaryotes as they contain the eukaryotic signature compounds, sterane (derived from sterol precursors) (Albani et al. 2010;

Donoghue & Antcliffe 2010).

The time scale of these eukaryotic fossils was dated after the Great Oxidation Event of Earth (Figure 1.1, Donoghue & Antcliffe 2010). It was astonishing to push the fossil record of eukaryotes 200 million years back, but still, it will make sense to believe that the origin of eukaryotes occurred shortly after the first rise of oxygen in the atmosphere 2.4 billion years ago, as cyanobacteria consumed toxic greenhouse gases and accumulated oxygen. Moreover, there is quite a high possibility that these fossils are evidence for presently extinct ancestral-eukaryuotes, and more details of the unclear characters of this ancient lineage could be inferred.

Figure1.1. Fossil Evidence of early life and the corresponding chemical composition of Oceans and Atmosphere on earth (Donoghue & Antcliffe 2010, with the permission from the publisher).

1.2 The 3D Scenario verses the 2D Scenario

The two domains scenario (2D Scenario), based on those phylogenetic trees that show an Archaeal origin of Eukaryotes, states that there are only two domains in the universal tree of life: one is the Prokaryotes that is composed of Bacteria and Archaea, and the other is Eukaryotes, which arose as the descendent of the first

(5)

domain (Gribaldo, 2010). Within the two domain scenario, the hypothesis suggests that Eukarya was originated from the fusion of Bacteria and a phylum of Archaea, Crenarchaeota, based on small-subunit ribosomal RNA (16s rRNA) phylogenies (Lake, et al. 1984; Rivera & Lake, 1992). Alternative hypothesis indicate that Eukarya was emerged from symbiosis of Bacteria and the other Archaea phyla, Euryarhaeota (Margulis, 1970; Margulis, 1996; Searcy & Stein, 1978; Searcy, 2003). However, the argument against 2D scenario is that, despite the example of gammaproteobacterial symbionts hosted in betaproteobacterium, no proof of endosymbiotic association between Archaeon and Bacterium has been obtained yet (Poole &

Penny, 2007).

The three domains scenario (3D Scenario), based on the observation from classical universal tree of life, proposes that all the living organisms were divided into three independent domains: Eukaryote, Bacteria and Archaea (Woese, 1990). In this scenario, a present-extinct ancestral-eukaryotic lineage was suggested to explain the origin of mitochondria and the evolvement of other complex characters in eukaryotes (Kurland, et al. 2006; Poole & Penny, 2007; Field & Dacks, 2009).

Seven recently performed large-scale phylogenomic investigations on tree of life, applying different strategic approaches failed in drawing a consensus conclusion. Three studies show independent clades of Archaea and Eukaryotes, which is in agreement with 3D Scenario, whereas four studies exhibit a relationship between Eukaryotes and a particular lineage of Archaea, which supports 2D Scenario, but the particular lineage of Archaea appears to be variable from one to another. Thus, the significance of a robust rooted phylogeny of Archaea was emphasized on the way to answering the question of origin of Eukaryotes (Brochier Armanet, et al. 2008; Robertson, et al. 2005; Gribaldo, et al. 2010).

1.3 Endosymbiont Theory

The endosymbiont theory which states that mitochondria and chloroplasts were derived from free-living bacteria was first proposed in the nineteenth century, but it was not widely accepted until Margulis reannounced it with her own molecular experimental evidence in the 1970s (Brindefalk, 2009).

Many characters such as single circular genome, bacteria-type transcription and translation enzynmes and components, and bacterial-binary-fission-like replication and division have supported the link of mitochondria and plastids to bacteria (Barton, et al. 2007). The phagotrophic nature of eukaryote also might have facilitated the occurence of endosymbiont events (Kurland & Penny, 2006). Conclusive proof of the endosymbiont theory came from phylogenetic analysis.

Phylogenetic analysis of all primary plastid traced back to a single endosymbiont event between a eukaryotic cell and cyanobacteria, and the origin of mitochondria was phylogenetically traced back to a common ancestor within alphaproteobacteria. Early genomic studies suggested that the highly reduced genome of Rickettsia is the most related alphaproteobacteria to mitochondria (Andersson, et al. 1998). A recent phylogenomic study also placed mitochondria within the alphaproteobacteria clade (unpublished results), but the exact position within alphaproteobacteria is still unknown.

The discovery of homologs of mitochondrial proteins in amitochondrial protists (Bui, et al., 1996; Clark &

(6)

Roger, 1995; Germot, et al., 1996; Germot, et al., 1997; Horner, et al., 1996; Mueller, 1998; Roger, et al., 1996; Roger, et al., 1998) suggested that these organisms, such as Trichomonas, Giardia and Entamoeba are actually secondarily amitochondrial, and all eukaryotes have, or at least have had mitochondria. Hence, it was proposed that the endosymbiont event of mitochondria occurred before the emergence of the last common ancestor of eukaryotes (Kurland & Andersson, 2000). In that, answering the question about origin of mitochondria is helpful to unravel the mystery of the eukaryotic origin.

Another interesting phenomenon as a result of endosymbionts was the gene migration from the organelle genome to the nuclear genome, and thus the reduction of the organelle genome, since stable environment within eukaryotic cells has eliminated the unnecessary genes that used to be essential for free-living bacteria. For example, the maximum number of protein-coding genes in organelle genomes is only 100 (Barton, et al. 2007). As compensation, materials including metabolites, non-coding RNAs and proteins were imported from their host. However, many imported proteins or RNAs are not necessarily of eukaryotic origin; sometimes the genes that encode them were actually transferred from the organelle and of bacterial origin.

1.4 Eukaryotic Signatures

Features such as nucleus, endomembrane system (Bapteste, et al., 2005; Mans, et al., 2004; Field, et al., 2009), mitochondrion (Embley, 2006; Giezen & Tovar, 2005), spliceosomal introns (Collins & Penny, 2005;

Roy & Gilbert, 2006), linear chromosomes with telomeres synthesized by telomerases (Nakamura & Cech, 1998), meiotic sex (Ramesh, et al., 2005), sterol synthesis (Desmond & Gribaldo, 2009), unique cytokinesis structures (Eme, et al., 2009) and the capability of phagocytosis (Cavalier, 2002; Jekely, 2003) have distinguished eukaryotes from the other two domains of bacteria and archaea (Gribaldo, et al. 2010).

From the perspective of comparative genomics, proteins with no orthologs in the genomes of prokaryotes are regarded as eukaryotic signature proteins (ESPs) (Kurland, et al. 2006), and 137 proteins were described as ESPs in the yeast mitochondrial proteome ten years ago (Karlberg, et al., 2000). As there are much more genomic data available nowadays, it would be interesting to blast these previously defined ESPs against the current prokaryotic genomic database, looking for homologs that might have been missed ten years ago due to limited resources.

1.5 Hsp multigene family and Mitohchondrial Protein Import Machine

Heat shock proteins are groups of proteins that are conserved in almost all living organisms (Craig, 1985).

They function as molecular chaperones, preventing proteins from degradation and promoting the folding of preproteins during their translocation across membranes (Johnston, et al. 1998, Gupta, et al. 1994).

Families of large Hsps were categorized based on their molecular weight, such as Hsp90, Hsp70, Hsp60 and Hsp40, where in particular, the function of mtHsp70 is clear as a central part of an ATP-dependent preprotein import motor across the inner membrane of mitochondria into the matrix (Figure1.2, Pfanner &

Geissler, 2001; Kang, et al., 1990; Horst, et al., 1997; Voos, et al., 1996).

(7)

It was recently discovered that, all large Hsps in eukaryotic cells cluster in four different monophyletic groups according to their subcellular location (Huang, et al. 2008). The cytoplasm and endoplasmic reticulum versions appear to be evolved from a duplication of the same ancestral gene, while the origin of mitochondrial and plastid versions are more controversial for different Hsps. However, phylogenies for both Hsp70 and Hsp60 are consistent with endosymbiont theory, where mitochondrial versions are grouped with proteobacteria (Hsp60 was shown to be grouped with alpha-proteobacteria) and plastid versions are grouped with cyanobacteria (Figure 1.3, Boorstein, et al. 1994, Huang, et al. 2008).

Figure1.2. Mitochondrial Two main protein import pathways (Pfanner & Geissler, 2001, with the permission from the publisher).

Since places of different versions of Hsps in the content of three domains are still mysterious, it would be interesting to look at the whole picture within tree of life and to try to untangle the evolutionary

relationship among eukaryotes, bacteria and archaea. Questions concerning origin of mitochondria, such as is mitocondrial version of Hsp70 grouped with alpha-proteobacteria as was observed from Hsp60, which order, or more specifically, which living species of alpha-proteobacteria would be the best representative for the ancestor of mitochondria remain to be answered. In addition, mitochondrial version from eukaryotic species that have loss their mitochondria, might provide another perspective of view for both origin of mitochondria and origin on eukaryotes.

1.6 tRNA gene loss in Mitochondrial genome and Aminoacyl-tRNA Synthetase gene

In contrast to the monophyletic origin of mitochondrial protein import, tRNA import evolved multiple times during the evolution of eukaryotes, since some tRNAs were lost from the genome of mitochondria, and the loss of the same tRNA occurred several times in different eukaryotic lineages independently (Adams and Palmer 2003; Gissi et al. 2008).

Schneider proposed the hypothesis that the frequency of mitochondrial tRNA gene loss is positively correlated to the sequence similarity between the bacterial-type (endosymbiont) mitochondrial aminoacyl-tRNA synthetases (aaRSs) and their eukaryotic-type cytoplasmic homologs (Schneider 2001;

(8)

Lithgow 2010).

However, the phylogenetic studies of aaRSs have falsified the assumption of Schneiders hypothesis that mitocohdrial genes are always of bacteria origin and cytoplasmic genes are of eukaryote origin. And the truth is that the complex evolutionary history of mitochondrial and cytoplasmic aaRSs have got some aaRS active in both compartment but of single origin, either from bacteria or eukaryote (Brindefalk et al. 2007).

A. B.

Figure1.3. Multigene copies of HSP70 and HSP60 in Eukaryotic cells (Boorstein, et al. 1994; Huang, et al.

2008, with the permission from the publishers). A: Distance-marix based phylogeny of HSP70 (Boorstein, et al.

1994), B: Neighbor-Joining based phylogeny of HSP60 (Huang, et al. 2008). Both trees show different origin of genes functioning in different organelles. Mitochondrial copy of HSP70 is grouped with proteobacteria, and

more specifically, HSP60is grouped with Alphaproteobacteria; Plastid copy of HSP70 is grouped with Cyanobacteria, and so is Chloroplast copy of HSP60.

Hence, it would be interesting to take this study further, untangle the patterns of tRNA loss and raise the question, since it is not reasoned by the sequence similarity of aaRS homologs, what is the appropriate explanation for tRNA loss in the mitocondrial genome.

Unpublished evidences in Brindefalks PhD thesis suggested that there was a co-evolution between mitochondrial tRNA loss and the replacement of corresponding aaRSs, based on all the mitochondrial genomes available up until November 2006. In this project, we are trying to update Brindefalks analysis with the latest mitochondrial genome resources in December 2010.

(9)

2. Material and Methods

2.1 Blast for Eukaryotic Signature Proteins

There were 137 genes from yeast proteome that were defined as Eukaryotic Signature Proteins, ESP, whose homologous were only found in Eukaryotes (Karlberg, et al., 2000). Since more genomic data is available nowadays compared with ten years ago when Karlberg did his research, these 137 ESPs were blast against the prokaryotes genome database that were updated in April 2010 according to NCBI. The database includes genomes of 1113 species, within which, 78 are Archaea and 992 are Bacteria. The program BLASTP 2.2.18 was used locally, and the E-value threshold of blast was set at 10-10.

For genes that do have prokaryotic hits, according to blast results, a Maximum Likelihood tree was constructed for each gene, using CAT matrix and WAG model in RAxML -7.0.4. Sequence alignment was processed in Kalign version 2.03.

2.2 Phylogenies for Mitochondrial Transmembrane proteins

Protein sequences of HSP60 and HSP70 from three domains were blast from the NCBI dataset, including 52 Bacteria, 8 Archaea, and 31 Eukaryotes. For the 52 bacterial sequences, 33 were from 6 orders of alphaproteobacteria, including Rickettsiales, Rhodospirillales, Rhizobiales, Rhodobacterales, Sphingomonadales and Caulobacterales, 2 were environmental sequence SAR11 and SAR116, where SAR11 clade was represented by Candidatus Pelagibacter. For the 31 eukaryotic sequences, 7 were from endoplasmid reticulum, 8 were from cytoplasm, 16 were from mitochondria, and for all the three eukaryotic copies, there were always some species who have lost their mitochondria being taken into our consideration, such as Giardiai intestinalis, Entamoeba dispar, Trichomonas vaginalis and Encephalitozoon cuniculi.

Kalign 2.03 was used for sequence alignment, and a maximum likelihood tree was constructed with RAxML -7.0.4, using CAT matrix and WAG model. Before running Bayesian tree, a protein model test was performed in Prottest version 2.4, to find the best combination of models and matrix that would give an ideal topology of trees with the highest support values. And finally, the Bayesian trees were run in Phylobayes3.2, using models suggested by Prottest results. Two chains were generated in Phylobayes, running over 10,000 generations. To accept the final phylogeny, the maximum difference between two chains is below 0.01. One optional treatment of the alignment is to pick up the conserved part of whole alignment for the construction of phylogenies, using GBLOCKS 0.91b.

Different parameters tested in Bayesian Phylogenies are listed in table 1, for example, we were interested in the changes of tree topology and relative places of certain species, when the sequence content is within three domains or only within alphaproteobacteria, when the tree was made of conserved part of the sequences or with whole alignment, when species without mitochondria were or were not taken into account, and when the trees were rooted in different ways.

(10)

Table2.1. Parameters for HSP phylogenies

Content Clean-ups Amito-taxa Root

3D - + Endoplasmic reticulum copy

3D + - Endoplasmic reticulum copy

3D + + Endoplasmic reticulum copy

Alpha - - Gama-proteobacteria /Aquifiae

Alpha - + Gama-proteobacteria

Alpha + + Gama-proteobacteri

HSP70

Alpha + + Gama-proteobacteria

3D - + Endoplasmic reticulum copy

3D + + Endoplasmic reticulum copy

Alpha - - Gama/Aquificae

Alpha - + Gama-proteobacteria

Alpha + - Gama-proteobacteria

HSP60

Alpha + + Gama-proteobacteria

2.3 Mitochondrial tRNA gene loss

2.3.1 Construction of phylogenetic trees

Phylogenetic tree for all Viridiplantae species whose mitochondrial genome was available was indicated from a highly conserved cox-1 gene, cytochrome c oxidase subunit 1. All the sequences were obtained from NCBI Genome Organelle Resources. Since species from the same order were not presented in the tree repeatedly, there are 37 plants shown in the final Bayesian tree. Maximum Likelihood tree was constructed in RAxML -7.0.4, with CAT matrix and WAG model. Prottest 2.4 was used to find the best protein matrix and model that fit our data. The Bayesian tree was constructed in Phylobayes3.2. Two chains were running parallel in phylobayes for more than 10,000 generations. To accept the final phylogeny, the maximum difference between two chains is below 0.01.

Phylogeny for all Fungi species was made in the same way as for Viridiplantae. In the final tree of Fungi, there are 47 species present in total.

2.3.2Parsimony inference of tRNA loss patterns

The tRNA genes present and absent matrix was inferred from mitochondrial genomes of all taxa obtained from Organelle Genome Resource. The parsimony pattern of tRNA gene loss was inferred from Paup4.0 b10. Calculation of minimum number of tRNA gene loss was parsed by Bio Perl script written by our own.

(11)

2.3.3 Statistics in R

ANOVA for minimum number of mitochondrial tRNA loss and the class of corresponding aminoacyl-tRNA synthetase gene was performed in R, version 2.7.2. Two types of aaRS gene classification was tested, one was three group classification from Brindefalk (Bridefalk et al. 2007), the other was two group classification, where in one group, aaRS genes from mitochondria and cytoplasm were of different origin, and in the other group, aaRS genes were of single origin (table 2.2).

Table2.2. Classification and Number of Aminoacyl-tRNA Synthetase Genes

Aminoacyl-tRNAsynthetase Group Number of genes

TyrRS A 2

TrpRS A 2

GluRS A 2

MetRS A 2

PheRS A 2

ProRS A 2

SerRS A 2

LeuRS A 2

AspRS A 2

IleRS B 2

ArgRS B 2

AsnRS B 2

CysRS C 1

HisRS C 1

AlaRS C 1

ThrRS C 1

GlyRS C 1

ValRS C 1

LysRS C 2

GlnRS C 1

Three group classification (Classifications from Brindefalk et al. 2007): A: Mitochondrial aaRS of bacterial affiliation and cytoplasmic aaRS of archaeal affiliation, B: both mitochondrial and cytoplasmic aaRS of bacgterial affiliation, C: single

aaRS active in both compartments.; Two group classification: A: Mitochondrial aaRS of bacterial affiliation and cytoplasmic aaRS of archaeal affiliation, BC: both mitochondrial and cytoplasmic aaRS of single origin.

Pearson Correlation between minimum number of mitochondrial tRNA gene loss and the sequence identity between 5 eukaryotes and 10 bacteria aaRS was analysed in R. Pearson Correlation of total number of mitochondrial tRNA genes loss to both mitochondrial genome size and the coding genome size was analysed as well.

(12)

3. Results

3.1 Blast for Eukaryotic Signature Proteins

Out of 137 eukaryotic signature proteins previously defined in Yeast mitochondrial proteome base on genome resource available ten years ago in 2000, there are 12 prokaryotic homologs hits in the blast reulsts when the threshold were set at 10-10, 5 of these 12 genes get more than 5 hits (Table 3.1). PIF1, mitochondrial DNA repair and recombination protein hits 93 as the highest number of homologs, whereas ATP16, IMP1, IMP2, MIP1 and PIS1 only got 1 hit each. But Maximun Likelihood tree of the prokaryotic homologs for these genes appears to be inconsistent with universal patterns for clustering of different bacterial phylums, which suggests that not all these prokaryotic hits are the real homologs to the corresponding ESPs. For instance, the identity of query sequence to hit sequences range from 25% to 35%, the highest sequence identity is 51.11%.

Table3.1 Prokaryotic homologs found out of 137 previously defined Eukaryotic Signature Proteins in Yeast Mitochondrial Proteome

Genes Description Num of Hits

PIF1 mitochondrial DNA repair and recombination protein PIF1 precursor 93

SHY1 out membrane translocase complex 26

MCR1 NADH-cytochrome B5 reductase precursor (P34/P32) 21

ribosomal protein mitochondrial ribosomal protein of the small subunit; member of E.coli S4 superfamily 8

RPO41 mitochondrial DNA-directed RNA polymerase 7

QCR2 ubiquinol-cytochrome C reductase core protein 2 2

YML025C putative L4P like ribosomal protein 2

ATP16 ATP syntase delta chain 1

IMP1 mitochondrial inner membrane protease subunit 1 1

IMP2 mitochondrial inner membrane protease subunit 2 1

MIP1 DNA polymerase gamma (mitochondrial DNA polymerase catalytic subunit) 1 PIS1 CDP-diacylglycerol-Inositol 3-phosphatidyltransferase 1

3.2 Phylogenies for HSP70 and HSP60 multigene family

3.2.1 Trees for three domains of Life

In the phylogeny based on Bayesian method for HSP70, eukaryotes and prokaryotes form two independent clades, when the tree was rooted with the paralog from endoplasmic reticulum of eukaryotes, though the topology of the tree does not change if it was midpoint-rooted (Figure 3.1). Mitochondrial copy of HSP70 from all lineages as well as the amitochondrial species, Encephalitozoon cuniculi, Giardia lamblia, Entamoeba histolytica and Trichomonas vaginalis appear to be of single origin with 0.99 support, and the mitochondrial clade was grouped within alphaproteobacteria as endosymbiont theory suggested, especially close related to Rickettsiales with one hundred percent support. The rest order of alphaproteobacteria are well seperaded, and Pelagibacter, representative of SAR11 data is clustered within Rhodobacterales, also with one hundred percent support.

(13)

Figure3.1. Tree of Three domains inferred from HSP 70

Bayesian algorithm based on full sequence alignment, Dirichlet process for rates across sites (CAT model), LG distribution for relative exchangeability (exchange rate), rooted with endoplasmic reticulum copy of gene from eukaryotes. The phylogeny shows a single clade of mitochondria, and its sister clade to alphaproteobaceria. Archaea is mixed with Firmicutes

and Actinobacteria, which may resulted from a lateral gene transfer. The cytoplasm copy of gene from Eukaryotes appeared at the base of the tree, which suggests an ancient origin of Eukaryotic genes.

(14)

Figure3.2. Tree of Three domains inferred from HSP 60

Bayesian algorithm based on the alignment of 456 amino acid after clean-ups of the original alignment. Dirichlet process for rates across sites (CAT model), LG distribution for relateive exchangeability (exchange rates), rooted with endoplasmic

reticulum copy of gene from eukaryotes. The phylogeny shows a single clade of mitochondria, and its sister clade to alphaproteobaceria. Archaea appeared at the base of the tree, and the single clade of cytoplasmic copy of gene from eukaryotes is diverged within the Archae. There is also one Archael linage goes with Firmicutes as in the case of HSP70.

(15)

The alphaproteobacteria clade is followed by gammaproteobacteria, betaproteobacteria and deltaproteobacteria one by one as was seen from universal tree of life. Chlamydia, Victivallis and Planctomyces are grouped close to each other as a PVC Superphylum as previously suggested, but do not form its own clade. Aquificae appear to be the earliest diverged bacterial phylum in the tree of HSP70, the node of Cyanobacteria is also place quite deep in the tree.

The most unexpected part of the tree of HSP70 is the mixture of Archaea with firmicultes and actinobacterias from Bacteria. Thus neither of Bacteria domain or Archaea domain is monophyletic. And a horizontal gene transfer between Archaea and Bacteria for HSP70 was suspected.

The phylogeny of HSP60 based on Bayesian method is also rooted with Endoplasmic reticulum under the assumption that HSP70 and HSP60 have the same evolutionary history, though the topology of the tree remains the same when it was midpoint-rooted (Figure 3.2). But the Endoplasmic reticulum copy of HSP60 is only found in mammals, which may be resulted from a recent duplication of this gene.

Figure3.3. Tree of alphaproteobacteria and Mitochondria inferred from HSP70 (With amitochondrial eukaryotes and clean-ups)

(16)

In the tree of HSP60, eukaryotic clade shows an affiliation to Archaea, as was seen from universal tree of life, and it is especially close related to a clade of Archaea that contains two species from Crenarchaeota, Thermofilum pendens, Staphylothermus marinus and one Korachaeota species Candidatus Korarchaeum.

But neither of Euryarchaeota or Crenachaeota is monophylotic in the tree of HSP60. Furthermore, one Euryarchaeota species Methanospirillum hugatei shows it affiliation to Bacteria rather than to the other to the other Archaea lineages.

The rest patterns in the phylogeny of HSP60 is consistently with HSP70, as it sufficiently supports the endosymbiont of mitochondria and mitochondria derived organelles form Rickettsiales within alphaproteobacteria. The representative of SAR11 clade is grouped with Rhodobacterales, and the representative of SAR116 clade is clustered within Rhodospirillales and Caolobacterals.

Figure3.4. Tree of alphaproteobacteria and Mitochondria inferred from HSP60 (Without amitochondrial eukaryotes or clean-ups)

(17)

3.2.2 Trees for Mitochondria and Alphaproteobacteria

Phylogenies for both HSP70 and HSP60 were constructed within the context of alphaproteobacteria, and two trees were rooted with gammaproteobacteria (Figure 3.3 & Figure 3.4). Both trees highly support the endosoymbiont of mitochondria from Rickettsiales, the SAR11 clade was grouped with Rhodobacterales and Caulobacterales, and the SAR116 clade was clustered within Rhodospirillales.

But Rhodospirillales is not monophyletic in both tree of HSP70 and HSP60. Sphingopyxis alaskensis from Sphingalmonales is clustered with Rhizobiales in the tree of HSP60, which may due to horizontal gene transfer or merely because of a coding bias.

3.3 Mitochondrial tRNA gene loss

3.3.1 tRNA gene loss in Viridiplantae Kingdom

Figure3.5. Parsimony inference of tRNA gene loss patterns in Viridiplantae Kingdom

Phylogeny for Viridiplantae kingdom was constructed based on highly conserved COX1 gene, cytochrome c oxidase subunit 1, and the tree was rooted with a Metazoa species Nematostella, since a Fungi species Saccharomyces cerevisiae is clustered as a sister node of the other plants, and it would be more controversial to root the tree with Saccharomyces.

Ala, Asp, Gly, Ile, Lys, Met, Asn, Pro, Arg, Ser, Thr, Val

Gln, Trp Ala, Cys, Asp, Glu, Phe, Gly, His, Ile,

Lys, Leu, Asn, Pro, Arg, Ser, Val, Tyr

Thr

Thr

Ala, Phe, Gly, Lys, Arg, Thr Ile, Lys, Met, Asn, Gln, Arg

Thr

Val Asn

Asp, Phe, Gly, Lys, Gln, Ser, Val Leu, Arg, Thr

Ala Phe Leu, Arg, Thr, Val

Asp, Trp

Ser

Asn, Arg, Ser

Leu, Arg, Thr, Val Leu, Arg, Thr, Val

Thr

Thr Val

Val

Ile, Leu, Arg, Val Leu, Arg, Val Trp

Gly, Thr

His

Leu, Arg, Val

(18)

The patterns for 95 minimum number of tRNA loss in Viridiplantae kingdom calculated from parsimony inference were tagged in the constructed phylogeny (Figure 3.5). tRNAs such as Thr and Val was lost more than ten times, including in internal nodes and in external leaves. But tRNAs such as Cys, Glu and Tyr was lost only once for each within the whole Viridiplantae kingdom.

3.3.2 tRNA gene loss in Fungi Kingdom

Arg Ala, Cys

Pro, Val

Cys

Val Glu

Glu, Tyr Cys

Cys

Ala, Cys, Phe, His, Ile, Asn, Arg, Ser, Thr, Val

Glu, Gly

Leu

Asp Gly

Figure3.6. Parsimony inference of tRNA gene loss patterns in Fungi Kingdom

Phylogeny for Fungi kingdom was constructed based on COX1 gene as well, and was rooted with a node containing one specie from Viridiplantae kingdom Zea mays and one species from Metazoa Nematostella.

Those tRNA annotated as tRNA-X, tRNA-Sec, tRNA-Glx were double checked and reannotated according to their anti-codon (Appendix, Table A2). The pattern of 28 minimum number of tRNA loss was calculated from parsimony inference was presented in the constructed phylogeny for Fungi.

3.3.3 ANOVA for loss tRNA genes of different origin

The number of deletion for each tRNA are showed in Figure 3.7, nine aminoacyl-tRNA synthetases classified as group A (two genes of different origins) were less likely tend to loss corresponding tRNAs, while the rest three aaRSs from group B and nine from group C, or group BC as referred in 2 groups classification (mitochondrial and cytoplasmic tRNAs of single origin), are more likely to lose their

(19)

corresponding tRNAs.

ANOVA test has confirmed this tendency for two-group-classification, both for when Gln and Lys was included and excluded (P=0.03659, P=0.0316). But it is not true for three-group-classification, either when Gln and Lys were included or not (P=0.0851, P=0.0915).

0 5 10 15 20 25 30 35 40

Met A His C Tyr A Trp A Phe A Leu A Lys C Ala C Ser A Glu A Gln C Asn B Ile B Pro A Asp A Cys C Gly C Val C Thr C Arg B

tRNA

D el et ions

Figure3.7 Minimal number of mitiochondrial tRNA loss inferred from analysis of 1704 mitochondrial genomes tRNAs are categorized according to the evolutionary history of their aaRSs listed in Table2.2. Black charts: Two genes of

different origins, Grey charts: Two genes of similar origin, White charts: A single gene of either origin.

Table3.2 ANOVA test for Min-tRNA loss

Classification of aminoacyl-tRNA synthetase tRNA-Gln & tRNA-Lys P-value

2 groups (A, BC) + 0.03659 *

2 groups (A, BC) - 0.0316 *

3 groups (A, B, C) + 0.0851

3 groups (A. B, C) - 0.0915

3.3.4 Correlation between tRNA Min-loss and sequence identity

Correlation between minimum number of tRNA loss and the sequence identity between bacterial genes and eukaryotic genes were plotted in Figure 3.8. The insignificant result of Pearson Correlation test (Cor=0.3855, P=0.1141) suggested that it is inappropriate to explain the frequency of tRNA loss by aminoacyl-tRNA synthetase sequence identity.

(20)

3.3.5 Correlation between tRNA Total-loss and Mitochondrial Genome size

Pearson correlation test performed in R suggested that there is correlation between tRNA total number of loss and mitochondrial genome size, either when tRNA-Gln and tRNA-Lys were taken into account or not (Cor=-0.06877, P=0.8234; Cor=-0.05822, P=0.8502).

Figure3.8 Sequence similarity between bacterial and eukaryotic aminoacyl-tRNA synthetases plotted against the minimum number of tRNA gene loss. Black square: two genes of different origins, Grey triangles: two genes of similar

origins, White diamonds: a single gene of either origin. No correlation was found between sequence identity and number tRNA loss.

3.3.6 Correlation between tRNA Total-loss and Mitochondrial Coding Genome size

Pearson correlation test performed in R suggested that there is correlation between tRNA total number of loss and mitochondrial coding genome size, either when tRNA-Gln and tRNA-Lys were taken into account or not (Cor=-0.04530, P=0.1200; Cor=-0.04524, P=0.1206).

Table 3.3 Pearson Correlation test

Correlation tRNA-Gln & tRNA-Lys Cor P-value

Total-tRNA loss & mito-Genomesize - -0.05822 0.8502

Total-tRNA loss & mito-Codingsize - -0.4524 0.1206

Total-tRNA loss & mito-Genomesize + -0.06877 0.8234

Total-tRNA loss & mito-Codingsize + -0.4530 0.1200

Min-tRNA-loss & Sequence-identity - 0.3855 0.1141

(21)

4. Discussion

4.1 Lateral gene transfer between Bacteria and Archaea

The mixed pattern of Archaea within phylums of Bacteria, namely Firmiculte, Actinobacteria and Thermogae, indicated that there was a lateral gene transfer between Archaea and Bacteria. And this phenomenon was reported by previous researches, and accordingly, only the affiliation between Actionbacteria and Archaea could be explained by conding bias (Macario, et al. 2006).

4.2 Deep node of Cyanobacteria

The phylogenetic position of Cyanobacteria is placed in a deep node in both trees for HSP70 and HSP60, which is different from what was observed from universal tree of life based on concatenated protein sequences (Brown, et al. 2001). However, the early divergent of Cyanobacteria in the phylogeny is consistent with the discovery of 3.5 billion years old stromatolite fossils of morden-Cyanobacteria-like ancient life (Barton, et al.2007).

4.3 Place of SAR11 and SAR116 sequences in HSP phylogenies

Placement of SAR11 within alphaproteobacteria was emphasized because this most abundant clade of free-living bacteria is likely to be the closest representative of ancestors of mitochondria, since the upper-layer-of-ocean lifestyle perfectly matches the oxygen rich environment required for the arise of aerobic cellular organelle. However, it is difficult to draw a straight conclusion where does Pelagibacter belong to. Clustering both within Rickettsiales and Rhodobacterales was observed in different phylogenies constructed with different parameters, but it tended to give higher support with Rhodobacterales.

SAR116 is another clade whose lifestyle is still unclear. The genome of a Punieispirillum marinum from this clade was released in June 2010 (Oh, 2010), and the sequence was available online early April.

Therefore, it is the first time to investigate the phylogenetic placement of this lineage within alphaproteobacteria. And the phylogenetic results from both HSPS70 and HSP60 agreed that SAR116 is always grouped within Rhodospirillales.

4.4 Clean-ups of alignment result in variations of phylogenetic topology

Topology within alphaproteobacteria for both HSP70 and HSP60 varies depending on the clean-ups of sequence alignment. For HSP70, the order Rickettsiales is monophyletic only when amitochondria species were excluded from the tree, and the sequence alignment was used directly without clean-ups. Otherwise, the order is divided into two subclades, one containing Rickettsia, and the other containing Wolbachia and the rest species in Rickettsiales, and mitochondrial clade are more related to the Wolbachia subclade of Rickettsiales. In the alpha tree for HSP70 with amitochondrial lineages and clean-ups of the alignment, mitochondrial clade shares the common ancestor with the Wolbachia subclade of Rickettsiales.

(22)

For HSP60, relationship between mitochondrial clade and Rickettsia dose not change whether sequence alignment was applied. But the relative position of Planctomyces and Chlamydia fluctuated according to the application of sequence alignment, that PVC phylum was broken down when the clean-up step was missed. One explanation for that HSP60 prefers alignment clean-ups is the lower identity between paralogs from endoplasmic reticulum and cytoplasm.

4.5 Paralog Rooting

The extreme conservation and multicopies of the two genes serve as a perfect marker for investigation of the evolutionary relationships between Eukaryotes, Bacteria and Archaea. The duplication of Endoplasmic reticulate and cytoplasmic copies gene in Eukatyotes has facilitated a possible way to root the phylogeny with a eukaryotic paralog.

The ER copy of HSP60 was only found in mammals, the protein sequences, expect the cpn60 domain are more variable than the cytoplasm copy, and less conserved than either archaeal or bacterial homologs. This suggests an ancient duplication within this gene. The advantage of rooting in this manner is the monophyletic of prokaryotes in the phylogeny of HSP70, and the molophyletic of bacteria in the phylogeny of HSP60.

4.6 Consistence with Endosymbiont Theory

Phylogenies from both HSP70 and HSP60 supported the endosymbiont theory and indicated the single origin of mitochondria from all lineages including secondary amitochondrial organisms. There is no doubt that mitochondrial copies of these two genes were derived from alphaproteobacteria, specifically within Rickettsiales order. And this topology is quite robust, no matter how the tree is rooted, or how many species of alphaproteobacteria is included.

However, not all proteins from HSP multigene family share the same evolutionary history. The phylogenies in this project are only single-gene trees, further genome wise analysis would provide a broader view in answering the question of origin of eukaryotic cells.

4.7 Co-evolution of tRNA gene loss and Aminoacyl-tRNA Synthetase gene

Both tRNA-Gln and tRNA-Lys were excluded in correlation test between tRNA loss and aminoacyl-tRNA synthetases sequence identity, due to the complex evolution history of GlnRS and LysRS. In correlation test for tRNA loss to both mitochondrial genome size and coding genome size, statistics for both with and without consideration of tRNA-Gln and tRNA-Lys were performed, and two set of analysis gave the same results that neither mitochondrial genome size nor coding genome size is related to the total number of tRNA loss.

ANOVA test and Pearson Correlation test suggest that aminoacyl-tRNA synthetases with higher sequence identity between bacterial type and eukaryotic type do not necessarily tend to loss the corresponding tRNAs in mitochondrial genome. In reality, single origin of aaRSs that functions in both mitochondria and

(23)

cytoplasm are more likely to lose their corresponding tRNAs.

The minimum number of tRNA loss inferred from parsimony algorithm would be different, as more mitochondrial genomes are continuously being released. But comparing to Brindefalk’s work in 2008 (Brindefalk, 2009), this research has updated the database during the 3 years, the total number of eukaryotic taxa has almost doubled itself, increased from 914 to 1704 (Appendix Table A1), the conclusion is still concrete. Hence, there is the confidence to believe a co-evolution between tRNA genes loss and their aminoacyl-tRNA synthetases.

Acknowledgement

Thank my supervisor Professor Siv Andersson for her instructions in each progressive step, provide me with opportunities to learn various bioinformatic methodologies, trust in my capability to finish such a big project as a Master student and to answer such a fundamental scientific question within six months.

Thank Professor Charles Kurland from Lund University for his concern about this project, thanks for his visit to my lab, suggestion for my work both practically and theoretically, and thanks for the discussions even when he was traveling around the world.

Thank Johan Viklund for his magical computation tutorial, and thank him for the data mining work when we are solving the problems for tRNA loss.

Thank Thijs Ettema for his computer since all my Bayesian trees were constructed in that way, and thank him for his suggestion and discussion.

Thank Lin Gen for helping me learn how to start with everything in this lab.

Thanks to everybody in Department Molecular Evolution.

And finally, thanks to Nature Publication Group and Springer for their permission of quoting some figures in this thesis.

(24)

Reference

Adams K.L., Palmer J. D., Evolution of mitochondrial gene content: gene loss and transfer to the nudcleus. Mol Phylogenet Evol 2003; 29: 380-95.

Albani A. E., Bengtson S., Canfield D. E., et al. Large colonial organisms with coordinated growth in oxygenated environments 2.1 Gyr ago. Nature 2010; 466: 100-104.

Andersson S. G. E., Zomorodipour A., Andersson J. O., et al. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature. 1998; 396: 133-143.

Bapteste E., Charlebois R. L., MacLeod D. & Brochier C. The two tempos of nuclear pore complex evolution: highly adapting proteins in an ancient frozen structure. Genome Biol 2005; 6: R85.

Barton N. H., Briggs D. E.G., Eisen J. A., Goldstein D. B., Patel N. H., Evolution, Cold Spring Harbor Lab Press, New York.

2007; 8: 202-210.

Boorstein W. R., Ziegelhoffer T., Craig E. A., Molecular Evoluton of the HSP70 Multigene Family. J Mol Evol 1994; 38:

1-17.

Brindefalk B., Mitochondrial and Eukaryotic Origins: a phylogeneic perspective. Uppsala University. 2009; 1: 15-20.

Brindefalk B., Viklund J., Larsson D., Thollesson M., Andersson S. G. E., Origin and evolution of the mitochondrial aminoacyl-tRNA synthetases. Mol Biol Evol. 2007; 24: 743-56.

Brochier-Armanet C., Boussau B., Gribaldo S., Fortere P., Mesophilic Crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nature Rev. Microbiol. 2008; 6: 245-252.

Brow J. R., Douady C. J., Italia M. J., Marshal W. E., Stanhope M. J., Universal trees based on large combined protein sequence data sets. Nature Gene. 2001; 28: 281-285.

Bui E. T. N., Bradley P. J., Johnson P. J., A common evolutionary origin for mitochondria and hydrogenosomes. Proc. Natl.

Acad. Sci. USA.1996; 93: 9651-9656.

Cavalier-Smith T., The phagotrophic origin of eukaryotes and phylogenetic classification of Protozoa. Int. J. Syst. Evol.

Microbiol. 2002; 52: 297–354.

Clark C. G., Roger A. J., Direct evidence for secondary loss of mitochondria in Entamoeba histolytica. Proc.Natl. Acad. Sci.

USA. 1995; 92: 6518-6521.

Collins L. & Penny D., Complex spliceosomal organization ancestral to extant eukaryotes. Mol. Biol. Evol. 2005; 22:

1053–1066.

Craig E.A., The heat shock response. CRC Crit Rev Biochem 1985; 18: 239-280.

Desmond E. & Gribaldo S., Phylogenomics of sterol synthesis: insights into the origin, evolution, and diversity of a key eukaryotic feature. Genome Biol. Evol. 2009; 2009: 364–381.

Donoghue P. C. J., Antcliffe J. B., Origins of multicellularity. Naure 2010; 466: 41-42.

Embley T. M., Multiple secondary origins of the anaerobic lifestyle in eukaryotes. Philos. Trans. R. Soc. Lond. B Biol. Sci.

2006; 361: 1055–106.

Eme L., Moreira D., Talla E. & Brochier-Armanet C., A complex cell division machinery was present in the last common

(25)

ancestor of eukaryotes. PLoS ONE 2009; 4: e5021.

Field M. C. & Dacks J. B., First and last ancestors: reconstructing evolution of the endomembrane system with ESCRTs, vesicle coat proteins, and nuclear pore complexes. Curr. Opin.Cell Biol. 2009; 21: 4–13.

Germot A., Phillip H., Guyader H. L., Presence of a mitochondrial-type 70-kDa heat shock protein in Trichomonas vaginalis suggests a very early mitochondrial endosymbiosis in eukaryotes. Proc. Natl. Acad. Sci. U.S.A. 93: 14614-14618.

Germot A., Phillip H., Guyader H. L., Evidence for loss of mitochondria in microsporidia from a mitochondrial type HSP70 in Nosema locustae. Mol. Biochem. Parasitol. 1997; 87: 159-168.

Gissi C., Iannelli F., Pesole G., Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity 2008; 101: 301-20.

Giezen M. & Tovar J., Degenerate mitochondria. EMBO 2005; Rep. 6: 525–530.

Gribaldo S., Poole A. M., Daubin V., Forterre P., Brochier-Armanet C., The origin of eukaryotes and their relationship with the Archaea: are we at a phylogenomic impasse? Nature Rev Micr. 2010; 8: 743-752.

Gupta R.S., Aitken K., Falah M., Singh B., Cloning of Giardia lamblia heat shock protein HSP70 homologous: Implications regarding origin of eukaryotic cells and of endoplamic reticulum. Proc. Natl. Acad.Sci.U.S.A. 1994; 91: 2895-2899.

Horner D. S., Hirt R. P., Kilvington S., Lloyd D., Embley T. M., Molecular data suggest an early acquisition of the mitochondrion endosymbiont. Proc. R. Soc. London Ser. B 1996; 263: 1053-1059.

Horst M. et al., Sequential action of two hsp70 complexs during protein import into mitochondria. EMBO J. 1997;

16:2668-2677.

Huang L. H., Wang H. S., Kang L., Different evolutionary lineages of large and small heat shock proteins in eukaryotes. Cell Res 2008; 18: 1074-1076.

Jekely G., Small GTPases and the evolution of the eukaryotic cell. Bioessays 2003; 25: 1129–1138.

Johnston J. A., Ward C. L., Kopitto R. R., Aggesomes: A cellular response to misfolded proteins. J Cell Biol 1998; 143:

1883-1898.

Kang P. J., et al., Requirement for hsp70 in the mitochondrial matrix for translocation and folding of precursor proteins.

Nature 1990; 348: 137-143.

Karlberg O., Canback B., Kurland C. G., Andersson S. G. E., The dual origin of the yeast mitochondrial proteome. Yeast Comp. Functional Genomics 2000; 17: 170-187.

Kurland C. G., Andersson S. G. E., Origin and Evolution of the Mitochondrial Proteome. Micrbiol. Mol. Biol. Rev. 2000; 64:

786-820.

Kurland C. G., Collins L. J., Penny D., Genomics and the irreducible nature of eukaryote cells. Science 2006; 312:

1011-1014.

Lake J. A., Henderson E., Oakes M., Clark M. W., Eocyte: a new ribosome structure indicates a kingdom with a close relationship to eukaryotes. Proc. Natl Acad. Sci. USA 1984; 81: 3786-3790.

Lithgow T., Schneider A., Evolution of macromolecular import pathways in mitochondria, hygrogenosoes and mitosomes.

Phil. Trans. R. Soc. B 2010; 365: 799-817.

(26)

Macario A. J. L., Brocchieri L., Shenoy A. R., Macario E. C., Evolution of a Protein-Folding Machine: Genomic and Evolutionary Analyses Reveal Three Lineages of the Archaeal hsp70 (dnaK) Gene. J. Mol Evol. 2006; 63: 74-86.

Mans, B. J., Anantharaman, V., Aravind, L. & Koonin, E. V., Comparative genomics, evolution and origins of the nuclear envelope and nuclear pore complex. Cell Cycle 2004; 3: 1612–1637.

Margulis L., Origin of Eukatyotic Cells Yale Univ, Press, New Haven, 1970.

Margulis L. Archaeal-eubacterial mergers in the origin of Eukaryota: phylogenetic classification of the life. Proc. Natl Acad.

Sci. USA. 1996; 93: 1971-1076.

Mueller M., Enzymes and compartmentation of core energy metabolism of anaerobic protists: a special case in eukatyotic evolution., Evolutionary relationships among protozoa. Kluwer Academic Publishers, London, U.K. 1998; In Coombs G. H., Vickerman K., Sleigh M. A.., Warren A. (ed.): 109-127.

Nakamura T. M. & Cech T. R. Reversing time: origin of telomerase. Cell 1998; 92: 587–590.

Oh H. M., Kwon K. K.. Kang I., Kang S. G., Lee J. H.., Kim S. J., Cho J. C. Complete Genome Sequence of “Candidatus Puniceispirillum marinum” IMCC1322, a Representative of the SAR116 Clade in the Alphaproteobacteria. Journ. Bacterio.

2010; 192: 3240-3241.

Pfanner N., Geissle A., Versatility of the mitochondrial protein import machinery. Nature Rev. Mol.Biol. 2001; 2: 339-349.

Poole A. M., Penny D. Evaluating hypotheses for the origin of eukaryotes. Bioessays 2007; 29: 74-84.

Ramesh M. A., Malik S. B. & Logsdon J. M. Jr., A phylogenomic inventory of meiotic genes; evidence for sex in Giardia and an early eukaryotic origin of meiosis. Curr. Biol. 2005; 15: 185–191.

Rivera M. C., Lake J. A., Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. Science 1992; 257: 74-76.

Robertson C. E., Harris J. K., Spear J. R., Pace N. R., Phylogenetic diversityand ecology of environmental archaea. Curr.

Opin. Microbiol. 2005; 8: 638-642.

Roy S. W. & Gilbert W., The evolution of spliceosomal introns: patterns, puzzles and progress. Nature Rev.Genet. 2006; 7:

211–221.

Seacy D. G., Stein D. B., Green G. R., Phylogenetic affinities between eukaryotic cells and a thermophilic mycoplasma.

Biosystems 1978; 10: 19-28.

Searcy D. G., Metablolic intergration during the evolutionary origin of mitochondria. Cell Res. 2003; 13: 229-238.

Schneider A., Dose the evolutionary history of aninoacyl-tRNA snthetasesexplain the loss of mitochondrial tRNA genes?

Trends Genet. 2001; 17: 557-558.

Schneider H. C., Westermann B., Neupert W., Brunner M., The nucleotide exchange factor MGE exerts a key function in th ATP-dependent cycle of mtHsp70-Tim44 interaction driving mitochondrial protein import. EMBO J. 1996; 15: 5796-5803.

Woese C. R., kandler O., Wheelis M. L., Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl Acad. Sci. USA 1990; 87:4576-4579.

(27)

Appendix

Table A1. Mitochondrial tRNA gene loss in different eukaryotic groups Group Numbera Genome sizeb Coding size tRNA-Totc

Metazoa 1574 16,164 11,338 818

Fungi 54 43,357 18,123 97

Viridiplantae 36 252,019 37,725 174

Stramenopiles 17 42,438 31,096 23

Alveolata 8 15,402 10,596 149

Amoebozoa 5 52,314 35,291 40

Rhodophyta 3 31,600 23,296 3

Cryptophyta 2 54,308 31,932 0

Haptophyceae 1 29,013 16,494 0

Jakobida 1 69,034 55,920 1

Malawimonadidae 1 47,328 31,824 0

Choanoflagellida 1 76,568 24,969 0

Heterolobosea 1 49,843 40,608 5

Total 1704 - - 1310

a Number of genomes analyzed

b Average mitochondrial genome size of each group a c Total: total number of tRNA loss

Table A2. Re-annotation of tRNA according to Anti-codon

Taxa ID Taxa name Annotated tRNA Anti-codon Group

153609 Moniliophora perniciosa XaatRNA Terminal Fungi

5755 Acanthameba castellanii XaatRNA tRNA-Lys Amoebozoa

416842 Saccharina ochotensis trnaX tRNA-Lys Stramenopiles

9913 Bos taurus (Cattle) Sec_tRNA tRNA-Trp Metazoa

9615 Canis lupus familaris (dog) Sec_tRNA tRNA-Trp Metazoa 9598 Pan trolodtes (Chimpanzee) Sec_tRNA tRNA-Trp Metazoa 7668 Strongylocentrotus purpuratus ( purple urchin) Sec_tRNA tRNA-Trp Metazoa

5062 Aspergillus oryzae tRNA-Sec tRNA-Trp Fungi

8226 Katsuwons pelamis Glx_tRNA tRNA-Glu Metazoa

8228 Euthynnus alletteratus Glx_tRNA tRNA-Glu Metazoa

8235 Thunns alaluga Glx_tRNA tRNA-Glu Metazoa

(28)

Figure A1. Phylogeny for HSP70 with increased number of alphaproteobacteria

Bayesian algorithm based on full sequence alignment, Dirichlet process for rates across sites (CAT model), LG distribution for relative exchangeability (exchange rate), rooted with endoplasmic reticulum copy of gene from eukaryotes.

References

Related documents

For this, I aim to: (1) provide novel taxonomic assignments and mapping of the distribution of snakes in the region, (2) test the role of geographical and environmental distances

(1) provide novel taxonomic assignments and mapping of the distribution of snakes in the region, (2) test the role of geographical and environmental distances

Reptiles and amphibians (the herpetofauna), occupy a wide range of habitats and niches, making them key organisms to understanding the origins of Neotropical biodiversity.. The

The latter entreats dissipative dynamics; non-Hermitian quantum mechanics together with modern quantum statistics thereby establish a precise spatio-temporal order of

Stöden omfattar statliga lån och kreditgarantier; anstånd med skatter och avgifter; tillfälligt sänkta arbetsgivaravgifter under pandemins första fas; ökat statligt ansvar

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

För det tredje har det påståtts, att den syftar till att göra kritik till »vetenskap», ett angrepp som förefaller helt motsägas av den fjärde invändningen,

If the aaRs from mitochondria is too similar to its counterpart in the eukaryotic host, mitochondrial tRNA gene is lost, provided the assumption that all mitochondrial aaRs