• No results found

I also want to give a special thanks to the master of cloning, Lun Yao, for your patience with all my questions

N/A
N/A
Protected

Academic year: 2021

Share "I also want to give a special thanks to the master of cloning, Lun Yao, for your patience with all my questions"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT BIOTECHNOLOGY, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2017,

CRISPR/Cas9 mediated metabolic engineering of carbon fixation in cyanobacteria

LINNEA ÖSTERBERG

(2)

Acknowledgements

I want to thank my supervisor Paul Hudson and my co-supervisor Markus Janasch. You have given me unlimited support and encouragement. I also want to give a special thanks to the master of cloning, Lun Yao, for your patience with all my questions. I want to thank Danuta Kaczmarzyk, Ivana Cengic and Kiyan Shabestary for providing me with many day to day tips and trix for making my project work. I also want to thank Michael Jahn, Johannes Asplund-Samuelsson, Jan Karlsen, Da Wang and the rest of Gamma 5 for providing a fun and stimulating work environment.

Last, I what to thank Synechocystis sp. PCC 6803, even though we will always have a hate-love relationship I know that this thesis would have not have been possible without you.

List of Abbreviations

aTc - anhydrotetracycline

Calvin cycle - Calvin-Benson-Bassham cycle Cas - CRISPR associated

CRISPR - clustered regularly interspaced short palindromic repeats CRISPRi - CRISPR interferance

crRNA - CRISPR RNA dDNA - donor DNA E. coli - Escherichia coli HR - homologous recombination NHEJ - non-homologus end joining PAM - protospacer adjacent motif PCR - polymerase chain reaction RBS - ribosome binding site sgRNA - single-guide RNA

Synechocystis - Synechocystis sp. PCC 6803 tracrRNA - transacting crRNA

(3)

Contents

1 Introduction 1

2 Theoretical Background 2

2.1 Cyanobacteria . . . . 2

2.1.1 Photosynthesis and Carbon Fixation . . . . 2

2.1.2 Synechocystis sp. PCC 6803 . . . . 3

2.2 Metabolic Engineering . . . . 3

2.2.1 Examples of Metabolic Engineering in Synechocystis . . . 3

2.2.2 Systems Biology and Metabolic Engineering . . . . 4

2.2.3 Identifying Metabolic Control . . . . 4

2.3 Controlling Expression . . . . 5

2.3.1 Ribosome Binding Sites . . . . 6

2.4 Tools for Metabolic Engineering in Cyanobacteria . . . . 6

2.4.1 The CRISPR Toolbox . . . . 6

2.4.2 Shuttle Vectors for Cyanobacteria . . . . 8

2.4.3 Insulator . . . . 8

3 Materials and Methods 8 3.1 Tools and Software . . . . 8

3.2 Plasmid Construction . . . . 9

3.3 Bacterial Strains, Culturing Conditions and Transformations . . 9

3.4 CRISPR/Cas9 . . . . 9

3.4.1 Cas9 DNA Cleaving Assay . . . . 9

3.4.2 System I . . . . 10

3.4.3 System II . . . . 10

4 Results 10 4.1 Design of Targeting and Donor Constructs . . . . 10

4.2 Two Systems for Cas9-editing . . . . 12

4.2.1 System I . . . . 12

4.2.2 System II . . . . 14

5 Discussion 15 5.1 Targets and Constructs . . . . 16

5.2 Systems for CRISPR/Cas9 editing . . . . 17

6 Future Perspectives 20

Appendices 27

A List of Figures 27

B List of Tables 28

C List of Plasmids 28

D List of Primers 29

E List of Strains 30

(4)

F Protocols 30

F.1 Golden Gate Assembly Protocol . . . . 30

F.2 Transformation Protocols . . . . 30

F.2.1 Electroporation . . . . 30

F.2.2 Natural transformation . . . . 31

F.3 Antibiotics and inducers . . . . 31

G Raw data 31 G.1 Plates . . . . 31

(5)

Abstract

Cyanobacteria can be used to produce various biofuels and chemicals, as so called cell factories. However, carbon fixation is often a limiting step and to improve carbon flux it is important to understand the control of the Calvin- Benson-Bassham cycle (Calvin cycle). This thesis suggest a way to utilize CRISPR/Cas9 to evaluate the controlling elements of the Calvin cycle.

In vivo characterization of the control over carbon flux can be determined by systematically changing the Ribosome Binding Site (RBS) of the enzymes in the Calvin cycle. By utilizing an engineered, ethanol producing strain of Synechocystis sp. PCC6803 (Synechocystis), one can correlate the change in productivity, growth rate and CO2 consumption to the change in abundance of corresponding enzyme, thus determine the control of that enzyme.

Due to time restrictions, most constructs were not finished and non of the Synechocystis strains were characterized. However, it was found that the ex- pression of CRISPR/Cas9 editing vector appear to be toxic in Escherichia coli.

Moreover, using a two vector system seem to pose a high mutational pressure in Synechocystis. Thus, this might not be an optimal approach for CRISPR/Cas9 editing in cyanobacteria.

(6)

Referat

Cyanobakterier kan anv¨andas som cellfabriker f¨or att producera en variation av kemikalier varvid vissa kan anv¨andas som biobr¨anslen. Dock ¨ar kolfix- eringen ofta en begr¨ansning och f¨or att f¨orb¨attra kolfl¨odet ¨ar det viktigt att man f¨orst˚ar hur kontollen ¨over Calvin-Benson-Bassham cykeln (Calvin-cykeln) fungerar. Denna uppsats f¨oresl˚ar ett s¨att att utnyttja CRISPR/Cas9 f¨or att utv¨ardera de kotrollerande best˚andsdelarna i Calvin cykeln.

Genom att systematiskt ¨andra Ribosombindnngst¨allet (RBS) f¨or enzymerna involverade i Calvin-cykeln kan man In vivo karakterisera kolfl¨odets kontroll.

or man detta i en stam av Synechocystis sp. PCC6803 (Synechocystis), gen- modifierad f¨or att producera etanol, kan man relatera f¨or¨andringen i etanol- produktivitet, tillv¨axthastighet och CO2-konsumtion till m¨angden av respektive enzym och d¨arvid dess kontroll ¨over Calvin-cykeln.

a grund av tidsbrist f¨ardigst¨alldes inte alla konstrukt och inga Synechocys- tis varianter blev karakteriserade. Dock observerades det att CRISPR/Cas9 modifieringsvektor verkar vara toxisk f¨or Escherichia coli. D¨artill verkar ett tv˚avektorsystem f¨or CRISPR/Cas9 modifiering utg¨ora ett mutationstryck i Syne- chocystis och d¨armed ¨ar detta system f¨ormodligen inte optimalt.

(7)

1 Introduction

The continuous increase of anthropogenic greenhouse gases is contributing to irreversible effects on the environment, partly due to the long-lasting effects of CO2 perturbation [57]. Atmospheric CO2 is one of the main greenhouse gases [46] and to manage these effects, strategies for CO2 mitigation have been suggested, including carbon sequestration and CO2neutral technologies [33].

Against this background, the photosynthetic conversion of CO2 and H2O to products such as fuels and chemicals would be an ideal solution as the current production of such compounds usually results in CO2 emission. The ability of cyanobacteria to fixate CO2 through photosynthesis makes them a promising host for a sustainable bioeconomy [4, 5].

Cyanobacteria use the Calvin-Benson-Bassham cycle (Calvin cycle) to fixate CO2. This is, however, an energy consuming process [23] and remains a chal- lenge in the effort of turning cyanobacteria into an efficient cell factory. Many attempts have been made to increase the efficiency of the Calvin cycle focusing on the main CO2 fixation enzyme RuBisCO. Despite large efforts, no substan- tial improvement has been achieved, suggesting that the cycle’s limiting step lies elsewhere [18]. In addition, it has been shown that the over expression of the Calvin cycle enzyme Sedoheptulose-1-7 Biphophatase improved the metabolic flux of CO2 fixation in tobacco [51], further supporting this idea.

Figure 1: The Calvin cycle is one of six natural pathways known to fixate CO2

and it is the primary pathway of plants, algae and cyanobacteria. It is the most energy consuming CO2 fixating pathway and contains 11 enzymes and 13 reactions converting 3 CO2 to 1 glyceraldehyde 3-phosphate [23]. Dotted lines shows reactions in proximal pathways where Calvin cycle intermediate metabo- lites serve as branch points, being substrates or products of those enzymes.

(8)

Convenient approaches for systematic characterization of the Calvin cycle in vivo, has long been lacking in cyanobacteria. However, the development of CRISPR/Cas9 as a tool for cyanobacteria eases such characterizations. This thesis suggests how the implementation of CRISPR/Cas9 for cyanobacteria can be used to systematically evaluate the controlling elements of the Calvin cycle by changing the RBS (ribosome binding sites), hence the abundance of the catalyz- ing enzymes. By using an engineered, ethanol producing strain of Synechocystis sp. PCC6803 (Synechocystis), one can correlate the change in productivity, growth rate and CO2 consumption to the change in abundance of correspond- ing enzyme. Thus, achieving in vivo characterization of the control over carbon flux for a respective enzyme.

2 Theoretical Background

2.1 Cyanobacteria

Cyanobacteria form a diverse group of oxygenic, phototrophic, gram-negative prokaryotes including the key genera Synechocystis, Synechococcus, Oscillatoria and Anabaena. They have a diverse morphology including unicellular dividing by binary fission, unicellular colonial, filamentous heterocyst and filamentous non-heterocystous as well as branching filamentous. They all produce chloro- phyll a but as they have different types of pigments, the colors span from blue- green to red or brown. Cyanobacteria are widely found in terrestrial, freshwater, and marine habitats, including extreme environment such as hot springs, saline lakes and desert soils. As well as being a key factor in oxygenating the at- mosphere millions of years ago they are also key factors in maintaining the biosphere today [38].

The ability to grow using sunlight with CO2 as a carbon source and need- ing minimal nutrient supply has sparked the interest of cyanobacteria usage in biotechnological applications. Compared to other photosynthetic organisms like plants and algae, they grow faster and are easier to engineer [25]. However, many strains of cyanobacteria have multiple copies of the genome, making it hard to genetically modify them in an efficient way. To make sure that the introduced genotype translates into the phenotype the mutation must be introduced in all copies, otherwise it will be diluted in the expression coming from the copies of the genome not edited. In addition, cyanobacteria are still outcompeted by other cell factories such as Escherichia coli (E. coli ) and Saccharomyces cere- visiae in terms of titer and yield [72].

2.1.1 Photosynthesis and Carbon Fixation

Carbon enters the Calvin cycle trough RuBisCO and gets converted to glycer- aldehyde 3-phosphate. To pass one time through the Calvin cycle, 3 CO2, 9 ATP and 6 NADPH are consumed, resulting in the generation of 1 glyceraldehyde 3-phosphate, 9 ADP and 6 NADP+ [23].

However, RuBisCO can also use O2 as a substrate in a process called pho- torespiration. This process is not only energy wasting, but it also results in the cell losing carbon as CO2. This process can cause up to 25% loss of all car- bon fixed due to RuBisCO’s preference towards carboxylation [11]. However,

(9)

cyanobacteria have circumvented this problem by compartmentalizing the Ru- BisCO enzyme and enriching CO2trough the carbon concentrating mechanism [23, 72].

The carbon concentrating mechanism is a property of cyanobacteria that concentrates CO2 around RuBisCO. This is facilitated by the carboxysome which is a sub-cellular compartment. The carboxysome has a positively charged surface, preventing O2 permeability and CO2escape [23, 72].

The caboxysome imports the carbon as HCO3. In the carboxysome the HCO3 gets converted into CO2by carbonic anhydrase, thus concentrating car- bon and keeping photorespiration at a low activity [23, 72].

2.1.2 Synechocystis sp. PCC 6803

Synechocystis is a naturally competent [72], unicellular cyanobacteria strain [32]

that was first isolated from a fresh water lake 1968 and is part of the Pasteur Culture Collection. This is the most extensively studied strain of cyanobacteria [72] partly due to the early acquirement of the genome sequence and annotation in 1996 [31].

There are many similarities between Synechocystis and many higher plants.

Furthermore, the photosynthetic proteins from Synechocystis can be heterolog- ically expressed in plants [23]. Synechocystis grows under photoautotrophic, mixotrophic and heterotrophic conditions [72] and since they are much easier to cultivate and genetically modify than plants, they have become a model organism for studying and engineering photosynthesis [32].

In contrast to Synechococcus sp. PCC 7942, which is also often used as a model organism, Synechocystis has approximately 40% larger genome and has a more complex and flexible metabolism including more isoenzymes. In the Calvin cycles these include fructose-1,6-bisphosphatase and ribose-5-phosphate isomerase [29].

2.2 Metabolic Engineering

Metabolic engineering is the targeted use of recombinant DNA technologies to modify the metabolic network either by redirecting or introducing new pathways in order to increase the production of metabolites, produce metabolites non- native to the host organism or improve cell properties [59].

What signifies metabolic engineering is the iterative process where a rigorous analysis of what impact the introduced modification will have is performed in each step. This gives a cycle going from design to genetic modification, to characterization and back to design again as is shown in Figure 2. The results from the characterization can be used to improve models and understanding of the biological system, leading to improved design. Thus, the relationship between experimental- and computational biology is very strong and is one of the hallmarks of metabolic engineering [59].

2.2.1 Examples of Metabolic Engineering in Synechocystis

As a cell factory, Synechocystis has been used for the production of commodity chemicals such as isoprene [35] and H2 [7] as well as biofuels, including fatty acids [36], fatty alcohols [49], and ethanol [22]. Ethanol was first produced in

(10)

Figure 2: Metabolic engineering is characterized by working as a closed cycle that goes from design to genetic modification, to characterization and back to design again in order to improve or enable production of metabolites [59].

Synechococcus sp. PCC 7942 by introducing exogenous pyruvate decarboxylase and alcohol dehydrogenase, thus creating a novel pathway to ethanol [17]. To produce ethanol in Synechocystis only pyruvate decarboxylase is needed, since Synechocystis has endogenous alcohol dehydrogenase [22].

2.2.2 Systems Biology and Metabolic Engineering

The field of systems biology has greatly improved the strategies of metabolic engineering. A system is defined as a network of interdependent components making up the parts of a unified whole [63]. Metabolic networks make a perfect example of a system and as systems biology aims to examine biology as a system, the fields are tightly interlocked.

From systems biology, strategies for computer aided modeling of the metabolic networks have emerged. There are two common approches to use: stoichiometry- based approaches and kinetics-based approaches. Of the stoichiometry-based approaches the flux balance analysis is the most common method. This method finds the steady-state solutions of fluxes where the stoichiometry model is con- strained by measured fluxes [14]. When using constrained models a solution space containing all feasible solutions is obtained and the true state will be included in this space [60, 53]. In addition to stoichiometry kinetic-based ap- proaches also try to evaluate the relationship between flux and metabolites with respect to the underlying kinetic rate laws. However, many parameters are un- known or uncertain [14].

To circumvent this problem, parameter sampling frameworks has been de- veloped [14]. A common framework is Monte Carlo sampling of the parame- ter space, wich uses random sampling to achieve a probabilistic interpretation [60, 53].

2.2.3 Identifying Metabolic Control

Control has a very specific meening in the context of metabolic engineering.

The maximum reaction rate (Vmax) is dependent on enzyme abundance and it follows from Equation (1) [11] that when the enzyme abundance of any enzyme

(11)

in a pathway increases, the rate of that reaction increases. As the law of mass action dictates, this will change the steady state of the system. This change in the system is described here with the term control [47].

Vmaxdef= kcat[E] (1)

The flux-force relationship theorem relates the relationship of net-fluxes (J+− J) to Gibbs free energy (∆G) for chemical reactions working in a non- equilibrium steady state, as can be seen in Equation 2. Since these are the conditions for a chemical reaction in a cell, this theorem provides the frame- work in which the net-flux can be related to the ∆G of enzyme reactions, thus connecting reaction rate and control to the termodynamics of the reaction [9].

∆G = −RT lnJ+

J (2)

From the flux-force relationship, it follows that an enzyme catalyzing a re- action far from its equilibrium, having a large ∆G, uses the majority of the enzyme units catalyzing the forward reaction, while an enzyme catalyzing a reaction close to its equilibrium wastes a lot of enzyme units catalyzing the backwards reaction as well [47].

Changing the abundance of an enzyme working far from its equilibrium point will have a large effect on the system, since the enzymes mainly are catalyzing the forward reaction, meaning that the enzyme is having a large control of the system. In contrast, an enzyme catalyzing a reaction working close to its equilibrium point will have little control of the system as this enzyme also is catalyzing the backwards reaction. An increase of abundance will increase the backwards flux as well as the forward flux, thus have a small effect on the net flux. This results in a small effect on the system [47].

Consequently, perturbing the system by systematically changing the enzyme abundances of a pathway will give information about the controlling elements of that pathway. This information can then be implemented in the models and greatly help the strategic in the metabolic engineering approach of increasing pathway net flux.

2.3 Controlling Expression

There are several ways of changing protein abundance. Going back to the cen- tral dogma, there are in principle three levels of controlling protein abundance:

transcriptional-, translational- and post-translational level.

For every level down, the dynamic is limited and therefore, to achieve large effects, changing the expression on the transcription level might be preferred, e.g. changing the promoter. Then the enzyme abundance is not limited by the abundance of mRNA. However, many genes are organized in operons and changing the promoter will affect all genes in that operon. On the transcriptional level one can also change the copy-number of the gene. This is though very limited since it is only possible to achieve a multiple of the wild type expression.

At the translational level one can target either the mRNA abundance or the translation efficiency. This can be done with RNA interference, ribozymes, codon optimization, introducing mRNA secondary structures etc. [44].

(12)

It has been showed that one can change protein abundance by targeting translational elongation efficiency by adding poly-A tracks to the target gene.

This is shown to decrease mRNA stability and decrease protein expression[6].

However, this method only lowers the expression, setting the wild type expres- sion profile as a limiting factor.

2.3.1 Ribosome Binding Sites

The Ribosome Binding Sites (RBS) regulates the expression of individual genes on an operon [19]. However, the genetic context of the RBS also plays a role for both translation and transcription. Mutalik et al. evaluated the impact of the part-junction interference for different genetic elements in E. coli. The interface between the UTR and the gene plays a big role in overall expression, mRNA abundance and translation efficiency. One reason for this could be RNA secondary structures. Sequences downstream promoter can regulate RNA poly- merase promoter escape hence also the promoter strength [43]. This means that changing the RBS could also alter promoter strength.

2.4 Tools for Metabolic Engineering in Cyanobacteria

2.4.1 The CRISPR Toolbox

The clustered regularly interspaced short palindromic repeat (CRISPR) system was first described in 1987 by Ishino et al. “Five highly homologous sequences of 29 nucleotides were arranged as direct repeats with 32 nucleotides as spacing.

/. . . / So far, no sequence homologous to these has been found elsewhere in prokaryotes, and the biological significance of these sequences is not known.” [28]

Since then the discovery known as CRISPR/Cas9 system has completely taken over in the field and the impact of this discovery can be demonstrated by the exponential increase of publications. Results containing the word “CRISPR” in The National Center for Biotechnology Information (NCBI) PubMed database has increased from 1 in the year of 2002 to 2168 in the year of 2016.

Mechanism of CRISPR The diverse endogenous CRISPR systems are neatly reviewed by Sorek et al. [58]. The CRISPR system is native to archaea and bacteria, mediating a nucleic acid based adaptive immune system. Although the stages of the mechanism in which the microorganisms acquire immunity is distinct and common for all different CRISPR systems, the CRISPR loci and the proteins mediating these stages differ. The host cell acquires immunity by integrating a part of the invading species genome into the CRISPR element of the chromosome. These parts are called protospacers and are often flanked by a short motif called protospacer adjacent motif (PAM). From these CRISPR ele- ments, a short RNA is generated in a process known as CRISPR RNA (crRNA) biogenesis. The crRNA is used by the cell to prime a CRISPR associated (Cas) protein to mediate the interference with the invading species [58].

The Cas gene families are named based on phylogenetic studies which gen- erated an orthologue based classification system, dividing the systems in types and subtypes [58]. As the dynamics of the CRISPR loci has hampered the clas- sification, a new classification system was introduced 2015 dividing the systems

(13)

into classes, types and subtypes. The new system is based on signature protein families and the architecture of the cas loci [39].

Derivatives from the CRISPR system type II is the most commonly used tool for genomic editing, including Cas9 primed with sgRNA. The endogenous system uses crRNA together with transacting crRNA (tracrRNA) to guide the Cas9 nuclease. However, these two RNAs can be replaced by a single-guide RNA (sgRNA) [13]. Cpf1 is another nuclease protein from CRISPR system type V that only uses crRNA as a guide [73]. After introducing a double stranded brake the cell uses either homologous recombination (HR) or non-homologous end joining (NHEJ) to repair the brake. When HR is used one can modify or introduce new DNA by supplying a donor DNA (dDNA) that the cell can use as template for repair. When NHEJ is employed, the reparation process is error prone and will introduce deletions or insertions [13].

The NHEJ pathway has been extensively characterized in eukaryotes but is not as well studies in prokaryotes [48, 55]. This pathway was long consid- ered unique to eukaryotes. This was discovered to be untrue after bioinformatic studies identified ATP dependent ligases in bacteria, which lead to studies con- firming this pathway in mycobacteria [48, 55]. It is not clearly established if cyanobacteria carry this pathway or not.

CRISPR/Cas9 Specificity CRISPR/Cas9 is the most characterized sys- tem for use in genetic engineering. The target specificity of the CRISPR/Cas9 system has been reviewed by Wu et. al. [67] and is an important and often critical factor when using this tool. There are several factors that affect the specificity. First, the PAM sequence, directly flanking the 3’ end of the target sequence, is very important for specificity. For the Cas9 derived from Streptococ- cus pyogenes this sequence is NGG. In addition, 10-12 nucleotides proximal to the PAM region have been identified as especially important for specificity. This region is called the seed region. Furthermore, the abundance of Cas9/sgRNA complex also influence the specificity. The higher the abundance, the lower the specificity [67].

CRISPR in Cyanobacteria One of the challenges when doing genetic engi- neering of cyanobacteria is their multiple chromosome copies. When introducing a change in the genome one needs to make sure that the change is incorporated in all copies. This is one of the reasons why the introduction of CRISPR/Cas9 system for gene editing in cyanobacteria has opened a lot of doors in the context of metabolic engineering.

So far CRISPR based systems have a short history in cyanobacteria. CRISPR interference (CRISPRi) is a technique utilizing a Cas9 protein lacking nucle- ase activity in order to repress genes. This was the first tool introduced for cyanobacteria in 2016 [70] and shortly after, Cas9 with endonuclease activity was implemented for genome editing [66]. Furthermore, the same year a CRISPR system utilizing Cpf1 as a nuclease was also implemented for cyanobacteria [64].

So far CRISPRi has after the introduction been used for titratable repression demonstrated by controlling flux of carbon fixation in Synechococcus sp. PCC 7002 [24] and for succinate production in Synechococcus elongatus PCC 7942 [26]. CRISPR/Cas9 has also been used for succinate production [34].

(14)

2.4.2 Shuttle Vectors for Cyanobacteria

A shuttle vector is a vector that can replicate and be maintained in more than one organism [15]. This way cloning can be done in a fast-growing organism with well-established techniques i.e. E. coli and then the finished construct can be transformed into the target organism. The available shuttle vectors for cyanobacteria were reviewed 2012 by Wang et. al. Seven representative vectors were available which of RSF1010 derivatives was used with Synechocystis as host [65]. These plasmids are generally not stable and antibiotic pressure is needed to maintain the plasmid [65, 72].

Plasmid pPMQAK1 The plasmid PMQAK1 is a broad-host-range replicat- ing vector derived from RSF1010 to be compatible with the use of biobricks and is known to replicate both in E. coli and Synechocystis [27]. This allows for assembly of constructs to be made in E. coli and then transformed into Syne- chocystis. However, both RSF1010 and PMQAK1 are encoding for the protein MobA, which is needed for conjugative DNA transfer [21]. As part of a complex nicks plasmid DNA and covalently binds to the single strand [42]. The bound protein and the single stranded DNA copies, being denatured during cell lysis, may cause interference with enzymatic reactions and low yields during cloning [62].

2.4.3 Insulator

When introducing a genetic element, its function is dependent on the genetic, cellular and environmental context. This is a problem when predicting how the element is going to work. To address this problem Lou et al. [37] evaluated how ribozymes with endo-nuclease activity could be used as genetic insulators to buffer the genetic element from their genetic context. RiboJ is one of these insulators and works by cleaving itself at its 3’ end. This way the 5’UTR of a mRNA can be removed, thus isolating the genetic element from its genetic context on a translational level [37]. The usage of RiboJ has been shown to strengthen the correlation between actual expression and predicted expression for different genetic contexts in E. coli [37]. Later, RiboJ has been used in Synechocystis [20, 62].

3 Materials and Methods

3.1 Tools and Software

The following tools and soft wares were used: CyanoBase [1], CasOT tool [68]

, QuickFold [40], Geneious R6, Tm calculator from Thermo Fisher Scientific Inc, Primer Blast from NCBI [71], GeneArt Primer and Construct DesignR

Tool from Thermo Fisher Scientific Inc, Ligation Calculator from 2011 iGEM UT Dallas team [2], KiNG (Kinemage, Next Generation), ChemDrawBio and POPPY (submitted manuscript, https://github.com/Asplund-Samuelsson/

POPPY).

(15)

3.2 Plasmid Construction

System I The sgRNA were constructed with overlap extension PCR followed by restriction cloning into pPMQAK1 vector containing Cas9 originating from S. pyogenes under tetR repression. The dDNA for rbcL was cloned into a non- replicating plasmid using Golden Gate Cloning. Protocols, primers used and resulting plasmids are listed in Appendices C-F.

System II All plasmids were created from a template of sgRNA and dDNA that were ordered as a construct in a plasmid from Thermo Fisher Scientific Inc.

The template was diversified to contain different strengths of RBS by overlap extension PCR, followed by restriction cloning into the pPMQAK1 vector con- taining Cas9 under tetR repression. The sequences and the different strengths of the RBS can be seen in Table 1. Primers used and resulting plasmids are listed in Appendices C-D.

Table 1: Standard part Ribosome Binding Sites for Synechocystis

Part Sequence

BBa B0032 TCACACAGGAAAG Low expression BBa B0034 AAAGAGGAGAAA Strong expression BBa B0064 AAAGAGGGGAAA Medium expression

Englund et al. has systematically evaluated different promoters and ribosome binding sites in Synechocystis sp. PCC 6308 these ribosome binding sites having a low, medium and strong expression profile when evaluated using YFP and BFP [19].

3.3 Bacterial Strains, Culturing Conditions and Transfor- mations

Cloning was done in the E. coli strains 10-beta [45], CopyCutter [52] and Xl1- Blue. All E. coli strains were cultured in 37C, LB-media either in liquid culture or on agar plates supplemented with antibiotics for selection. All Synechocystis strains were cultured in climate chamber (Percival Climatics SE-1100) at 28C, 50 µE/sm2illumination, and CO2 at 1% v/v in BG-11 in either liquid culture or on agar plates supplemented with antibiotics for selection and anhydrotetra- cycline (aTc) as inducer.

Transformation of replicating plasmids in to Synechocystis was performed using electroporation. Non-replicating plasmids were transformed using natural transformation. Transformation protocols can be found in Appendix F.

3.4 CRISPR/Cas9

3.4.1 Cas9 DNA Cleaving Assay

Colonies were taken from plates and inoculated into BG-11 liquid culture sup- plemented with antibiotics for selection until visibly green. 5 µL culture was spotted on BG-11 plate supplemented with antibiotics and double concentration of inducer and BG-11 plates with only antibiotics.

(16)

3.4.2 System I

The dDNA was transformed using natural transformation and plated on plates supplemented with aTc.

3.4.3 System II

After electroporation, half of the transformed cells were plated as normal on antibiotic plates. A single colony was resuspended in 0.5 mL MilliQ and plated on plates containing aTc.

The other half from the transformation were cultured in liquid culture sup- plemented with antibiotics for selection. When the liquid culture was visible green, the cells were plated on plates containing aTc.

4 Results

4.1 Design of Targeting and Donor Constructs

When choosing which enzymes to target in the Calvin cycle there are many things to consider.

The target list was based on network analysis modeling, proteomics-, mRNA data and literature. The targets were decided based on the following assump- tions: 1) Enzymes working far from its equilibrium has a larger control of the system and thus are more interesting for this study. This was evaluated from network analysis data using POPPY. 2) The isoenzyme having the biggest ef- fect is the one with the larges relative abundance based on proteomics data.

3) Enzymes where mRNA data and proteomics data correlate are less likely to be regulated at a post transcriptional level, thus changing the RBS will have intended effect on protein abundance. To see how the results from this study compare to previous studies it is also of interest to include well studied enzymes such as RuBisCO. The resulting list of targeted genes can be seen in Table 2.

Table 2: Genes targeted by CRISPR/Cas9

Gene name Gene product Gene loci

rbcL Ribulose bisphosphate carboxylase large subunit slr0009 rbcS Ribulose bisphosphate carboxylase small subunit slr0012

pgk Phosphoglycerate kinase slr0394

gap2 NAD(P)-dependent glyceraldehyde-3-phosphate dehydrogenase sll1342

tpi Triosephosphate isomerase slr0783

fbaA Fructose-bisphosphate aldolase, class II sll0018

fbpI Fructose-1,6-/sedoheptulose-1,7-bisphosphatase slr2094

tktA Transketolase sll1070

rpiA Ribose 5-phosphate isomerase slr0194

prk Phosphoribulokinase sll1525

talB Transaldolase slr1793

This table shows a list of the targeted genes, the gene product as well as the gene loci. All genes codes for enzymes or subunits of enzymes in the Calvin cycle.

(17)

Preferably a sgRNA target site would be found in the 5’UTR but since that is not always possible, the beginning of the target gene was also included in the search. For each gene between 6-39 PAM sites were found. Out of those, the sgRNA target closest to the ribosome binding site was chosen given that it had an acceptable low risk for off targets. This means that the most similar off target would have a minimum two mismatches in the seed region. Else, the sgRNA target with the lowest risk were chosen. In Figure 3 one can see how the sgRNA binds to the 5’UTR and where the double stranded break would occur and in Table 3 the targets are listed together with the distance between the PAM sites and the gene start codon.

Figure 3: Cas9 forms a complex with the sgRNA that guides the protein to the corresponding place in the genomic DNA. Cas9 is a nuclease and makes a double stranded break three nucleotides upstream the PAM sequence which is NGG for this particular system. In this figure the sgRNA is targeting the 5’UTR of rubisco large subunit. Cas9 structure is based on PDB:4008 [54].

The dDNA needs two homology arms on each side of the target region to be successfully integrated into the genome. The homology region surrounding the RBS site is designed to be 400 bp on each side. Since the ribosome binding site is not annotated for many of the genes in Synechocystis, 15 bp are removed at the 5’ end of the gene to at least disrupt the wild type RBS and replaced with RiboJ and the standard part RBS BBaB0032, BBaB0064 and BBaB0034 shown in Table 1. These standard RBS have been characterized to give a low, strong and medium expression profile in Synechocystis [19].

To remove the target site in the dDNA, synonyms mutations were introduced to remove the PAM site or introduce mismatches in the seed region. When pos- sible the codon preference of the codon removed was matched. Also, unwanted restriction sites were removed. In Figure 4 an overview of the dDNA design can be seen together with the genomic DNA.

(18)

Table 3: Proximity of sgRNA to target

Gene name Distance of PAM from start codon Gene loci

rbcL +1 slr0009

rbcS -18 slr0012

pgk -5 slr0394

gap2 +26 sll1342

tpi -47 slr0783

fbaA -35 sll0018

fbpI +14 slr2094

tktA +16 sll1070

rpiA +16 slr0194

prk +22 sll1525

talB -7 slr1793

This table shows the PAM site distance from the start codon of the gene.

CRISPR/Cas9 will make a double stranded cut three nucleotides upstream the PAM site. The target, namely the RBS of the genes are often not annotated.

However, we assume that they will be located within 15 nucleotide proximity of the gene start codon.

Figure 4: The donor DNA is design so that the regions flanking the RBS are homologous to the genomic DNA.

4.2 Two Systems for Cas9-editing

Two system were tested for Cas9-editing. System I have Cas9 and the sgRNA on a replicating vector and the dDNA on a non-replicating vector. This reduces the amount of cloning when the same sgRNA-Cas9 vector can be used for sev- eral RBS mutants. System II has the dDNA, sgRNA and Cas9 on the same replicating vector. This system reduces the mutational pressure for the Syne- chocystis since the dDNA is provided from the beginning, thus leaky expression won’t lead to death.

4.2.1 System I

The sgRNA was created with overlap extension PCR, see Figure 5. The band corresponding to the sgRNA fragment of 413 bp is clearly visible but there is also unspecific binding yielding in a band of around 300 bp.

The sgRNA was cloned into the vector pPMQAK1 containing Cas9. The Cas9 was put under a tetR repression system, see Figure 6. When transformed

(19)

Figure 5: The sgRNA was created by overlap extension PCR. The sgRNAs are labeled by their corresponding targets. Primers are designed to yield a product of 413 bp, however some samples show an additional band at around 300 bp.

into the K12 derived E. Coli strain 10-beta 52.2 % of the colonies sequenced had large deletions, always affecting the tetR gene. 23 colonies were sequenced and 12 of them showed a large deletion. Two hot spots were found for the start of the deletion. Upstream the sgRNA promoter and upstream the tetR promoter. All deletion results in loss of functioning genes for tetR and Cas9.

In Figure 6 the deletions in the 10-beta strains are shown. When moving to the CopyCutter strain 5 colonies were sequenced and none of them showed deletions. However, the colony PCR had been optimized which means that the only sequenced samples showed positive colony PCR. Thus, the sequencing data from 10-beta and CopyCutter is not entirely comparable. With the colony PCR only 2 out of 10 showed indication of deletion which is 20% suspected deletions compared to the 52.2% confirmed deletions of the 10-beta strain.

The replicating plasmid was transformed into Synechocystis using electropo- ration. To test if the sgRNA is targeting the gene, Cas9 was induced. Induction of Cas9 should result in cell death. However, as can be seen in Figure 7 some cells died on both plates and some cells survived on both plates. These results are summarized in Table 4. Colony-PCR of the plasmid in the samples that had negative results, using two different primer pairs, were unable to yield any product in the negative constructs.

The dDNA for rbcL with high and low expression RBS was assembled using Golden Gate Assembly and confirmed by Sanger sequencing. The plasmid was transformed by natural transformation into Synechocystis, containing a replicat- ing plasmid with sgRNA targeting rbcL. The cells were plated on plates with and without aTc and as can be seen in Figure 8 there is a clear selection pres- sure in the induced cells were colonies are formed. However, there is no clear

(20)

Figure 6: Deletions were found in the 10-beta strain transformed with the con- structs including the sgRNA and the Cas9 under tetR repression. Two hot spots for deletions were found. The deletion starts at the 3’ end of the sequence displayed. The figure shows the size of the deletion and an overview of the constructs on the plasmid where the start site of the deletions are displayed.

Figure 7: The cells grown on plates with and without inducer. The cells grown on the aTc+ plate should die and the cells grown without the inducer should survive. On these plates positive(+), negative(-) and inconclusive(X) results can be seen.

difference between the cells transformed with the dDNA and the one that did not receive the dDNA.

4.2.2 System II

To reduce the amount of cloning the sgRNA and donor DNA containing BBaB0034, was synthesized with biobrick cloning sites flanking the constructs. These con- structs were to be diversified with the other versions of RBS using overlap PCR,

(21)

Table 4: Transformation of pPMQAK1 and induction of Cas9 in Synechocystis.

Gene name CFU Death by induction Survives induction

rbcL 21 2 0

rbcS 10 0 7

pgk 31 0 4

gap2 > 100 1 1

tpi > 100 4 1

fbaA 2 0 1

fbpI 2 0 2

tktA > 100 2 0

rpiA 2 0 1

prk > 1000 2 0

talB > 1000 0 2

Figure 8: Induced and non-induced Synechocystis, containing sgRNA and Cas9 under tetR repression on a replicating plasmid with and without dDNA. There is a clear difference between the induced and the non-induced cells. However there is no significant difference between the cells with the dDNA and the cells without dDNA.

cloned into the pPMQAK1 plasmid containing Cas9 under tetR repression and transformed into Synechocystis. However, due to time restriction this was not completed and a summary of the progress can be seen in Table 5.

5 Discussion

In this thesis, two systems for how CRISPR/Cas9 can be used to systematically edit RBS for the enzymes in the Calvin cycle were designed and their induction were tested in an engineered, ethanol producing strain of Synechocystis. The aim was to characterize the resulting strains by using chemostat cultivation and correlate the change in productivity, growth rate and CO2consumption to the change in abundance of the corresponding enzyme in order to achieve in vivo characterization of the control over carbon flux for the corresponding enzyme.

(22)

Table 5: Progress of System II

Gene name PCR product pPMQAK1 Synechocystis CFU

rbcL No No NaN

rbcS Yes Yes > 10

pgk Yes No NaN

gap2 No No NaN

tpi No No NaN

fbaA No No NaN

fbpI Yes No NaN

tktA No No NaN

rpiA No No NaN

prk Yes No NaN

talB No No NaN

Due to time restrictions, System II was not tested to completion. The PCR products are diversified version of the template ordered as constructs on a cloning plasmid. These are then cloned into the pPMQAK1 vector contain- ing Cas9 under tetR repression. That plasmid is then cloned into Synechocystis using electroporation.

Due to time restrictions and difficulties in cloning, the aim was not fulfilled and only a few constructs were transformed into Synechocystis. The project will be continued by the lab but will unfortunately not be included in this thesis.

5.1 Targets and Constructs

Isoenzymes When choosing the target, three assumptions were made. One of them was that the isoenzymes with the highest abundance would have the largest activity. There are many isoenzymes in relation to the Calvin cycle, however the role of these isoenzymes has not been established. Dependent on the kinetic parameters of an isoenzyme the abundance of the protein translates into different activities. For example, an enzyme working far from its equilib- rium point can have the same net activity as a high abundant enzyme working close to its equilibrium point. Consequently, if this assumption is true or not depends on the role of the isoenzymes in the Calvin cycle. Some are believed to be regulated together to be active during different growth conditions such as high light versus low light [61] but it could also be that the enzymes are compartmentalized and are active in different parts of the cell. The isoenzymes can have different tendencies for the forward or reversed reaction, thus spec- ifying what reactions in the metabolism the catalyzes. It could also be that changing the abundance of one isoenzyme will cause the other to be more ac- tive, thus buffering the system and making the reactions more robust. This is supported by Jablonsky et al. whom made an in silico comparative study of Synechocystis and Synechococcus sp. PCC 7942 and determined that the higher amount of isoenzymes in Synechocystis allowed for a higher homeostatic stability of metabolites in response to changes in gene expression, indicating enzymatic regulation [29].

The consequences of this assumption about isoenzymes being true or not is

(23)

not crucial for this study. Either way the results would yield useful information to improve the kinetic models of carbon fixation and help us understand what the role of these isoenzymes are.

Connecting Pathways It is also interesting to consider evaluating the en- zymes connecting the Calvin cycle to adjacent pathways. Some of these en- zymes in the adjacent pathways share metabolites with the Calvin cycle both as substrates and products thus they influence the flux. It is possible that the controlling step of the Calvin cycle actually lies outside the pathway itself.

Donor DNA A previous studies in the cyanobacterium Synechococcus elon- gatus PCC 7942 suggested that homology arms of 400 bp were sufficient for effective homologous repair [34] and therefore the homology region surrounding the RBS site was decided to be 400 bp. In the case were the donor DNA was made by Golden Gate Cloning GC content at the end of the homology arms had to be considered and therefore they might differ in size between constructs.

There are several reasons for wanting to reduce the size of the dDNA. In System I and System II the dDNA is made by PCR. When making constructs a high-fidelity polymerase is used, yet the polymerase can still incorporate the wrong nucleotide and with increased length an increased risk of errors are intro- duced. In addition, when the dDNA, sgRNA and Cas9 under tetR repression is on the same plasmid, the plasmid becomes around 13 kb. This large size ef- fects the transformation efficiency, the plasmid stability and plasmid segregation [56]. When considering problems already discussed with the cloning efficiency of the pPMQAK1 vector due to the mobility proteins, an additional reduction in transformation efficiency due to size need to be limited as much as possible.

5.2 Systems for CRISPR/Cas9 editing

Regarding the set up for CRISPR/Cas9 editing used in this study, there were two interesting findings. 1) the tetR might be toxic to E. coli under certain conditions and 2) the mutational pressure using system I is very high for the cell.

To improve the implementation of CRISPR/Cas9 as a tool for cyanobacteria these findings can be useful.

TetR Could be Toxic in E. coli When cloning the construct in the 10- beta strain, several unique deletions were observed. They all affected the tetR gene and the Cas9 gene, making these proteins non-functioning. The amount of unique deletions all affecting the same genes indicates that it is related to the function of the genes deleted and not due to an unstable plasmid. That the problem is reduced when the copy number of the plasmid is reduced also indicates that something expressed by the plasmid is problematic for the cell.

One reason for this could be that the sgRNA has a target in the E. coli genome and thus provide a selection pressure for mutants that deleted the Cas9 gene. A BLAST search including possible PAM sites sowed that the highest similarity between the sgRNA target and the E. coli K12 genome is for the sgRNA targeting fbaA, having one mismatch in the seed region and four in the non-seed region. This should yield a very low risk of off-target effects, especially when Cas9 is under tight repression and only proteins from leaky

(24)

expression is present. Additionally, the mutations were not limited to the fbaA construct, meaning that other affected construct posed an even lower risk of off-target effects. Thus, this explanation is not consistent with the current information about CRISPR/Cas9 specificity and it is unlikely to be the cause of this deletions.

Another explanation for these deletions could be in the tetR gene. TetR is a part of a family of regulators characterized by the conserved helix turn helix motif [50]. These tetR family proteins regulate many anabolic functions like amino acid synthesis but also efflux pumps which can have a large effect on drug resistance [50]. The specificity of these proteins binding to the operators in the DNA is believed to mainly be assigned by four residues [50]. TetR shows a high specificity leading to efficient suppression [12].

When lowering the copy-number of the plasmid the deletion frequency go down. Consequently, if the tetR gene is the problem, the deletion frequency goes down with lower expression of the tetR repression protein. This could be due to metabolic pressure, off-target effects or interference with endogenous regulation.

The copy-number of pPMQAK1 vector in E. coli should be similar to the RSF1010 vector it is derived from since the replicon is identical [27]. RSF1010 has been reported to have a copy-number of on average 12 copies per cell [21].

The size of the plasmid is known to reduce the copy-number [56] and since the RSF1010 vector is 8684 bp [21] it is expected for the pPMQAK1 plasmid to have a lower copy-number. It has been shown that the replicative stability of the RSF1010 replicon is reduced with the copy number [27]. CopyCutter reduces the copy-number of the plasmid even further.

TetR is under a strong promoter and thus highly expressed. It could be that the metabolic pressure is making the plasmid unstable. The metabolic pressure of plasmid has consequences for growth rate [10] and the reduced metabolic pressure of the mutants gives them a growth advantage. Since the copy number is already low, the deletions are unique and the cells has only grown for one hour in liquid culture, this seem like an unlikely explanation. If this explanation was true one would expect a lower frequency or that the cells would have been grown for longer.

High expression of tetR protein may have off-target effects and display tox- icity in the cell, creating a selection pressure favoring the cells with a tetR deletion. Every plasmid has one operator site where tetR can bind and act as a repressor. Since tetR is consecutively produced there will be excess protein that does not occupy any operator sites. tetR is shown to have a high specificity but since it is only four residues that are responsible for the specificity and these can be similar between different regulators in the tetR-family it could be that when put in a low copy number plasmid under a strong consecutive promoter the excess protein show unspecific binding. Many of the tetR family regulators, regulates essential functions in the cell, thus unspecific binding could be highly toxic. Lowering the copy number of the plasmid would also lower the amount of excess protein leading to a decrease in toxicity.

Excess protein could also interact with E. coli endogenous operator sites.

Endogenously the tetR gene is under feedback regulation [12]. If a consecutive supply of proteins is provided this might interfere with endogenous regulation and thus cause toxicity.

Both cross-regulation between tetR family regulators [3] and interaction with

(25)

endogenous operator sites [16] has been reported and could be a reasonable explanation to the apparent toxicity of the tetR gene in these constructs.

System I Poses a Mutational Pressure on Synechocystis In both ex- periments when inducing Cas9 there were cases where the cells that should die, survived. When inducing the Cas9 as the sgRNA is expressed, these will form a complex. The sgRNA will guide Cas9 to a specific site and Cas9 will make a double stranded cut. To cut, Cas9 needs a PAM site. A feature of Synechocystis is that each cell has several copies of its genome. This means that it has a tem- plate to repair itself with trough homologous recombination. However, since the repair will have the same sequence as the original copy it will still be a target and as long as Cas9 is induced, the genome will continue to be cut until the cell dies.

Some possible explanations for the cells to survive include 1) mutations of the replicating plasmid containing the sgRNA and Cas9, 2) faulty design of sgRNA, 3) mutations in the Synechocystis genome or 4) Cells survive long enough for the inducer to be broken down.

In the first experiment when Cas9 was induced without the presence of the dDNA about half the constructs survived when induced. Since two times the amount of inducer was used compared to previous studies made with CRISPRi [69], the results are probably not due to degradation of the inducer. The results seemed unified within the sample, meaning that all colonies tested for rbcL were dying while all colonies tested for rbcS were surviving. Only the colonies tested for gap2 and tpi had samples both surviving and dying. The unified results, meaning either they function or not, would indicate a faulty sgRNA design since this is a binary property. Mutations either in the plasmid or in the genome would be expected to be more randomly distributed. However, the small sample size could be problematic in order to see these kinds of patterns and due to time limitation of the thesis and in some cases lack of colonies, it was not possible to increase the sample size. Furthermore, the essentiality of the genes targeted could also have an effect. If this is true, it would mean that within a sample the mutational pressure is more unified than between the samples which could also account for the samples seemingly unified results. One could try to reduce the mutational pressure by supplementing the media with glucose, making the photosynthetic genes ”less” essential. It would have been interesting to see with a larger sample size if this is the case.

When the second experiment inducing Cas9 was performed, all cells originate from one colony that in the earlier experiment consistently died when induced.

This means that the sgRNA is definitely targeting the gene but the cell still survives when they shouldn’t. This is a strong indication that this system pose a high mutational pressure on the cells.

It could also indicate that cyanobacteria have other mechanisms for repairing double stranded breaks than homologous repair.

Given this high mutational pressure on the cells it can be concluded that this approach is not optimal and thus I tried system II in order to reduce the mutational pressure for the cyanobacterium.

References

Related documents

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

General government or state measures to improve the attractiveness of the mining industry are vital for any value chains that might be developed around the extraction of

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Av tabellen framgår att det behövs utförlig information om de projekt som genomförs vid instituten. Då Tillväxtanalys ska föreslå en metod som kan visa hur institutens verksamhet

Tillväxtanalys har haft i uppdrag av rege- ringen att under år 2013 göra en fortsatt och fördjupad analys av följande index: Ekono- miskt frihetsindex (EFW), som

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i

Den förbättrade tillgängligheten berör framför allt boende i områden med en mycket hög eller hög tillgänglighet till tätorter, men även antalet personer med längre än