• No results found

Flux  balance  analysis  to  improve  butanol  productivity  in  Synechocystis  PCC  6803      Master  of  Science  Thesis    Kiyan  Shabestary

N/A
N/A
Protected

Academic year: 2021

Share "Flux  balance  analysis  to  improve  butanol  productivity  in  Synechocystis  PCC  6803      Master  of  Science  Thesis    Kiyan  Shabestary"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)

 

 

   

   

Flux  balance  analysis  to  improve  butanol  productivity  in   Synechocystis  PCC  6803  

   

Master  of  Science  Thesis   Kiyan  Shabestary    

       

 

Supervisor:  

Josefine  Anfelt,  PhD  student,  KTH    

Examiner:  

Paul  Hudson,  Assistant  Professor,  KTH  

 

Degree  Project  in  Biotechnology  BB202X  

(2)

 

Abstract  

 

Engineering  microorganisms  at  the  systems  level  is  recognized  to  be  the  future  of   metabolic   engineering.   Thanks   to   the   development   of   genome   annotation,   microorganisms   can   be   understood,   as   never   before,   and   be   reconstructed   in   the   form   of   computational   models.   Flux   balance   analysis   provides   a   deep   insight   into   cellular   metabolism   and   can   guide   metabolic   engineering   strategies.   In   particular,   algorithms  can  assess  the  cellular  complexity  of  the  metabolism  and  hint  at  genetic   interventions   to   improve   product   productivity.   In   this   work,   Synechocystis   PCC   6803   metabolism   was   investigated   in   silico.   Genetic   interventions   could   be   suggested   to   couple   butanol   synthesis   to   growth   as   a   way   to   improve   current   productivities.   Cofactor   recycling   and,   in   particular,   buffering   mechanisms   were   shown   to   be   important   targets.   Creating   a   cofactor   imbalance   and   removing   these   buffering   mechanisms   is   an   important   driving   force.   This   forces   a   carbon   flux   through  butanol  synthesis  to  maintain  cofactor  balance  and  sustain  growth.  

 

Objective  

 

The  objective  of  the  present  work  is  to  identify  gene  targets  in  silico  at  the  systems   level  to  improve  n-­‐butanol  and  isobutanol  productivities  in  Synechocystis  PCC  6803.  

An   emphasis   is   put   on   coupling   product   to   growth   and   understanding   the   driving   forces  behind  it.  

   

(3)

Contents    

Introduction  ………4  

Theory……….6                        6   Genome-­‐scale  metabolic  models…….………..6  

Flux  balance  analysis  ………7  

Optimization  algorithms  ………9                9   n-­‐butanol  and  isobutanol  pathways  ………11                                  11   Metabolic  engineering  in  cyanobacteria  ………...14  

Results  ……….18    

OptKnock  to  predict  reaction  deletions  ………...18  

OptForce  to  predict  reaction  modulations    .………23  

Identifying  buffering  mechanisms  for  cofactor  balancing  ..………28  

Discussion  ……….30  

Conclusion  ……….33  

Future  directions  ………..33  

References  ……….34  

 

   

(4)

Introduction  

 

Industrial  biotechnology  is  an  emerging  and  promising  discipline  for  the  production   of   a   wide   range   of   compounds   including   fine   chemicals,   biopolymers,   biofuels   or   biopharmaceuticals.   Using   living   organisms   as   ‘cell   factories’,   mainly   micro-­‐

refineries  that  perform  a  specific  task,  represents  the  cornerstone  of  this  field  and   lay   the   foundations   towards   a   future   bio-­‐based   society.   The   apparition   of   high-­‐

throughput   sequencing   to   read   cellular   genetic   information   has   appeared   as   a   revolution  leading  biotechnology  into  a  new  era.  Understanding  the  cellular  genetic   code   enable   an   unprecedented   insight   into   microorganisms   physiology   and   functions.  As  a  result,  tools  to  genetically  engineer  cell  factories  have  evolved  to  gain   in  accuracy.    

 

Previous  methods  to  seek  for  high-­‐producing  strains  were  mainly  based  on  chance   where  random  mutagenesis  created  strains  diversity  and  good  screening  methods   enabled   selection   of   the   best   producers.   However,   these   methods   were   time-­‐

consuming  and  did  not  find  the  optimal  strains.  Recently,  genomics  paved  the  way   for   other   –omics   techniques   such   as   transcriptomics,   proteomics   and   finally   metabolomics.   This   high   amount   of   data   assessable   with   newly   available   high-­‐

throughput  analytical  techniques  can  be  translated  into  gene-­‐protein-­‐reaction  (GPR)   associations.   New   methods,   mainly   rational   design,   accurately   target   genes   to   modulate   cellular   functions   through   this   relationship   for   production   purposes.  

Metabolic   engineering   is   the   field   aiming   at   understanding   how   genes   influence   cellular  metabolism  and  further  exploit  them  to  shape  microorganisms  metabolism   to   meet   engineering   objectives.   Metabolic   engineer   may   count   on   tools   including   gene   deletions,   over-­‐expression,   down-­‐regulation   or   even   heterologous   gene   insertion  from  other  organisms  as  way  to  improve  strains.    

 

Initial  metabolic  engineering  strategies  aimed  at  redirecting  carbon  flux  to  product   synthesis   regardless   of   the   metabolic   burden   created   in   the   cell.   In   particular,   cofactors,  widely  connected  metabolites  helping  reactions  to  go  forward,  were  not   taken   into   account   and   their   imbalance   resulted   in   such   bottlenecks.   It   is   now   commonly   thought   that   understanding   the   cell   as   a   whole   is   an   important   key   to   metabolic  engineering  success.  However,  biological  organisms  are  complex  systems   involving   hundreds   if   not   thousands   of   biochemical   reactions.   For   metabolic   engineers,   this   means   a   high   number   of   combinatorial   gene   targets   to   reach   an   optimal  overproduction  phenotype.  Therefore,  to  efficiently  assess  the  high  number   of  possibilities  a  cell  can  offer,  computational  methods  are  very  welcoming  tools  to   support   experiments.   Genome-­‐scale   metabolic   models   aim   to   represent   cellular   metabolism  as  a  whole  using  GPR  associations  and  can  be  used  for  simulations.  Flux   balance   analysis   (FBA)   is   a   computational   method   aiming   at   solving   flux   distributions   (phenotype)   in   cellular   metabolism   for   a   given   genotype.   Algorithms   can  further  be  used  to  provide  strategies  (mainly  reaction  deletions  or  modulations)   in  order  to  reach  a  desired  overproduction  phenotype.  

 

(5)

An   important   issue   met   in   engineering   strategies   is   cellular   behavior.   Very   often,   engineering  strategies  are  in  opposition  to  the  cellular  ones.  Most  microorganisms   optimize   their   growth   as   a   mean   of   survival.   The   ones   growing   the   fastest   outcompete  slower  organisms.  This  forms  the  basis  for  natural  selection.  Organisms   have   evolved   for   millions   of   year   to   perfectly   fit   their   environment.   Coupling   product  synthesis  to  growth  is  a  way  to  bypass  this  issue.  A  successful  strategy  is  to   modify  the  stoichiometric  system  in  a  way  that  the  cell  needs  to  produce  a  product   in   order   to   achieve   growth.   Therefore   at   maximal   growth   rate,   the   product   is   synthesized  as  a  growth-­‐related  by-­‐product.    

 

The  aim  of  this  project  is  to  perform  FBA  on  cyanobacteria  Synechocystis  PCC  6803   genome   scale   model   to   identify   metabolic   engineering   strategies   for   enhanced   biofuel   production.   n-­‐butanol   and   isobutanol   are   target   molecules   in   this   project.  

Cyanobacteria  are  prokaryotic  algae  and  promising  cell  factories  for  the  production   of  a  wide  variety  of  compounds  including  biofuels.  There  are  two  types  of  process   using   algea   for   biofuel   production.   One   is   to   use   high-­‐biomass   content   eukaryotic   algae  such  as  chlamydomonas  species  for  biomass  production.  The  biomass  is  then   decomposed   into   sugars   and   fermented   into   biofuel   by   microorganisms   such   as   yeast.   The   other   is   the   direct   use   of   algae   as   a   chassis   for   biofuel   production.  

Cyanobacteria  species  are  preferred  cell  factories  in  this  case  due  to  easier  genetic   modifications.   Nevertheless,   algae   only   require   nitrogen,   carbon   dioxide   and   sunlight   to   grow.   This   low   requirement   makes   them   promising   cell   factories.  

Moreover  they  do  not  compete  with  food  production  as  they  do  not  require  arable   land.  

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(6)

Theory  

 

Genome-­‐scale  metabolic  models      

Genome-­‐scale   metabolic   models   (GSMMs)   have   been   the   catalysts   for   the   development  of  flux  balance  analysis.  It  is  a  bottom-­‐up  approach  where  the  model  is   created  from  its  smallest  constituents,  starting  from  genes  to  establish  the  metabolic   map  (Thiele  and  Palsson,  2010;  Fig.  1).  Based  on  extensive  research  from  literature   and   databases   (KEGG,   Brenda),   a   draft   reconstruction   can   be   first   established.   It   should  be  noted  that  this  might  include  previous  GSMMs  to  be  upgraded.  Software   such   as   the   RAVEN   toolbox   (Agren   et   al.,   2013)   can   perform   reconstructions   in   a   semi-­‐automated   way.   This   reconstruction   often   results   in   gaps   i.e.   some   missing   reactions   in   a   pathway   (a   missing   serine   pathway   for   Synechocystis   PCC   6803;  

Knoop  et  al.,  2013).  For  further  readability,  confidence  scores  are  assigned  for  each   reaction.  Reactions  without  any  physiological  evidence,  introduced  in  the  model  for   functional  purposes  solely,  take  the  lowest  confidence  value.  Finally,  a  model  can  be   generated   into   a   mathematical   form,   the   stoichiometric   matrix   (S   matrix).     All   stoichiometric   coefficients   for   each   reaction   and   metabolite   are   stored   in   the   S   matrix.   One   important   part   of   the   model   is   the   addition   of   artificial   reactions   for   modeling  purposes.  This  includes  addition  of  maintenance  and  a  biomass  reaction.  

The  biomass  reaction  aims  to  link  precursors  to  biomass  constituent  such  as  lipids,   proteins   and   other   macromolecules   to   model   growth.   The   flux   going   through   this   reaction  is  scaled  in  order  to  represent  the  growth  rate  (μ).    

 

The   reconstructed   model   can   then   be   saved   in   the   Systems   Biology   Markup   Language   (SBML)   format   (Hucka   et   al.,   2003).   This   provides   a   standardized   way   for   model   usage   and   improvement.  

Genome-­‐scale   models   can   be   used  for  two  purposes  mainly.  

First,   they   can   be   used   as   a   way   to   provide   a   way   to   contextualize   high   trough-­‐put   omics  data.  Secondly,  they  can   be   used   alongside   algorithms   in   order   to   find   optimal   engineering   strategies.   The   latter  is  the  aim  of  the  present   work.    

Synechocystis  PCC  6803,  often   labeled   as   the   “green   Escherichia   coli”   (E.   coli),   is   a  

Fig.   1.   Genome-­‐scale   model   reconstruction   procedure.   Figure   from  Feist  et  al.  (2008).  

(7)

model   organism   for   phototrophs.   However,   lack   of   genome   annotation   and   extensive  in  vivo  gene  essentiability  have  slowed  down  the  development  of  accurate   cyanobacteria  genome-­‐scale  models  in  comparison  to  well-­‐studied  organisms  like  E.  

coli   or   Saccharomyces   cerevisiae   (S.   cerevisiae).   The   apparition   of   a   gene   database   (CyanoBase;  http://genome.microbedb.jp/cyanobase/)  for  all  known  Synechocystis   PCC  6803  and  Anabaena  PCC  7120  ORFs  (Nakamura  et  al.,  1999;  Nakao  et  al.,  2010)   paved   the   way   for   numerous   genome-­‐scale   reconstructions   (Shastri   and   Morgan,   2005;   Montagud   et   al.,   2011;   Saha   et   al.,   2012;   Nogales   et   al.,   2012;   Knoop   et   al.,   2013).      

 

Synechocystis  sp.  PCC  6803  model  iJN678  (Nogales  et  al.,  2012)  was  used  to  perform   flux   balance   analysis.   The   reconstructed   network   incorporates   678   genes,   863   reactions  and  795  metabolites.  The  model  supports  heterotrophic,  autotrophic  and   mixotrophic  conditions  simulation  by  constraining  uptakes.  Autotrophic  conditions   were   simulated   as   “light   limited”   where   photon   flux   was   constrained   to   -­‐18.7   mmol/gDW/h  (both  bounds)  and  carbon  uptake  (HCO3)  constrained  to  a  maximal   value   of   3.8   mmol/gDW/h.   In   heterotrophic   conditions,   glucose   uptake   was   constrained   to   0.8   mmol/gDCW/h   in   order   to   match   the   photoautotrophic   maximum  growth  rate.  Photon  flux  was  set  to  0.    

 

In  this  project,  iJN678  was  used  because  it  offers  a  standardized  metabolite,  reaction   and  transporter  annotation  comparable  to  E.  coli  genome-­‐scale  models.  The  biomass   objective   function   is   refined   in   comparison   to   other   models.   Respiration   and   photosynthesis  are  also  well  modeled,  in  particular  cyclic  electron  flows  which  are   important  targets  in  this  study.    

 

Results  have  been  shown  to  be  sensitive  to  the  model  used,  cofactor  preference  and   reaction  promiscuity.  Since  cofactor  preference  is  not  known  for  a  large  number  of   reactions,  models  can  show  high  differences  in  cofactor  assignment.  This  has  been   shown   to   be   a   real   issue   when   investigating   suitable   strategies   for   target   overproduction.   For   this   reason,   the   model   from   Knoop   et   al.   (2013)   is   used   to   contrast  and  discuss  the  validity  of  obtained  results.  

 Flux  balance  analysis    

Flux   balance   analysis   (FBA)   is   a   mathematical   method   assuming   steady-­‐state   growth   to   simulate   flux   distribution   for   a   given   genome-­‐scale   model   (Orth   et   al.,   2010).   Mainly,   internal   metabolites   such   as   the   ones   found   in   the   central   metabolism   are   assumed   not   to   accumulate.   For   each   internal   metabolite,   production   and   consumption   rates   equalize.   This   then   leads   to   a   set   of   equations   that   can   be   solved   using   linear   programming   (LP).   Since   the   system   is   underdetermined,  multiple  distributions  are  possible.  It  should  be  noted  that  this  is   a   fundamental   difference   with   metabolic   flux   analysis   (MFA)   where   exchange   reactions  are  measured  such  as  the  system  is  determined.  A  single  flux  distribution   can   then   be   obtained   for   MFA.   Using   an   objective   function   to   be   maximized   or   minimized   (a   reaction   in   the   model   needs   to   be   chosen),   a   certain   distribution   is  

(8)

picked   up.   For   the   flux   distribution   to   be   realistic,   the   objective   function   to   be   optimized   should   fit   the   organism   objective.   Natural   selection   has   compelled   microorganisms   to   aim   for   increased   fitness.   Therefore,   maximization   of   growth   (biomass  equation)  is  often  chosen  as  objective  function,  especially  for  prokaryotes.  

 

Alternatives  to  FBA  include  minimization  of  metabolic  adjustment  (MOMA;  Segrè  et   al.,  2002)  and  regulatory  on/off  minimization  (ROOM;  Shlomi  et  al.,  2005).  Both  are   used   to   find   distribution   in   mutants   cells   based   on   WT   distributions.   They   both   minimize   flux   distribution   differences   between   the   WT   and   the   mutant.   MOMA   is   used  to  find  immediate  flux  distribution  whereas  FBA  and  ROOM  give  the  final  one   (Segrè  et  al.,  2002).  MOMA  and  ROOM  are  relevant  tools  when  mutants  are  tested   experimentally.    

 

FBA   is   often   referred   to   as   a   constraint-­‐based   tool.   Constraints   present   in   current   GSMMs   can   be   classified   into   two   types:   (1)   Balance   constraints   due   to   the   stoichiometry   of   the   system.   (2)   Flux   constraints   where   each   flux   is   limited   by   an   upper   and   lower   bound   (i.e   irreversible   reaction   cannot   carry   a   negative   flux).  

Additional   constraints   on   the   system   can   greatly   influence   the   solution   space,   mainly,   the   set   of   all   mathematical   possible   flux   distributions   that   satisfy   the   constraints  (Fig.  2).  The  more  constraints  one  adds,  the  more  restricted  the  solution   space.  This  results  in  predicting  an  optimal  flux  distribution  with  greater  accuracy.  

   

Fig.  2.  The  conceptual  basis  of  constraint-­‐based  modeling.  In  this  representation,  a  three  reactions  model  is    

assumed   for   simplicity.   Without   any   constraints,   all   mathematical   distributions   are   possible   solutions   of   the   system.  When  constraints  are  added,  the  solution  space  is  restricted  to  a  constrained  area  (in  blue).  An  objective   function  can  be  used  to  get  an  optimal  flux  distribution  for  (v1,v2,v3).  In  this  example  v3  is  the  objective  function   to  be  optimized.  Adapted  from  Orth  et  al.  (2013).  

The  main  constraint  in  FBA  is  the  stoichiometry  of  the  system.  Net  formation  of  a   given  metabolite  is  calculated  as  the  sums  of  all  flux  leading  to  this  metabolite  minus   the  sums  of  all  flux  consuming  this  metabolite,  over  the  whole  system.  Depending  on   the   stoichiometry,   some   reactions   can   weigh   more   than   others.   For   a   given   flux,   a   reaction   whose   stoichiometry   is   2   moles   of   metabolites   formed   per   mole   of   substrate  leads  to  higher  formation  rate  than  if  only  1  mole  was  formed.  Thus,  net  

(9)

metabolite   formation   rate  !!!"!  can   be   simplified   to   the   sum   of   reaction   rates   multiplied  by  the  reaction-­‐specific  stoichiometry  coefficient  of  the  metabolite  :    

𝑑𝑥!

𝑑𝑡   =   𝑆!"  𝑣!

!

!!!

   

N   is   the   total   number   of   reactions,  𝑆!"  and   vj   are   stoichiometric   coefficient   and   reaction   rate   for   the   ith   metabolite   and   jth   reaction.   This   equation   can   be   further   generalized   for   all   metabolites   M   of   the   system.   A   vector   of   net   formation   rates   dx/dt  can  be  obtained  as  the  S  matrix  multiplied  by  a  vector  v   containing  the  flux   distribution.   Under   steady-­‐state,   the   net   rate   for   each   metabolites   is   zero   (no   accumulation).  

  𝑑𝑥

𝑑𝑡 = 𝑆𝑣 = 0    

Other  types  of  constraints  not  integrated  (for  now)  in  current  genome-­‐scale  models   are  regulatory  constraints  or  advanced  kinetic  constraints  such  as  enzyme  activity   or  kinetic  parameter  values.      

 Similarly,  all  possible  mathematical  distributions  can  be  further  plot  in  two  or  three   dimensions.  One  reaction  flux  can  be  plotted  against  another  one  giving  a  particular   solution  space.  This  is  useful  to  determine  possible  phenotypes  for  a  given  genotype.  

Ibarra   et   al.   (2002)   successfully   demonstrated   FBA   accuracy   for   long-­‐term   phenotypes.   After   40   days,   E.   coli   evolved   from   a   sub-­‐optimal   phenotype   to   reach   optimal  growth  as  predicted  by  FBA.  

 

The   most   widely   used   platform   to   perform   FBA   is   the   COBRA   toolbox   (Schellenberger  et  al.,  2011).  It  can  be  used  on  MATLAB  or  python  platforms  and  its   high   modularity   and   documentation   makes   COBRA   toolbox   the   best   choice   in   the   scientific   community.   Most   simulations   were   performed   using   the   COBRA   toolbox   2.0  on  MATLAB.  

 

Optimization  algorithms    

As   discussed   before,   FBA   offers   a   genuine   insight   into   the   flux   distribution   for   a   given  network.  As  an  extension,  perturbations  can  be  applied  to  the  system  in  the   form  of  reaction  deletions  or  modulations.  Flux  distributions  can  be  computed  and   differences   can   be   analyzed   to   understand   the   cellular   metabolism.   Additionally,   algorithms  can  be  employed  to  go  the  other  way  round.  For  a  particular  phenotype  – let’s   say   an   overproduction   phenotype-­‐,   algorithms   predict   associated   reaction   deletions  or  modulations  (genotype  alteration).  

   

(10)

OptKnock    

The   most   common   and   widely   used   algorithm   is   the   bi-­‐level   optimization   framework  OptKnock  (Burgard  et  al.,  2003).  It  identifies  reaction  deletion  strategies   that   couple   the   biomass   objective   function   with   a   biochemical   overproduction   target.    This  bi-­‐level  formulation  can  be  simplified  into  a  single-­‐level  mixed  integer   linear   programming   (MILP),   which   consists   of   problems   with   both   discrete   and   continuous  variables.  

   

maximize 𝑣!"#$%&' (OptKnock)

𝑦!

subject to maximize 𝑣!"#$%&& (Primal)

𝑣!

subject to !!!!𝑆!"  𝑣! = 0

𝑣!!!"!#_!"#$%& =   𝑣!!!"!#_!!_!"#$%&%"#'

  𝑣!"#!!   ≤   𝑣!"#!  !"#!                 𝑣!"#$!""   ≥ 𝑣!"#$%&&!"#$%!  

        𝑣!!"#∙ 𝑦!   ≤   𝑣!   ≤   𝑣!!"#∙ 𝑦!,        ∀𝑗 ∈ 𝑁    

1 − 𝑦! ≤ 𝐾      𝑤𝑖𝑡ℎ  𝑦!  

!∈! =   0,1 , ∀𝑗 ∈ 𝑁  𝑎𝑛𝑑  𝐾, 𝑛𝑢𝑚𝑏𝑒𝑟  𝑜𝑓  𝑘𝑛𝑜𝑐𝑘𝑜𝑢𝑡𝑠  𝑎𝑙𝑙𝑜𝑤𝑒𝑑    

 

However,  due  to  the  complexity  of  the  algorithm,  high  computational  time  has  often   been   an   issue.   OptKnock   screen   for   all   reactions,   and   the   more   knockout   allowed,   the  higher  number  of  combinations.  An  alternative  algorithm  is  OptGene,  a  genetic   algorithm   (GA)   to   find   knockout   strategies   quicker   (Patil   et   al.,   2005).   The   idea   behind   this   algorithm   is   to   find   individuals   with   high   fitness   values   (mainly   high   productivity).    Individuals  are  then  selected  for  mating.  Offspring  are  screened  and   the  iterative  process  continues.  A  mutation  rate  is  also  applied  each  round  in  order   to  create  diversity.  

 

OptForce    

Developped   as   an   extension   of   the   OptKnock   framework,   OptForce   finds   reaction   modulations   in   addition   to   deletions   (Ranganathan   et   al.,   2010).   The   algorithm   compares  the  flux  variability  between  an  overproducing  (OP)  phenotype  and  a  wild   type   (WT)   phenotype   (Fig.   3).   Reactions   whose   value   ranges   significantly   differ   between  OP  and  WT  phenotypes  are  listed  in  MustU  (over-­‐production)  and  MustL   (down-­‐regulation).  This  can  be  further  extended  to  reactions  pair  (MustUU,  MustLL,   MustUL).  As  a  first  output  of  the  algorithm,  these  reaction  lists  can  be  used  to  guide   strategies.   OptForce   can   further   compute   combinations   of   these   reaction   modulations  from  these  lists  alongside  reaction  knockouts.  The  obtained  strategy  is   referred   as   the   FORCE   set.   By   increasing   the   number   of   allowed   modifications   (knock-­‐out/-­‐up/-­‐down),   OptForce   can   iteratively   give   different   strategies   and   solutions  can  pictures  in  the  form  of  a  boolean  diagram.  

(11)

 

OptSwap    

OptSwap  (King  and  Feist,  2013)  is  a  bi-­‐level  MILP  problem  based  on  RobustKnock   formulation  (Tepper  and  Shlomi,  2010).  In  addition  to  RobustKnock  and  OptKnock,   it   incorporates   possible   swaps   in   cofactor   specificity   for   oxidoreductase   reactions.  

Since   cofactor   usage   is   an   essential   feature   of   coupling,   this   algorithm   provides   a   genius  tool  to  achieve  high  productivity.    

 

                             

Fig.   3.   Comparing   flux   ranges   between   normal   and   overproducing   phenotype   in   OptForce   (Ranganathan  et  al.,  2010).  

There  are  other  similar  algorithms  developed  for  specific  purposes  but  not  extended   to   the   COBRA   toolbox   community   (OptORF,   Kim   &   Reed,   2010;   CASOP,   Hädicke   &  

Klamt,   2010).   Nevertheless,   results   using   different   algorithms   are   useful   tools   to   compare  with  results  from  OptForce,  OptKnock  and  OptSwap.  

 

n-­‐butanol  and  isobutanol  pathways    

n-­‐butanol   (hereafter   butanol)   is   a   fermentative   product   produced   from   the   precursor   acetyl-­‐CoA   through   the   ABE   fermentation   process.   Traditionally,   production  of  fermentative  products  has  been  produced  under  anaerobic  conditions   in  facultative  or  obligate  fermentative  organisms.  Earlier  efforts  to  produce  butanol   have  been  achieved  in  the  native  butanol  producer  and  obligate  anaerobic  bacteria   Clostridia,  where  butanol  production  was  first  reported  (Pasteur,  1862).  Clostridium   species   are   not   great   hosts   due   to   their   slow   growth   and   thus   low   volumetric   productivity.   Therefore,   Clostridia   butanol   pathway   has   further   been   adapted   to   other   organisms   (Nielsen   et   al.,   2009).   Expressing   a   pathway   from   a   fermentative   organism   into   an   autotrophic   organism   such   as   cyanobacteria   raise   few   concerns.  

Among  them,  oxygen  sensitivity  of  pathway  enzymes  (Atsumi  et  al.,  2008;  Boynton   et  al.,  1996)  and  reversibility  of  trans-­‐enoyl-­‐CoA  reductase  (Bond-­‐Watts  et  al.,  2011)   have   been   targets   for   improvement.   To   solve   these   problems,   chimeric   pathway   using  enzymes  from  other  organisms  have  been  employed  (Lan  &  Liao,  2012;  Bond-­‐

Watts   et   al.,   2011;   Anfelt   et   al.,   2015).   Alternative   pathways   to   produce   butanol   include  modification  of  the  clostridia  pathway  at  one  or  several  steps.  For  instance,   Lan  and  Liao  (2012)  replaced  the  first  condensation  step  by  an  ATP-­‐driven  two-­‐step   process   using   beta-­‐oxidation   reversal   in   Synechococcus   elongatus   PCC   7942.   One  

(12)

malonyl-­‐CoA   and   one   acetyl-­‐CoA   ultimately   condense   to   one   acetoacetyl-­‐CoA.  

Pasztor  et  al.  (2014)  also  used  beta-­‐oxidation  reversal  and  extended  it  to  butyryl-­‐

ACP  further  transformed  to  butanol  using  the  TPC  pathway  in  E.  coli  (Fig.  4).  

 Although  isobutanol  is  similar  to  butanol,  it  is  less  soluble  and  has  an  energy  density   close  to  gasoline  making  it  a  better  biofuel.  Isobutanol  synthesis  takes  place  along   the   2-­‐keto-­‐acid   (or   Ehrlich)   pathway.   Amino   acid   precurssors   in   the   form   of   keto   acids  are  decarboxylated  into  aldehydes  and  reduced  to  the  corresponding  alcohols.  

Isobutanol   can   be   obtained   from   valine   whereas   propanol   from   isoleucine.  

Isobutanol   has   been   successfully   produced   in   E.   coli   (Yan   and   Liao,   2009)   and   Synechococcus  elongatus  PCC  7942  (Atsumi  et  al.,  2009).  All  pathways  are  studied  in   this   project   (Fig.   4).   They   are   hereafter   referred   to   as   LL   pathway   (Lan   and   Liao,   2012),  TPC  pathway  (Pasztor  et  al.,  2014),  isobutanol  pathway,  NADH  and  NADPH   pathway.  

 

FBA  was  used  to  determine  the  maximal  theoretical  butanol  (or  isobutanol)  yields   for   all   these   pathways   (Table   1).   Yields   were   calculated   as   production   rate   over   carbon   uptake   rate.   Under   light-­‐limited   conditions,   the   carbon   uptake   is   the   sole   limiting-­‐factor.   It   is   calculated   as   bicarbonate   input   minus   carbon   dioxide   output.  

Even   though   most   pathways   have   the   same   maximal   rate   towards   the   product,   yields  slightly  differ  because  of  different  uptake  rates.  

 

Table   1.   Maximum   rates   and   yields   for   the   different   pathways   in   light-­‐limited   conditions.  

 

Maximum   production  rate  

Bicarbonate   uptake  rate  

Carbon  dioxide     export  rate  

Maximum  theoretical   yield  YHCO3-­‐,P   Pathway    mmol  Prod.  /  gDW  h    mmol  HCO3-­‐  /  gDW  h    mmol  CO2  /  gDW  h   mol  Prod.  /  mol  HCO3-­‐  

Std  NADH   0.39    1.55   0   0.252  

Std  NADPH   0.39    1.55   0   0.252  

TPC      0.375   3.7                  2.19   0.248  

LL   0.39   3.7              2.14   0.250  

Isobutanol   0.39      1.55   0   0.252  

 

It  is  interesting  to  see  that  for  TPC  and  LL  pathways,  HCO3-­‐  is  taken  at  maximal  rate   even  though  one  third  is  required  as  represented  by  a  massive  CO2-­‐  export.  These   two  pathways  differ  from  the  others  in  using  ATP.  One  explanation  could  be  that  this   extra   demand   requires   higher   ATP   production.   The   main   ATP   source   is   ATP   synthase,   whose   activity   is   dependent   on   LEF,   fixed   in   this   case   (light-­‐limiting).  

Based  on  these  calculations,  each  pathway  can  achieve  similar  maximal  theoretical   yield   due   to   similar   stoichiometry.   However   TPC   and   LL   pathway   requires   more   carbon  dioxide.  It  was  mentioned  that  carbon  sinks  such  as  decarboxylation  steps   increased  carbon  uptake  (Oliver  and  Atsumi,  2015).  This  can  be  a  disadvantage  for   the  cost  of  the  process  with  a  high  amount  of  carbon  not  efficiently  used.  Production   of  isobutanol  is  suggested  here  since  it  a  better  fuel  and  has  a  similar  maximal  yield   than  butanol.  

(13)

   

  Fig.  4.  Pathways  leading  to  butanol  and  isobutanol  synthesis  from  pyruvate  precursor  as  implemented  in   the  model.  Plain  and  dashed  arrows  indicate  native  and  non-­‐native  reactions  respectively.  In  blue,  the  2-­‐keto-­‐

acid   route   for   isobutanol.   In   red,   standard   Clostridium   butanol   pathway.   In   this   project,   last   three   steps   were   studied  with  NADH  or  NADPH.    In  green,  ATP-­‐driven  pathway  reported  by  Lan  and  Liao  (2012).  The  first  two   steps  and  conversion  of  crotonyl-­‐CoA  to  butyryl-­‐CoA  useing  NADH  are  specific  for  this  pathway.  In  yellow,  TPC   pathway  using  β-­‐oxidation  reversal  reported  by  Pazstor  et  al.  (2014).  Metabolite  names  next  to  arrows  indicate   reaction  requirements.  

(14)

Metabolic  engineering  in  cyanobacteria    

Generic  approaches  /  Introduction  

 Regardless   of   the   organism   used,   metabolic   engineering   can   be   divided   into   three   levels   with   increased   complexity   and   predictive   power   (Fig.   5).   Initial   strategies   have   mainly   focused   on   gene   level.   Rerouting   carbon   flux   to   a   certain   pathway   by   simple   amplification   using   high   gene   copy   number,   optimized   codon   usage   or   increased   promoter   strength   was   common.   Quickly   it   was   realized   that   productivities   did   not   meet   expectations   due   to   cellular   balancing   omission.  

Stepping   back   to   the   pathway   level   granted   a   better   global   understanding   of   the   process.  The  “push  and  pull”  concept  aiming  at  increasing  pathway  precursors  and   minimizing   competing   pathways   is   an   example.   Balancing   the   pathway   using   cofactor   usage   and   the   notion   of   driving   force   enabled   to   achieve   higher   productivity.  The  current  state  of  metabolic  engineering  focuses  on  these  two  levels.  

Generic   strategies   to   increase   productivities   include   most   notably:   By-­‐product   production   inhibition,   removal   of   competing   reactions,   target   pathway   overexpression,   cofactor   usage,   product   degradation/uptake   removal,   feedback   inhibition  removal  or  toxicity  tolerance.  Built  on  the  previous  levels,  systems  level  is   at   its   infancy   due   to   its   high   level   of   complexity.   It   requires   high   computational   power  and  high  understanding  of  the  organism.  

 

Fig.   5.   Systemic   approach   of   strain   development   (Lan   and   Liao,   2013).   From   right   to   left,   increased   in    

productivity  often  means  increased  in  complexity.  

In   cyanobacteria,   early   strategies   on   individual   gene   level   have   been   performed   extensively.  Expression  of  different  native  and  foreign  promoters  has  been  achieved.  

Native   promoters   used   are   usually   found   in   housekeeping   genes   (light-­‐inducible   PpsbA2  from  PSII;  Prbc  in  CO2  fixation).  Strong  foreign  IPTG-­‐inducible  promoter  Ptrc  is   widely  used  in  pathway  engineering  (Atsumi  et  al.,  2009;  Huang  et  al.,  2010;  Lan  and   Liao,  2011).  Codon  usage  for  heterologous  gene  could  also  be  optimized.  IspS  gene   encoding  isoprene  synthase  from  a  plant  specie  was  successfully  codon-­‐optimized,   resulting  in  a  10-­‐fold  increase  in  gene  expression  (Lindberg  et  al,  2010).  

Recent  advances  have  mainly  focus  on  the  pathway  level  with  the  notion  of  a  driving   force   pushing   towards   product   synthesis.   Decarboxylation   and   ATP   usage   could   provide  such  a  driving  force  (Lan  and  Liao,  2012).  As  discussed  earlier,  reversibility  

(15)

pathway  reactions  has  also  been  investigated  as  a  way  to  push  forward  the  carbon   flux  (Bond-­‐Watts  et  al.,  2011).  Enzyme  kinetic  data  can  also  be  integrated  to  have  an   overview  of  the  rate-­‐controlling  step  in  a  pathway  using  metabolic  control  analysis   (Angermayr  et  al.,  2013).  Engineering  tolerance  to  product  is  also  an  important  part   to  achieve  high  productivity  (Kaczmarzyk  et  al.,  2014).  

 

Future   studies   aim   at   building   on   these   driving   forces   to   improve   current   productivity.   Genetic   engineering   of   non-­‐obvious   targets   aims   at   creating   such   driving   forces   over   the   whole   organism   rather   than   the   pathway   level.   Since   genome-­‐scale   reconstructions   involve   hundreds   of   reactions,   powerful   computational   techniques   are   therefore   essential.   In   E.  coli,   it   was   possible   to   use   OptKnock  to  couple  lactate  production  to  growth  (Fong  et  al.,  2005).  After  60  days,   experimental  results  were  in  good  agreement  with  simulations  and  approximately   90%   of   maximal   theoretical   lactate   production   rate   was   achieved.   Through   this   project,  analysis  of  driving  forces  that  lead  to  growth-­‐coupled  butanol  synthesis  is  a   recurring  theme.    

 

Cofactor  recycling    

One  important  issue  in  metabolic  engineering  is  cofactor  balance.  It  has  been  many   times  the  case  that  adding  and  removing  pathways  results  in  a  cofactor  imbalance,   equivalent   to   a   bottleneck.   Due   to   interconnections   between   photosynthesis   and   respiration,   cyanobacteria   have   a   highly   regulated   cofactor   balance   making   it   difficult  to  engineer.  

 

Cofactor   recycling   is   important   in   order   to   couple   biomass   to   product   synthesis.  

Since   cell   factories   internal   objectives   are   to   optimize   their   growth,   linking   it   to   engineering   objective   represents   a   genuine   driving   force   towards   overproduction   phenotype.  A  phenotypic  phase  plane  (PPP  or  production  envelope)  can  be  plot  to   illustrate  that.  Biomass  is  iteratively  forced  at  different  flux  values.  For  each  of  them,   flux  towards  product  synthesis  is  maximized  and  minimized.  For  a  given  genotype,   the  PPP  represents  all  mathematically  possible  phenotypes.  We  assume  that  the  cell   will  optimize  its  growth.  Thus,  the  cellular  flux  distribution  will  be  pinpointed  as  the   point   with   the   highest   growth   rate   (far   right;   Fig.   6).   Now   if   product   synthesis   is   coupled   to   growth,   at   maximal   growth   rate,   product   synthesis   will   occur   as   a   necessary   by-­‐product   to   growth.   This   notion   is   perhaps   the   central   aspect   of   this   work  and  provides  a  powerful  strategy  for  product  synthesis.  

 

Coupling  biomass  and  fermentative  products  requires  few  gene  deletions  in  E.  coli   or  S.  cerevisiae.  Lactate  production  in  E.  coli  is  a  good  example.  Biomass  formation   results  in  an  excess  of  reducing  equivalents  through  the  central  metabolism  in  the   form   of   NADH.   Under   anaerobic   conditions,   the   TCA   cycle   is   not   active.  

Fermentation  pathways  leading  to  lactate,  ethanol  or  succinate  (all  recycling  NADH   to  NAD+)  are  thus  the  main  remaining  mechanisms  to  re-­‐oxidize  excess  NADH.  This   recycling  is  essential  for  the  cell  to  sustain  growth.  Without  any  NAD+  available,  flux   through  the  central  metabolism  cannot  go  forward.  As  a  result,  lactate,  ethanol,  and  

(16)

succinate   are   all   potential   growth-­‐coupled   product   through   cofactor   recycling.  

Strong   coupling   between   one   of   these   products   and   growth   can   be   achieved   by   knocking-­‐out  other  fermentative  pathways.    

 

Fig.  6.  Simple  trade-­‐off  versus  growth-­‐coupled  phenotypic  phase  plane.  A.  In  the  simple  trade-­‐off  design,  no   product   is   synthesized   at   maximal   growth   rate.   B.   In   the   growth-­‐coupled   design,   product   is   synthesized   at   maximal  growth  rate.  

 

Another   key   concept   related   to   cofactor   recycling   as   driving   force   is   the   ATP/NADPH   ratio,   a   central   theme   in   cyanobacteria.   For   instance,   knocking   out   ethanol   dehydrogenase   and   acetate   kinase   lead   to   growth-­‐coupled   synthesis   of   lactate  in  silico  (Burgard  et  al.,  2003)  and  in  vivo  (Fong  et  al.,  2005).  This  suggests   that  single  NADH  recycling  does  not  achieve  coupling  in  this  case.  ATP  generation  is   also  an  important  factor.    

 

Coupling   product   with   biomass   has   been   shown   to   be   difficult   in   cyanobacteria   (Nogales   et   al.,   2013).   Under   phototrophic   conditions,   presence   of   mechanisms   to   balance   cofactor   redox   state   hinders   coupling   (Fig.   7).   Cyclic   electron   flows   (or   alternate   electron   flows;   CEF)   support   the   cell   in   ATP/NADPH   ratio   modulation   depending   on   environmental   conditions.   Because   carbon   dioxide   fixation   requires   an   ATP/NADPH   ratio   of   1.5,   higher   than   the   1.28   ratio   provided   by   the   linear   electron   flow,   CEF   decreases   the   ATP/NADPH   ratio   by   indirectly   increasing   ATP   production   at   the   expense   of   direct   NADH/NADPH   oxidation.   It   is   believed   that   biomass   requires   a   ratio   larger   than   1.51   or   2   (Erdrich   et   al.,   2014;   Knoop   and   Steuer,   2015).   For   this   reason,   a   successful   strategy   would   aim   at   decreasing   the   ATP/NADPH  ratio.  With  a  higher  ratio  requirement  needed  to  produce  biomass,  an   excess   NADPH   would   then   be   available   for   our   product.   NADPH   would   then   be   recycled  back  to  NADP+  using  product  synthesis  pathway.  This  would  provide  a  way   to  couple  product  with  growth.  

   

 

0   0.05   0.1   0.15   0.2   0.25   0.3   0.35   0.4   0.45  

0   0.01   0.02   0.03   0.04  

Product  mmol/gDW/h  

Growth  rate  1/h  

A.  Simple  trade-­‐off  

0   0.05   0.1   0.15   0.2   0.25   0.3   0.35   0.4   0.45  

0   0.01   0.02   0.03   0.04  

Product  mmol/gDW/h  

Growth  rate  1/h  

B.  Growth-­‐coupled  

(17)

  Fig.   7.   A.   Overview   of   photosynthetic   and   respiratory   electron   transport   chains.   In   most   cyanobacteria,   photosynthesis  only  occurs  in  thylakoids,  and  respiration  takes  place  in  both  cellular  and  thylakoid  membrane.  

As   a   result   photosynthesis   and   respiration   closely   interact   even   sharing   some   membrane   components   (plastoquinone   PQ   pool,   cytochrome   b6f).   Linear   electron   flow   includes   photosystem   II   (PSII),   cytochrome   b6f   (CYTBF),  photosystem  I  (PSI),  ferredoxin  NADP+  oxidoreductase  (FNR),  connected  trough  plastoquinone  (PQ),   cytochrome   c   (cytC),   and   ferredoxin,   respectively.   Cyclic   electron   flows   include   the   ferredoxin   PQ   reductase   (FQR),  the  NAD(P)H  dehydrogenase  complexes,  the  aa3-­‐type  terminal  oxidase  (CYO),  the  PQ  oxidase  (CydBD),   the  MEHLER  reaction,  and  the  hydrogenase  (H2ase).  B.  Simple  scheme  showing  how  ATP/NADPH  requirements   can  be  engineered  to  provide  a  driving  force  for  product  synthesis.  

     

ATP NADPH

1.3   LEF

≥  2   Biomass

   

   

Fixed  ATP/NADPH  ratios  

Extra  NADPH  for product  synthesis

B A

 

(18)

Results  

 

OptKnock  and  OptForce  were  applied  to  iJN678.  OptKnock  was  performed  using  the   COBRA   toolbox   2.0   on   MATLAB   (Schellenberger   et   al.,   2011).   OptForce   was   performed   using   the   GAMS   modeling   environment   (General   Algebraic   Modeling   System,   www.gams.com).   CPLEX   solver   was   used   for   both   algorithms.   Transport   reactions   were   removed   from   consideration   using   OptKnock.   For   OptForce,   transport   and   peripherical   reactions   associated   with   biomass   formation   (carotenoid,   riboflavin,   sterol,   nucleic   acid   components...)   were   removed   from   the   pool  as  much  as  possible  (reduced  set).  Two  approaches  were  used,  one  iteratively   removing  these  reactions  from  the  solution  and  another  removing  these  reactions   before   running   the   algorithm.   Transporters   or   other   peripherical   reactions   were   allowed   to   be   knockout   mainly   for   informative   reasons.   Understanding   the   logic   behind  these  knockouts  is  an  important  aspect  to  understand  driving  forces  behind   growth-­‐coupled  butanol  production.  In  many  cases,  strategies  could  not  be  found  for   the  reduced  set.  In  both  cases,  reaction  deletions  found  using  OptKnock  were  used   to  guide  the  algorithm  and  decrease  the  computation  time.  Simulations  were  limited   to  few  hours.    

Identifying  the  driving  force    

Identifying   the   reason   behind   coupling   is   an   important   part   of   understanding   to   create   high-­‐producing   strain.   For   each   strategy,   sensitivity   to   cofactor   recycling   is   tested  to  identify  which  cofactor  balance  requirements  push  the  flux  though  product   synthesis.   Implementing   a   reaction   that   re-­‐oxidize   a   specific   cofactor   and   test   its   influence  on  the  coupling  design  is  informative.  If  the  coupling  is  not  affected,  the   strategy   is   not   sensitive   to   this   cofactor.   Conversely,   no   coupling   means   that   the   reaction   is   used   instead   of   the   pathway   because   it   is   less   a   burden   for   the   cell   to   simply   recycle   cofactors   using   a   single   reaction   than   producing   butanol.   For   instance,  a  reaction  converting  NADH  to  NAD+  can  be  added  for  a  given  strategy  to   test  sensitivity  to  NADH  recycling.  If  the  PPP  shows  uncoupling  between  growth  and   product,  then  the  strategy  is  said  to  be  sensitive  to  NADH  consumption.  Strategies   were   sensitive   to   NADH   consumption,   NADPH   consumption,   and/or   ATP   production.  

 

OptKnock  to  predict  reaction  deletions    

OptKnock   provides   the   first   step   towards   identifying   suitable   gene   targets.   High   computational  time  or  even  existence  of  a  possible  solution  makes  it  difficult  to  find   targets.   OptKnock   hinted   strategies   for   both   heterotrophic   and   autotrophic   conditions.   For   simplicity   and   also   because   autotrophic   conditions   are   target   conditions   for   cyanobacteria   growth,   autotrophic   strategies   are   considered   here.  

Strategies  were  found  for  the  NADH  pathway  and  for  the  LL  pathway.  

   

(19)

Strategy  for  NADH  pathway    

The  initial  list  proposed  by  OptKnock  involved  more  than  10  reaction  deletions  to   achieve  coupling.  Refining  the  list  involved  successive  removal  of  each  target  to  get  a   minimal   set   required   to   reach   coupling.   The   reduced   list   involves   7   reaction   deletions   (Table   2).   To   compare   the   effect   of   each   reaction   deletion   on   the   flux   distribution,   simulation   with   the   restored   reaction   was   computed   for   each   target   (mutant+).   A   visualization   tool   was   used   to   easily   see   global   changes   in   the   metabolism  (Maarleveld  et  al.,  2014).  A  phenotypic  phase  plane  permits  to  visualize   growth-­‐coupled   production   of   butanol   (Fig.   8).   This   strategy   was   shown   to   be   sensitive  to  NADH  consumption.  

 

Fig.  8.  Phenotypic  phase  plane  for  the  standard  NADH  pathway.  In  the  modified  strain,  approximately  0.13    

mmol/gDW/h  butanol  production  is  predicted  at  maximal  growth.  

Among  the  targets,  hindering  the  circular  electron  flow  was  shown  to  be  essential   for   coupling.   NADH   dehydrogenases   (type   1   and   type   2)   were   required   to   be   knocked   out.   This   is   consistent   with   other   studies   from   Erdrich   et   al.   (2014)   and   Reed  et  al.  (2013),  where  they  used  other  algorithms  such  as  CASOP  and  OptORF,   respectively.   The   most   obvious   reason   behind   these   knockouts   is   the   removal   of   competing  reactions  for  NADH  re-­‐oxidation.  Since  biomass  is  optimized,  NADH  re-­‐

oxidation   will   preferentially   occurs   in   the   ETC,   where   ATP   can   be   produced   as   an   end   product   rather   than   being   ‘wasted’   for   a   product   without   interest   for   the   cell   (butanol  is  excreted  and  cannot  be  re-­‐used  after  all).    

0   0.05   0.1   0.15   0.2   0.25   0.3   0.35   0.4   0.45  

0   0.005   0.01   0.015   0.02   0.025   0.03   0.035   0.04  

Butanol  mmol/gDW/h  

Growth  rate  1/h  

PPP  for  NADH  pathway  

initial  strain   moditied  strain  

References

Related documents

Regression test results for Lines Executable as dependent variable indicate that Specification Line of Code, Conceptual Complexity, Definition-Use, Minimum Coverage,

We were able to (i) newly identify putative NtcA binding sites in the promoter regions of scpB and scpE genes and, using EMSA, we could show that NtcA indeed binds to these

Since the MXML format is intended to be used for transfer of models the thesis will also consider import of the generated XML into a runtime toolchain for optimization

According to the data from both standard batch cultivation systems described above, cultures of high cell densities appear to be beneficial for high volumetric sesquiterpene

In anoxic darkness with nitrogen availability upon combined treatment with TTA and acetate, the strains with partial or complete glyoxylate shunt, 2P_I and 2P_IM, increased

Interestingly, extra copies of the native pepc or the heterologous expression of PEPc from the cyanobacterium Synechococcus PCC 7002 (Synechococcus) in the CD-P, increased ethyl-

A compilation of interviews conducted with 60 building owners in greater Stockholm shows 

To learn how to increase the production of squalene, manoyl oxide or any other terpenoid, we conducted a detailed investigation of each step in the MEP biosynthesis pathway,