• No results found

Affibody Molecules Targeting VEGFR2 - Two Turns Off and Four Turns On

N/A
N/A
Protected

Academic year: 2022

Share "Affibody Molecules Targeting VEGFR2 - Two Turns Off and Four Turns On"

Copied!
99
0
0

Loading.... (view fulltext now)

Full text

(1)

Two Turns Off and Four Turns On

Rezan Güler

Kungliga Tekniska Högskolan, KTH Royal Institute of Technology

School of Engineering Sciences in Chemistry, Biotechnology and Health

Stockholm, Sweden 2020

(2)

KTH Royal Institute of Technology

School of Engineering Sciences in Chemistry, Biotechnology and Health Department of Protein Science

AlbaNova University Center SE-106 91 Stockholm, Sweden

Cover picture adapted from work by Tim Zarechny

Printed by US-AB, 2020

ISBN 978-91-7873-505-1 TRITA-CBH-FOU-2020:15

(3)

- Charles Darwin

The universe is change: life is judgement.

- Marcus Aurelius

(4)
(5)

The notion of employing proteins as drugs traces back several decades. As recombinant DNA technology emerged, it became a powerful tool for the tailoring of protein traits via genetic approaches - so-called protein engineering. The application of such tools to develop proteins that bind specific molecular targets has seen remarkable clinical success, and today seven out of the ten top-selling drugs in the world belong to this class of proteins.

A well-investigated protein scaffold for generating novel target-binding moieties is the small staphylococcal protein A-derived affibody molecule. This thesis revolves around the engineering of affibody-based binding proteins that aim to influence the signaling network in the biological process of blood vessel formation, so-called angiogenesis. The first study in this thesis describes the engineering of a heterodimeric affibody molecule that targets the principal regulating receptor of angiogenesis, vascular endothelial growth factor receptor-2 (VEGFR2). Two separate affibody molecules that bind adjacent VEGFR2 epitopes were previously fused into one biparatopic construct, leading to a remarkable increase in apparent affinity. Further, the biparatopic protein here demonstrated inhibition of vascular endothelial growth factor A (VEGF-A) binding to VEGFR2, and consequently inhibition of VEGFR2 phosphorylation, proliferation and in vitro sprouting of endothelial cells. In the second study, the aim was to evaluate the biparatopic protein as a molecular imaging probe for in vivo visualization of VEGFR2 expression in a glioblastoma multiforme (GBM) brain tumor-model. The results displayed significantly higher probe uptake in tumor compared to normal brain tissue, with a two-hour post injection tumor-to-brain ratio of 78. In the third study, the goal was to mimic the ability of the natural ligand to agonize VEGFR2 via receptor dimerization, and also simulate presentation as extracellular matrix (ECM) bound factors. To this end, the dimeric antagonist was reformatted into a tetrameric construct, hypothesized to bridge two receptor units, and fused to recombinant spider silk. Interestingly, whereas the tetramer displayed agonistic effects both in soluble and immobilized form, the activity of the dimer shifted from antagonistic to agonistic when immobilized. In the fourth study a combined in silico and directed evolution approach was used to increase the thermal stability and hydrophilicity of the biparatopic protein. The final construct demonstrated an increase in melting temperature of about 15°C, complete refolding after heat-induced denaturation and decreased uptake in normal tissues when evaluated in mouse biodistribution studies.

In conclusion, this thesis covers the development, characterization and engineering of VEGFR2-binding affibody molecules, aimed for use in research, therapy and diagnosis.

Keywords: Protein engineering, affibody molecules, VEGFR2, angiogenesis

(6)

Our bodies constantly perform remarkable functions, what makes this possible? Why are you able to see these letters? How do you move your fingers and limbs? Why does drinking milk give some people wind and some not? What carries the oxygen we breathe through our veins? How is your skin elastic enough to withstand grandma’s pinch? How does your immune system recognize and defend you against pathogens such as the coronaviruses?

These are all fascinating questions, and in one way or another, their answer involves proteins.

Practically all functions necessary to life rely on proteins, the blueprint for which lie encoded in the genes of our DNA. Proteins are vital to our well-being, and it is therefore not a surprise that their malfunction is linked to many diseases.

Two types of proteins are particularly crucial to this thesis. First, proteins that receive and relay messages (receptors), which allow cells to communicate with and react to their surroundings. These often take part in controlling biological processes, such as the formation of blood vessels. Second, guided molecular missiles (affinity proteins) which can interact with distinct molecular targets - in our immune system pathogens are detected by affinity proteins called antibodies. Affinity proteins can be designed to target specific receptors and influence various biological processes, and are a popular class of proteins in drug development.

Using proteins as drugs is a relatively old idea that now has revolutionized medicine, treating many previously untreatable diseases. Diabetes 100 years ago - before the discovery of the protein insulin - meant the lack of effective treatment resulted in severe disease and often early death. This is in stark contrast to the life of diabetic patients today. Proteins have demonstrated to be valuable complements to the traditional small molecule drugs, as they are generally more specific in their actions, have lower side-effects and can stay in our blood for long periods of time.

Designing a protein drug may sound trivial. Going to the drawing board based on rational principles is appealing, but still very difficult to accomplish due to the complexity of proteins. Thankfully, nature has inspired us with its own impressive design process. It has been active for four billion years to create life as we know it and is called evolution.

In nature, organisms continuously mutate while being subjected to various environmental conditions that put pressure on their ability to survive, forcing them to either adapt or perish. Mimicking this process, researchers can mutate the DNA of a protein and control environmental conditions in the test tube to artificially place pressure on protein traits, thereby directing evolution to fit our needs.

(7)

system. However, vessel function is dependent on a balanced growth; too much or too little can result in several diseases. Influencing and visualizing blood vessel growth (angiogenesis) is therefore valuable for therapeutic, diagnostic and research applications. This thesis describes the design and evaluation of affinity proteins that target a receptor involved in angiogenesis. These proteins thereby provide an opportunity to manipulate the function of a receptor involved in blood vessel formation.

(8)

Våra kroppar genomför oavbrutet fenomenala funktioner, vad är det som gör det möjligt?

Varför ser du de här bokstäverna? Hur rör du fingrar och lemmar? Varför blir vissa gasiga av att dricka mjölk och andra inte? Vad är det som bär syret genom våra blodådror? Hur är din hud elastisk nog att motstå mormors kindnyp? Hur känner ditt immunsystem igen och försvarar dig mot patogener som coronavirus? Detta är fascinerande frågor och på ett eller annat sätt involverar deras svar proteiner. Praktiskt taget alla funktioner som är nödvändiga för liv förlitar sig på proteiner, vars ritningar ligger kodat i generna i vårt DNA. Proteiner är alltså essentiella för vår hälsa, och det är därför inte förvånande att deras dysfunktionalitet är kopplat till många sjukdomar.

Två typer av proteiner är särskilt viktiga för denna avhandling. Det ena är proteiner som tar emot och vidarebefordrar meddelanden (receptorer), som tillåter celler att kommunicera med och reagera på sin omgivning. Dessa deltar ofta i kontrollering av biologiska processer, såsom bildning av blodkärl. Det andra är styrda molekylära missiler (affinitetsproteiner) som kan interagera med distinkta molekylära mål. I våra immunsystem upptäcks till exempel patogen av affinitetsproteiner som kallas antikroppar. Affinitetsproteiner kan utformas för att rikta in sig på specifika receptorer och påverka olika biologiska processer och är en populär kategori av proteiner i läkemedelsutveckling. Att använda proteiner som läkemedel är en relativt gammal idé som nu har revolutionerat medicinfältet och behandlar många tidigare icke behandlingsbara sjukdomar. Till exempel, för 100 år sedan innan upptäckten av proteinet insulin, innebar diabetes ofta allvarlig sjukdom eller förtidig död. Detta står i skarp kontrast till diabetespatienters liv i dag. Proteiner har visat sig vara värdefulla komplement till de traditionella småmolekylläkemedlen, eftersom de i allmänhet är mer specifika, har färre biverkningar och kan stanna i vårt blod under lång tid.

Komplexiteten hos proteiner gör det utmanande att försöka skapa nya proteinläkemedel.

Lyckligtvis har naturen inspirerat oss med sin egna imponerande designprocess. Den har varit aktiv i fyra miljarder år för att skapa livet på jorden och kallas evolution. I naturen muteras organismer kontinuerligt medan de utsätts för olika miljöförhållanden som sätter press på deras förmåga att överleva, vilket tvingar dem att antingen anpassa sig eller försvinna. Denna process efterliknas i provröret genom att forskare muterar ett proteins DNA och sedan kontrollerar förhållandena för att på ett konstgjort sätt lägga tryck på proteinegenskaper - därigenom rikta evolutionen enligt våra önskemål.

Blodkärl är avgörande för en sund utveckling av våra kroppar. De transporterar näringsäm- nen och syrebärande röda blodkroppar till avlägsna vävnader och har en viktig roll i vårt immunförsvar. Men blodkärlsfunktionen är beroende av en balanserad tillväxt; för

(9)

forskning. Den här avhandlingen beskriver design och utvärdering av affinitetsproteiner som binder till en receptor involverad i angiogenes.

I den första studien som presenteras i denna avhandling beskrivs konstruktionen av en affibody-molekyl som binder till en receptor för reglering av angiogenes, vaskulära endoteltillväxtfaktorreceptor-2 (VEGFR-2). Affibody-proteinet demonstrerade vidare för- måga att blockera receptorn från att interagera med tillväxtfaktor VEGFA. Syftet med studie nummer två var att utvärdera affibody-proteinets förmåga att visualisera VEGFR2- uttryck i en hjärntumörmodell. Resultaten visade signifikant högre upptag i tumör jämfört med frisk hjärnvävnad. I den tredje studien var målet att uppnå en funktion hos affi- bodymolekylen som efterliknar den hos den naturliga tillväxtliganden. Således designades molekylen om till en struktur med potentiell förmåga att aktivera receptorn samt kopplades till spindelsilke. I den fjärde studien användes en kombination av in silico design samt riktad evolution för att öka proteinets värmestabilitet. Det slutgiltiga proteinet visade en ökning i värmestabilitet på cirka 15°C och fullständig återveckning efter värmeinducerad denaturering. Sammanfattningsvis så ger proteinerna som presenteras här därmed en möjlighet att manipulera funktionen hos en receptor involverad i blodkärlsbildning.

(10)

This thesis is based on the following articles and manuscripts, referred to in the text by their Roman numbers. Full versions of the documents are appended at the end of the thesis.

I Fleetwood, F.*, Güler, R.*, Gordon, E., Ståhl, S., Claesson-Welsh, L., Löfblom, J (2016), Novel affinity binders for neutralization of vascular endothelial growth factor (VEGF) signaling. Cell. Mol. Life Sci. 73:1671-1683.

II Mitran, B.*, Güler, R.*, P. Roche, F., Lindström, E., Kumar Selvaraju, R., Fleet- wood, F., Rinne, S. S., Claesson-Welsh, L., Tolmachev, V., Ståhl, S., Orlova, A., Löfblom, J. (2018), Radionuclide imaging of VEGFR2 in glioma vasculature using biparatopic affibody conjugate: proof-of-principle in a murine model. Theranostics.

8(16):4462-4476.

III Güler, R.*, Thatikonda, N.*, Ali Ghani, H., Hedhammar, M., Löfblom, J. (2019), VEGFR2-Specific Ligands Based on Affibody Molecules Demonstrate Agonistic Effects when Tetrameric in the Soluble Form or Immobilized via Spider Silk. ACS Biomater.

Sci. Eng. 5:6474-6484.

IV Güler, R., Flemming Svedmark, S., Abouzayed, A., Orlova, A., Löfblom, J. Increas- ing thermal stability and improving biodistribution of VEGFR2-binding affibody molecules by a combination of in silico and directed evolution approaches. Manuscript

*These authors contributed equally to this work.

All articles are reprinted by permission of the copyright holders.

(11)

I Planned, designed and performed the experimental work together with co-authors.

Performed design, production and characterization. Wrote the manuscript together with co-authors.

II Planned and designed the experimental work together with co-authors. Performed design, production and characterization of imaging agents pre-labeling. Wrote the manuscript together with co-authors.

III Planned, designed and performed the experimental work together with co-authors.

Performed design, production and characterization of proteins. Wrote the manuscript together with co-authors.

IV Planned, designed and performed the experimental work together with co-authors.

Performed library construction, selections, screening and protein characterization.

Wrote the manuscript with contribution of co-authors.

(12)

This thesis will be defended May 15th, 2020 at 10:00, Kollegiesalen, Brinellvägen 8, våningsplan 4, KTH-huset, KTH Campus, Stockholm, for the degree of "Teknologie doktor" (Doctor of Philosophy, PhD) in Biotechnology.

Respondent:

Rezan Güler, M.Sci. in Biotechnology

Department of Protein Science, School of Engineering Sciences in Chemistry, Biotechnology and Health KTH Royal Institute of Technology, Stockholm, Sweden

Faculty opponent:

Prof. Dr. Dario Neri

Department of Chemistry and Applied Biosciences ETH Zürich, Zürich, Switzerland

Evaluation committee:

Prof. Masood Kamali-Moghaddam

Department of Immunology, Genetics and Pathology, Molecular tools, Uppsala University, Uppsala, Sweden

Assoc. Prof. Lars Jakobsson

Department of Medical Biochemistry and Biophysics, Vascular Biology, Biomedicum, Karolinska Institutet, Stockholm, Sweden

Docent Marika Nestor

Department of Immunology, Genetics and Pathology, Medical Radiation Science, Uppsala University, Uppsala, Sweden

Chairman:

Prof. Per-Åke Nygren

Department of Protein Science, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH, Royal Institute of Technology, Stockholm, Sweden

Respondent’s main supervisor:

Assoc. Prof. John Löfblom

Department of Protein Science, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH, Royal Institute of Technology, Stockholm, Sweden

Respondent’s co-supervisor:

Prof. Stefan Ståhl

Department of Protein Science, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH, Royal Institute of Technology, Stockholm, Sweden

(13)

ABD Albumin binding domain

ADA Anti-drug-antibody

CD Circular dichroism

CDR Complementarity determining region DNA Deoxyribonucleic acid

ECM Extracellular matrix

ELISA Enzyme-linked immunosorbent assay Fab Fragment antigen binding

FACS Fluorescence-activated cell sorting Fc Fragment crystallizable

FcRn Neonatal Fc receptor

FDA Food and drug administration

HSA Human serum albumin

IgG Immunoglobulin G

IMAC Immobilized metal ion affinity chromatography ka Association rate constant

kd Dissociation rate constant

KD Equilibrium dissociation constant

mAb Monoclonal antibody

PCR Polymerase chain reaction PET Positron emission tomography p.i. Post injection

SDS-PAGE Sodium dodecyl sulfate polyacrylamide gel electrophoresis SPECT Single-photon emission computed tomography

SPR Surface plasmon resonance

Tm Melting temperature

VH Heavy chain variable domain VL Light chain variable domain VEGF Vascular endothelial growth factor

VEGFR Vascular endothelial growth factor receptor

(14)

Contents

Abstract . . . i

Popular scientific summary . . . ii

Populärvetenskaplig Sammanfattning . . . iv

List of appended articles and manuscripts . . . vi

Public defense of dissertation . . . viii

Abbreviations . . . ix

1 Proteins - the molecules of life 1 2 Proteins in therapy and diagnosis 5 2.1 Important protein traits . . . 6

2.1.1 Affinity and selectivity . . . 6

2.1.2 Stability . . . 7

2.1.3 Immunogenicity . . . 7

2.2 Affinity proteins . . . 9

2.2.1 Antibodies . . . 9

2.2.2 Alternative scaffold proteins . . . 11

2.2.3 Affibody molecules . . . 14

3 Designing proteins 19 3.1 Protein engineering . . . 20

3.1.1 Rational design . . . 20

3.1.2 Directed evolution . . . 22

3.1.3 Library diversity . . . 23

3.2 Selection platforms . . . 25

3.2.1 Phage display . . . 26

3.2.2 Cell surface display . . . 26

3.2.3 Staphylococcal cell surface display . . . 29

3.2.4 Cell-free display . . . 30

(15)

4 Angiogenesis in sickness and in health 33 4.1 Angiogenesis . . . 34 4.2 VEGF/VEGFR signaling . . . 35 4.3 Therapeutics and diagnostics - VEGF/VEGFR2 . . . 37

5 Present investigation 41

5.1 Study I - Novel affinity binders for neutralization of vascular endothelial growth factor (VEGF) signaling . . . 42 5.2 Study II - Radionuclide imaging of VEGFR2 in glioma vasculature using

biparatopic affibody conjugate: proof-of-principle in a murine model . . . . 47 5.3 Study III - VEGFR2-specific ligands based on affibody molecules demonstrate

agonistic effects when tetrameric in the soluble form or immobilized via spider silk . . . 51 5.4 Study IV - Increasing thermal stability and improving biodistribution of

VEGFR2-binding affibody mo-lecules by a combination of in silico and directed evolution approaches . . . 58 5.5 Concluding remarks and future outlook . . . 66

Acknowledgements 68

References 71

(16)
(17)

Proteins - the molecules of life

Life exists and endures thanks to a myriad of essential processes, many operated by the macromolecules we call proteins, one of the four building blocks of life1. Ever since Berzelius coined the term protein (’of first order’) in 18382, proteins have been subject to intense study, and their diverse functions were discovered to be connected to virtually all biological processes3. During four billion years of natural selection, proteins have evolved to perform various complex functions, including molecular recognition of pathogens4, catalysis of otherwise inert reactions5, nerve signal transmission6, structural support7, metabolic regulation8, and molecule transportation and storage9.

There are four levels of protein structure (Figure 1.1). The subunits of proteins - amino acids - consist of a conserved central carbon, an amino group and carboxylic group, and a variable functional side chain. From a repertoire of 20 (excluding selenocysteine and pyrrolysine) such amino acids combine, via peptide bonds, in linear chains in an order as dictated by the corresponding genes, to form the primary structure of proteins. Backbone hydrogen bonds then spatially arrange the linear sequence to form the two rudimentary motifs of protein secondary structure, the α-helix and β-strand. Further, several forces (e.g. hydrophobic effect, covalent disulfide-bonds, hydrogen bonds and van der Waal forces) organize secondary structure motifs into what is called the tertiary structure10. Finally, if two or more polypeptides combine, they produce a fourth level of organization named the quaternary structure.

Proteins vary in size and length, spanning from around 20 to 33 000 amino acids11,12. Moreover, the distinct side-chains of amino acids mediate properties such as hydrophobicity, charge, chemical reactivity and size. The intrinsic properties of a proteins amino-acid chain together with environmental influences determine the folding fate of a protein. For example,

(18)

Figure 1.1: A schematic representation of the four levels of protein structure. Primary structure - the sequence of amino acids that build up the polypeptide chain. Secondary structure - local of chain folding creates α-helices and β-sheets. Tertiary structure - several secondary structures combine to form the tertiary structure. Quaternary structure - separate folded polypeptide chains further interact and organize into multi-subunit quaternary structures. Structures were adapted from PDB ID: 2OTK.

folding of globular proteins is considered to be driven to a large extent by the hydrophobic effect13 (i.e. burial of hydrophobic amino acids in the core). Small changes in primary structure, such as a introduction of a polar amino acid in the hydrophobic core, can thereby lead to dramatically altered protein folding (e.g. misfolding and aggregation)14. This is important, as protein function most often is directly connected to the three-dimensional fold. Consequently, it is not surprising that misfolding is associated with a multitude of diseases such as prion disease, Alzheimers, Parkinson and Huntington15.

Human cells have approximately 20 000 protein-coding genes in their genetic material16. The double-helical chains of deoxyribonucleic acid (DNA) consist of four different nucleobases:

adenine (A), thymine (T), cytosine (C) and guanine (G). The two separate strands are held together by hydrogen bonds between A-T and C-G nucleobase pairs17. In cells, nucleotide

(19)

sequences are read and transcribed into messenger RNA (mRNA), and consecutively sequence triplets (denoted codons) are translated according to the genetic code into the string of amino acids that create proteins18. This transmission of information from DNA to RNA to protein is referred to as the Central dogma of molecular biology19.

Following biosynthesis, the complexity of proteins is often further increased by post trans- lational modifications (PTMs). The localization, function and fate of proteins can be significantly affected by modifications such as ubiquitination, glycosylation, phosphoryla- tion, proteolysis and lipidation20. While basically every cellular process involves proteins, their function is rarely executed in isolation. Most proteins take part in molecular interplay.

Specific protein-protein interactions, as well as protein-nucleic acid interactions, form many delicate networks that control the biochemical events that result in life. The biological importance and remarkable capabilities of proteins to carry out tasks at a molecular scale has stimulated great scientific and industrial interest. In particular, their applicability for molecular recognition has been widely applied to visualize, treat and study disease, a topic covered in the following chapter.

(20)
(21)

Proteins in therapy and diagnosis

Since the introduction of recombinant therapeutic insulin four decades ago21, proteins have gone from being a practically nonexisting class of drugs to now encompassing approximately 250 different agents that are FDA approved for clinical use22. Further, the current top-selling drug Humira, a mAb for treatment of autoimmune diseases, sold for around 20 billion USD in 2018. Advancements in recombinant DNA- and protein engineering technologies have promoted the success of proteins as drugs, by contributing to the ability to tailor protein function, characteristics, safety, and efficacy for therapeutic and diagnostic use23. Chapter 3 will cover different processes of protein engineering.

Proteins are generally several orders of magnitude larger than conventional small molecules23. The size and complexity of proteins is linked to their clinical potential and facilitates treatment possibilities of previously untreatable diseases24. Compared to small molecule drugs, proteins (i) can perform more sophisticated functions (e.g. catalytic activity) (ii) can be engineered for higher specificity and affinity, (iii) are innate to the body and therefore often well tolerated (iv) can replace faulty or missing proteins, and (v) can be recombinantly produced by microorganisms or mammalian cells.

Protein-based therapeutics can be grouped in many ways. One way is by their pharmacolog- ical activity (e.g. replacing a dysfunctional protein, affecting a signal pathway, introducing a novel function, delivering toxic or radioactive compounds, recruiting the immune system)25. Due to the scope of this thesis, in this chapter, the focus will be on covering important characteristics of affinity proteins and give examples of affinity protein scaffolds used in therapeutic and diagnostic applications.

(22)

2.1 Important protein traits

2.1.1 Affinity and selectivity

Affinity and selectivity are critical traits for protein therapeutics as they rely on molecular recognition. Binding affinity refers to the strength of the noncovalent interaction between a protein and its interaction partner, and in protein science, it is commonly reported as the equilibrium dissociation constant KD [M]. This constant can be expressed as follows, where at equilibrium, the concentrations of free protein and ligand are represented by [A] and [B]

respectively, and the concentration of the two in complex is represented by [AB]:

KD = [A][B]

[AB]

Further, the affinity can also be expressed as a function of the kinetic rate constants kd (dissociation rate constantm, s-1) and ka (association rate constant, M-1 s-1), characterizing

the interaction.

KD= kd ka

Typical values of high-affinity interactions tend to have a fast association and slow dissoci- ation kinetics. Several forces contribute to the affinity between a protein and its ligand:

electrostatic interactions (ionic or salt bond as well as hydrogen bonds), van der Waals interactions and hydrophobic interactions26–28. Affinity is an important trait for many applications. In vitro diagnostic assays may, for example, require high-affinity binding to reach sufficient sensitivity for detecting disease markers. Moreover, high-affinity can mediate high potency and thereby allow for lower dosing in therapeutic settings29. In vivo molecular imaging generally favors high affinity between a labeled probe and its target ligand as it increases the signal-to-background ratio30. However, for some applications, less is more31. For example, in an effort to cross the blood-brain-barrier via binding to the transferrin receptor, reducing the affinity of anti-TfR antibodies led to an increase in receptor-mediated transcytosis32.

As mentioned, one important characteristic of protein therapeutics is their high potential capability to selectively interact with intended targets - a crucial feature due to several reasons. High binding selectivity ensures low off-target binding and coupled with high- affinity, it allows for lower dosing, both leading to lower adverse effects24,25. On the other hand, therapeutic targets are often engaged in both normal and disease-related processes.

(23)

by the high potency of protein therapeutics33. In summary, affinity and selectivity are important properties, hence they are often addressed in the engineering of affinity proteins (Chapter 3).

2.1.2 Stability

It is essential for the activity of a protein that it occupies its correct three-dimensional fold. Denaturation and refolding are the processes where a protein shifts between its active (native) and its non-active (denatured) state. For most proteins, denaturation is an irreversible process34,35. Protein stability can be seen as the degree to which the native state is energetically favored over the denatured states36. Most globular proteins are, however, only marginally stable, with a ∆Gfolding of about 10 kcal/mol37. Hence, they are highly susceptible to denaturation by environmental factors such as high temperature, pH, and denaturants36.

The thermal stability of proteins is often reported as the melting temperature (Tm), which is defined as the temperature at which half of the proteins occupy the native state and the other half occupy a denatured state. Typically, high thermal stability for proteins is beneficial for therapeutic and diagnostic applications - as low thermal stability can be detrimental in many ways. For example, it can lead to exposure of proteolytically susceptible regions and aggregation-prone regions, and further, indirectly cause immunogenicity, low production yield, and a significant non-active fraction of administered proteins38. As proteins are engineered for improved traits such as affinity, introduced mutations are more likely to be destabilizing than stabilizing. Hence, protein engineering (Chapter 3) efforts sometimes develop into a step-wise process where the enhanced protein has to be re-engineered for increased stability39.

For most applications involving proteins, stability is a critical aspect, and its functional importance is further highlighted by the fact that many mutations that cause human disorders do so by destabilizing proteins15,40.

2.1.3 Immunogenicity

Compared to conventional small molecular drugs, protein therapeutics have a much higher risk of being immunogenic and giving rise to unwanted anti-drug-antibodies (ADAs). The resulting ADAs can lead to decreased efficacy of the administered protein and significant adverse effects, particularly involving induction of the immune system toward biologically important endogenous proteins, i.e. autoimmunity23,41.

(24)

There are numerous factors that can influence the immunogenicity of proteins, examples include protein sequence and structure, impurities, aggregation, degradation, frequency of administration, dosage, administration route and formulation25. ADAs can significantly impact the pharmacokinetic profile or efficacy of proteins drugs, creating the need for increased dosage, which may, in turn induce further ADA production. Protein aggregation has been demonstrated to increase the risk of immunogenicity. Degradation, denaturation and aggregation can lead to exposure of otherwise buried antigenic sites. Further, it can lead to multimeric repetitive structures, which in turn can induce T-cell-independent B-cell activation42. Hence, aggregation is typically a critical factor in controlling immunogenic- ity.

Many safety and efficacy issues of protein therapeutics originate in immunogenicity. De- spite considerable efforts, in silico prediction of potential immunogenicity is still very challenging41.

(25)

2.2 Affinity proteins

Nature has created a plethora of proteins that engage in protein/protein-interactions to carry out wide-ranging functions. As discussed in the previous section, proteins possess a capacity for molecular recognition that allows specific interaction with virtually any type of antigen43. Thus, they have become essential detection and bioseparation tools for in vitro applications such as enzyme-linked immunosorbent assay (ELISA), western blotting, immunoprecipitation and protein purification. Further, affinity proteins that bind disease-related molecules are being increasingly used for diagnostic and therapeutic applications44. In therapeutics, examples of mechanisms of action are target protein neutralization, interference (e.g. blocking), payload delivery and recruitment of the immune system25.

Historically, antibodies have been the primary choice in both research and medicine.

However, their frequent use has also exposed some inherent disadvantages. Hence, in the last two decades, a new generation of alternative scaffold proteins with complementing properties have emerged43. In general, these are created by taking a stable protein found in nature and using them as a template to either re-engineer existing binding surfaces or introduce novel ones.

This section will describe different affinity protein scaffolds, including antibodies to give a frame of reference, and their most important features. Further, examples of therapeutic use and their targets will also be mentioned. As affibody molecules have been used in the work presented in this thesis, emphasis will be on describing this particular scaffold.

2.2.1 Antibodies

Antibodies, or immunoglobulins, are Y-shaped molecules that are a key part of the hu- moral immune system. In response to pathogens and allergens, B-cells produce these multifunctional affinity proteins. Antibodies perform several important functions in the immune system, such as: tight target binding and functional interference (neutralization), tagging pathogens and recruiting specialized immune cells (opsonization), and complement activation to perforate the target cell membrane45.

Humans have five distinct immunoglobulin isotypes (IgG, IgM, IgD, IgA, and IgE), with IgG being the most abundant in serum, making up 15-20% of all plasma proteins45. IgGs consist of four distinct polypeptide chains, two identical heavy chains (H) and two identical light chains (L). The heavy chain is further divided into one variable domain VH and three or four constant domains, and the light chain is divided into one variable and one constant

(26)

domain. These subdomains are covalently linked by disulfide bridges to form a complex bivalent homodimer with a molecular weight of circa 150 kDa46 (Figure 2.1A). Together, these characteristics contribute to making production, purification and characterization relatively costly and complicated47.

The antigen-binding interface of antibodies is comprised of hypervariable loops present at the N-terminus of the variable heavy and light chain (VH and VL), denoted complemen- tarity determining regions (CDRs). There is a total of three CDRs per variable domain (CDR1/2/3), leading to a total of twelve per antibody. Sequence differences between antibodies in the CDR loops confers their diverse and specific binding capabilities. The human antibody repertoire is generated by random gene recombination of variable gene segments, and affinity maturation by the process of somatic hypermutation48.

Many cell types such as blood vessel endothelial cells perform pinocytosis, which ingests plasma proteins for lysosomal degradation. IgGs avoid this degradation by pH-dependent binding to the neonatal Fc receptor (FcRn)49. The low pH of the endosomes facilitates the binding of the IgG Fc region to FcRn, allowing IgG to hitchhike back to extracellular neutral pH and subsequent release. Due to this mechanism, the in vivo half-life of IgG molecules is four times longer (21 days) than other isotypes (3-5 days)50. The large size of antibodies also put them well-above the cut-off for renal filtration, which is about 60 kDa51. The Fc region also has binding sites for other receptors, which mediate for example, the activation of the immune system, including, phagocytosis by macrophages, antibody-dependent cellular cytotoxicity (ADCC), and complement cascade initiation.

Many of the mentioned characteristics of antibodies are beneficial for certain applications while undesirable or redundant for others. The exceptionally long half-life and slow blood clearance of IgG is generally an advantage in therapeutic applications as it allows for less frequent administration or lower dosage to reach and maintain relevant concentrations in blood52. However, long half-life can also lead to lower signal-to-background (tumor-to-blood) ratios and occlusion of clear tumor visualization in molecular imaging of solid tumors, as well as increased non-target adverse effects from elevated exposure of radioactivity in healthy tissues in radiotherapy53. Further, the size of full-length antibodies imposes some limitations. Both tissue penetration and blood vessel extravasation correlate inversely with size. Thus, antibody diffusion and extravasation are lower compared to smaller scaffold proteins54.

One strategy to decrease the size, avoid effector functions and bivalency of antibodies is to split them into smaller units. Several antibody fragments, that retain the capacity for molecular recognition, have been developed. Examples include Fab fragments, scFvs and

(27)

single-domain antibodies (i.e. nanobodies) of camel or shark origin. These derivatives vary in size from around 14 kDa for nanobodies, 27 kDa for scFvs and 55 kDa for Fab-fragments (Figure 2.1B-D).

Figure 2.1: Schematic illustration of an (A) antibody (IgG) and different antibody fragments (B) Fab, (C) scFv and (D) single-domain so-called nanobodies.

Harnessing the immune system for therapeutic applications is an area of intense study, with much focus on utilizing antibody moieties. Bispecificity, where two specificities are introduced into one construct, offers new potential mechanisms of action. The monoclonal bispecific T-cell engagers (BiTEs) are fusion proteins made of two distinct scFvs from two antibodies. By binding tumor-associated antigens and cytotoxic T-cell receptors, they bridge the two, facilitating T-cell responses toward tumor cells55. Another example of engaging T-cells in killing cancer is the genetic re-engineering of T-cells to express chimeric antigen receptors, arming them with new specificities56. Further, fusing antibodies to cytokines (immunocytokines) for selective tumor delivery is yet another anti-cancer immune recruiting strategy57.

2.2.2 Alternative scaffold proteins

As exemplified in the previous section, the bivalency, long half-life, large size, disulfide dependent structure and effector functions of IgGs are not always desirable characteristics.

In order to circumvent some of the limitations of antibodies, non-immunoglobulin affinity proteins (or alternative scaffold proteins) have been increasingly explored and are now

(28)

reaching the clinical stage58. Further, alternative scaffold proteins are considered to have a significant advantage in terms of commercial opportunities, since they are not limited by the intellectual property landscape of antibody engineering58,59.

As discussed earlier in this thesis, most natural proteins are involved in protein-protein interactions. However, there are essential characteristics that make a protein more or less appropriate for engineering approaches and as potential biological drugs. In general, reported alternative scaffold proteins have several elements in common: (i) a stable structure with high solubility and thermal stability - crucial for tolerance of introduced variation without negatively influencing protein folding (ii) they consist of small, single polypeptide chains which facilitate modularity and protein fusions, and (iii) high production yields at relatively low cost43. Further, many of the alternative scaffolds are cysteine-free which provides several advantages such as cytoplasmic expression in bacteria and introduction of cysteines for site-specific conjugation58. In summary, scaffold proteins aim to replicate desirable antibody features while avoiding unwanted ones.

Immunogenicity has to be considered for therapeutic applications as some of the scaffolds have a non-human origin. Further, the mutations introduced during protein engineering may, in addition to altering specificity, also introduce novel immunogenic epitopes. Moreover, the small size and lack of FcRn-mediated recycling lead to a short half-life for these scaffolds.

For applications where extended circulation times are desired, in vivo half-life extension can be mediated by for example fusion to albumin, albumin-binding proteins or Fc domain, conjugation of PEG-molecules, or by protein oligomerization58.

Over the last two decades, the emergence of protein engineering has successfully introduced an abundant variety of scaffolds for creating novel binding moieties47. Here I will briefly give an overview of some extensively studied protein scaffolds.

Anticalins are derived from lipocalins, a broad family present in many organisms, which are relatively low-molecular-weight (160-180 amino acids) and robust proteins that transport and store a diverse set of molecules60. More specifically, anticalins are of human origin and use ApoD as a scaffold. Anticalins have a conserved beta-barrel structure and four loops that form a flexible pocket-shaped binding site (Figure 2.2A). Diversification and library generation is achieved by randomizing up to 24 amino acids in the loop regions. Libraries are then subjected to selections, by for example phage display, to isolate anticalins with new binding specificities.61. Currently, there are three anticalins in clinical trials (i) a bispecific 4-1BB/Her2 binder, for immuno-oncology indications, and (ii) an IL4Ra binder for asthma indication. There is one therapeutic anticalin that targets angiogenesis, it binds VEGF-A and has completed phase 1 studies62.

(29)

Adnectins, or monobodies, are based on the tenth fibronectin type III domain, a protein that is abundant in the ECM and plasma. Adnectins are small (around 100 amino acids), stable, naturally multimerized, and cysteine-free. Fn3 domains consist of two anti-parallel beta-sheets with three variable loops, that are similar to CDRs, exposed on each end of the structure63 (Figure 2.2B). In most diversification efforts, either two or three loops are subjected to variation to generate adnectins with novel specificities. The first adnectin to reach clinical investigation, CT-322, was a PEGylated adnectin that binds VEGFR2.

It has completed phase II trials where it failed to demonstrate clinical significance in the treatment of recurrent glioblastoma47,64.

Figure 2.2: Illustrations of several alternative scaffold proteins. (A) Anticalin (PDB ID:

4QAF) (B), Adnectin (PDB ID: 3QWR), (C) Knottin (PDB ID: 6MM4), (D) Avimer (PDB ID: 1AJJ), and (E) DARPin (PDB ID: 4YDY).

Knottins, or cystine-knot proteins, are a family of proteins with similar folding structure.

The cystine-knot motif is present in various lifeforms with functions such as antimicrobial activity, ion channel blockage and protease inhibition. Moreover, they are small proteins (30- 50 amino acids) with high thermal, proteolytic and chemical stability. Here, the stabilizing molecular ’knot’ core is created by three separate disulfide bonds that promote structural tolerance for introduced variation. Further, knottins have loop regions that mediate molecular recognition (Figure 2.2C). Thus, the loops are often targeted for diversification to construct libraries for directed evolution efforts65. Two naturally derived knottins have

(30)

been FDA-approved, Ziconitide - as an analgetic, and Linaclotide - for treatment of irritable bowel syndrome66. In addition, combinatorially engineered knottins are under development for use as cancer therapeutics and diagnostics67.

Avimers are derived from naturally occurring A-domains, which are strings of oligomerized domains present in numerous cell-surface receptors59. Natural binding targets include small molecules, proteins and viruses. Native A-domains consist of about 35 amino acids, where seven are conserved and mediate a stable structure through disulfide formations and calcium-binding (Figure 2.2D). The remaining 27 non-conserved amino acids contribute to molecular recognition and are targeted in the engineering of novel binding specificities68. Avimers are often developed as multimers during directed evolution. Linking multiple domains that recognize different epitopes on the same target (a similar strategy is covered in the present investigation of this thesis, study I-IV), creates a protein chain capable of high-affinity interactions through avidity effects. Phase 1 trials have been completed for an avimer-based (Trimeric A-domain) IL-6 inhibitor in the treatment of Crohn’s Disease.

DARPins, designed ankyrin repeat proteins, are based on 33 amino acid long repeat sequences that are commonly found in intracellular protein-protein interactions, and that are tightly stacked together, facilitating modular target-binding surfaces69. Moreover, two hydrophilic capping repeats at each end are included to increase stability. Further, every repeat is made of a beta sheet-turn followed by two antiparallel alpha-helices (Figure 2.2E). Analysis of natural ankyrin repeat proteins revealed residue positions that are involved in protein-protein interactions, and are thus attractive for diversification efforts and engineering of novel specificities. Each repeat thus consists of six variable positions that are important for recognition, and 27 structurally important, conserved positions70. There is one therapeutic anti-VEGF DARPin that targets angiogenesis, and it has completed phase 3 studies for the treatment of wet age-related macular degeneration71.

2.2.3 Affibody molecules

Affibody molecules are one of the earliest reported scaffold proteins, and the scaffold is originally derived from one of the immunoglobulin binding domain of staphylococcal protein A (SPA)72,73. This small protein domain is a cysteine-free three-helix bundle of 58 amino acids that exhibits fast folding, high solubility and high stability. Further, the absence of cysteines enables thiol mediated site-specific attachment of, for example, fluorophores, cytotoxic drugs and organic chelators74 to introduced cysteines. In the development of the affibody scaffold, the original protein A domain was mutated in two positions to improve proteolytic stability and reduce Fab binding75,76. Later, the non-binding part of the

(31)

hydrophilicity, facilitate solid-phase peptide synthesis, and finally, to reduce any residual interactions with Fab fragments77.

Generation of affibody molecules that selectively bind novel targets is typically achieved by randomization of the residues originally located in the binding interface of immunoglobulin and SPA. These 13 surface-exposed amino acids are located on helix one and two and mediate molecular recognition (Figure 2.3). Helix three is generally excluded from combinatorial engineering efforts as it mediates framework stability73. Generally, to isolate novel binders, unbiased libraries are used in phage display selections, while subsequent affinity maturation often involves cell-surface display based methods. Since the first reported combinatorial engineering of an affibody molecule, binders have been generated to more than 40 targets78. Examples include fibrinogen, insulin, IL-17A, VEGFR2, HER2, EGFR, CD28 and amyloid- beta53.

Figure 2.3: The structure of an affibody molecule (PDB ID: 3MZW) with the thirteen typically randomized positions highlighted in helix 1 and 2 (study IV).

Affibody molecules typically exhibit high-affinity target binding while retaining the three- helical structure and stability of the parent scaffold78. Today, affinities after maturation are typically in the subnanomolar range, e.g. femtomolar affinity for IL-17A (300 fM), picomolar affinity for Her2 (22 pM79), and Her3 (21 pM80), and, reported in this thesis, picomolar for VEGFR2 (200 pM81) (study I-IV). All these affibody molecules show a similar alpha-helical content as the parental scaffold. However, there is one reported study of an

(32)

affibody molecule with a relatively different structure and mode of binding. When selecting affibodies against an amyloid beta peptide, disulfide-bridged affibody dimers were isolated, which demonstrated an altered fold that sequesters its target by β-hairpin interactons82. Interestingly, this unique interaction has given rise to a new type of scaffold that is efficient for sequestering other amyloidogenic proteins such as α-synuclein83.

The extremely fast and reversible folding of affibody molecules typically enables tolerance to harsh chemical conditions and facilitates their use in applications such as bioseparation, molecular DNA technology and in vivo molecular imaging. Ideally, a molecular imaging probe for cancer diagnostics should exhibit characteristics such as rapid blood clearance, straightforward radionuclide labeling, high tumor retention, and low non-target mediated uptake84. Affibody molecules possess many of the suitable properties for molecular imaging.

Their small size leads to fast tissue penetration and rapid renal clearance, which enhances imaging contrasts. Affibody molecules exhibiting high specificity and affinity are readily generated towards various targets74. High specificity is important to minimize non-target accumulation, which increases signal-to-background ratios and avoids false positives. High target affinity generally improves tumor-to-background signal ratios. Further, low dissocia- tion rate constants prolong retention times, which in turn, facilitate the use of radioisotopes with longer physical half-lifes85. Lastly, their high stability and refolding capabilities are important for retained target binding after being subjected to harsh radiolabeling conditions85.

As a result of their well-suited features, affibody molecules have been extensively studied as in vivo imaging agents. The first and most well-investigated affibody for medical imaging targets HER2, and has reached late-stage clinical studies for positron emission tomography (PET) imaging of breast cancer. Here, fast blood clearance and high-contrast images were observed within 4-24h post-injection. Importantly, this affibody does not compete with the binding of clinically used HER2-targeting antibodies such as Herceptin and Perjeta. Thus, patients already under treatment with these two drugs can still be stratified without interfering with their therapeutic effects. Moreover, no anti-affibody antibodies were detected in completed trials86. Several other targets have been pre-clinically investigated for affibody-based imaging studies, including EGFR87, HER388, VEGFR289 and PDGFRβ90.

Typically, therapeutic applications require administered proteins to have long in vivo half- lifes, and the small size of affibody molecules leads to rapid clearance through renal filtration.

However, the modularity of affibody scaffolds facilitates multimerization to harness avidity effects, introduce multiple specificities or, modify biodistribution characteristics84. Thus, adapting the half-life for therapeutic applications is often achieved by fusion to other

(33)

domains (e.g. FcRn binders91, albumin-binding domains92,93, and Fc94). Blocking of protein- protein interactions and subsequent signaling, is one of several mechanisms for achieving therapeutic effects. This strategy is exploited by a therapeutic affibody that is currently in phase II clinical trials for treatment of psoriasis. It is formatted as a dimeric IL-17 binder that has the two affibody moieties separated by an ABD.

(34)
(35)

Designing proteins

The fascinating and diverse capabilities of proteins have contributed to their position as an important resource in research, medical, and industrial applications24,95. Using proteins in non-natural settings for various purposes, however, creates a set of trait requirements that need to be fulfilled, dependent on the given application. The ability to construct such tailored proteins was essentially enabled by the advent of recombinant DNA technology, which allowed for altering of proteins at gene level and subsequent production in chosen host cells. Hence, protein engineering is today used as a tool to create and adapt proteins for specific applications.

Tailored traits, such as physicochemical, pharmacokinetic, catalytic, affinity, selectivity, and immunogenicity, can be attained by either rational or directed evolution-based approaches.

Approaches based on detailed knowledge of a protein to implement specific changes expected to result in a given trait are broadly classified as rational design96. Contrarily, introducing random or semi-random changes to create a large population of protein variants, and then harnessing the power of selection or screening to isolate proteins is referred to as directed evolution97.

In the studies presented in this thesis, VEGFR2- binding affibody molecules have been engineered by a combination of rational design and directed evolution. This chapter will, therefore, focus on discussing these two strategies and how they complement each other in the design of affinity proteins.

(36)

3.1 Protein engineering

3.1.1 Rational design

Rational design-driven protein engineering often relies on detailed structural and empirical knowledge of a protein to predict how certain modifications could alter or introduce a specific feature96.

The rise of X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and more recently cryo-EM has led to solved three-dimensional structures of protein-protein complexes at near atomic resolution becoming more prevalent, and the total number of publicly available structures are now exceeding 150,00098,99. The availability of structural information greatly facilitates the prediction of what changes to introduce and their expected effects. However, even with such insight, proteins are often difficult to tailor by rational design. The difficulty is due to a two-part problem. (1) The complexity of intramolecular and solvent interactions of amino acids, and (2) how amino acids cooperate to give rise to protein traits. Meaning even slight modifications in primary sequence can lead to dramatic unwanted conformational changes. Still, many concepts labeled as ’rational’ have been employed and include: site-directed mutagenesis, recombinant fusions or deletions, chemical or enzymatic changes, and in silico protein design.

Site-directed mutagenesis has been broadly applied in structure-guided protein engineering since its introduction more than 40 years ago100,101. Its power has been successfully demonstrated in examples such as engineering enzyme stability102, activity103, and substrate selectivity and specificity104. When lacking necessary structural information, site-directed mutagenesis can also be used to deepen our understanding of specific protein traits, by allowing interrogation of single amino acids and their individual contribution to features of interest. A frequently used technique to accomplish this goal is to separately replace a subset of selected a.a to alanine (alanine scanning), which is then coupled to screening of the investigated trait105,106.

Recombinant fusions or deletions are excellent ways of either adding or removing a function from a protein of interest. Examples such as His6 and Strep-tag, or proteins such as the albumin binding domain (ABD) and fragment crystallizable region (Fc region), can easily be genetically combined with the protein of interest to for example, facilitate purification, solubility, detection or increase of the half-life in vivo. Combining the specificity of an affinity protein with a cytokine57 or a toxin can decrease off-target side effects in vivo, enabling higher dosing. Avidity effects can be harnessed by fusing affinity proteins with binding epitopes that are in close proximity to each other81,107.

(37)

Furthermore, genetic fusions can mediate attachment of proteins to different surfaces, which can alter their function and half-life (see study III). For example, growth factors (GFs) often exhibit short half lifes in soluble form due to degradation and receptor turnover108. Successful immobilization strategies include chemical conjugation, tag-mediated capture and self-assembling tags. One such self-assembling tag is partial spider silk which facilitates the coating of surfaces109 with the active protein, contributing to the imitation of the natural presentation of GFs but also, potentially increasing their in vivo half life by decreasing internalization76,110.

Employing the chemical properties of individual amino acids, it is possible to engineer proteins by site-directed conjugation of functional groups such as fluorophores, toxic chemicals, biotin, polyethylene glycol, and chelators. Maleimide labeling of proteins utilizes its selective reaction with the sulfhydryl groups present on cysteines111. In cysteine-free proteins, it is therefore common to, for example, introduce cysteines at the C- or N-terminus to facilitate conjugation of, for example chelators which enables labelling with radiometals.

An affinity protein coupled to a radiometal can, as previously mentioned, function as a tracer, providing a noninvasive method for diagnostic molecular in vivo imaging112. De novo protein design is to date one of the most challenging approaches to protein engineering. One major obstacle is the vast space of possible sequences. For a relatively small protein of 50 amino acids, there exists 2050 variations, which can be put in contrast to the number of naturally occurring proteins that is in the order of 1012. Further, with the principle underlying in silico design of new proteins being that "proteins assume the lowest free-energy state"113, examining the protein sequence and folding space computationally becomes unfeasible with increasing sequence length. Nonetheless, the increase in computing power has pushed the field forward. Examples of de novo protein design include the design of a very stable protein icosahedron114, a retro aldol enzyme115 and a protein-protein interaction complex that can heterodimerize with a Kd of 130 nM116. The difficult challenge of predicting how a given primary sequence dictates its three-dimensional structure is called the protein folding problem117. Artificial intelligence is now showing great potential in solving this issue. In 2018, Google’s Deepmind made their first appearance with Alphafold when they took part in the CASP13 blind protein structure prediction assessment, where they outperformed other contenders by managing to predict with high-accuracy 24 out of 43 free modelling structures (structures with no available homologues)118. Computational predictions are, however, yet to be accurate enough to be efficient in areas such as protein- based drug design119.

(38)

3.1.2 Directed evolution

The increasing abundance of structural information and improved comprehension of protein folding undoubtedly contributes to the advance of rational design. Nevertheless, in most situations, predicting the effects of modifications on a protein with a high success rate is yet to be achieved. This eludes to the two-part problem of the intra- and intermolecular complexity of amino acid interactions mentioned in the previous subsection. Today, a considerable limitation with the rational approach is its reliance on individually examining the effects of each introduced mutation96. For small sample sizes, this may not create any significant issues. However, the fact that amino acids often act collaboratively warrants the study of combined mutations, leading to a rapid expansion of sample size to an impractically large number. Fortunately, nature provides us with an ’irrational’ idea of how both to create and investigate large sequence spaces.

Natural evolution is a repetitive process where proteins are randomly mutated to create new variants, whose fitness are then tested under various environmental conditions. Following this principle, nature has adapted proteins to fulfill the complex functions necessary for life.

Mimicking evolution by manipulating the diversifying process, environmental conditions and selection pressure is referred to as directed evolution, and can be used to adapt proteins to our desire.

Selection strategies generally consist of three ingredients: (I) creation of a genetic diversity commonly referred to as library, (II) linking of the gene sequence (genotype) to the protein variant (phenotype) (III) selection under conditions that target, and select for, a trait of interest120. Different sources of generating library diversity and available systems for linking genotype to phenotype will be discussed in the two following sections.

The first law of directed evolution ’You only get what you screen for’, represents the difficulty of shaping selection conditions to target specific protein traits121,122. Every selection system has unintended biases that could reduce the efficiency of selection and result in enrichment of unwanted traits. Examples of bias-coupled protein traits include stability, folding efficiency, non-specific binding, and host toxicity. Creating a system that minimizes these inherent pressures in favor of the intended pressure, is therefore important. The next pitfall to avoid is setting an appropriate stringency. Too low stringency allows all proteins to pass through the selection, too high stringency, and not even the most promising candidates in the library will survive120.

Nonetheless, the flexibility of directed evolution has resulted in an incredibly powerful strategy to tailor the traits of proteins and it can also increase the understanding of how different traits emerge. Studying the output of isolated protein variants can provide

(39)

information for incorporation in rational approaches. Conversely, a rational element in the design of library diversity can significantly improve the number of functional proteins in the pool. Thus, a combination of the two approaches, often called semi-rational, is a common protein engineering strategy.

3.1.3 Library diversity

Library diversity is available from a multitude of both natural and artificial sources. Natural immune repertoires of animal and human origin hold great diversity from which antibody gene libraries can be isolated. Thereby, the immune system’s ability to select for specific antibodies can be exploited by first immunizing an animal with an antigen and second, isolating their B-lymphocytes and extracting the genetic information to create a library.

Unchallenged naïve B-cells can be used to generate universal antibody libraries, effectively evading the disadvantage of having to generate a new library for each antigen. In terms of library composition, in vivo immune system-based libraries are, in some ways, black box strategies with low flexibility that essentially limit the subjected proteins to antibodies and subjected traits to affinity123.

Conversely, synthetic libraries allow for intricate modifications of library design to fit specific proteins and traits. For example, residues identified as critical for binding particular antigens can easily be integrated into the library to increase the probability of isolating high-affinity binders124. Further, residues that are critical for stability can be left untouched while mutating surrounding a.a. increasing the number of functional proteins in the library74.

One popular synthetic method employs random diversification of libraries during DNA replication, denoted error-prone PCR (epPCR). Here, the fidelity of DNA polymerases is decreased by the addition of metal ions or nucleotide analogs125. Another approach to increase the error-rate is the use of DNA polymerases that have been engineered for lower fidelity126,127. Random incorporation of mutations by epPCR has several advantages:

it is cheap, readily available, and has a considerable element of serendipity. Although it is possible to target specific regions and control the frequency of mutagenesis, the non- targeted approach results in several limitations. For instance, there is a risk of introducing stop codons, mutating structurally important regions such as the hydrophobic core, and an inherent bias of introduced mutations due to the natural degeneracy of the genetic code.

Sexual gene recombination of mutants isolated from screening, selection or natural sources is referred to as DNA shuffling128. This process pairs well with epPCR by recombining

(40)

enriched mutations to harness potential synergistic effects instead of exposing them to a new round of epPCR, which may introduce deleterious mutations. In principle, gene recombination followed by another round of selection effectively concentrates beneficial mutations and further eliminates harmful mutations122.

To increase the quality of the library, the library design process can be guided by structural, functional, predictive in silico and general information regarding the protein of interest.

This is commonly referred to as a semi-rational or knowledge-based approach127. Finding specific hot residues to target and limiting the a.a diversity effectively reduces library size while increasing functionality129.

(41)

3.2 Selection platforms

So far, we have discussed how to emulate the first step in nature’s evolutionary design of proteins, diversification. Following the generation of genetic variation in nature, proteins are exposed to environmental conditions that test their fitness. The observable trait (phenotype) that provides a survival advantage allow their genetic information (genotype) to be inherited by their offspring.

The linkage of genotype and phenotype connects the protein trait to its gene sequence and is a main requirement of any selection system. There is an affluence of strategies for achieving this goal, ranging from mammalian cell display to direct conjugation of proteins to their encoding nucleic acid sequences. They can be broadly divided into (i) cell-dependent display systems which use a host organism for expression and carrying of genetic information, and (ii) cell-free display systems which translate and link genotype to phenotype in a cell-free manner. Each technology has its own advantages and limitations that should be considered depending on the application130. One example, obtainable library size, varies between methods and can be an important factor affecting the probability of isolating interesting proteins from a given selection. The characteristics and utilization of each system will be discussed separately in their corresponding section.

The final ingredient in mimicking the evolutionary process is the artificial application of selection pressure to target specific protein attributes. However, depending on the selection system of choice, it will come with inherent biased background pressures, including variations in protein expression levels, host toxicity, unspecific binding and other ways of circumventing intended selection pressures. Hence, experimental design and conditions have to be carefully planned towards selecting for intended attributes120.

References

Related documents

Most of the work in combinatorial protein engineering (e.g. display of antibody libraries) has, hence, been conducted using fusions to pIII (Benhar, 2001; Bradbury and Marks,

Phage display is a bio-panning method that is used to isolate molecules with binding specificity to a specific target protein through a number of selection cycles..

All of the ZAbetamatlib constructs together with one of the dimeric original binders as control, (ZAb3A12)2 VE, together with Zwt showed good binding towards the target, A

The kinetics of the binding of the two Affibody molecules to human and murine VEGFR2 were analyzed in a surface plasmon resonance (SPR)-based biosensor assay. Binding of Z VEGFR2_1

In this review we summarize the current understanding of signal transduction downstream of vascular endothelial growth factor A (VEGFA) and its receptor

Already in 1902, Emil Fischer predicted that peptide synthesis should facilitate the preparation of synthetic enzymes. Although large peptides, up to the size of proteins,

The findings contributes by elucidating a need for further research if blockchain technology is to be used in a domain where requirements of trust, responsibilities and

Hit molecules from this screen were then acquired and tested for their ability to (i) bind to free RF2 in solution, (ii) inhibit peptide release from the ribo- some in an in