• No results found

Structural Insights at Sub-Ångstrom, Medium and Low Resolution:

N/A
N/A
Protected

Academic year: 2021

Share "Structural Insights at Sub-Ångstrom, Medium and Low Resolution:"

Copied!
62
0
0

Loading.... (view fulltext now)

Full text

(1)

1

Structural Insights at Sub-Ångstrom, Medium and Low Resolution:

Crystallization of Trypsin, Bacterioferritin, Photosynthetic Reaction Center, and Photosynthetic Core Complex

WEIXIAO YUAN WAHLGREN

Department of Chemistry and Molecular Biology – Biochemistry Göteborg, Sweden

2012

(2)

2

Structural Insights at Sub-Ångstrom, Medium and Low Resolution:

Crystallization of Trypsin, Bacterioferritin, Photosynthetic Reaction Center, and Photosynthetic Core Complex

Weixiao Yuan Wahlgren

Cover: Overview of 4-fold pore of bacterioferritin from Blastochloris viridis

Copyright@2012 by Weixiao Yuan Wahlgren ISBN 978-91-628-8539-7

Available online at http://hdl.hanle.net/2077/30181

Department of Chemistry and Molecular Biology Biochemistry and Biophysics

University of Gothenburg SE-413 90 Göteborg, Sweden Printed by Kompendiet

Göteborg, Sweden 2012

(3)

3

Till Erik och Johan

(4)

4

(5)

5

The catalytic action of serine proteases depends on the interplay of a nucleophile, a general base and a general acid. The catalytic triad is composed of serine, histidine and aspartate residues. The serine acts as a nucleophile while the histidine plays a dual role as the general base or acid at different steps of the reaction. However, the role of aspartate is unclear. I recovered an ultrahigh resolution (0.93 Å) X-ray structure of a complex formed between trypsin and a canonical inhibitor.

At sub-ångstrom resolution, hydrogen atoms could be visualized, giving a clue to the protonation state of the catalytic residues. By comparing this with the theoretical electron density calculated by density theory functional, the protonation states of the catalytic histidine and aspartate are discussed. Hence, a refined mechanism for serine protease action is proposed in this thesis.

Photosystem harvests energy from sunlight with near 100% quantum yield. To study light-induced structural changes of the photosynthetic reaction center from purple non-sulfur bacterium Blastochloris viridis using X-ray crystallography, robust protein crystals with tight crystal packing are prerequisite. In this thesis, lipidic-sponge phase crystallization method was used and yielded well diffracting crystals for structure determination. Crystals showed a type I packing and a 1.86 Å resolution structure was determined with four lipid molecules captured in the structure.

Moreover, I demonstrated that an occupied QB binding site can be obtained by co-crystallizing with UQ2 using the sponge phase crystallization method. However, attempting to crystallize the reaction center-light harvesting 1 core complex, a 440 kDa membrane protein complex of total 54 putative subunits, it required different crystallization methods. Here, the resolution has been optimized to beyond 8 Å by using the lipidic bicelle crystallization method.

Conflict between the free but potential toxic Fe(II) and the insolubility of Fe(III) led to the evolution of bacterioferritin in bacteria, which functions as an iron storage and detoxification protein.

Bacterioferritin from Blastochloris viridis (Bv Bfr) was crystallized and the structure was solved to 1.58 Å resolution. With the combination of X-ray structure, redundancy PCR and tandem mass spectrometry, the previously unknown amino acid sequence of Bv Bfr was determined.

Conformational states of the ferroxidase center which undergoes reorganization upon different soakings were trapped. One water-like small ligand coordinated to the Fe1 binding site was captured in the Fe(II)-soaked structure. By density functional theory calculations the character of this small legend was rationalized. In addition, the structure and mechanism of iron import of the protein was studied and discussed. Finally, the redox-state of the heme in the crystals with and without Fe(II)-soaking treatment was studied by single crystal UV-VIS microspectrophotometry, before and after the X-ray exposure.

(6)

6

Paper I. I was involved in data analysis, preparing figures and writing of the manuscript.

Paper II. I took a major part in the protein production, purification, crystallization and X-ray data collection. I took part in the crystallographic analysis and preparation of figures and manuscript.

Paper III. I took a major part in the work involving reaction center from Blastochloris viridis and the corresponding lipidic-sponge phase conditions.

Paper IV. I was involved in the entire project. I was responsible in sequencing, production, purification and crystallization of the protein. I planned soaking experiment, performed X-ray data collection, single crystal UV-VIS microspectroscopy, structural analysis and density functional theory calculation. I took a major part in interpretation of the results, preparing figures and writing of the manuscript.

(7)

7 This thesis is based upon the following papers:

I. Wahlgren WY, Pal G, Kardos J, Porrogi P, Szenthe B, Patthy A, Graf L, Katona G. (2011) The Catalytic Aspartate Is Protonated in the Michaelis Complex Formed between Trypsin and an in Vitro Evolved Substrate-like Inhibitor - A REFINED MECHANISM OF SERINE PROTEASE ACTION. J. Biol.

Chem. 286 (5), 3587-3596.

II. Wöhri AB, Wahlgren WY, Malmerberg E, Johansson LC, Neutze R, Katona G. (2009) Lipidic Sponge Phase Crystal Structure of a Photosynthetic Reaction Center Reveals Lipids on the Protein Surface. Biochemistry 48 (41), 9831-9838.

III. Wöhri AB, Johansson LC, Wadsten-Hindrichsen P, Wahlgren WY, Fischer G, Horsefield R, Katona G, Nyblom M, Oberg F, Young G, Cogdell RJ, Fraser NJ, Engström S, Neutze R. (2008) A lipidic- sponge phase screen for membrane protein crystallization. Structure 16 (7), 1003-1009.

IV. Wahlgren WY, Omran H, von Stetten D, Royand A, van der Post S, Katona G. Structural characterization of bacterioferritin from Blastochloris viridis. Accepted for publication at PLoS ONE.

Related publications:

Johansson LC, Arnlund D, White TA, Katona G, et al. (2012) Lipidic phase membrane protein serial femtosecond crystallography. Nature Methods 9, 263–265

Lundholm I, Wahlgren WY, Piccirilli F, Lupi S, Perucchi A, Katona G. Terahertz Absorption Change in a Photosynthetic Reaction Center upon Photoactivation. Submitted to Biophysics Journal

Other publication:

Rapali P, Radnai L, Suveges D, Harmat V, Tolgyesi F, Wahlgren WY, Katona G, Nyitray L, Pal G.

(2011) Directed Evolution Reveals the Binding Motif Preference of the LC8/DYNLL Hub Protein and Predicts Large Numbers of Novel Binders in the Human Proteome. PLoS ONE. 6 (4)

(8)

8 ADP adenosine-5’-diphosphate

AFM atomic force microscopy ATP adenosine-5’-triphosphate BChl bacteriochlorophyll a BPhe bacteriopheophytin a

Bv Bfr bacterioferritin from Blastochloris viridis Bfr bacterioferritin

CHAPS 3-([3-cholamidopropyl]-dimethylammonio)-1-propanesulfonate

CHAPSO 3-([3-cholamidopropyl]-dimethylammonio)-2-hydroxy-1-propanesulfonate DDM n-dodecyl-b-D-maltopyranoside

DFT density functional theory

DMPC 1,2-dimyristoyl-sn-glycero-3-phosphocholine DMPG 1,2-dimyristoyl-sn-glycero-3-phospho-rac-glycerol DMSO dimethyl sulfoxide

FI fluid isotropic kDa kilo dalton

LCP, Q lipidic cubic phase

LDAO lauryldimethylamine-N-oxide LH1 light harvesting 1 complex LH2 light harvesting 2 complex LSP, L3 lipidic-sponge phase

Lα lamellar phase

MAG monoacylglycerol

MO monoolein

MPD 2-methyl-2,4-pentanediol MQA, QA menaquinone

MW molecular weight

NMR nuclear magnetic resonance

P special pair

PDB protein data bank PEG polyethylene glycol Pi inorganic phosphor

PPO pentaerythritol propoxylate P960 special pair Blastochloris viridis QB, UQB ubiquinone

UQBH2 ubiquinol UQ2 ubiquinone-2

UQ9 ubiquinone-9

RC-LH1vir photosynthetic reaction center and light harvesting 1 core complex from Blastochloris viridis

RC reaction center

RCvir reaction center from Blastochloris viridis

TM transmembrane

Å angstrom (10-10 m)

(9)

9

1.1 The nature of enzymatic catalysis ... 11

1.2 The origin of photosynthesis ... 12

1.3 Iron storage and detoxification ... 12

1.4 Scope of the Thesis ... 13

2 Methodologies ... 14

2.1 Degenerate Polymerase chain reaction ... 14

2.2 Protein crystallization ... 15

2.2.1 Protein crystal growth ... 15

2.2.2 Membrane protein crystallization... 16

2.2.3 Lipidic-cubic phase crystallization (LCP) ... 17

2.2.4 Lipidic-sponge phase (LSP) ... 18

2.2.5 Lipidic bicelle crystallization ... 18

2.2.6 Crystal packing ... 19

2.3 Protein X-ray crystallography ... 19

2.3.1 Structure determination ... 20

2.3.2 The phase problem ... 20

2.3.3 Structure refinement and validation... 21

2.3.4 Data-quality validation and high-resolution cut-off ... 21

3 Trypsin: an example of serine protease catalysis (paper I) ... 23

3.1 Serine protease catalysis ... 23

3.2 Co-crystallization ... 25

3.3 Density functional theory ... 25

4 Photosynthetic reaction center and reaction center – light harvesting 1 core complex from Blastochloris viridis (paper II, III) ... 27

4.1 Photosynthetic reaction center from Blastochloris viridis (paper II and III) ... 27

4.1.1 The purification and lipidic-sponge phase crystallization of RCvir ... 29

4.1.2 Lipids bound to the protein surface ... 30

4.1.3 QB binding site (unpublished result) ... 31

4.2 Crystallization of reaction center – light harvesting 1 complex from Blastochloris viridis ... 34

4.2.1 Purification of RC-LH1vir core complex ... 36

4.2.2 Lipidic-sponge phase screen ... 36

4.2.3 Further optimization of lipidic-sponge phase crystallization method (unpublished results) ... 37

(10)

10

results) ... 41

4.2.6 Future perspective …….………...43

5 Structure of bacterioferritin from Blastochloris viridis (paper IV)... 44

5.1 Discovery and structure determination ... 45

5.2 DNA and amino acid sequencing ... 47

5.3 Soaking experiments ... 47

5.4 Ferroxidase site ... 48

5.5 Three- and four-fold pore ... 49

5.6 Single crystal UV-VIS microspectrophotrometry ... 51

6 Acknowledgements ... 53

7 References ... 55

(11)

11

1 Introduction

Have you ever wondered about the origin of life on Earth? Although we are far from certain about how life arose, generally it is accepted that the development of life started with a chemical evolution for nearly four billion years ago [1]. While it was a reducing atmosphere on Earth at that time, simple geologically occurring molecules reacted with each other and formed organic polymer complexes [2, 3]. It was followed by a stage in which these polymer-collections self-organized and formed replicating units. At some point in this process, the transition from a lifeless collection to a living system occurred. Eventually, through biological evolution the complex network of modern life was formed.

In a reducing atmosphere of the prebiotic Earth, reactions among the simple molecules are thought to have formed the organic precursors from which biological molecules such as polypeptides and polynucleotides developed [2, 3]. Minerals like clays may have played an important role as catalysts and catalyzed these chemical reactions.

1.1 The nature of enzymatic catalysis

A catalyst works by providing an alternative reaction pathway between substrate and product.

Thus, the rate of the reaction is increased since this alternative route has a lower activation-energy, compared to the reaction path without the catalyst. The enormous variety of biochemical reactions that comprise life are nearly all mediated by a series of biological catalysts named enzymes. These biological catalysts are one of the remarkable outcomes of biological evolution and they differ from ordinary chemical catalysts in several important aspects. Compared with most efficient analogous chemical catalysises, enzymes can mediate reactions under milder reaction conditions, but with at least several orders of magnitude greater reaction rates. Furthermore, they have greater reaction specificity for both their substrates and products, compared to chemical catalysts and rarely have side products. Finally, they have a high capacity for regulation in response to external signals other than substrates and products [4]. Considering all these remarkable catalytic properties which enzymes have, one of the central questions of biochemistry rises: how do enzymes work?

Compared to an un-catalyzed reaction, a catalyst stabilizes the transition state of the reaction, thus lowering the height of the kinetic barrier and increasing the reaction rate. Enzymes are unique in that they are able to combine their specificity of substrate binding and their optimal arrangement of catalytic groups. To date, six types of catalytic mechanisms that enzymes employ have been classified: acid-base catalysis, covalent catalysis, metal ion catalysis, electrostatic catalysis, proximity and orientation effects and finally preferential binding of the transition state complex [4]. Serine Proteases are one of the best characterized group of enzymes with detailed examination of their catalytic mechanisms.

(12)

12

1.2 The origin of photosynthesis

The process of photosynthesis is possibly the most important chemical reaction on Earth which has led to the development of advanced life forms. The most accepted hypothesis for the origin of photosynthesis on Earth is that photosynthesis developed in Bacteria after divergence from Archaea and Eukarya [5]. Photosynthesis derives from five photosynthetic bacterial lineages.

Among the four anoxygenic photosynthetic bacterial lineages, purple bacteria (nonoxygen-evolving type II photosystem), green sulphur bacteria (type I photosystem), green non-sulphur bacteria (nonoxygen-evolving type II photosystem) and heliobacteria (similar homodimeric type I reaction center), the purple bacteria were most likely the earliest emerging photosynthetic lineage [5, 6].

The fifth and only oxygenic photosynthetic bacterial lineage is cyanobacteria (both type I photosystem and oxygen-evolving type II photosystem) and which were proved to be late-evolving [6]. It has been widely accepted that the photosynthetic properties of eukaryotes in form of chloroplasts developed from cyanobacteria through endosymbiosis [7]. However, coinciding with Earth being dominated by cyanobacteria, oxygen level on Earth started to rise.

1.3 Iron storage and detoxification

Iron is one of the most abundant metals on Earth. It has two stable oxidation states, Fe (II) and Fe (III). Depending on the environment, the two states can be readily switchable which makes iron an extremely useful redox-mediator in biology [8]. Irons are involved in a variety of critical processes, such as respiration, photosynthesis, nitrogen fixation and DNA synthesis. Iron can be incorporated in protein molecules in the form of a heme, bound to sulfur in various types of iron-sulfur clusters, as mononuclear iron centers or as di-iron center [9].

In an oxidizing atmosphere, oxidation of the free ferrous ion to the solid ferric form initiates two problems. Fe (II) becomes rapidly oxidized by oxygen to Fe (III) and produces reactive oxygen species through the Fenton reaction, subsequently triggers the oxidative damage processes. On the other hand, Fe (III) has a low solubility of around 10-18 M in concentration under physiological conditions, thereby preventing it as an available iron source. Therefore, an efficient iron storage and release mechanism is required in a living organism, and ferritins fulfill this role.

(Fenton reaction)

Ferritins constitute a broad superfamily of iron-storage proteins, widespread in all kingdoms of life.

By controlling a reversible transition between the free ferrous ion in solution and the mineral ferric core inside its cavity, ferritins supply living cells with an effective iron concentration in the range of 10-3- 10-5 M [10]. Additionally, by isolating the excess irons inside the cavity, away from oxidizing molecules, ferritins minimize the production of oxidative harmful species in the cell. Therefore, ferritins are presumed to have an iron detoxification function and they can be involved in cell redox-stress resistance [9]. Ferritins isolated from bacteria may also contain a heme, and are then called bacterioferritins (Bfr).

(13)

13

1.4 Scope of the Thesis

The aim of this thesis has been to combine different biochemical methods together with protein X- ray crystallography in order to obtain detailed structural and biochemical information of proteins.

Furthermore, lipidic crystallization methods have been developed and evaluated in an attempt to obtain diffraction-quality crystal of membrane protein complexes.

Although serine proteases are one of the best characterized enzymes with respect to detailed description of the catalytic mechanisms, questions like protonation states of catalytic residues at different steps of the reaction still remained unsolved. An ultrahigh resolution (0.93 Å) X-ray structure of a co-crystallized complex formed between the serine protease, trypsin and one canonical inhibitor is discussed in this thesis. Electron density distribution of covalent bonds in the key catalytic residues were analyzed and compared with theoretic electron density produced from density theory functional calculations. Combined with partially visible densities corresponding to putative hydrogen atoms, the protonation states and function of these catalytic residues were discussed and a refined mechanism for serine protease action was proposed.

Photosynthesis, which converts sunlight into chemical energy, is one of the most important biological processes in nature. To study light-induced structural changes of the photosynthetic reaction center from purple non-sulfur bacterium Blastochloris viridis using X-ray crystallography, robust protein crystals with tight crystal packing is a prerequisite. Protein was purified from its native source with a modified purification protocol and lipidic-sponge phase crystallization method was applied. The reaction center was used as a model protein to develop a lipidic-sponge phase screen and optimize a modified sponge phase form.

It is still a challenge to study large membrane protein complexes using X-ray crystallography. Often, the bottleneck is to produce well-diffracting crystals. Methods including traditional vapour diffusion, lipidic-cubic phase, lipidic-sponge phase and bicelle crystallization have all been used attempting to crystallize reaction center - light harvesting 1 core complex (RC-LH1vir). By using bicelle crystallization method the resolution of this 440 kDa membrane protein complex crystal structure has been improved.

Bacterioferritins provide an accessible storage of iron in a mineralized ferric form. Simultaneously, they reduce the concentration of toxic free Fe(II). Bacterioferritin from Blastochloris viridis (Bv BFR) was accidently crystallized while we worked on RC-LH1vir complex. With the combination of X-ray structure, redundancy PCR and mass spectrometry the previously unknown amino acid sequence of Bv Bfr was determined. Multiple soaking experiments were performed aiming to study the mechanism of iron transport of the protein. Conformational changes around the ferroxidase center upon different soaking treatments were captured by X-ray crystallographic structures. Methods like density functional theory calculations and single crystal UV-VIS microspectrophotometry were applied assisting the interpretation of the structural details.

With a wide range of methodologies being used, structural and biochemical details of three different classes of proteins have been investigated. Additional, different lipidic crystallization methods have been discussed while focusing on the crystallization of a large membrane protein complex.

(14)

14

2 Methodologies

2.1 Degenerate Polymerase chain reaction

Polymerase chain reaction (PCR) is a biochemical technology that amplifies NA to obtain numerous copies of a particular DNA sequence. The method consists of cycles of repeated DNA melting, primer annealing and enzymatic replication of the required DNA sequence. Degenerate PCR is fundamentally identical to ordinary PCR, except for one major difference: instead of using specific PCR primers with a given sequence, degenerate primers are used whereby one or more of its positions have several alternative nucleotides. This method has proven to be a tremendously powerful tool to identify new members of gene families [11].

Most genes within a family encode proteins sharing structural similarities. Bacterioferritin from Blastochloris viridis was accidently crystallized while attempting to crystallize reaction center-light harvesting 1 core complex of this organism (paper IV). In the absence of the genomic information for Blastochloris viridis, the initial guess of the amino acid sequence was derived directly from the electron density map. In addition, by aligning the amino acid sequences of proteins from related organisms, conserved and variable regions can be determined respectively. Once the sequences of the conserved regions are identified, the degeneracy of these amino acids can be determined. This can be done by taking the product of the degeneracy of each amino acid in the sequence. For example, valine has four codons (GTT GTC GTA GTG) and thus has a degeneracy of 4 while tryptophan has only one codon (TGG) and a degeneracy of 1. Methionine and tryptophan are the only amino acids that are coded by one unique codon, while serine, arginine and leucine are coded by six codons and therefore should be avoided if possible [12-14]. For PCR, two conserved regions for locating the forward and reverse primers are needed. Once the conserved amino acids regions with low degeneracy are chosen, they can be back-translated to the corresponding nucleotide sequences which are used as starting points for designing degenerate PCR primers. The so called

"wobbles" are inserted in the PCR primers where there is more than one possibility of nucleotides and one pair of degenerate PCR primers is constructed. Degenerate PCR primers are normally 5 to 7 amino acids long and fairly close together (200 bp – 600 bp). PCR efficiency will drop if these regions are too far apart or the primers are too long. “Tails” can be added to the degenerate primers on the 5' ends and help to increase the PCR efficiency of these primers by increasing primer length and hence annealing temperature. Tails including restriction sites can be used for directional cloning. Alternatively, those ended with terminal G’s will encourage Taq polymerase to add overhanging A's for use in TA subcloning.

(15)

15

2.2 Protein crystallization

2.2.1 Protein crystal growth

The goal of crystallization is to produce well-ordered crystals which diffract when hitting with an X- ray beam. It is often the rate-limiting step in protein crystallography, particularly for membrane proteins. The growth conditions for a protein crystal are unpredictable and a strategy of systematic approach is essential.

The basic principle of protein crystallization is to alter the composition of the protein solution in a controlled way so that protein precipitates and forms diffracting-quality crystals. In order to get crystals the protein solution has to reach a supersaturated state, as described in the phase diagram (Figure 1). The supersaturated region can be divided into three zones, metastable, nucleation and precipitation zone. If the precipitation process occurs too fast, the protein solution jumps to the precipitation zone and amorphous precipitate which does not contain any internal order within the protein molecules is usually observed. To obtain crystals, nuclei have to be formed in the nucleation zone where small nuclei are formed containing a small amount of well-ordered protein molecules. Afterwards crystal growth spreads outwards from the nucleating site and protein solution stays in the metastable zone. The successful production of diffraction quality crystals depends on all parameters influencing crystallization, such as type and concentration of precipitants, protein concentration, pH, buffer, temperature, detergent, protein purity and more.

Figure 1. Phase diagram [15]. The concentrations of purified protein and precipitant solution in the droplet increase and reach the nucleation zone (1). Nucleation occurs and crystal starts to grow. The concentration of the protein in the solution decreases (2).

The most commonly used method for protein crystallization is vapour diffusion. There are three different experimental arrangements known as hanging drop, sitting drop and sandwich drop methods (Figure 2). The reservoir solution in hanging drop contains precipitant solution. A droplet containing purified protein and precipitant solution from the reservoir is placed onto a glass cover slide which is then sealed so that the drop is hanging above the reservoir solution. Initially, the droplet of protein solution contains lower concentration of precipitant than in the reservoir. Since the system strives towards equilibrium, water diffuses from the drop to the reservoir and both the protein and precipitant concentrations gradually increase to an optimal level for crystallization in the supersaturated state, where nucleation starts (Figure 1). The same principle is applied for sitting and sandwich drop.

(16)

16 (A) (B) (C)

Figure 2. The vapour diffusion experiment. In a hanging-drop setup (A), the protein/

precipitant drop is turned upside-down. In a sitting-drop setup (B), a pedestal is used to separate the protein/ precipitant drop from the reservoir solution.

In a sandwich-drop setup (C), the protein/ precipitant drop is located between two slices with spacer [16].

The key of forming a crystal is the crystal contacts where the protein molecules come in close contact and adhere to each other at specific points. This is particularly difficult for membrane protein.

2.2.2 Membrane protein crystallization

There are several challenges when working with membrane proteins, such as overexpression in high yields, purification with intact function and crystallization. The difficulties originate from the intrinsic nature of the membrane proteins, consisting of a hydrophilic part facing the aqua environment and a hydrophobic surface embedded in a lipid bilayer (Figure 3).

Figure 3. The biological membrane. The lipid bilayer is shown in light grey, with polar head groups as circles that are connected by fatty acid tails. Integral or peripheral membrane proteins are shown in dark grey [15].

With help of detergents, membrane proteins can be extracted from the native lipid bilayer or refolded in an aqueous solution and the same architecture arrangement of the hydrophilic/hydrophobic parts can be kept. In the aqueous solution, detergent molecules form a micelle around the hydrophobic surface of the protein, hence help stabilizing the membrane protein in aqueous solution (Figure 4). In nature, membrane proteins have evolved to be sufficiently stable in a membrane full of lipids. However, they have not evolved to be stable in an environment of detergents. Thus, membrane proteins often have poor stability in detergent solutions, especially in the detergents with short aliphatic chains and small or charged head groups, which are the best suited detergents for growing diffraction-quality membrane protein crystals.

Consequently, detergents exclusively used for membrane protein purification often have longer aliphatic chains and big head groups, such as DDM, which are mild and more efficient at stabilizing the membrane proteins and keeping their intact function, but prevent the necessary crystal contacts.

(17)

17

Figure 4. Solubilized membrane protein in solution and crystal form.

Hydrophilic parts of the membrane proteins are covered by detergent molecules in solution (left). In crystal line, crystal contacts are mainly formed by hydrophilic parts of the membrane proteins (right) [15].

While purified detergent-solubilized membrane proteins are still required as a starting point, approaches, including lipidic-cubic phase, lipidic-sponge phase, and bicelle crystallization methods, have been developed to immerse purified membrane protein within a lipid-rich matrix before crystallization. This environment is hypothesized to contribute to the protein’s long-term structural stability and thereby favor crystallization [17].

2.2.3 Lipidic-cubic phase crystallization (LCP)

During the last decade, several alternative methods of membrane protein crystallization have been established to solve the problems of using detergents. In 1996 the concept of the lipidic-cubic phase crystallization technique was introduced. The idea is to bring the membrane proteins back to the lipidic environment after they have been purified with help of detergents, thus facilitating the crystallization [18]. Monoacylglycerols (MAGs) form the most widely used and best studied bi- continuous lipidic-cubic phases and of these, monoolein (MO) is the most frequently used for membrane protein crystallization (Figure 5 A). When MO is mixed with water, depending on temperature and water/lipid ratio a range of phases can be formed.

Bi-continuous cubic phases (Q) are spontaneously formed when the protein solution is mixed with MO. The detergent molecules are presumably integrated into the lipids and the membrane protein molecules are incorporated into the highly curved cubic phase. By adding the precipitant solution or salt, water molecules are osmotically withdrawn from the interior part of the cubic phase.

Gradually, the bilayer curvature increases and clusters together and locally flattens regions of the cubic phase into planar lamellar (Lα) stacks into which the proteins move [19]. When the protein molecules pack together and associate, they can eventually nucleate and grow into crystals (Figure 7 A).

(A)

Figure 5. Phase map. MO (A) and schematic phase map for the solvent/additive-MO-H2O system and schematic representation of the water channels with surrounding MO bilayers (B). The three phase regions are indicated: lamellar phase (Lα), cubic phase (Q) and sponge phase (L3). To the right is shown a highly curved bilayer of the cubic phase, a reduced curvature of the bilayer of the sponge phase and a flat lamellar phase. The figure is adapted from [16].

(18)

18 2.2.4 Lipidic-sponge phase (LSP)

Almost ten years later, a liquid analogue of the LCP, the lipidic-sponge phase was developed by Caffrey et al. [20]. Usually, the protein diffusion rate in the lipid bilayer is low. In order to make it easier for the proteins to diffuse in the lipid bilayer, the highly curved cubic phase can be flattened out by several methods. One option is to add solvents or other additives, such as jeffamine, polyethylene glycol (PEG) or 2-methyl-2,4-pentanediol (MPD), to the MO/water system in order to create a sponge phase, L3 phase [21]. In a MO-solvent-water system, it starts from the cubic phase.

When solvent is added, the area exposed to the aqueous domains increases and the bilayer interaction decreases. The cubic phase swell until a liquid phase lacking long-range order is formed.

The sponge phase can be regarded as a diluted or melted cubic phase with two-three times larger aqueous pores than in the cubic phase (Figure 5 B) [22]. Since it contains large aqueous pores, membrane proteins with larger hydrophilic domains may be incorporated easier in this phase than in the cubic phase.

2.2.5 Lipidic bicelle crystallization

Bicelles are a mixture of aliphatic long chain lipids, between 12 and 18 carbons, and short chain lipids or detergent, 6 to 8 carbons [23]. The morphology of bicelle is fairly adaptable depending on composition, temperature and hydration. The most recognized phase behavior of a bicelle mixture is a nano-disc with the long chain lipids present in majority in the disc plane and the short chain lipids or detergents mainly distributed in the torus of the disc (Figure 6 A). Within the last 20-30 years bicelles have been a membrane model system developed in the NMR field, where bicelles with neutral or charged lipids self-align parallel or perpendicular in a magnetic field depending on the charge of the lipid [24]. Bicelle sample can be rapidly spun at variable angles in a magnetic field [25]. Hence, orientation, structure and dynamic of membrane proteins inserted into these bicelles can be studied by both solid-state NMR and multidimensional liquid-state NMR spectroscopy [26, 27].

In 2001, bicelle crystallization was presented as another lipidic crystallization technique by Salem Faham and James U. Bowie [28]. Well-diffracting crystals of bacteriorhodopsin formed from bicelle, a method which is flexible and simple to use. Bicelle tends to form small bilayer disks at low temperature, and appears to form a perforated lamellar phase at higher temperature [29].

Detergent-purified membrane proteins can be readily reconstituted into the bilayered discoidal lipid-detergent assemblies, where they are maintained in a native-like bilayer environment (Figure 6 B). Due to the liquid-like nature of protein-bicelle mixture at low temperature like 4 °C, it is easy to measure the functionality of a membrane protein in bicelles before the crystallization setup to optimize the bicelle composition [30]. Moreover, protein–bicelle mixtures can be manipulated in the same easy way as membrane proteins-detergent solution, making bicelles compatible with all commercially available screens and standard equipment including high-throughput crystallization robots. A number of membrane proteins have now been successfully crystallized using the bicelle method, including bacteriorhodopsin, β2 adrenergic receptor, voltage-dependent anion channel, xanthorhodopsin and rhomboid protease [31-35].

(19)

19

(A) (B) ( C)

Figure 6. Bicelles. Bicelle in form of small bilayer disk (A) is formed by lipid with head group (dark grey) and double tails, and detergent molecule with head group (white) and single tails. Membrane protein can be reconstituted into the bicelle bilayered disk (B). Crystals of reaction center-light harvest 1 complex formed with bicelle crystallization method (C).

2.2.6 Crystal packing

Three-dimensional crystals of membrane proteins can pack in two different types: type I crystals and type II crystals (Figure 7). Protein molecules in type I crystals are packed in lipid monolayers and stabilized via hydrophobic interactions. The monolayers are then on top of each other and stabilized via both hydrophobic interactions and hydrophilic interactions between the polar groups of the proteins. They tend to diffract well and are more stable and robust. This is the type of crystals that usually forms when crystallizing with the lipidic phase methods. Type II crystals are often grown from the conventional detergent vapour diffusion technique. The hydrophilic regions of the proteins interact with each other and leave large solvent spaces around the hydrophobic parts.

(A) (B)

Figure 7. Crystal packing. Type I crystal packing (A): membrane proteins (grey) packed in lipid (white head group) monolayers, and type II crystal packing (B): membrane proteins interact with each other by their hydrophilic regions [16].

2.3 Protein X-ray crystallography

The most commonly used method for determining protein three-dimensional structure is X-ray crystallography. A crystal is a solid in which the constituent atoms, molecules or ions are packed in a regularly ordered, repeating pattern extending in three dimensions. When the crystal is exposed to X-rays, the photons interact with the core electrons and are scattered in all directions. The scattered beams are either reinforced or extinguished by constructive or destructive interference, producing a diffraction pattern. Bragg’s law gives the condition for constructive interference:

(Equation 1)

n is an integer

λ is the wavelength of x-ray beam used

d is the distance between the planes in the atomic lattice

θ is the angle between the incident x-ray beam and the scattering planes

(20)

20

The diffraction of X-rays through the closely spaced lattice of atoms in a crystal is recorded and then analyzed to reveal the nature of that lattice, which leads to an understanding of the molecular structure. To be able to solve the structure, a complete dataset needs to be collected by rotating the crystal in order to cover most parts of the reciprocal space. The data collected from a diffraction experiment is a reciprocal space representation of the crystal lattice.

Figure 8. Illustration of Bragg’s law [15].

2.3.1 Structure determination

In crystallography, the structure factor (static structure factor) describes how a crystal scatters incident radiation resulting in diffraction spots. It is a mathematical function defining the amplitude and phase of a wave diffracted from crystal lattice planes characterized by Miller indices h,k,l. The electron density ρ(x,y,z) at a position x,y,z in the unit cell can be fully described by the inverse Fourier transformation of the structure factor:

( ) | | ( ) ( )

(Equation 2)

Here V is the volume of the unit cell, h, k, l are the Miller indices, | | is the amplitude of structure factor for a respective reflection hkl, α is phase angle and x, y, z define a certain position in the unit cell. However, only the intensity of each diffraction spot can be recorded experimentally and it is proportional to the square of the structure factor amplitude | |

2.3.2 The phase problem

The data obtained by X-ray diffraction lacks phase information, thus calculating the electron density directly from it is not possible. There are several ways developed to obtain initial phases.

Experimental phasing methods like MAD (multiple anomalous diffraction) and SAD (single anomalous diffraction) can be used, together with more traditional methods such as MIR (multiple isomorphous replacement) and SIR (single isomorphous replacement) [36].

However, most protein structures are determined by using phase from an existing structure of a homologous protein or fragment. Molecular replacement is still the most frequently used technique to overcome the phase problem due to the remarkable growth of available protein structures in the Protein Data Bank (PDB). The Patterson function

[ ( )] (Equation 3)

(21)

21

is an alternative expression of Equation 2. It represents an interatomic vector map containing one peak for each atom related to every other atom and is created by squaring the structure factor amplitudes and setting all phases to zero. Thus Patterson maps for the data derived from unknown structure and from model, the structure of a previously solved homologue, are both generated.

Subsequently, rotation function is performed in Patterson space while translation search is applied in real-space to move the model through the unit cell. In the correct orientation and position within the unit cell, the two Patterson maps should be closely correlated and the initial phase can be obtained. The quality of the solutions can then be evaluated by a standard linear correlation coefficient or an Rfactor.

2.3.3 Structure refinement and validation

Once phases have been obtained, the first electron density map of the unknown structure can be calculated and a preliminary model can be built into the density map. The model is then refined by iterating between automated refinement in reciprocal space and manual real space refinement.

The agreement of the model with the data is evaluated with the help of the crystallographic Rfactor: || ( )| | | ( )||

( )|

(Equation 4)

Here the experimental structure factor Fobs is from the observed data and the model structure factor Fcal is the structure factor calculated from the model. As refinement progresses, the value of the R-factor should decrease since ideally Fobs and Fcal should be identical. Although in practice, the value of ‘the highest resolution divided by 10’ is generally considered as a reasonable Rfactor. Usually, an additional factor Rfree,calculated as Rfactor but the data used for calculation is a small fraction of reflections (usually 5%) that is never included in the refinement process, is commonly used as cross-validation. Thus the Rfree becomes a less biased indicator for the quality of the model.

Large discrepancies between these two values can be interpreted as implying that the refined structural model might be over-fitted to the data during the refinement.

2.3.4 Data-quality validation and high-resolution cut-off

Crystallographic data-quality is evaluated by an analogous indicator Rmerge (or Rsym)which measures the spread of n independent measurements of the intensity of an unique reflection, Ii(hkl), around the average Ī(hkl) [37]:

| ( ) ̅( )|

( ) (Equation 5)

Since Ii(hkl) values influence Ī(hkl), Rmeas, equals to Rmerge times a factor of √ ( ) , was used as a replacement of Rmerge to give an evaluation which is not dependent of the multiplicity [38].

Because the intensity of diffraction decreases with resolution, a high-resolution cut-off is applied to discard data which go beyond the cut-off resolution since they are often considered as noise and their inclusion may degrade the quality and add bias to the model. As a rule of thumb, typically data are truncated at a resolution before the Rmerge (or Rmeas) value exceeds 0.6 – 0.8, and before the signal-to-noise ratio (〈 ( )〉) drops below 2, although many uncertainties are associated with these criteria [39, 40].

At high-resolution data, background noise dominates the numerator part of the equation while the denominator approaches zero due to the average net intensity. Thus, the commonly applied

(22)

22

criterion of high-resolution cut-off is not as reliable as it appears. Data-quality R-values are not comparable to R-values from model refinement. In May 2012, Karplus and Diederichs proposed using Pearson correlation coefficient (CC), a measure of linear association between two variables, as a parameter that could potentially evaluate both data accuracy and the agreement between model and data on a common scale [41]. Pearson’s CC has been efficiently used in crystallography in anomalous scattering data process [42].

Unmerged data is divided into two parts, each containing a random half of the measurements of each unique reflection, I1 and I2. The CC, denoted CC1/2, is calculated between the average intensities of each subset, 〈 〉 and 〈 〉:

( ) [( 〉) ( 〉)] (Equation 6)

Since CC1/2 measures the correlation of one noisy data subset with another noisy subset, thus CC1/2

should be expected to be underestimating the real information content of the data. The true signal would be measured by so called CCtrue, the correlation between the averaged data set of these two subsets, thus less noisy due to the extra averaging, and the noise-free true signal. Although the true signal would normally not be known, the relation between CC1/2 and CCtrue can be derived with the assumption that errors in the two half-data sets are random and on average of similar size:

(Equation 7)

Here CC* estimates the value of CCtrue based on a finite-size sample. CC* approaches 1.0 at low resolution and drops to near 0.1 at high resolution cut-off. Moreover, CCwork and CCfree, the standard and cross-validated correlations of the experimental intensities with the intensities calculated from the refined molecular model, can be calculated on the same scale and directly compared with CC*. A CCwork larger than CC* indicates over-fitting while a CCfree smaller than CC* suggests that the model does not account for the entire signal in the data. A CCfree closely matching CC* implies that data quality is limiting model improvement, such as at high resolution. CC* can be used not only to evaluate data quality but also to link crystallographic model quality with data quality [41]. In this thesis, CC* was used as criterion for high-resolution cut-off when low resolution X-ray crystallographic data of reaction center-light harvesting 1 complex was analyzed.

(23)

23

3 Trypsin: an example of serine protease catalysis (paper I) 3.1 Serine protease catalysis

Proteases are a group of proteolytic enzymes found in all organisms. They are functioning in digestion, posttranslational processing of secreted proteins, neurotransmitters and hormones, blood coagulation and complement fixation [43]. Many of these enzymes are of medical importance, and hence are potential drug targets. Basically, there are four distinct proteases based on the catalytic type namely: serine, cysteine, aspartic and metallopeptidases.

By delocalization of the nitrogen lone pair into the carbonyl group, the peptide bond is a strong linkage with a high degree of double-bond character. During hydrolysis of a peptide bond, all three heavy atoms of the peptide are directly involved in the catalytic reaction by interacting with appropriate catalytic groups of the enzyme (Figure 9) [44]. Thus, the collective action of catalytic groups is essential. In the first step, the strength of the peptide bond weakens by a nucleophile, such as a serine OH group or a water molecule, which attacks the carbonyl carbon atom of the peptide. With the help of a general base, which accepts the proton from the nucleophilic OH group, a tetrahedral intermediate forms. Subsequently, the intermediate state is stabilized by an electrophilic catalysis step. It is followed by the leaving of the amine group from the tetrahedral intermediate. Since the amine is a poor leaving group, its protonation by a general acid is essential.

So the whole process is facilitated by a general base-general acid catalytic mechanism.

Figure 9. Common features of peptide bond cleavage by proteases. S represents the nucleophile and E the electrophile. XH acts as a general-acid which donates its proton to the leaving amine group NH.

Through kinetic, chemical, physical and genetic analyses, serine proteases are among the most studied protein families. In addition, the discovery of many three-dimensional structures has led to a better understanding of enzyme catalysis [45, 46]. As shown in Figure 10, all serine protease families are presumed to share a common catalytic mechanism, with a catalytic triad composed of Asp-His-Ser [45, 47]. Serine proteases cleave peptide bonds in two major steps: acylation and deacylation. When a suitable substrate binds to the enzyme, it induces a conformational change in the enzyme so a transient β-sheet forms. Consequently, the active site is positioned next to the scissile peptide bond. The catalytic serine attacks the scissile peptide bond and a short-lived tetrahedral intermediate forms. A proton of the serine is then transferred to the catalytic histidine.

The tetrahedral intermediate is rapidly deacylated by the reverse of the acylation steps followed by the release of the resulting carboxylate product, thereby regenerating the active enzyme (Figure 10).

(24)

24

Figure 10. The catalytic mechanism of serine proteases (adapted from [4]) . The reaction involves (step 1) the nucleophilic attack of the active site Ser on the carbonyl carbon atom of the scissile peptide bond to form the tetrahedral intermediate and His accepts a proton from Ser; (step 2) the decomposition of the tetrahedral intermediate to the acyl-enzyme intermediate through general acid catalysis by the active site Asp-polarized His, followed by loss of the amine product and its replacement by a water molecule; (step 3) the reversal of step 2 to form a second tetrahedral intermediate; (step 4) the reversal of step 1 to yield the reaction’s carboxyl product and the active enzyme.

Serine functions as the primary nucleophile. At different steps of the reaction process histidine functions as proton acceptor and donor. However, the role of aspartic acid is unclear. One theory is the charge relay mechanism suggested by Blow et al. who propose that the histidine becomes protonated by the OH group of serine, subsequently transferring one of its protons to the catalytic aspartate which is negatively charged in the free enzyme, thus neutralizing the active site [45].

However, the most critical point of this theory is based on the correct pKα value and protonation state of the histidine. In the free, solvent-accessible form the pKa of the histidine is around 7.0 - 7.5 while aspartate residue has an assumed pKa around 3, which does not motivate the transfer of a proton from histidine to aspartate since histidine is a better proton keeper than aspartate [48, 49].

Site-directed mutagenesis of the aspartate to neutral asparagine was examined and catalytic rate was decreased by a factor of 104 which supported the role of a negative charged residue near the histidine [50]. Hence, the protonation state and the task of the buried aspartate residue at the catalytic triad are still under debate.

(25)

25

3.2 Co-crystallization

Recent technical advances in crystallographic analysis, particularly improvements in X-ray flux at third generation synchrotrons with high brilliance and highly focused synchrotron beam lines, have significantly improved the resolutions to better than 1 Å which now becomes achievable for many macromolecular crystal structures. With sub-ångstrom resolution, the level of visualized detail of the structure is increased, including the location of hydrogen atoms [51, 52]. Hydrogen atoms play a significant role in the catalytic mechanism of enzymes. By monitoring the hydrogen occupancy in structures, captured at a series of pH values, the pKα can be directly determined [53]. Nevertheless, to obtain such ultrahigh resolution diffraction, the key step is to grow crystals of exceptional quality.

Protein complexes formed by components with weak binding strength usually show different potential binding modes, thus their crystals often suffer from disorder and do not diffract well [54].

Optimization of contact interfaces between a protein and its binding partner can lead to increased binding. In this scenario complexes formed are less dynamic which may improve crystal quality[55].

Wild-type SGTI (also named SGPI-1) is a weak inhibitor of bovine trypsin (BT) with an inhibition constant in micromolar range [56, 57]. Using a phage display assay, a set of SGTI mutants were selected that showed improved binding to bovine trypsin [56, 57]. The unique combination of mutations improved the inhibition constant of the mutant inhibitor SGPI-1-PO-2 from micromolar to picomolar range and made it possible to crystallize the complex [56]. From a crystallization buffer at pH 4.6, crystals of the complex were produced and diffracted to a ultrahigh 0.93 Å resolution.

3.3 Density functional theory

In physics and chemistry Density functional theory (DFT) is a quantum mechanical modeling method used to investigate electronic structure, principally ground state, of many-body systems in atoms, molecules and condensed phases. Thus, the properties of a many-electron system can be determined by using functionals of the spatially dependent electron density. DFT has been increasingly used in applications related to biological systems, complementing experimental investigations. Hence, it allows a close connection between theory and experiment, serving either to validate the conclusions that have been reached from the analysis of the experiments, or to distinguish between those possibilities that were left open. It often leads to critical clues about the geometric, electronic, spectroscopic properties of the systems being studied [58]. Therefore, the experimental coordinates of the simplified active site in the structure were optimized using DFT with the aim of supporting interpretation of the protonation state of catalytic residues observed from the experimental electron density (Figure 11).

(26)

26

Figure 11. Comparison of the theoretical protonated and deprotonated carboxyl groups to experimental electron density in the crystal structure at 0.93 Å resolution (left). (a) electron density of Asp-102 compared with theoretical protonated propionic acid. (b) electron density of Asp-189 compared with theoretical deprotonated propionic acid. 2mFobs - DFcalc electron density map (blue) is contoured at 4.5 σ, and mFobs DFcalc density (green) is contoured at 2.5 σ. Alternative protonation states of the imidazole ring as calculated by density theory and experimentally observed at residue His-57 in the electron density maps at 0.93 Å resolution (right). 2mFobs - DFcalc electron density map (blue) is contoured at 3.5 σ, and mFobs – DFcalc density (green) is contoured at 3.0 σ.

References

Related documents

Department of Physics, Chemistry, and Biology (IFM) Linköping University. SE-581 83

Firstly, developing a method of producing membrane protein crystals in lipidic cubic phase in a system that can be monitored over time with high magnifications using

This thesis focuses on structural investigations performed by serial crystallography (SX) of CcO in lipidic cubic phase (LCP).. The advances of bright X-ray sources in the form

To study light-induced structural changes of the photosynthetic reaction center from purple non-sulfur bacterium Blastochloris viridis using X-ray crystallography,

Crystallization of Trypsin, Bacterioferritin, Photosynthetic Reaction Center, and Photosynthetic

Division of Communication Systems Department of Electrical Engineering (ISY) Linköping University. SE-581 83

45. Skulder av olika slag 1858 — 1866 i % av agrarkapital enligt marknadsvärdering 1862 avseende storjordbruk samt yngre och äldre hälft av ägare till mellanjordbruk och

The linear model of quantization effect is: when input signal of quantizer is so big that quantization error shows irrelevance to input signal, quantization effect is equivalent to