• No results found

Proton binding in proteins: pK

N/A
N/A
Protected

Academic year: 2022

Share "Proton binding in proteins: pK"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)

UPTEC X 04 025 ISSN 1401-2138 APR 2004

JENS CARLSSON

Proton binding in proteins: pK a

calculations with explicit and implicit solvent

Master’s degree project

(2)

Molecular Biotechnology Programme

Uppsala University School of Engineering

UPTEC X 04 025 Date of issue 2004-04

Author

Jens Carlsson

Title (English)

Proton binding in proteins: pKa calculations with explicit and implicit solvent

Title (Swedish)

Abstract

Accurate and reliable pKa prediction is of significant interest, because it provides direct information of the protonation state of a protein and can be compared to experimental data. In this work pKa shifts of three aminoacids in proteins have been calculated using molecular dynamics free energy simulations with an explicit solvent and implicit Generalized Born solvent model. The direction of the shifts were correctly predicted in both the explicit and implicit solvent calculations, but the results were not in perfect agreement with experimental data.

Keywords

Proton binding, pKa, free energy calculations, molecular dynamics

Supervisors

David A. Case

Department of Molecular Biology, The Scripps Research Institute, San Diego, USA Scientific reviewer

Johan Åqvist

Department of Cell and Molecular biology, Uppsala University

Project name Sponsors

Language

English

Security

ISSN 1401-2138 Classification

Supplementary bibliographical information Pages

(3)
(4)

Proton binding in Proteins:

pK

a

calculations with explicit and implicit solvent

Jens Carlsson

Sammanfattning

Att kunna modellera cellens molekyler med datorsimuleringar är av stort intresse inom biologin och kan ge djupare förståelse för relationen mellan proteiners stuktur och funktion. Proteiners aktivitet styrs bland annat av den omgivande syra-bas

koncentrationen i cellen och i detta projekt har förändringar hos proteiner vid olika pH- värden studerats. För att på ett korrekt sätt beskriva den miljö som omgiver proteinet måste naturligtvis vatten inkluderas i datorsimuleringarna och detta studerades med två olika modeller. Den ena modellen beskriver vattenmolekyler på ett detaljerat sätt, vilket leder till mycket kostsamma beräkningar, medan den andra modellen representerar vattnets effekt på ett mer förenklat sätt. Det visade sig att båda modeller kunde, på ett tillfredställande sätt, indikera proteinets egenskaper vid olika pH-värden.

Examensarbete 20 p i Molekylär bioteknikprogrammet Uppsala universitet April 2004

(5)

Contents

1 Introduction 2

2 Theory 4

2.1 Molecular dynamics . . . . 4

2.2 Molecular mechanics and force fields . . . . 5

2.3 Thermodynamic integration . . . . 6

2.4 The Generalized Born model . . . . 8

2.5 pKa shifts from free energies . . . . 10

2.6 Linear Response Theory . . . . 11

3 Materials and Methods 13 3.1 Modelling proton binding with molecular mechanics . . . . 13

3.2 Choice of appropriate residues and reference molecule . . . . 14

3.3 Setup of explicit solvent calculations . . . . 15

3.4 Setup of implicit solvent calculations . . . . 16

4 Results and Discussion 17 4.1 Linear free energy derivative and Gaussian distribution of the perturbation energy . . . . 17

4.1.1 Explicit solvent calculations . . . . 17

4.1.2 Implicit solvent calculations . . . . 20

4.2 Structural analysis and comparison to experimental data . . . . . 22

4.2.1 Asp26: structural re-organization on long timescales . . . 22

4.2.2 Asp20 . . . . 28

4.2.3 Asp14 . . . . 28

5 Conclusions 30

6 Acknowledgements 31

(6)

1 Introduction

Understanding the pH dependent properties of proteins is of great importance, because ionizable residues in active sites play key functional roles in fundamental biochemical processes. A deeper insight into proton binding provides knowledge in many fields of biochemistry, such as protein stability, substrate binding, en- zyme mechanisms and molecular recognition. For these reasons, accurate and reliable pKa prediction is of significant interest, because it provides direct in- formation of the protonation state of the protein and can easily be compared to experimental data.

Proton binding constants are sensitive to the environment created by the tertiary structure of the protein. For example, an aspartic acid in a nonpolar environment will have an enormous pKa shift upwards, because it is highly unfavorable to create a charge in such a region. Conversely, a positively charged environment would in this case favor an unprotonated state and hence downshift the pKa value (Figure 1).

Figure 1: The pKa shift of ionizable residues depends on the local environment in a protein. Buried residues can have substantially shifted pKa values compared to the amino acid in aqueous solution (The illustration was used with permission from Holger Gohlke [1]).

In the ionization of an aspartic acid in a protein a hydrogen ion is transferred into the solvent leaving a negatively charged carboxylate group. The protein itself undergoes dielectric relaxation by conformational changes and electronic polarization in response to the charge re-arrangement.

There are several macroscopic methods for calculating pKavalues. One pop- ular approach is based on solving the Poisson-Boltzmann equation(PB) using a continuum model for both the protein and the solvent [2, 3]. These calcula- tions are often based only on one or a few experimental conformations of the protein, but unfortunately, slightly different conformers have been shown to pro- duce results, that differ by several pKa units [4, 5, 6]. Attempts to incorporate conformational averaging in PB models by side chain flexibility [7, 8, 9], larger ensembles of structures [10], or an average structure [11] have been shown to improve results. Yet, these models still lack the ability to model the conforma- tional changes that occur in response to protonation state changes. These effects are accounted for implicitly in PB models by using higher values of the dielectric constant,  [12]. There is, however, no general consent about the meaning of an average  in such heterogenous structures as proteins and how an appropriate value for a specific model is chosen [13, 14, 15].

(7)

A more fundamental approach is given by microscopic free energy calcula- tions using Molecular Dynamics (MD) simulations. Unfortunately, calculations of free energies using an all atom representation of both solvent and protein have, for a long time, been impossible due to convergence problems, lack of computational power, and inaccurate treatment of long-range interactions. Mi- croscopic calculations of the free energy associated with deprotonation of ion- izable residues have been carried out, but many of these studies have made further approximations, for example by using simplified solvent representations such as the Langevin Dipole model [16]. Another popular assumption is the linear response approximation [17], which assumes a linear response of the pro- tein to the charge re-arrangement. Although linear response has been shown to be valid for small molecules, few computational studies with a fully microscopic treatment of the protein and solvent system have been performed in this area [18].

In recent years, there has been significant improvement in the accuracy of long-range electrostatics in MD simulations as well as an enormous increase of computer power, which provide the opportunity to reach converged results using a fully microscopic approach. Since pKa essentially is the free energy of deprotonation, a fundamental way to estimate this value is by using Molecular Dynamics Free Energy (MDFE) simulations. This work presents the first MDFE calculatations of pKa shifts with an all-atom representation of both the protein and the solvent as well as an accurate treatment of long-range electrostatics by Particle Mesh Ewald (PME) summation. The approach should therefore provide a more accurate description of both conformational flexibility and solvation effects compared to PB methods. Furthermore, it allows the protein to relax from the structure of the initial state in response to the charge re-arrangements.

The study focuses on three aspartic acids of which two are extremely per- turbed in pKa compared to the free amino acid in solution (pKa ∼ 4.0) [19]:

Asp26 (pKa ∼ 7.5) [20] and Asp20 (pKa ∼ 4.0) [21] of Escherichia coli Thiore- doxin and Asp14 (pKa ∼ 2.0) [22] of Ribonuclease A. For all cases, the direction of the pKa shifts were correctly predicted and the results are in fair agreement with experimental data. The simulations reported also directly verify linear response to the deprotonation of a solvated protein.

A second set of MDFE pKa calculations were carried out using a Gener- alized Born [23, 24, 25] implicit solvent model. Although this approach uses a continuum solvation model, it still explicitly takes into account conforma- tional averaging and protein relaxation and, thus, does not, contrary to most PB methods, assume a higher value of the protein dielectric constant. The same three residues as in the explicit solvent calculations were considered. Again the direction of the shifts were correctly predicted and the results were closer to the experimental data than the corresponding explicit solvent calculations.

(8)

2 Theory

This section outlines the basic principles of molecular dynamics and free en- ergy calculations. More detailed descriptions of these methods can be found in litterature [26, 27].

2.1 Molecular dynamics

In molecular dynamics the movement of a molecule is simulated as a function of time by solving Newtons equations of motion. Assume a system composed of N atoms, for example a protein in aqueous solution. By defining a potential function, V (rN), for the system, the force Fiacting on atom i, can be calculated:

Fi=∂V (rN)

i

(1) where rN = r1, ...,rN are the cartesian coordinates of all N atoms. Since the masses miof the atoms are known, Newton’s second law connects the force with the acceleration aiof the atoms:

ai= Fi

mi (2)

By integration of this expression, the velocities and positions of the atoms are obtained. Under the assumption that the force acting on each atom is constant for a given time step, , new positions and velocities can be generated by using finite difference methods. An example of such an algorithm is leap-frog [28], which is widely used in MD simulations:

ri(t + δt) = ri(t) + δt· vi(t) (3) vi(t + δt) = vi(t1

2δt) + δt· ai(t) (4) where ri(t) and vi(t) are the position and the velocity for particle i at time t. The time step of a biomolecular simulation is typically chosen to be in the order of one femto-second. Using these equations thousands of iterations can be performed to investigate the movement of a molecule [26].

(9)

2.2 Molecular mechanics and force fields

In order to use MD simulations a potential function, V (rN), must be defined and parametrized. There are several different types of potential functions (also called force fields) and this section describes the general form of those common in biomolecular simulations. These force fields are composed of two groups of terms: Intramolecular (Intra) terms, which are associated with covalently con- nected atoms and Intermolecular (Inter) terms, which represent the noncovalent or nonbonded interactions:

V (rN) = Vintra(rN) + Vinter(rN) (5) The intramolecular potential function has the following form:

Vintra = 

bonds

Kr(r−req)2+ 

angles

Kθ−θeq)2+ 

dihedrals

Vn

2 [1+cos(nφ−γ)] (6) The first two terms represent the potential energy of the bonds and angles.

For these contributions, the energy varies with the square of the displacement, which is an effective way to keep the bond lengths r and angles θ close to their equilibrium values req and θeq. Krand Kθ are the force constants of the bonds and angles near their equilibrium value. The last term in the intramolecular term represents the potential energy associated with torsional movement. The dihedral angle is φ, Vn is the force constant, n is the periodicity, and γ is the phase of the dihedral.

The intermolecular potential function has the following form:

Vinter =

i<j

4ij

σij rij

12

σij rij

6

+

i<j

qiqj

rij (7)

The first term in the intermolecular contribution is the Lennard-Jones (LJ) 6- 12 term and represents repulsion and dispersion interactions. There are two parameters in this equation; the well depth, ij, indicates the magnitude of favorable dipole induced interactions between atoms i and j, and σij is the interatomic distance at which the interaction energy is zero. The last term in the potential function is the electrostatic term, which depends on the partial atomic charges, qi and qj, and the distance, rij , between atom i and j [26].

The parameters in the intra- and intermolecular terms are derived by using experimental data, combined with quantum chemical calculations, to obtain the correct behavior of different types of molecules. AMBER, which is the force field used in this work, is parameterized to be used in simulations of proteins and nucleic acids.

(10)

2.3 Thermodynamic integration

Free energy calculations are among the most powerful and interesting tools in computational chemistry, and the technique is frequently used in biomolecular calculations. This section describes a common free energy calculation method, thermodynamic integration (TI).

If the free energy difference, ∆F, between two states A (protonated aspartic acid) and B (unprotonated aspartic acid) is assumed to be a continuous function of a coupling parameter λ, it can be written as:

∆G(λ) =

 1

0

∂G(λ)

∂λ dλ =

 1

0

∂U (rN, λ)

∂λ



λ

(8)

The free energy F(λ) is defined as:

F (λ) =−kBT ln(Z(λ)) =−kBT ln(Q(λ)) + c(pN) (9) where Z(rN, pN, λ) is the classical partition function and Q(λ) is the config- urational integral with the velocity part integrated out. The configurational integral in the NVT ensemble is:

Q(λ) =



exp[−U(rN, λ)/kBT ]drN (10)

U(rN, λ) is the potential energy function, T is the temperature and kB is the Boltzmann constant.

Computer simulations allow manipulations of systems in a way that would not be possible in real experiments. Since the free energy is a state function, the transformation from state A to state B can be performed stepwise, where different values of the coupling parameter λ correspond to mixtures of states A and B. A common hybrid function is a linear interpolation between the potential energy functions corresponding to each state, UA and UB:

U (rN, λ) = (1− λ)UA(rN) + λUB(rN) (11) The hybrid system is gradually changed from state A to state B by varying the coupling coordinate λ from 0 to 1.

By using the chain rule, the derivative of the free energy can be calculated:

∂F (λ)

∂λ =−kBT∂lnQ(λ)

∂λ =−kBT 1 Q(λ)

∂Q(λ)

∂λ (12)

The derivative of the classical partition function is:

∂Q(λ)

∂λ = 1 kBT

 ∂U (rN, λ)

∂λ exp[−U(rN, λ)/kBT ]drN (13) Substituting this into equation (12) yields:

∂F (λ)

∂λ =

 ∂U (rN, λ)

∂λ

exp[−U(rN, λ)/kBT ]

Q(λ) drN (14)

(11)

Since the Boltzmann probability function is ρ(λ) = exp[−U(rN, λ)/kBT ]

Q(λ) (15)

this yields

∂F (λ)

∂λ =

 ∂U (rN, λ)

∂λ ρ(λ)drN =

∂U (rN, λ)

∂λ



λ

(16)

The brackets indicate an average over the ensemble corresponding to the hybrid energy function, U(rN, λ). Finally the free energy difference can be calculated from:

∆F (λ) =

 1 0

∂U (rN, λ)

∂λ



λ

(17)

In practice, the integration can be achieved by making a series of simulations corresponding to a number of values of λ between 0 and 1. The total free energy difference can then be approximated by the area under the graph (Figure 2) [26].

0 0.2 0.4 0.6 0.8 1

<U/∂λ>λ

λ ∆G

Figure 2: The free energy can be calculated by making a series of calculations corre- sponding to differentλ values. The free energy change can be estimated by calculating the area under the graph.

The technique derived is not only valid for the NVT ensemble, but also for the other thermodynamical ensembles such as NPT and NVE. It should be noted that these ensembles essentially are artificial constructs and in the thermodynamic limit (large system size and long simulations) the calculated free energies become equivalent [29].

(12)

2.4 The Generalized Born model

Biomolecules interact in an aqueous phase and a correct description of this en- vironment is thus important to obtain realistic simulations. Explicit solvation using an all-atom representation of water molecules is the most accurate way to incorporate these effects in MD, but it is also the computationally most expen- sive. A more approximate, but less expensive, alternative is implicit solvation, where the explicit water molecules are replaced with a continuum medium hav- ing the average properties of water. This procedure decreases computational time, because the number of atoms used is reduced. In addition, there are several other advantages. First, there is no need for re-organization of water molecules in response to conformational changes of the solute and, therefore, it is not necessary to perform long equilibrations in order for water to fully penetrate cavities in the protein. Also, since implicit representation of solvent corresponds to solvation in an infinite volume, difficulties with using periodic systems, such as replica interaction, are avoided [30].

Protein εw

Protein

H H O

H H O H H

O H H

O

H H O H H H H O

O

H H O H H O H H OH H O

H H O H H

O

H H O H H

O

H H O

H H H H O

O H H

O

H H O

H H O

H H H H O O

H H O

H H O H H

O H H

O

H H O

H H O

H H O

H H O

H H O

H H O H H

O

H H O H H O

Figure 3: By using an implicit solvent model the water molecules in a simulation do not have to be explicitly included. The water molecules are replaced with the average properties of water, characterized by its dielectric constant,w. This reduces computation time and accelerates conformational sampling.

In 1920 Born derived the electrostatic component of the free energy of sol- vation for displacing a charge within a spherical solvent cavity. For a simple ion with radius a and charge q the solvational free energy is given by the well-known Born formula [31]:

∆GBorn=q2 2a

 1 1

w



(18)

(13)

If a protein is modelled as a set of spheres with charges q1,..., qN and radii a1,...,aN the total electrostatic free energy is a sum of Born- and Coulomb terms:

G =

N i=1

N j=i+1

qiqj

wrij 1 2

 1 1

w

N

i=1

q2i

ai (19)

=

N i=1

N j=i+1

qiqj rij

 1 1

w

N

i=1

qiqj rij 1

2

 1 1

w

N

i=1

q2i

ai (20) The last two terms represent the polarization free energy. This is the Generalized Born equation (GB) [23, 24]:

∆GGBpol =

 1 1

w

N

i=1

N j=i+1

qiqj rij 1

2

 1 1

w

N

i=1

qi2

ai (21) As a final approximation, the expression can be transformed into a single term:

∆GGBpol ≈ −1 2

 1 1

w

N

i=1

N j=1

qiqj

fGB(rij) (22) The function fGB should be constructed so that when i=j (rij = 0), it acts as an ”effective Born radius”, fGB = Ri. The effective Born radius, Ri, depends not only on ai, but also accounts for the fact that atoms in proteins, unlike ions, are surrounded by neighbors that displace the solvent and, therefore, decrease the polarization energy (Ri ≥ ai). At longer distances between atom i and j fGB acts as an effective interaction distance, fGB ≈ rij. One popular form of fGB is:

fGB(rij) =

r2ij+ RiRjexp[−rij/4RiRj]12

(23) Many methods have been proposed for the estimation of Ri and numerous ex- amples can be found in literature [30]. These models have been shown to be a good approximation of solvation energies for small molecules, but for molecules having large interior regions the errors increase. This also affects pKa calcu- lations, and it seems as GB models might not provide the accuracy needed to predict pKa shifts [32]. The largest errors in the calculation of effective Born radii were obtained for deeply buried atoms, where the model seems to under- estimate the pKa shifts. In this work a recently developed Generalized Born model, GBOBC [25], is used which corrects for the underestimation of effective Born radii for deeply buried atoms.

(14)

2.5 pKa shifts from free energies

The pKa value is connected to the free energy change of protonation. The equlibrium constant Ka is related to the pKa value:

AH + H2O A+ H3O+ (24)

pKa=−log10(Ka) (25)

Since the equilibrium constant is connected to the free energy by:

Ka= exp

−∆Gdeprotonation

RT

(26) The pKa can be calculated from equation (25) and (26), which yields:

pKa= 1

2.303RT∆Gdeprotonation (27)

the pKa shift of a residue relative to a reference compound with known pKa value can then be calculated by performing two free energy calculations (Figure 4) [16]. One calculation where the free energy of deprotonation for the amino acid in the protein, ∆Gprotein, is estimated and a second simulation where the same free energy for the amino acid in solution, ∆Gmodel, is computed.

Figure 4: By using two mutations, ∆pKa with respect to a model compund can be calculated. Protein-ASPH and ASPH represent the protonated form of aspartic acid in the proteins and the model compound respectively. Protein-ASPand ASP represent the unprotonated forms.

The relative free energies are given by:

∆∆Gdeprotonation= ∆Gprotein− ∆Gmodel (28) The pKavalue of the residue in the protein is then given by using equation (27):

pKa,protein= pKa,model+ 1

2.303RT∆∆Gdeprotonation (29)

∆pKa,protein= pKa,protein− pKa,model= 1

2.303RT∆∆Gdeprotonation (30)

(15)

2.6 Linear Response Theory

This section gives an introduction to linear response theory and provides a brief overview of concepts used in the analysis of the results of this work.

Assume that the potential function, U, for a system can be written as a sum of the initial state, UA, and a perturbation, ∆U , where the variable λ ∈ [0, 1]

determines the magnitude of the perturbation.

U = UA+ λ∆U (31)

In the case considered in this work the perturbation is the charge created on an amino acid by deprotonation and the λ value determines the mixed states of the protonated and deprotonated amino acid. For the linear interpolation (see Section 2.3), this means that ∆U = UB−UA, i.e. equation 31 can be expressed:

U = λUB+ (1− λ)UA= UA+ λ(UB− UA) (32) The free energy can be expressed through the perturbation formula derived by Zwanzig [33]:

∆F =−β−1lnexp(−βλ∆U)A (33)

Expansion of this expression in powers of βλ∆U yields [26]:

∆F = λ < ∆U >A−λ2β 2

(∆U− < ∆U >A)2

A+ λ3β62

(∆U− < ∆U >A)3

A+ ...

(34)

In early studies of electron transfer reactions, it was found that the free energy of charging is approximately quadratic as a function of charge. This implies that terms of order three and higher are negligible and this assumption leads to the linear response approximation (LR) [18].

∆FLR= λ < ∆U >A−λ2β 2

(∆U− < ∆U >A)2

A (35)

The free energy is hence a quadratic function of λ, in accordance with the results described above. From equation (35) it also follows that the derivative with respect to λ of ∆FLR is a linear function of λ:

∂∆FLR(λ)

∂λ =< ∆U >A−λβ

(∆U− < ∆U >A)2

A (36)

It can also be shown that an equivalent condition is that the probability distri- bution of the energy gap is gaussian [34].

The free energy can be decomposed into two components [18]. The static component to the free energy, ∆Fstatic, corresponds to the introduction of the

(16)

The relaxation free energy can hence be obtained by calculating the second derivative with respect to λ:

∆Frelaxation=β

2 < (∆U− < ∆U >A)2>A=1 2

2FLR(λ)

∂λ2 (39)

(17)

3 Materials and Methods

3.1 Modelling proton binding with molecular mechanics

In this work, molecular mechanics is used to model the proton binding reaction.

The proton is treated as a classical mechanical particle, bearing a partial charge, and interacts with the atoms in the system via the force field potential function.

The binding of the proton is introduced by creating a new particle, the hydrogen atom, in the simulation. By using thermodynamic integration, this can be modelled by gradually transforming the protonated form into the unprotonated of the amino acid.

Oδ1 Oδ2 -0.6376 (-0.8014)

Cγ 0.6462 (0.5307)

Cβ

Cα

Hδ2 0.4747 (0.0000)

-0.5554 (-0.8014)

Figure 5: Charges of the protonated form and the unprotonated form (in parenthesis) of aspartic acid.

The change of charge was localized to the Cγ, Oδ1, Oδ2and Hδ2atoms. The partial charges of aspartic acid were chosen so that the total change of charge is -1 (Table 1). In many other methods for pKacalculations, the change is located only to the heavy atoms, but the approach taken here should provide a more accurate description of the reaction because the hydrogen atom is explicitly included.

Table 1: Partial charges and atom types of the protonated and unprotonated form of aspartic acid.

Protonated Unprotonated Atom Atom type Charge Atom type Charge

Cγ C 0.6462 C 0.5307

Oδ1 O -0.5554 O2 -0.8014

Oδ2 OH -0.6376 O2 -0.8014

Hδ2 H 0.4747 H 0.0000

(18)

3.2 Choice of appropriate residues and reference molecule

The choice of residues for which the pKashifts were calculated was based on two critera: First, many studies of pKa values have focused on unperturbed surface groups, but it has been pointed out that in these cases reasonable agreement with experiment can be achieved even by fundamentally wrong models [14].

Thus, to be a proper test case of the applied methodology, the residues should have a large pKa shift. Second, the residues were selected such that they are not influenced by the protonation state of other ionizable residues. That is, there should not be any residues nearby that have a similar pKa value with respect to the residue of interest. Interacting titratable sites add complexity to the calculations, because all 2N different protonation states have to be taken into account in this case. The chosen residues are listed in Table 2.

Table 2: Experimental pKa values of the three chosen residues and the model com- pound.

Residue pKa

Thioredoxin - Asp26 7.5 Thioredoxin - Asp20 4.0 Ribonuclease A - Asp14 2.0

Model - Asp 4.0

Asp26 of the oxidized form of Escherichia coli Thioredoxin is a deeply buried residue in the interior of the protein and it has an experimentally determined pKa of 7.5 [20]. It was assumed that it titrates independently of other ionizable groups, which seems reasonable since no nearby group has a similar pKa value as Asp26. All the other ionizable groups in the protein have a fixed protonation state, which is determined by the most likely protonation state at pH close to the experimental pKa of the residue. In this case all other aspartic and glu- tamic acids were kept unprotonated, histidines were kept in a singly protonated state, and lysines and arginines were protonated. To confirm that a surface exposed residue is expected to show unshifted behavior, Asp20 of Escherichia coli Thioredoxin was chosen. No experimental data was found for this residue, but it is conserved in human Thioredoxin where its pKa is unperturbed (pKa

≈ 3.6-4.0) [21]. All aspartic and glumatic acids were protonated in this case and histidines were kept in a doubly protonated state. Arginines and Lysines were also kept in their protonated state. Finally, to complement the abnormally high pKa of Asp26, Asp14 of Ribonuclease A was chosen as the third case. Ac- cording to experiment it has an unusually low pKa value equal to 2.0 [22]. For the same reasons as in the Asp26 of Thioredoxin, this residue was expected to titrate independently of other residues. In this case aspartic and glutamic acids were kept unprotonated although the predominant state at pH = 2.0 probably is the protonated form. Even though all of them are more than 10 ˚A apart from Asp14, this might affect the final results of the calculations. To study the influence of this, Generalized Born calculations were carried out both with the protonated and unprotonated aspartic and glumatic acids. Histidines were doubly protonated in all these calculations.

(19)

As model compound aspartic acid with blocking groups N-acetyl and N- methylamide (Figure 6) was chosen. The model pKa value is, according to experiment, equal to 4.0 [19].

O H

N

H N CH3

H O

O CH3 O

Figure 6: Aspartic acid with protective groups was chosen as reference molecule for thepKacalculations.

3.3 Setup of explicit solvent calculations

All explicit solvent free energy calculations were carried out with AMBER 7.0 suite of programs [35] using the AMBER/parm94 force field [36] and the TIP3P water model [37]. Periodic boundary conditions were applied using an octahe- dral box. The temperature was set to 300 K, the time step to 1 fs, and bonds to hydrogens were constrained with SHAKE [38]. Long-range electrostatic interac- tions were treated by PME [39] with an 8 ˚A cutoff for the direct space sum and a continuum method to evaluate van der Waals interactions beyond the direct sum.

All calculations on Thioredoxin were performed starting from an X-ray struc- ture with 1.68 ˚A resolution (PDB code 2TRX) determined at pH=3.8 [40]. The crystallographic structure was placed in an octahedral box with a volume of about 181000 ˚A3. In the Asp26 case the box contained 4757 water molecules and 4 Na+ ions and in the Asp20 case 4753 water molecules and 3 Na+ ions.

The structure was equilibrated by performing 50 ps of NVT simulation, in which the temperature was raised to 300 K. An additional NPT simulation was carried out for another 110-250 ps until a proper density of the system was reached.

All calculations on Ribonuclease A were performed starting from an X-ray structure with 1.05 ˚A resolution determined at pH 5.2 ( PDB code 1KF2 ) [41].

The crystallographic structure was placed in an octahedral box with volume 267923 ˚A3 containing 7324 watermolecules and 9 Cl ions. The system was equilibrated by performing 50 ps of NVT simulation, in which the temperature was raised to 300 K. Finally a NPT simulation was carried out for another 250

(20)

Successive simulations were started from the configurations reached after 200- 1000 ps of the previous simulation. For Asp26 and Asp14, which are buried atoms, convergence was slow and a 3 ns equilibration was performed at each λ. After that a 5 ns production for Asp14 and 6 ns production for Asp26 was carried out. Asp20, which is a surface residue, was equilibrated for 200 ps at each λ and after that a 5 ns production were carried out. At each endpoint a short, 1.5 ns, reverse calculation (λ=0.88729 to λ= 0.5) was carried out for all residues. This was performed to detect possible hysteresis effects in the free energy derivative. The model compound was simulated for about 1.0 ns for each window.

3.4 Setup of implicit solvent calculations

All Generalized Born thermodynamic integration calculations were carried out with AMBER/GBOBC(parameter set I) [25] with a modified form of the parm99 force field, parm99Mod2, [42, 43]. The temperature was set to 300 K, the time step to 2 fs and bonds involving hydrogens were constrained with SHAKE [38].

No cut-off for long-range interactions was used, GB/SA was used and concen- tration of monovalent ions was set to 0.2 M. The same crystal structures as in the explicit solvent run were used. They were prepared by a 20 ps equilibration in which the temperature was gradually raised to 300 K. The coordinates were held fixed with harmonic restrains and gradually released during a period of 30 ps. The TI calculations were performed with five different λ’s between 0 and 1 (0.0, 0.11270, 0.5000, 0.88729, 1.0). For most points a 10 ps equilibration was performed before starting the production runs in order for the protein to adjust to the perturbation. All successive simulations were started from the configu- ration reached after 200-500 ps of the previous simulation except λ = 0 and λ

= 0.11270, which were both started from the equilibrated X-ray structure

(21)

4 Results and Discussion

This section is divided into two parts. The first part discusses the calculations and the agreement of the results with linear response theory. Section two ana- lyzes the structures, the calculated pKa values and compares this work to earlier computational studies and experimental data.

4.1 Linear free energy derivative and Gaussian distribu- tion of the perturbation energy

4.1.1 Explicit solvent calculations

The total production time for the explicit solvent calculations was on average more than 5 ns for each λ value. The calculated free energies of deprotonation are shown in Table 3 together with the estimated errors.

Table 3: Explicit solvent MDFE results (kcal/mol)

Model Asp26 Asp20 Asp14

run length (ns) 3.0 18.0 15.0 15.0

∂G

∂λ(λ=0.11270) -1.3(0.8) -3.1(12.0) -1.0 (1.6) -1.3(1.0)

∂G

∂λ(λ=0.5) -75.3(0.2) -64.5(4.6) -73.9(0.4) -78.3(1.8)

∂G

∂λ(λ=0.88279) -148.6(2.0) -131.3(7.8) -145.1(1.8) -146.8(3.2)

∆G -75.1(0.6) -66.0(4.3) -73.4(0.7) -75.9(1.1)

∆∆G 0 9.1(4.3/2.5) 1.7(0.9) -0.8(1.4)

∆∆G (exper.) - +4.8 0 -2.7

Absolute and relative free energies (kcal/mol) for the three residues and the model compound. ∆G and ∆∆G are calculated as presented in the methods section. The error in the derivatives are estimated as twice the standard deviation of block averages, each trajectory being divided into four blocks.

As described in Section 2.6 a gaussian distribution of the energy gap ( ∆U

= Uprotonated(λ) - Uunprotonated(λ) ) and a linear free energy derivative with respect to λ can be expected from linear response theory. Figure 7 shows the free energy derivative with respect to λ for each residue. Figure 8 shows the log probability distributions of the energy gap for each residue and λ value.

(22)

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Model

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Model

Fit

(a) Model.

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Asp26

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Asp26

Fit

(b) Asp26.

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Asp20

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Asp20

Fit

(c) Asp20.

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Asp14

-160 -140 -120 -100 -80 -60 -40 -20 0

0 0.2 0.4 0.6 0.8 1

G/∂λ

λ Asp14

Fit

(d) Asp14.

Figure 7: The free energy derivative (kcal/mol) as a function ofλ for the explicit solvent calculations: a) Model, B) Asp26, C) Asp20, D) Asp14.

The linearity in the free energy derivatives is close to perfect for all three residues, although it should be pointed out that only three data points were used in this study. Using more points would, of course, more clearly show if the response truly is linear. The ”backward” runs, carried out to detect pos- sible hysteresis, overlapped well with the ”forward” runs for all residues (data not shown). The decrease in the free energy derivative describes the increasing polarization of the environment surrounding Asp. This indicates that the envi- ronment, as described in Section 2.6, responds linearly to the deprotonation of the residues.

References

Related documents

Som ett steg för att få mer forskning vid högskolorna och bättre integration mellan utbildning och forskning har Ministry of Human Resources Development nyligen startat 5

Dels kom han att intaga en ledande ställning i folkskollärarekårens förenings- väsen — redan 1885 vardt han medlem och två år senare sekreterare i centralstyrelsen för Sveriges

A new constant pressure algorithm and periodic boundary conditions in molecular dynamics free energy calculations Title Swedish Abstract Periodic boundary conditions were implemented

In this work pK a shifts of three aminoacids in proteins have been calculated using molecular dynamics free energy simulations with an explicit solvent and implicit Generalized Born

In this work, an algorithm for calculating the free energy of solvation for a molecule, called the generalized Born/surface area (GB/SA) method [1, 2], was implemented into an

(1999) Escherichia coli mutants lacking all possible combinations of eight penicillin binding proteins: viability, characteristics, and implications for peptidoglycan

mmas plerumque magis excelluifie memoria, autima- ginatione, quam judicio Philofophico. Plurimae in Poéfi, Flifioria , &amp; linguis verfatifTimae fuere, fed, quod ad illas

[r]