Benchmarking Physical Properties of Water Models

(1)

Uppsala University

Degree Project C in Physics, 15c

Department of Physics and Astronomy

Division of Molecular and Condensed Matter Physics

Benchmarking Physical Properties of

Water Models

Author:

Tomas Andr´e

Supervisor:

Carl Caleman

Subject reader:

Jens Carlsson

(2)

Abstract

Water is a fundamental part of life as we know it, and by that also a fundamental for biology, chemistry, and parts of physics. Understanding how water behaves and interacts is key in many fields of all these three branches of science. Numerical simulation using molecular dynamics can aid in building insight in the behavior and interactions of water. In this thesis molecular dynamics is used to simulate common rigid 3 point water models to see how well they replicate certain physical and chemical properties as functions of temperature. This is done with molecular dynamics program GROMACS which offers a complete set of tools to run simulations and analyze results. Everything has been auto-mated to work with a python script and a file of input parameters. Most of the models follow the same trends and are valid within a limited temperature range.

Sammanfattning

Vatten är en av de fundamentala byggstenarna för liv, därmed är det även fundamen-talt för biologi, kemi och delar av fysiken. Att först˚a hur vatten beter sig och interagerar är en stor fr˚aga inom dessa tre grenar av vetenskap. Med molekyldynamik g˚ar det att utföra numeriska simuleringar som kan användas som hjälpmedel för att bygga en djupa-re först˚aelse för riktigt vatten. I den här uppsatsen s˚a har molekyldynamik använts till att simulera vanliga rigida 3 punkts parametiseringar av vatten för att se hur bra de kan repli-kera vissa egenskaper som funktioner av temperatur. Simuleringen är gjord med hjälp av molekyldynamik programet GROMACS som ger en fullständig uppsättning verktyg för att simulera och analysera molekylsystem. Alla simuleringar och analys är automatiserat med ett pythonprogram och en fil för parametrar. De allra flesta modeller följer liknande trender och är giltiga inom sm˚a temperaturintervall.

(3)

2.2 Molecular dynamics . . . 3 2.2.1 Lennard-Jones potential . . . 4 2.2.2 Intramolecular interactions . . . 4 2.2.3 The MD algorithm . . . 5 2.3 GROMACS programs . . . 6 3 Method 6 3.1 Simulation setup . . . 6 3.2 Analysis . . . 7 3.3 Automation script . . . 8 3.4 Definition of models . . . 9 4 Result 12 4.1 Density . . . 13 4.2 Enthalpy of vaporization . . . 13

4.3 Static dielectric constant . . . 14

5 Discussion and Conclusion 15 5.1 SPC and TIP3P . . . 16

5.1.1 SPC . . . 17

5.1.2 TIP3P . . . 17

6 Summary and Outlook 17 References 18 A Python code 20 A.1 main.py . . . 20 A.2 run-1.py . . . 21 A.3 run-1000.py . . . 24 A.4 analysis.py . . . 28

(4)

1 Introduction

Water is everywhere, from everyday interactions to scientific studies in chemistry and physics. Water was thought to be one of the four elements, along with fire, earth, and air, by the ancient Greeks. This belief was the leading theory up until the 1700-1800s when the scientific interest in water was sparked. Following the discovery of hydrogen and oxygen and how these formed water there were no longer doubts that water was not an element, but rather a compound. In the centuries that passed both atoms and molecules became studied in more detail and water was an established molecule, the H2O-molecule .

The structure of the water molecule is well known today and consists of one oxygen atom bonded with two hydrogen atoms. The atoms in the H2O-molecule are all electrically neutral

but polar, leading to the H2O-molecule creating a dipole electric field with the positive charge

located at the half-way point between the hydrogen atoms and negative charge located at the oxygen. The H2O-molecule is V-shaped with the angle between the hydrogen atoms being

about 104.5° and the length of the O-H being roughly 0.1nm.

Numerical methods can be used to make theoretical predictions about liquids which can later be tested by real experiments. This is highly beneficial when one would like to under-stand certain complicated systems or systems under extreme conditions since those can be studied with higher control and without real-world interference. Using different models for liquids, or more precisely for this work, water can yield divergent results, where some results may be more close to actual real values than others[1].

Molecular dynamics (MD) is a computer simulation method used for studying the inter-actions and dynamics of molecules and systems of molecules. Numerically solving Newtons equations of motions the trajectories of the constituent molecules of the system can be ac-quired by observing how the system evolves over time. MD is used a lot in biology and chem-istry, e.g. for studying protein folding and cell structures. There are uses in physics as well, most in molecular physics and material physics for obvious reasons. MD is used in this work to simulate different models for water molecules and determining what makes them different. How can these differences impact the physical and chemical properties of the water models.

A model for water is made up of a set of parameters simpler models have less parame-ters while more complex models usually have more. Some examples of common parameparame-ters are the angle between hydrogen, the bond length between oxygen and hydrogen, the electric charges, and two parameters σ and ε. The last two makes up the Lennard-Jones Potential, this will be discussed more in further detail in the theory section. These parameters are what gives rise to models performance and can be varied to more accurately replicate desired properties. At the time this is written there exists no universal model for water, this should be kept in mind when using these numerical simulations. Although the models are simple they can be a very powerful tool for studying water behavior and making theoretical predictions for the real properties of water.

The intent with this project to benchmark a subset of the more commonly used water mod-els with the use of molecular dynamics with GROMACS[2], a molecular dynamics program fitted with all tools needed to simulate and analyze systems of molecules. The aim is to see how physical properties depend on the temperature and see how these properties differ be-tween water models. Water models have widespread use in both computational chemistry and

(5)

computational physics. When simulating bio-molecules and other systems using water as a solvent, the majority of what actually is simulated is water molecules. So the choice of water model could impact the quality of the simulation.

2 Background and theory

In this work the average contents of ’real’ water are not considered. From now on, whenever ’water’ is mentioned, it is referring only to the H2O-molecule, singular or plural if nothing

else is explicitly stated.

2.1 The water molecule and how to parametrize it

The H2O-molecule, as the name implies, consists of two hydrogen atoms and one oxygen

atom. The constituent atoms are all neutral but the molecule is polar, and as a consequence, each H2O-molecule creates a dipole electric field. The dipole field of the H2O-molecule is a

direct cause of the electron distribution in the molecule. The distribution is due to the oxygen atom having a bigger atomic number than the two hydrogen atoms and therefore more pro-tons which in turns puts a lot of the positive charge at the oxygen site. This makes the electron slightly more attracted to the oxygen part of the molecule, and there for creating an overall dipole.

Figure 1: A simple parametrization of a water molecule. Here l1 is the length of the bond, θ

is the angle between the bonds, σ one of the Lennard-Jones parameters and q1 and q2 are the

partial charges of the molecule. Figure adopted from [3]

If one was to try to make a perfect model that would replicate water in all imaginable ways it would be orders of magnitude to complicated to be simulated in large quantities for any remarkable duration of time, when trying to replicate the properties and physics of water molecules with respect to all known interactions it would end up being very complex and com-plicated and thus hard to model. One would find out that simulating more than a few molecules

(6)

at a time would quickly become a cumbersome task due to the immense computation power that would be required. A way to circumvent this problem is to parametrize the molecules with simple models that can be handled with ease in numerical simulations. The most simple way to parametrize water is just to use a neutral sphere. A somewhat more accurate parametriza-tion would be to place three point charges, one at each atom in the molecule. The length of the distance between the oxygen site and the hydrogen site and the angle between the hydrogen bonds. This is seen in Figure 1. With this simple model, water molecules could be simulated with them only interacting with the Coulomb force from the charges. There is however an-other interaction at play here called the Lennard Jones potential which is parametrized by two parameters. The Lennard-Jones potential is explained in greater detail later on.

Not all models use only 3 points but some models use 4 sites, where the 4th site might be used to displace a charge or a model could use 5 sites and split the charge at the oxygen site into multiple points.

By varying these parameters the resulting model will produce different physical and chem-ical properties under computer simulations. The parameters are often experimentally tweaked to give the wanted results, there are however a few models built around ab inito calculations. Currently, there exists no universal model that is used to simulate water, so there are a lot of different models currently in use to replicate different properties.[1]

Some of the water models can be classified into different model types of which three are rigid, flexible and polarizable. Rigid models, as the name suggests have rigid lengths and angles, in other words, they are constant and non-changing in time. However, it is well known that the bonds of water models are not rigid but rather the bonds exhibits vibration. A more accurate representation of water would be to let models be flexible, these models try to imitate the molecular vibrations of water. This implies that the bond lengths and angle can oscillate. Another way to make models better fit real water is to make them polarizable, that is making the charge distribution non-constant and changing over time. Just like in actual water the elec-tron distribution is not constant and hence the electric field is constantly changing.[4] These two more complex model types increase accuracy at the cost of computing power. Models combining these types are also in existence. Only rigid models are considered in this work.

2.2 Molecular dynamics

This section aims to explain what forces and interactions governs the motion of systems of atoms and molecules. Following is some notation. The position of each atom i is kept track of and its position vector ri is simply given by

ri = xix + yˆ iˆy + zizˆ (1)

where xi,yi,zi is the coordinates of the ith atom. The magnitude of ri is given by

|ri| = ri =

q x2

i + yi2+ zi2. (2)

The vector between atom i and j is written as

rij =rj− ri = (xj− xi)ˆx + (yj − yi)ˆy + (zj− zi)ˆz (3)

and likewise

|rij| = rij =

q

(7)

Figure 2: The general form of the Lennard Jones potential[6] parameters σ and ε are marked. Figure adopted from [6].

The initial positions of all atoms in the systems are know. If initial velocities are not present they can be semi-randomly generated by the Maxwell-Boltzmann velocity distribution. [2] 2.2.1 Lennard-Jones potential

The Lennard-Jones potential (LJ potential) is an approximation of the potential governing the interaction between neutral atoms and molecules, first proposed by Lennard-Jones in 1924[5]. It combines a short-range repulsive term and a longer range attractive term, it is parameterized by two parameters. The LJ potential is often used in numerical methods and simulations for its mathematical simplicity which enables fast numerical calculation. The LJ potential has the form VLJ(rij) = 4εij " σij rij 12 − σij rij 6# (5) with εij being the depth of the potential well and σij being the distance to the minimum value

of the potential well.[2] The typical form of the LJ potential can be seen in Figure 2 with ε and σbeing marked in the figure.

The potential can also be rewritten on the form VLJ(rij) = 4εij " σ_ij rij 12 − σij rij 6# = C (12) ij r12 ij − C (6) ij r6 ij , (6)

with two other parameters, C(12)_{and C}(6)_{, which is the form used to define the models later}

in Table 1.

2.2.2 Intramolecular interactions

For flexible models that try to replicate the vibrations of the molecules, it is common to model the vibrations as oscillation’s of the bond length and the angle. Hence the potential is often modeled as harmonic oscillators. With the potential between atoms i and j

Vbond(rij) = 1 2kij(rij − r (0) ij ) 2 ₍₇₎

(8)

for the bond length where r(0)

ij being the equilibrium point, and kij being the proportionality

constant.

Similar for the angle between atoms ijk

Vangle(θijk) = 1 2kijk(θijk− θ (0) ijk) 2 ₍₈₎

where θijk being the angle between atoms i,j,and k, θ_ijk(0)being the equilibrium point, and kijk

being the proportionality constant. 2.2.3 The MD algorithm

Molecular dynamics (MD) is a numerical computer simulation method. It can be used to study large systems of interacting molecules which would be too complicated and time consuming to study otherwise. The idea of MD is to numerically solve Newtons equations of motion for all of the molecules in the system and see how the system evolves with time.

The force on any particle i, Fi is given by

Fi = − X j ∂V (rij) ∂rij ˆ rij

where the potential V is the pair potential between a pair of atoms ij, it is given by the sum of the Coulomb potential, the Lennard-Jones potential, and the intramolecular potentials

V (rij, θijk) = qiqj 4πε0rij + 4εij " σij rij 12 − σij rij 6# +1 2kij(rij − r (0) ij ) 2₊ 1 2kijk(θijk− θ (0) ijk) 2

where qiand qj are the local charges of the atoms. ε0is the permittivity of vacuum. Since this

project has only concerned rigid models it simplifies to V (rij) = qiqj 4πε0rij + 4εij " σij rij 12 − σij rij 6# (9) The motion of the the ith atom can then be found by numerically evaluating Newton’s equa-tions motion d2_r i dt2 = Fi mi . (10) or equivalently, dri dt =vi, dvi dt = Fi mi (11) where vi is the velocity of the ith atom.

With this, the velocity and the position is evaluated using a leap-frog algorithm[2]. They are obtained by vi(t + 1 2∆t) = vi(t − 1 2∆t) + Fi mi ∆t (12) and ri(t + ∆t) =ri(t) +vi(t + 1 2∆t)∆t (13)

respectively, where ∆t is the time step. This can then be repeated for as many time steps as needed. By doing this for all atoms the motion of the system is obtained.

(9)

The MD algorithm uses this setup to simulate systems of molecules as follows:

1. Input the topology of the system, which is the position and velocity coordinates of all atoms in the system.

2. Compute all the forces of the system by evaluating the pair potentials. 3. Numerically solve Newton’s equations of motion for a small time step. 4. Update the position and velocity for all atoms in the system.

Repeat step 2,3,4 for desired simulation length.[2] The relatively simplistic models used in molecular dynamics allow larger systems of molecules to be simulated in shorter times than what would be possible with more complicated models.

GROMACS[2] is a molecular dynamics simulation program, created mainly for simulat-ing complicated bio-molecules such as proteins and lipids, it still offers great advantages for molecular physics and really any system of molecules. The program comes with a large variety of tools for running MD simulations and analyzing the results. All simulations and analysis has been done with GROMACS built-in tools.

GROMACS is run completely from the OS terminal with commands and there is no need for a scripting language.

2.3 GROMACS programs

GROMACS is complete with tools to generate, simulate and analyze systems of molecules and atoms. Throughout this project GROMACS version 4.6.7 is used. To run a simulation three things are needed. Firstly, a .mdp file which specifies parameters for the simulation. Secondly, a .gro structure file that defines the system, i.e location and velocity coordinates for all con-stituent atoms of the system. Lastly, a .top topology file which defines the structure of the molecules in the system. The GROMPP program reads and validates these files and produces a .tpr file, a file prepared to be used as input for simulation. Then the MDRUN programs take the .tpr file as an input and spit out a .edr file containing information about the energy prop-erties from the simulation, a .trr file containing the trajectories for all the simulated atoms and a .gro structure file for the system after a simulation.

The energy file can be probed by a the g energy program which is used to extract data such as enthalpy, potential energy, kinetic energy, etc.

3 Method

3.1 Simulation setup

The simulations use the Nos´e-Hoover temperature coupling and Parrinello-Rahman pressure coupling, these couplings are used to keep the temperature respectively pressure constant during each simulation, with some fluctuations. The pressure were kept at 1bar= 105_{Pa, which}

is close to the atmospheric pressure. GROMACS regular MD integrator is used, which means it follows the MD algorithm. A 1.1nm cutoff for the Lennard-Jones interaction were used, this means that all interactions from the Lennard-Jones potential are ignored after the cutoff distance. All models were simulated in the temperature range from 233.13K to 353.13K with a simulation every 15K. The liquid phase systems were simulated for 1ns with a time-step of

(10)

Figure 3: A frame from the liquid phase simulation for TIP3P at 300K.

0.002ps and the gas phase systems were simulated for 10ns with the same length of the time-step. For the liquid phase, 1000 molecules are placed in a 3nm×3nm×3nm box with periodic boundary conditions as seen in Figure 3. The gas phase contains only one single molecule. All simulations use are of NPT ensemble type, which means that the number of molecules (N), the pressure (P), and temperature (T), are constant in all simulations. The pressure is kept constant across all simulations while the temperature is different but constant in each single simulation. Too keep the pressure constant the box changes its volume to accommodate for the new temperature for each new simulations at a new temperature. To not let this impact the results it is common to run an equilibrium simulation before the real simulation to let the box adjust itself to the new temperature. All for each model at each temperature the system goes through the following four steps:

1. The .mdp files parameters are changed to the correct temperature.

2. An energy minimization is performed to ’relax’ the system (This step is not required for the gas phase system)

3. Two PRE simulations are done to put the system into equilibrium.

4. The LONG simulation of the system is performed. This is the product simulation, or ”real” simulation.

The names PRE and LONG are used to refer to the equilibrium and production simulations.

3.2 Analysis

The set of physical and chemical properties calculated from the data are the density, the en-thalpy of vaporization, and the dielectric constant. Some of them have a direct temperature dependence and some of them have a more indirect temperature dependence.

The density ρ is simply

ρ = M

(11)

where M is the mass of the system and V is the average volume.[7] The enthalpy of vaporization ∆H can be calculated by

∆H = ((E_intragas + kBT ) − (E_intraliquid+ E_interliquid)) (15)

where Eintra is the intramolecular energy and Einter is the intermolecular energy. Though in

the simulations it is sufficient just to use

∆H = ((E_potgas+ kBT ) − E_potliquid) (16)

with Epot being the potential energy which we extract from the simulations.[7]

The static dielectric constant is computed by relating it to the fluctuations of the total dipole moment M of the liquid simulations.

ε = 1 + 4π 3

M2_{− hMi}2

V kBT

(17) where kBis the Boltzmann constant, T is the temperature, and V is the volume.[7]

3.3 Automation script

Doing simulations manually can be rather cumbersome when the number of simulations ex-ceeds a handful, to help with this a python script was created to handle the simulations. The script was created with three aspects in mind. Firstly to minimize required user interaction, secondly to keep all the files and directories sorted and organized so they are simple to nav-igate and find. Lastly, it is made to handle upscaling and expansion without any problems due to the algorithmic way everything is handled increasing size should not be any problem. The script is separated into three different types of scripts, simulation scripts, analysis scripts, and a single master script. The names are quite self-explanatory, simulations scripts run the simulations, analysis scripts run the analysis and the master script is responsible to organize and sort the directories and start the other scripts. The directories are organized by as seen by the visual representation in Figure 5 where blue rounded rectangles represent directories and green rectangles represent files. Objects marked with an asterisk are not unique, meaning they represent multiple objects (i.e. model∗.itp is not a single file but represents all .itp file models).

A simulation script is placed in each ../MODELTYPE/N POINT/ directory to account for dif-ference that could be present in different model types. In the same directory there are two scripts called run-1.py and run-1000.py which have almost the same functionality, one is to simulate the liquid phase system with 1000 H2O-molecules and the gas phase system with one

H2O-molecule respectively. The script works by changing the temperature in the .mdp files

and then running an energy minimization to ”relax” the system. After it runs two PRE simula-tions to put the water into equilibrium. Then the LONG simulation runs. The output structure from the simulation is used in the next iteration for the next temperature. This procedure is repeated until the whole range of temperature has been covered. This is done for both the liquid phase system and the gas phase system.

Similarly, an analysis script is placed in the same directory as the simulation script. It goes through all simulation outputs across the temperature range and extracts data from simulation output. This data is then analyzed and averages and errors are all saved in a results directory.

(12)

This data can be used to plot graphs and study the data from the simulations.

Lastly, the master script, it sorts all directories as seen in Figure 5 and is in charge of running all other scripts. All scripts read an input file where the user can choose which models to simulate, for how long, size of the time-step in the simulations, etc.

Following is a very rough outline for the automation script and a schematic overview in Figure 4, a lot of steps are omitted but this will hopefully give a general idea how the script works and what it does.

Read input file for script parameters Run Simulation script

for All Models do T = Tstart

while T < Tenddo

Run Energy minimization Run Equilibrium simulation 1 Run Equilibrium simulation 2 Run proper Simulation T = T + dT

end

Model = Next Model end

Run analysis script for All Models do

T = Tstart

while T < Tenddo

Extract energy terms Extract trajectory terms Save data in results T = T + dT

end

Model = Next Model end

Figure 4: Schematic overview of the script, START is the directory with model structures.

3.4 Definition of models

All models are defined in Table 1 sorted by year of implementation. The table contains the names and parameters for all models examined. The first column is simply the name of the water model and reference to its origin. The second column contains the implementation years. Third column contains the bond length, i.e the distance from the oxygen atom to the hydrogen atom. The fourth column contains the angle between the hydrogen bonds. Column five and six contains the charges of oxygen and hydrogen respectively. Following in column 7 and 8 the LJ parameters for the oxygen and similar for column 9 and 10 with hydrogen.

(13)

Table 1: Thr ee point mo dels: definition of mo dels.

impl.

R

OH

∠

HOH

q

O

q

H

C

12 O

C

6 O

C

12 H

C

6 H

(nm)

(degr

ees)

(e)

(kJnm

12

/mol)

(kJnm

6

/mol)

(kJnm

12

/mol)

(kJnm

6

/mol)

SPC

[8]

1981

0.10000

109.47 -0.82000

0.41000

0.26331E-05

0.26171E-02

0

0 TIP3P

[9]

1983

0.09572

104.52 -0.83400

0.41700

0.24352E-05

0.24889E-02

0

0 TIP3P-CHARMM

[10]

a

1985

0.09572

104.52 -0.83400

0.41700

2.80544E-06

2.67233E-03

1.29160E-17

3.15333E-09

SPC/E

[11]

1987

0.10000

109.47 -0.84760

0.42380

0.26331E-05

0.26171E-02

0

0 SPC-RE

[12]

1995

0.10000

109.47 -0.80680

0.40340

-0

0 TIP3P-MOD

[13]

1996

0.09572

104.52 -0.83400

0.41700

-F3C

[14]

1997

0.10000

109.47 -0.82000

0.41000

3.13139E-06

3.11195E-03

1.02147E-17

1.30749E-09

MSPC/E

[15]

1998

0.98390

109.47 -0.82160

0.41080

0.25153E-05

0.24976E-02

0

0 SPC/RF

[16]

1998

0.10000

109.47 -0.81240

0.40620

0.24768E-05

0.26171E-02

0

0 SPC/L

[17]

2002

0.11000

104.50 -0.68850

0.34425

0.26341E-05

0.26173E-02

0 0.28090E-06

SPC/A

[17]

2002

0.10000

109.47 -0.81000

0.40500

0.25504E-05

0.26173E-02

0 0.46182E-07

TIP3P/EW

-B

[18]

2004

0.09572

104.52 -0.83000

0.41005

0.18814E-05

0.17915E-02

0

0 TIP3P/EW

-F

[18]

2004

0.09572

104.52 -0.83000

0.41500

0.18489E-05

0.17408E-02

0

a First pr esente d in Reiher ,W .E., III. Ph.D .Thesis, Har var d Univ ersity ,1985.

(14)

Figure 5: Visual overview of the directory and file structure for the simulation where blue rounded rectangles are directories and green rectangles are files. Directories and files marked with an asterisk, ∗, are not unique, meaning there represent multiple of these with different names. The underlying structure for the Results directories have been omitted in favor of non-cluttering and relevance.

(15)

4 Result

For this project 13 rigid 3 point models were studied, the models are as follows in no particu-lar order; TIP3P[9], SPC[8], SPCE[11], F3C[14], MSPCE[15], SPCA[17], SPCL[17], SPCRE[12], SPCRF[16], TIP3PCHARMM[10], TIP3PEWB[18], TIP3PEWF[18], and TIP3PMOD[13]. These are all simple rigid models and are all defined in Table 1.

Each of the examined properties is plotted in Figures 6 - 8 as functions of temperature. For each figure, the indicated models are plotted and an experimental curve as a comparison. The properties studied are chosen since they all are dependent on lots of different other properties, from volume, potential energy, and dipole moment to name a few, all of which GROMACS can extract without much work. These properties are also quite intuitive, it is easy to understand what they actually symbolize and how it could affect simulations. Those factors were impor-tant while working on the script so that errors could be found by observing the results. Also the density could play a big role in say, water transport in bio-molecules such as cell mem-branes and the likes where density of the water could govern the transport rate.

The properties have been fitted to polynomials of the degree that matches the experimental data on the form of a function f such that f(T ) = a0× T0+ a1× T1+ ... + an× Tn, where

T is the temperature and n is the degree of the polynomial which the data has been fitted to. All data points are presented with errorbars which indicate the standard deviation. Following are some clarifying comments and remarks about each of the graphs and statistics for each of the properties.

(16)

4.1 Density

The density dependence on temperature is shown in Figure 6. the data is fitted to a 2nd degree polynomial.[19] All models seem to be following similar trends just with the height shifted. Many models overshoot the experimental value at the lower temperatures and undershoot at the higher temperatures. Experimental values from [20]

Figure 6: Comparison of the density and its dependence on temperature between different water models and experimental data.

4.2 Enthalpy of vaporization

The entalphy of vaporization describes how much energy is needed to transform a given amount of liquid into gas. It could be seen as the amount of energy needed for molecules to overcome the intermolcular interactions and ”break free” into gas. The enthalpy of vapor-ization and its dependence on temperature is shown in Figure 7. It is fitted to a 3rd degree polynomial. A very important thing to note about Figure 7 is that the experimental data is the enthalpy of vaporization at saturation pressure. No good reference data for the pressure at which the calculations were done at could be found. So the experimental data plotted in the Figure 7 is not the true value of the enthalpy of vaporization at 1bar, the real values should probably be higher at lower temperatures and lower at higher temperatures. This can be seen in the models, that the slope is steeper at low temperatures and vice-versa. All models seem to follow the same general trend, SPCE overshoots all other models by quite a lot. The data is quite accurate with low standard deviation. A lot of models are good at replicating the po-tential energy, which the enthalpy of vaporization is dependent on. Experimental values from [21].

(17)

Figure 7: Comparison of the enthalphy of vaporization and its dependence on temperature between different water models and experimental data.

4.3 Static dielectric constant

The dielectric constant is how much charge required to obtain on unit of electric flux. One way to think about it is how well a material will conduct electricity. A material with low dielectric constant will be good at conducting current, while a material with high dielectric constant, i.e a dielectric material (also called insulator for high values of the dielectric constant) will be worse at conducting current. The static dielectric constant and its dependence on temperature is shown in Figure 8. It is fitted to a 3rd degree polynomial.[19] TIP3P tends close to the experimental data[22], with it within its error margins at almost all points. as it stands now not much more can be said without less uncertainty in the data. Higher precision in permittivity often results in inaccuracies for other quantities like enthalpy of vaporization, which often are more desired in most MD simulations. Experimental values from [22].

(18)

Figure 8: Comparison of the dielectric constant and its dependence on temperature between different water models and experimental data.

5 Discussion and Conclusion

With molecular dynamics having such widespread use for studying systems of molecules and with water being heavily present in these systems, it seems crucial to know how the water behaves at different temperatures as it might be a common variable.

Whatever any of the results may indicate one should always remember not to confuse these models with real liquid water. The convenience of simplified molecular models makes them an attractive tool. In fact, they are a very useful tool for making predictions. But looking at the data it becomes clear that the results hugely vary depending on which model has been used which might lead to a different conclusion. So when using water models it might be worth considering to choose a model that is accurate in the temperature ranges of interest since it could impact the outcome of the simulations.

There is still great insight to be gained from these simulations if not about water, but about the models themselves. By looking at the parameters of the models we can see that they are two different angles that are used for all models, about 109.5° and about 104.5°, we can see no clear difference in the groups from the plots, this could be either because the angle does not impact the properties that were studied or that all parameters together have a larger impact or both. We can see that the two first models SPC and TIP3P each belong to one of these groups and that almost all the following SPC respectively TIP3P models keeps to that group, with an exception of SPCL which falls in the 104.5° group. From looking at the parameter we see a trend that the LJ parameters are only defined for the oxygen and not for the hydrogen, this

(19)

simplifies the models. By looking at TIP3PCHARMM and F3C that do use LJ parameters for the hydrogen, it can be seen that they do not behave in any notable way. On the same note, SPCRE and TIP3PMOD do not have any LJ parameters at all, neither do these models exhibit any notable difference. A conclusion we can draw is that many models are accurate within a limited temperature range and does not follow the general behavior of real water and that the rigid 3 point models are not good multipurpose water models but rather usable in very spe-cific conditions. Also, the models all have pretty similar results that seem independent of the parameters. Many models have had their parameters optimized to replicate the most wanted properties, i.e energy, this impacts some of the other properties, in this case, the dielectric constant.

When it comes to water one thing of interest is the density maximum at 4°. None of the stud-ied models are capable of replicating that maximum. The 3-point models lack the complexity to affect it, and thus the form of the density plots are all the same for all models. Some more advanced models with more parameters and thus more degrees of freedom can actually repli-cate the maximum of the density. One such model is TIP5P[23], a 5-point rigid model.

It would be very interesting to do this for other types of models with more than 3 points and also for other than rigid models, such as flexible and polarizable. To see if these other types of models to a better job at replicating properties from real water.

5.1 SPC and TIP3P

SPC and TIP3P are the two oldest models looked at, however, there do exits older models, yet these models still are some of the most common models that are used today. Some extra discussion around these models follow. First a comparison with the calculated values and values form [24] which did similar computations as to this project at 25°C. Both SPC and TIP3P are models optimized for 300K so 25°C is close to the optimized temperature. The comparison is presented in Table 2 and 3. There has been no calculations done at 25°C so data from the fitted polynomials are used instead as comparison. The tables contains the data from the fitted polynomials, data from [24] and the absolute difference, ∆abs, and the relative difference, ∆rel,

for these values.

Calculated data Comparison data[24] ∆abs ∆rel

Density 980[Kg/m3_] _985[Kg/m3_] _5[Kg/m3_] _0.51%

∆Hvap 44300[J/mol] 44936.15[J/mol] 636.15[J/mol] 1.42% Dielectric constant 67 ± 7 60 ± 10 – – Table 2: Comparison between calculated data for SPC in comparison with data form [24] with absolute and relative differences.

Calculated data Comparison data[24] ∆abs ∆rel

Density 987[Kg/m3_] _1002[Kg/m3_] _15[Kg/m3_] _1.49%

∆Hvap 42600[J/mol] 43555.44[J/mol] 955.44[J/mol] 2.14% Dielectric constant 92 ± 10 88 ± 6 – – Table 3: Comparison between calculated data for TIP3P in comparison with data form [24] with absolute and relative differences.

It can be seen from Tables 2 and 3 that the calculated data agrees with similar calculations done before with just over 2% disagreement at most. There is not much to say about the

(20)

dielectric constant due to the large uncertainities, but it can be seen that they do overlap. Other that that for the density and enthalpy of vaporization the calculated data is lower than the comparison data [24]. The calculations are not completely similar since they’ve used Monte Carlo simulations with ”suffciently many steps” to achieve convergence and in the calculations for this project MD has been used with a set amount of steps.

5.1.1 SPC

The SPC model was implemented 1981 [8] with its main goal being a simple, effective and reli-able model. It was made a rigid 3 point model with a single Lennard-Jones potential centered at the oxygen due to its simplicity and was optimized for liquid water at 300K with a density of around 1000Kg/m3_{. The later SPC-like models follow in SPC footsteps and use parameters}

not far from the original. Looking at the results and starting with density, Figure 6, it can be seen that at 300K SPC does not achieve the 1000Kg/m3_{it was designed for, but rather a}

den-sity of about 975Kg/m3_{, so quite a bit off. It does achieve 1000Kg/m}3_{at roughly the melting}

point of real water, around 270K. Looking at the next plot for enthalpy of vaporization, Figure 7, it can be seen that SPC is right on point at 300K and seem to replicate the experimental data. Lastly, in Figure 8 at 300K SPC do have the experimental value within the error of the data point, however, the value coincides much better at 325K.

5.1.2 TIP3P

The TIP3P model was implemented 1983 [9] in a study which compared some water models, including SPC, as well as implementing a few new models like TIP3P and TIP4P. TIP3P is opti-mized for liquid water at 1atm at around 300K. Most of the TIP3P-like models follows TIP3P parameters and keeps the angle close to 109.5°. In neither Figure 6, 7, or 8 TIP3P replicates the experimental values at 300K. It is hard to say which parameters impact what properties from the dataset used.

To conclude, a simulation script that could be easily expanded to simulate and extract data form more models has been created. The models studied are the ones that I could quickly get my hands on to start working on the script. The calculated data seem to coincide with other calculations of the same type, which strengthens the validity of the used methods. A few con-clusions can be drawn but it would be exciting to look at other types of models to see if they behave differently.

6 Summary and Outlook

What has been accomplished so far is a solid ground to build upon. One could expand upon this by using contents of this work to test more models. By doing more accurate simulations and better analysis. Testing more models would be as simple as adding their .itp to the cor-responding directory and adding them to the list in the input files. If a model requires any special treatment it could be in its own category and have its own .mdp files tailored for them. There are also a lot more properties of water one could investigate. The script I’ve created will hopefully be of future use. A few things that could be improved right now is:

• The script runs the simulations in succession, if instead done in parallel the runtime would be reduced a large amount. A temporary solution to reduce the runtime is simply

(21)

to start multiple sessions of the script. Furthermore one could apply more advanced statistical methods to study the data.

• Run the simulations for longer, my runs have been 1ns for the liquid phase systems and 10ns for the gas phase systems. More accurate results could easily be acquired by running longer simulations. Increasing the simulation time 10-fold should be sufficient to start and it would not take that much time.

• Looking at more water models. As mentioned this should not be a big problem.

• More properties could be looked at, this probably requires a little more work than adding new models, but some light editing of the analysis scripts should do it.

The vision I have is that one could collect data from all the water models currently in use, run long simulations and obtain accurate data for as many properties as possible. With this, one could create a collection of data that could be used to aid in any future use of these MD models. Unfortunately, I haven’t had time to accomplish all of that, so I leave that anyone that wants to continue down this path.

References

[1] Bertrand G. A reappraisal of what we have learnt during three decades of computer simulations on water. Journal of Molecular Liquids. 2002;101(1):219–260.

[2] van der Spoel D, Lindahl E, Hess B, and The GROMACS development team. GROMACS User Manual version 4.6.7; 2014. Available from: www.gromacs.org.

[3] Chaplin M. Water Models;. Accessed: 2019-06-03. Available from: http://www1. lsbu.ac.uk/water/water models.html.

[4] Yu H, van Gunsteren WF. Accounting for polarization in molecular simulation. Computer Physics Communications. 2005;172(2):69–85.

[5] Jones JE. On the Determination of Molecular Fields. II. From the Equation of State of a Gas. Proceedings of the Royal Society of London Series A, Containing Papers of a Mathematical and Physical Character. 1924;106(738):463–477. Available from: http: //www.jstor.org/stable/94265.

[6] Wikibooks. Molecular Simulation/The Lennard-Jones Potential — Wikibooks, The Free Textbook Project; 2018. [Online; accessed 4-June-2019. Picture edited to remove element ’Rmin’ and ’(˚A)’]. Available from: https://en. wikibooks.org/w/index.php?title=Molecular Simulation/ The Lennard-Jones Potential&oldid=3498647.

[7] Caleman C, van Maaren PJ, Hong M, Hub JS, Costa LT, van der Spoel D. Force Field Benchmark of Organic Liquids: Density, Enthalpy of Vaporization, Heat Capacities, Sur-face Tension, Isothermal Compressibility, Volumetric Expansion Coefficient, and Dielec-tric Constant. Journal of Chemical Theory and Computation. 2012;8(1):61–74.

[8] Berendsen HJC, Postma JPM, van Gunsteren WF, Hermans J. Interaction Models for Water in Relation to Protein Hydration. In: Pullman B, editor. Intermolecular Forces. Dordrecht: D. Reidel Publishing Company; 1981. p. 331–342.

(22)

[9] Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. Journal of Chemical Physics. 1983;79:926– 935.

[10] MacKerell, Jr AD, Bashford D, Bellott M, Dunbrack, Jr RL, Evanseck JD, Field MJ, et al. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. Journal of Physical Chemistry B. 1998;102:3586 – 3616.

[11] Berendsen HJC, Grigera JR, Straatsma TP. The missing term in effective pair potentials. Journal of Physical Chemistry. 1987;91:6269–6271.

[12] Berweger CD, van Gunsteren WF, M¨uller-Plathe F. Force field parametrization by weak coupling. Re-engineering SPC water. Chemical Physics Letters. 1995;232:429–436. [13] Neria E, Fischer S, Karplus M. Simulation of activation free energies in molecular systems.

Journal of Chemical Physics. 1996;105(5):1902–1921.

[14] Levitt M, Hirshberg M, Sharon R, Laidig KE, Daggett V. Calibration and testing of a water model for simulation of the molecular dynamics of proteins and nucleic acids in solution. Journal of Physical Chemistry B. 1997;101:5051–5061.

[15] Boulougouris G, Economou I, Theodorou D. Engineering a molecular model for wa-ter phase equilibrium over a wide temperature range. Journal of Physical Chemistry B. 1998;102(6):1029–1035.

[16] van der Spoel D, van Maaren PJ, Berendsen HJC. A systematic study of water models for molecular simulation. Journal of Chemical Physics. 1998;108:10220–10230.

[17] Gl¨attli A, Daura X, van Gunsteren WF. Derivation of an improved simple point charge model for liquid water: SPC/A and SPC/L. Journal of Chemical Physics. 2002;116:9811– 9828.

[18] Price DJ, Brooks III CL. A modified TIP3P water potential for simulation with Ewald summation. Journal of Chemical Physics. 2004;121:10096–10103.

[19] Lide DR, Company CR. CRC Handbook of chemistry and physics. 90th ed. Cleveland, Ohio: CRC Press; 2009.

[20] Gmehling J, editor. Water Density: Datasheet from “Dortmund Data Bank (DDB) – Thermophysical Properties Edition 2014” in SpringerMaterials (https://materials.springer.com/thermophysical/docs/den c174). Springer-Verlag Berlin Heidelberg & DDBST GmbH, Oldenburg, Germany;. Copyright 2010-2014 Springer-Verlag Berlin Heidelberg & DDBST GmbH, Oldenburg, Germany. Available from: https://materials.springer.com/thermophysical/docs/ den c174.

[21] Engineering ToolBox. Water - Heat of Vaporization; 2010. Accessed: 2019-06-10. Available from: https://www.engineeringtoolbox.com/ water-properties-d 1573.html.

[22] Gmehling J, editor. Water Dielectric Constant: Datasheet from “Dortmund Data Bank (DDB) – Thermophysical Properties Edition 2014” in SpringerMaterials

(23)

(https://materials.springer.com/thermophysical/docs/dec c174). Springer-Verlag Berlin Heidelberg & DDBST GmbH, Oldenburg, Germany;. Copyright 2010-2014 Springer-Verlag Berlin Heidelberg & DDBST GmbH, Oldenburg, Germany. Available from: https://materials.springer.com/thermophysical/docs/ dec c174.

[23] Mahoney MW, Jorgensen WL. A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions. Journal of Chemical Physics. 2000;112(20):8910–8922.

[24] Jorgensen WL, Tirado-Rives J, Berne BJ. Potential Energy Functions for Atomic-Level Simulations of Water and Organic and Biomolecular Systems. Proceedings of the Na-tional Academy of Sciences of the United States of America. 2005;102(19):6665–6670.

A

Python code

A.1 main.py

# coding: utf-8

-*-# Last edited by Tomas Andr´e (2019-06-05)

import datetime import os import shutil

from shutil import copyfile

from distutils.dirutil import copytree import subprocess

from subprocess import call import sys

import fileinput import time

from decimal import Decimal

# This file is for sorting through all models and running all simulations scripts. # Psuedocode overview

#

# For all model types # Go to type directory

# For all different point model # Go to point directory

# Run simulation scripts # Run analysis script #

# Working directory should contain the following: # main.py (This script)

# Folders for all models of the form ’Modeltype’/’nPOINT’/START # input.txt (parameters and setup file)

start = time.time()

# Read parameters from file

with open(”input.txt”,”r”) as inp: param = inp.read()

param = param.split(’“n’) for i,line in enumerate(param):

if not (line.startswith(’#’) or line.startswith(’@’)): if line.startswith(’types’):

types = line.replace(”types = ”,””) types = types.split(’,’)

# Check so all input parameters exists. for i,TYPES in enumerate(types):

(24)

if types[i] not in os.listdir(os.getcwd()):

print(”There exists no modeltype folder for: ’” + types[i] + ”’ so it has been removed.”) types.remove(types[i])

# Establish working directory wd = os.getcwd();

# Go through all different types (i.e RIGID, FLEXIBLE...) for i,TYPES in enumerate(types):

modeltype = types[i]; # Go to modeltype dir os.chdir(modeltype);

points = os.listdir(os.getcwd()); cwd = os.getcwd();

# For all diffrent points models (i.e 3POINT, 4POINT, etc) for j,POINT in enumerate(points):

point = points[j]; os.chdir(point);

print(”Currently in ” + os.getcwd());

# Run simulation scripts by calling terminal.

input = (”python run-1.py”,”python run-1000.py”,”python analysis.py”) if (”run-1.py” and ”run-1000.py”) in (os.listdir(os.getcwd())): print(”Now running script files for ” + modeltype + ”/” + point) call(input[0], shell=True)

call(input[1], shell=True) else:

print(”Script files missing.”)

if (”analysis.py” in (os.listdir(os.getcwd()))):

print(”Now running analysis script file for ” + modeltype + ”/” + point) call(input[2], shell=True)

else:

print(”Analysis file missing.”)

os.chdir(cwd); os.chdir(wd); end = time.time()

print(”Script finished running at system time: ” + str(datetime.datetime.now())) print(”Time taken: ” + str((end-start)/(60*60))) + ” hours”

# Here one could make it send an email or something

A.2 run-1.py

# coding: utf-8

import os import shutil

################# SETUP # get the working directory

cwd = os.getcwd() # .../Simulations/RIGID/3POINT/ wd = cwd

(25)

# Find ’input.txt’ to establish parameters and other needed input. os.chdir(’..’)

os.chdir(’..’)

param = param.split(’“n’) for i,line in enumerate(param): # Ignore comments

if not (line.startswith(’#’) or line.startswith(’@’)): # Gromacs directory

if line.startswith(”gd”): gd = line.replace(”gd = ”,””) # Water models to be simulated if line.startswith(”rigid3PModels”): model = line.replace(”rigid3PModels = ”,””) model = model.split(’,’) # Start temp if line.startswith(”Tstart”): Tstart = Decimal(line.replace(”Tstart = ”,””)) # End temp if line.startswith(”Tend”): Tend = Decimal(line.replace(”Tend = ”,””)) # Temperature step if line.startswith(”dT”): dT = Decimal(line.replace(”dT = ”,””)) # Time step length (ps)

if line.startswith(”ts”): ts = Decimal(line.replace(”ts = ”,””)) # Number of timesteps if line.startswith(”long1ts”): long1ts = Decimal(line.replace(”long1ts = ”,””)) if line.startswith(”pre1ts”): pre1ts = Decimal(line.replace(”pre1ts = ”,””))

# Check that all given models are present in START directory os.chdir(wd + ”/START”)

for i, MODEL in enumerate(model):

if (model[i] + ”.itp”) not in os.listdir(os.getcwd()):

print(”Missng .itp-file for ’” + model[i] + ”’ in ” + os.getcwd() + ” so it has been removed.”) model.remove(model[i])

# Go back to working directory os.chdir(wd)

# specify directories to .mdp files for test sim. & long sim.

premdp = wd + ”/START” + ”/pre1base.mdp”

longmdp = wd + ”/START” + ”/long1base.mdp” # make outputfolder & go there

if not os.path.exists(wd + ”/SIMS”): os.makedirs(wd + ”/SIMS”)

os.chdir(wd + ”/SIMS”) ######################### # for every model

# make model-directory and go there # set temperature

# make ’1’-directory # while T¡= Tend

# make temperature-directory and go there (model/1/T) # copy embase.mdp, topol1.top, long.mdp to ”./model/1/T” # insert correct T in long.mdp

# make ”PRE”-directory and go there

# copy embase.mdp, topol1.top, pre1.mdp, pre2.mdp to ”./model/1/T/PRE” # insert correct T in pre.mdp

# make pre-simulations

# use output in long-simulation # T=T+dt

(26)

# For all models do this

for j,Model in enumerate(model): MODEL=model[j]

# Check that we are in the right directory if not os.getcwd()==wd + ”/SIMS”:

os.chdir(wd + ”/SIMS”) # if not, go there.

if not os.path.exists(”./” + MODEL): # make model-folder & go there os.makedirs(”./” + MODEL)

os.chdir(”./” + MODEL)

# Set temperature to starting temperature T = Tstart

# Make directory for gas-phase simulations if not os.path.exists(”./1”):

os.makedirs(”./1”) os.chdir(”./1”)

# While our temperature is in the wanted ranges keep going. while T ¡= Tend:

if not os.getcwd() == wd + ”/SIMS/” + MODEL + ”/1/”: os.chdir(wd + ”/SIMS/” + MODEL + ”/1/”)

print(”solo”+MODEL+””+str(T)+”K”) if not os.path.exists(”./” + str(T)): os.makedirs(”./” + str(T))

os.chdir(”./” + str(T))

cwd = os.getcwd() ## /SIMS/MODEL/1/T

# copy need simulation files from START directory to simulation directory src2 = wd + ”/START/topol1.top”

dst2 = cwd + ”/topol.top”

src3 = wd + ”/START/” + MODEL + ”.itp” dst3 = cwd + ”/topol.itp” src4 = longmdp dst4 = cwd + ”/long.mdp” copyfile(src2, dst2) copyfile(src3, dst3) copyfile(src4, dst4)

# Change simulation temperature and other parameters in long.mdp with open(’long.mdp’, ’r’) as file:

filedata = file.read()

for i,line in enumerate(filedata.split(’“n’)): # For every line if line.startswith(’bd-temp’):

filedata = filedata.replace(line, ”bd-temp = ” + str(T))

if line.startswith(’reft’):

filedata = filedata.replace(line, ”reft = ” + str(T))

if line.startswith(’gentemp’):

filedata = filedata.replace(line, ”gentemp = ” + str(T))

if line.startswith(’dt’):

filedata = filedata.replace(line, ”dt = ” + ts)

if line.startswith(’nsteps’):

filedata = filedata.replace(line, ”nsteps = ” + long1ts)

with open(’long.mdp’, ’w’) as file: file.write(filedata)

# make PRE-simulation folder and copy the files there too if not os.path.exists(”./PRE”): os.makedirs(”./PRE”) os.chdir(”./PRE”) ## /SIMS/MODEL/1/T/PRE cwd = os.getcwd() dst1pre = cwd + ”/em.mdp” dst2pre = cwd + ”/topol.top” dst3pre = cwd + ”/topol.itp”

(27)

src5 = premdp dst5 = cwd + ”/pre.mdp” copyfile(src1, dst1pre) copyfile(src2, dst2pre) copyfile(src3, dst3pre) copyfile(src5, dst5)

# write correct temperatures in pre1.mdp and pre2.mdp with open(’pre.mdp’, ’r’) as file:

for i,line in enumerate(filedata.split(’“n’)): if line.startswith(’bd-temp’):

filedata = filedata.replace(line, ”bd-temp = ” + str(T))

if line.startswith(’reft’):

filedata = filedata.replace(line, ”reft = ” + str(T))

if line.startswith(’gentemp’):

filedata = filedata.replace(line, ”gentemp = ” + str(T))

filedata = filedata.replace(line, ”dt = ” + ts)

filedata = filedata.replace(line, ”nsteps = ” + pre1ts)

with open(’pre.mdp’, ’w’) as file: file.write(filedata)

# Start PRE simulation

# If it is the first sim, use starting structure, if not use output from prevous simulation i.e. /SIMS/MODEL/1/T-dT

if T==Tstart:

instructure = wd + ”/START/1.gro” if not T==Tstart:

Tprev = T - dT

instructure = wd + ”/SIMS/” + MODEL + ”/1/” + str(Tprev) + ”/confout.gro”

input = (gd + ”/grompp -f em.mdp -c ” + instructure + ” -p topol.top -o em.tpr -maxwarn 2”, “ gd + ”/mdrun -s em -c afterem.gro -nt 1”, “

gd + ”/grompp -f pre.mdp -c afterem.gro -p topol.top -o pre.tpr -maxwarn 2”, “ gd + ”/mdrun -s pre -c afterpre.gro -x pre.xtc -o pre.trr -e pre.edr -nt 1”) call(input[0], shell=True)

call(input[1], shell=True) call(input[2], shell=True) call(input[3], shell=True)

# go up 1 folder to run the LONG simulation os.chdir(’..’)

input = (gd + ”/grompp -f long.mdp -c PRE/afterpre.gro p topol.top o topol.tpr

-maxwarn 2”, “ gd + ”/mdrun -s topol.tpr -nt 1”) call(input[0], shell=True) call(input[1], shell=True) T = T + dT

A.3 run-1000.py

# coding: utf-8

(28)

################### SETUP

# Working directory cwd = os.getcwd() wd = cwd

# Go find input.txt to establish parameters and other needed input. os.chdir(’..’)

os.chdir(’..’)

if line.startswith(”gd”): gd = line.replace(”gd = ”,””) # Water models to be simulated if line.startswith(”rigid3PModels”): model = line.replace(”rigid3PModels = ”,””) model = model.split(’,’) # Start temp if line.startswith(”Tstart”): Tstart = Decimal(line.replace(”Tstart = ”,””)) # End temp if line.startswith(”Tend”): Tend = Decimal(line.replace(”Tend = ”,””)) # Temperature step if line.startswith(”dT”): dT = Decimal(line.replace(”dT = ”,””)) # Time step length (ps)

if line.startswith(”ts”): ts = Decimal(line.replace(”ts = ”,””)) # Number of timesteps if line.startswith(”long1000ts”): long1000ts = Decimal(line.replace(”long1000ts = ”,””)) if line.startswith(”pre1000ts”): pre1000ts = Decimal(line.replace(”pre1000ts = ”,””))

# Check that all given models are present in START dir os.chdir(wd + ”/START”)

for i, MODEL in enumerate(model):

if (model[i] + ”.itp”) not in os.listdir(os.getcwd()):

print(”Missng .itp-file for ” + model[i] + ” in ’” + os.getcwd() + ”’ so it has been removed.”) model.remove(model[i])

# Location of mdp files (.mdp files) for pre sim and long sim

premdp1 = wd + ”/START” + ”/pre1000base1.mdp”

premdp2 = wd + ”/START” + ”/pre1000base2.mdp”

longmdp = wd + ”/START” + ”/long1000base.mdp” # make simulation outputfolder & go there if not os.path.exists(wd + ”/SIMS”): os.makedirs(wd + ”/SIMS”)

os.chdir(wd + ”/SIMS”)

############################################### # for every model

# make model-directory and go there # set temperature

# make ’1000’-directory # while T¡= Tend

# make temperature-directory and go there (model/1000/T) # copy embase.mdp, topol1000.top, long.mdp to ”./model/1000/T” # insert correct T in long.mdp

(29)

# copy embase.mdp, topol1000.top, pre.mdp to ”./model/1000/T/PRE”

# insert correct T in pre.mdp

# make pre-simulation

# use output in long-simulation # T=T+dt

for j,Model in enumerate(model): MODEL=model[j]

if not os.getcwd()==wd + ”/SIMS”: os.chdir(wd + ”/SIMS”) # go back

if not os.path.exists(”./” + MODEL): # make model-folder & go there os.makedirs(”./” + MODEL) os.chdir(”./” + MODEL) T = Tstart if not os.path.exists(”./1000”): os.makedirs(”./1000”) os.chdir(”./1000”) while T ¡= Tend: print(”multi”+MODEL+””+str(T)+”K”)

if not os.getcwd() == wd +”/SIMS/” + MODEL + ”/1000”: os.chdir(wd +”/SIMS/” + MODEL + ”/1000”)

if not os.path.exists(”./” + str(T)): os.makedirs(”./” + str(T))

os.chdir(”./” + str(T))

cwd = os.getcwd() ## /SIMS/MODEL/1000/T # copy & paste simulation files

src1 = wd + ”/START/embase.mdp” dst1 = cwd + ”/em.mdp”

src2 = wd + ”/START/topol1000.top” dst2 = cwd + ”/topol.top”

src3 = wd + ”/START/” + MODEL + ”.itp” dst3 = cwd + ”/topol.itp” src4 = longmdp dst4 = cwd + ”/long.mdp” copyfile(src1, dst1) copyfile(src2, dst2) copyfile(src3, dst3) copyfile(src4, dst4)

# write correct temperatures and other parameters in long.mdp with open(’long.mdp’, ’r’) as file:

for i,line in enumerate(filedata.split(’“n’)): # For every new line #if line.startswith(’bd-temp’):

# filedata = filedata.replace(line, ”bd-temp = ” + str(T))

if line.startswith(’ref-t’):

filedata = filedata.replace(line, ”ref-t = ” + str(T))

if line.startswith(’gen-temp’):

filedata = filedata.replace(line, ”gen-temp = ” + str(T))

filedata = filedata.replace(line, ”dt = ” + str(ts))

filedata = filedata.replace(line, ”nsteps = ” + str(long1000ts))

with open(’long.mdp’, ’w’) as file: file.write(filedata)

# make PRE-simulation folder and copy the files there too if not os.path.exists(”./PRE”): os.makedirs(”./PRE”) os.chdir(”./PRE”) ## /SIMS/MODEL/1000/T/PRE cwd = os.getcwd() dst1pre = cwd + ”/em.mdp” dst2pre = cwd + ”/topol.top” dst3pre = cwd + ”/topol.itp” src5 = premdp1 src6 = premdp2 dst5 = cwd + ”/pre1.mdp”

(30)

dst6 = cwd + ”/pre2.mdp” copyfile(src1, dst1pre) copyfile(src2, dst2pre) copyfile(src3, dst3pre) copyfile(src5, dst5) copyfile(src6, dst6)

# write correct temperatures in em-mdp.-file with open(dst1pre, ’r’) as file:

for i,line in enumerate(filedata.split(’“n’)): # For every new line #if line.startswith(’bd-temp’):

with open(dst1pre, ’w’) as file: file.write(filedata)

# write correct temperatures and other parameters in pre1.mdp and pre2.mdp with open(’pre1.mdp’, ’r’) as file:

for i,line in enumerate(filedata.split(’“n’)): # if line.startswith(’bd-temp’):

filedata = filedata.replace(line, ”nsteps = ” + str(pre1000ts))

with open(’pre1.mdp’, ’w’) as file: file.write(filedata)

with open(’pre2.mdp’, ’r’) as file: filedata = file.read()

for i,line in enumerate(filedata.split(’“n’)): # if line.startswith(’bd-temp’):

filedata = filedata.replace(line, ”nsteps = ” + str(pre1000ts))

with open(’pre2.mdp’, ’w’) as file: file.write(filedata)

# Start PRE simulation

# If it is the first sim, use starting structure, if not use output from prevous simulation i.e. /SIMS/MODEL/1/T-dT

if T==Tstart:

instructure = wd + ”/START/1000.gro” if not T==Tstart:

Tprev = T - dT

instructure = wd + ”/SIMS/” + MODEL + ”/1000/” + str(Tprev) + ”/confout.gro” input = (

gd + ”/grompp -f em.mdp -c ” + instructure + ” -p topol.top -o em.tpr -maxwarn 2”, gd + ”/mdrun -s em.tpr -c afterem.gro -nt 4 ”,

(31)

gd + ”/grompp -f pre1.mdp -c afterem.gro -p topol.top -o pre1.tpr -maxwarn 2”,

gd + ”/mdrun -s pre1.tpr -c afterpre1.gro -x pre1.xtc -o pre1.trr -e pre1.edr -nt 4”,

gd + ”/grompp -f pre2.mdp -c afterpre1.gro -p topol.top -o pre2.tpr -maxwarn 2”, gd + ”/mdrun -s pre2.tpr -c afterpre2.gro -x pre2.xtc -o pre2.trr -e pre2.edr -nt 4”) # Making the energy minimization

call(input[0], shell=True) call(input[1], shell=True)

# Pre simulation using the EM output call(input[2], shell=True)

call(input[3], shell=True) call(input[4], shell=True) call(input[5], shell=True)

# Go up 1 folder to run the long simulation using output form the pre-simulation os.chdir(’..’)

input = (

gd + ”/grompp -f long.mdp -c PRE/afterpre2.gro -p topol.top -o topol.tpr -maxwarn 2”, gd + ”/mdrun -s topol.tpr -nt 4”) call(input[0], shell=True) call(input[1], shell=True) T = T + dT

A.4 analysis.py

# coding: utf-8

######## SETUP

# Working directory cwd = os.getcwd() wd = cwd

# Go find input.txt to establish parameters and other needed input. os.chdir(’..’)

os.chdir(’..’)

if line.startswith(”gd”): gd = line.replace(”gd = ”,””) print(gd)

# Water models to be simulated if line.startswith(”rigid3PModels”):

model = line.replace(”rigid3PModels = ”,””) model = model.split(’,’)

(32)

if line.startswith(”Tstart”): Tstart = Decimal(line.replace(”Tstart = ”,””)) # End temp if line.startswith(”Tend”): Tend = Decimal(line.replace(”Tend = ”,””)) # Temperature step if line.startswith(”dT”): dT = Decimal(line.replace(”dT = ”,””))

if not os.path.exists(os.getcwd() + ”/RESULTS”): os.mkdir(os.getcwd() + ”/RESULTS”) os.mkdir(os.getcwd() + ”/RESULTS” + ”/1”) os.mkdir(os.getcwd() + ”/RESULTS” + ”/1000”) os.chdir(os.getcwd() + ”/RESULTS”) resultDir = os.getcwd() # go to outputfolder os.chdir(wd + ”/SIMS”);

# For both the gas-phase and liquid-phase simulations. first = False;

ii = 1

while ii ¡ 2:

# For all specified models for j,Model in enumerate(model): MODEL=model[j]

if not os.getcwd()==wd + ”/SIMS”: os.chdir(wd + ”/SIMS”) # go back os.chdir(”./” + MODEL)

if first: os.chdir(”./1”) else:

os.chdir(”./1000”)

# create an analysis folder

if not os.path.exists(”./ANALYSIS”): os.makedirs(”./ANALYSIS”) os.chdir(”./ANALYSIS”) cwd1 = os.getcwd() ## /SIMS/MODEL/(1),(1000)/analysis fileName = (”potentialData”,”epotData”,”temperatureData”,”pressureData”,”volumeData”,”enthalpyData”,”densityData”,”epsilonData”,”alphaData”,”kappaData”,”cpData”,”deltaCData”,”tempData”) potentialData = []; etotData = []; temperatureData = []; pressureData = []; volumeData = []; enthalpyData = [] densityData = []; epsilonData = []; alphaData = []; kappaData = []; cpData = []; deltaCData = []; tempData = []; T = Tstart while T ¡= Tend: os.chdir(’..’) os.chdir(”./” + str(T)) cwd2 = os.getcwd() ## /SIMS/MODEL/1000/T

# copy & paste simulation files needed for analysis: ener, topol.tpr, traj # cwd1 - analysis directory

# cwd2 - simulation directory src1 = cwd2 + ”/ener.edr”;

dst1 = cwd1 + ”/ener-” + str(T) + ”K.edr”; src2 = cwd2 + ”/topol.tpr”;

(33)

dst2 = cwd1 + ”/topol-” + str(T) + ”K.tpr”; src3 = cwd2 + ”/traj.trr”; dst3 = cwd1 + ”/traj-” + str(T) + ”K.trr”; copyfile(src1, dst1); copyfile(src2, dst2); copyfile(src3, dst3); # Go to analysis directory os.chdir(cwd1); input = (

”echo e ’Temperature“n Potential“n Pressure“n Enthalpy“n Volume“n TotalEnergy“n 0’— ” + gd + ”/genergy f ” + dst1 + ” s ” + dst2 + ” o energy” + str(T) + ”K.xvg nmol 1000 fluctprops

-driftcorr ¿ ’fluct” + MODEL + str(T) + ”K.txt’”,

”echo ’1’— ” + gd + ”/gdensity -f ” + dst3 + ” -s ” + dst2 + ” -o density” + str(T) +”K.xvg”, ”echo ’1’— ” + gd + ”/gdipoles f ” + dst3 + ” s ” + dst2 + ” o mtot” + str(T) + ”K.xvg -a -aver” + str(T) + ”K.xvg -eps epsilon” + str(T) + ”K.xvg -d dipdist” + str(T) + ”K.xvg”) call(input[0], shell = True)

call(input[1], shell = True) call(input[2], shell = True)

# All relevant data should now be stored in ’outputTK.xvg’ files.

# Appned data into files input = (

gd + ”/ganalyze -f energy” + str(T) + ”K.xvg — grep -i ’ss1’ ¿ Potential” + MODEL + str(T) + ”K.txt”, gd + ”/ganalyze -f energy” + str(T) + ”K.xvg — grep -i ’ss2’ ¿ Etot” + MODEL + str(T) + ”K.txt”,

gd + ”/ganalyze -f energy” + str(T) + ”K.xvg — grep -i ’ss3’ ¿ Temperature” + MODEL + str(T) + ”K.txt”, gd + ”/ganalyze -f energy” + str(T) + ”K.xvg — grep -i ’ss4’ ¿ Pressure” + MODEL + str(T) + ”K.txt”, gd + ”/ganalyze -f energy” + str(T) + ”K.xvg — grep -i ’ss5’ ¿ Volume” + MODEL + str(T) + ”K.txt”, gd + ”/ganalyze -f energy” + str(T) + ”K.xvg — grep -i ’ss6’ ¿ Enthalpy” + MODEL + str(T) + ”K.txt”, gd + ”/ganalyze -f density” + str(T) + ”K.xvg — grep -i ’ss1’ ¿ Density” + MODEL + str(T) + ”K.txt”, gd + ”/ganalyze -f epsilon” + str(T) + ”K.xvg — grep -i ’ss1’ ¿ Epsilon” + MODEL + str(T) + ”K.txt”) call(input[0], shell = True)

call(input[1], shell = True) call(input[2], shell = True) call(input[3], shell = True) call(input[4], shell = True) call(input[5], shell = True) call(input[6], shell = True) call(input[7], shell = True) # Extracting the fluct properties.

with open(”fluct” + MODEL + str(T) + ”K.txt”,”r”) as file: lines = file.readlines()

for i,line in enumerate(lines): if line.startswith(”Coefficient”):

alpha = ((line.replace(”Coefficient of Thermal Expansion AlphaP = ”,””)).split())[0] if line.startswith(”Isothermal”):

kappa = ((line.replace(”Isothermal Compressibility Kappa = ”,””)).split())[0]

if line.startswith(”Heat capacity at”):

cp = ((line.replace(”Heat capacity at constant pressure Cp = ”,””)).split())[0]

if line.startswith(”Cp-Cv”):

delC = ((line.replace(”Cp-Cv = ”,””)).split())[0]

with open(”Alpha” + MODEL + str(T) + ”K.txt”, ’w+’) as file: file.write(alpha + ’“n’)

with open(”Kappa” + MODEL + str(T) + ”K.txt”, ’w+’) as file: file.write(kappa + ’“n’)

with open(”Cp” + MODEL + str(T) + ”K.txt”, ’w+’) as file: file.write(cp + ’“n’)

Benchmarking Physical Properties of Water Models

Uppsala University

Degree Project C in Physics, 15c

Department of Physics and Astronomy

Division of Molecular and Condensed Matter Physics

Benchmarking Physical Properties of

Water Models

Author:

Tomas Andr´e

Supervisor:

Carl Caleman

Subject reader:

Jens Carlsson

Contents

1

Introduction

2

Background and theory

2.1

The water molecule and how to parametrize it

2.2

Molecular dynamics

2.3

GROMACS programs

3

Method

3.1

Simulation setup

3.2

Analysis

3.3

Automation script

3.4

Definition of models

impl.

R

∠

q

q

C

C

C

C

(nm)

(degr

ees)

(e)

(e)

(kJnm

/mol)

(kJnm

/mol)

(kJnm

/mol)

(kJnm

/mol)

SPC

[8]

1981

0.10000

109.47

-0.82000

0.41000

0.26331E-05

0.26171E-02

0

0

TIP3P

[9]

1983

0.09572

104.52

-0.83400

0.41700

0.24352E-05

0.24889E-02

0

0

TIP3P-CHARMM

[10]