Evaluation of ab initio molecular docking for prediction of amyloid-ligand interactions

(1)

Linköping University | Department of Physics, Chemistry and Biology Bachelor thesis, 16 hp | Chemistry - Molecular Design Spring term 2020 | LITH-IFM-G-EX—20/3849--SE

i

Evaluation of ab initio molecular docking for

prediction of amyloid-ligand interactions

Leo Juhlin

Examiner, Per Hammarström Tutor, Sofie Nyström

(2)

ii

Avdelning, institution

Division, Department

Department of Physics, Chemistry and Biology

Linköping University

Datum Date 2020-06-08 Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport _____________ ISBN ISRN: LITH-IFM-G-EX--20/3849--SE __________________________________________________

Serietitel och serienummer ISSN

Title of series,numbering__________________________ Språk Language Svenska/Swedish Engelska/English ________________

URL för elektronisk version

Sammanfattning Abstract

Alzheimer´s disease is a neurodegenerative disorder affecting more than 34 million people worldwide and is postulated to be caused by an abnormal accumulation of Aβ-amyloid plaques. The Aβ-amyloid plaques consist of misfolded peptides, which leads to the plaques exhibiting multiple possible morphologies. One way of studying these polymorphic amyloid fibrils is by using different ligands with affinity for Aβ-amyloid fibrils. The purpose of this report was to evaluate whether the in silico based method ab initio molecular docking can be used to accurately predict amyloid-ligand interactions. The molecular docking method was evaluated by assessing its ability to reproduce experimentally derived relative binding affinities and reproducing conformations of amyloid-ligand complexes. The results indicate several complications of the ab initio molecular docking in performing the mentioned tasks. However, some modifications to the protocol used in this report might possibly improve the performance of the method.

Nyckelord Keyword

Molecular docking, poses, conformations, binding modes, clusters, Alzheimer´s disease, binding affinity, RMSD. Titel

Title

Evaluation of ab initio molecular docking for prediction of amyloid-ligand interactions

Författare Author Leo Juhlin

(3)

iii

Abstract

Alzheimer´s disease is a neurodegenerative disorder affecting more than 34 million people worldwide and is postulated to be caused by an abnormal accumulation of Aβ-amyloid plaques. The Aβ-amyloid plaques consist of misfolded peptides, which leads to the plaques exhibiting multiple possible morphologies. One way of studying these polymorphic amyloid fibrils is by using different ligands with affinity for Aβ-amyloid fibrils. The purpose of this report was to evaluate whether the in silico based method ab initio molecular docking can be used to accurately predict amyloid-ligand interactions. The molecular docking method was evaluated by assessing its ability to reproduce experimentally derived relative binding affinities and reproducing conformations of amyloid-ligand complexes. The results indicate several complications of the ab initio molecular docking in performing the mentioned tasks. However, some modifications to the protocol used in this report might possibly improve the performance of the method.

(4)

iv

Abbreviations

AD – Alzheimer´s disease Aβ – Amyloid-beta peptide MD – Molecular dynamics

NSB – Ligand with affinity for Aβ-amyloid fibrils (5,5'-((1E,1'E)-naphthalene-2,6-diylbis(ethene-2,1-diyl))bis(2-hydroxybenzoic acid))

TSB – Ligand with affinity for Aβ-amyloid fibrils (5,5'-((1E,1'E)-thiophene-2,5-diylbis(ethene-2,1-diyl))bis(2-hydroxybenzoic acid))

p-FTAA – Ligand with affinity for Aβ-amyloid fibrils (pentamer-formyl-thiophene acetic acid) AD4 – Autodock4

Vina – Autodock Vina ADT – AutodockTools VS – Virtual screening Å – Ångström (10-10_m)

(5)

v

Introduction

Alzheimer´s disease

Alzheimer´s disease (AD) is a neurodegenerative disorder characterized by impaired cognitive abilities and memory loss(1). AD accounts for 50-75% of all cases of dementia and is currently affecting more than 34 million people worldwide(2, 3). There are currently no known cures for AD and treatments are palliative(1). On a macroscopic scale the AD brain features atrophy in different areas, resulting in enlargement of sulcal spaces due to atrophy of the gyri, decreased brain weight and enlargement of the lateral ventricles(4). Microscopic features of the AD brain include extracellular amyloid plaques composed of the Aβ peptide and intracellular neurofibrillary tangles consisting of misfolded tau protein.

Amyloid cascade hypothesis

The amyloid cascade hypothesis (ACH) has for the past 20 years been the most influential hypothesis for AD research(5). The ACH proposes that AD is caused by an abnormal accumulation of amyloid-β (Aβ) plaques in the brain which subsequently triggers a pathological cascade, involving formation of neurofibrillary tangles, inflammatory response, oxidative injury, cell death and dementia(6).

The Aβ plaques consist of the Aβ peptide, which is a 39-43 residue long peptide produced by cleavage of the transmembrane amyloid precursor protein (APP) by β- and γ-secretase(6). The C-terminal end of APP is cleaved in different ways yielding Aβ peptide with varying lengths, with the Aβ40 isoform being the most

common and the second most prevalent being the more hydrophobic Aβ42 isoform.Following cleavage of

APP, the Aβ-peptide is released into the extracellular space, where it is rapidly degraded in healthy subjects(5).

The ACH furthermore proposes that the cause of the abnormal accumulation of Aβ plaques is due to changes in the steady state level of the Aβ peptide(7). This shift in the equilibrium can be mediated by an increased production of Aβ, reduction in Aβ degradation or increase in the Aβ42 /Aβ40 ratio.

Amyloid fibril polymorphism

The Aβ plaques consist of misfolded extracellular Aβ peptides which have assembled into protofilaments and subsequently aggregated into amyloid fibrils(8, 9). Upon formation of the protofilaments, the β-strands of the Aβ-peptides assemble into β-sheets, with the β-sheet being stabilized by backbone hydrogen bonding. The β-strands run perpendicular relative to the fibril axis while the β-sheets run parallel relative to the fibril axis, this arrangement is called cross-β structural pattern(8). The protofilaments are often assembled in a twisted conformation in mature amyloid fibrils, which can be seen in Figure 1(9).

Misfolded proteins consist of a high degree of disorder(10). Therefore, multimerization of the misfolded protein, e.g. Aβ aggregating into amyloid fibrils, leads to an aggregated protein exhibiting multiple morphologies. Amyloid fibril polymorphism is a research area describing variations in amyloid fibril structures.

The Aβ fibril has shown to be polymorphic in patient samples and several different fibril structures of Aβ40

and Aβ42 have been determined in vitro. The described variations in the Aβ amyloid fibrils indicates in

(7)

2

thresholds of AD and responding differently to certain treatments. Thus, studying amyloid fibril polymorphisms offer an opportunity to potential individualized treatment of AD.

A useful tool for studying amyloid polymorphism is the use of ligands with affinity for amyloid fibrils which spectral or fluorescent properties change depending on the type amyloid fibril morphology it is bound to. The first example of amyloid fibril polymorphism was in fact observed by a difference in the binding of the ligand Pittsburgh compound B (PIB) to different samples of amyloid fibrils. PIB bound abundantly to brain Aβ but with a lower affinity to animal models of AD, thus displaying a difference of amyloid fibril morphology between the samples.

Figure 1. Representation of an Aβ-amyloid fibril (PDB code (2BEG (11)). The fibril structure was duplicated in Create Fibril V 2.5 (12) and the image were made in SAMSON (13).

Purpose of the report

The purpose of this report is to investigate whether molecular docking can be used to accurately predict amyloid-ligand interactions. The ligands which will be focused on all have properties useful for detecting and analysing amyloid fibrils. This would enable a tool for the design of new ligands with useful properties for the study of amyloid fibrils and amyloid fibril polymorphism.

Previous computational studies of ligand interactions with amyloid fibrils have either used molecular dynamics(14)or data-driven docking(15, 16). Both these approaches are demanding, since data-driven docking require experimental prior experimental knowledge and molecular dynamics require excessive computational power(17, 18).

This work will use a third alternative, called ab initio molecular docking, and perform the molecular docking with limited prior experimental knowledge about the amyloid-ligand interaction and without computational demanding simulations. If this alternative is found to be successful it adds a time- and labour effective option for the computational analysis of ligands capable of detecting and/or distinguishing amyloid fibril structures. This approach has been done earlier in the AD field but with focus on therapeutic drugs targeting different key structures in the Aβ pathology(19, 20).

To evaluate the performance of the ab initio molecular docking, two approaches will be used. Both approaches will evaluate features characteristic of molecular docking.

The first approach will use modelled amyloid-ligand complexes(15, 16), described in the section “HADDOCK modelled amyloid-ligand complexes“ and try to reproduce, also called redock, the conformations of the complexes. This will verify whether the docking process is able to accurately predict conformations of amyloid-ligand complexes.

(8)

3 “Fluorescent ligands capable of detecting amyloid fibril structures” to Aβ-amyloid fibrils and try to reproduce the relative binding affinities reported for the ligands in the same article. This approach will verify whether the molecular docking can accurately predict relative binding affinities of amyloid-ligand complexes.

Computational methods to analyse amyloid-ligand interactions.

There are various options for the computational analysis of protein ligand interactions. These include molecular dynamics and molecular docking.

Molecular dynamics

Molecular dynamics (MD) is a concept regarding the movement of atoms in molecular or protein systems(18). The movement of the atoms are based on a model of interatomic interactions and can be used to provide simulations for various biomolecular processes such as ligand binding and protein folding. MD has previously been applied for the study of amyloid fibril ligand interactions which enabled the identification for a binding site for the amyloid-ligand p-FTAA(14).

Molecular docking

Molecular docking is an in silico based method to predict the interactions between a ligand and a biological target as well as the binding affinity between the mentioned entities(22). The molecular docking process will generate different binding modes/poses and rank them according to binding affinity. The use of molecular docking allows an effective way of studying protein-ligand interaction. This tool can further be used to improve the improve the design of different types of drugs. The estimated binding affinities obtained by molecular docking have been shown to poorly correlate with experimental binding affinities(23). However, the binding affinities can still be used to compare the activity between different ligands on the same biological target, i.e. used as relative binding affinities between different ligands, a concept commonly used in Virtual Screening (VS) (24).

The molecular docking process varies with the amount of information given before the docking and can therefore be divided into two categories, ab initio docking and data-driven docking(17).

The ab initio approach only considers the starting coordinates of the system and disregards any prior experimental knowledge about the targeted docking system. However, experimental knowledge can be used to assess the results obtained from the docking process.

In contrast, data-driven docking uses either experimental or predicted information about the system to drive the docking process. The information provides knowledge about the interface region between the molecular entities and/or the relative positions of the molecular components. This can include chemical shift perturbations from NMR-experiments, mutagenesis data, hydrogen/deuterium exchange and bioinformatic interface predictions. Data-driven does generally produce more reliable results than ab initio molecular docking(25).

This report will mainly focus on the ab initio docking approach. There will be docking processes using the data-driven approach presented in the appendix. The data-driven docking approach was used to test whether previous results from data-driven docking were reproducible or if there were any specific unknown parameter that contributed to obtaining the conformations generated by driven docking. The data-driven docking was also used to test its ability to reproduce the relative binding affinities of TSB and NSB to Aβ-amyloid fibril(25). However, it was found out that this approach requires a lot more experimental knowledge to be able too accurately reproduce relative binding affinities, which is why it is was not extensively focused on in this report.

(9)

4

Fluorescent ligands for detection of amyloid fibril structures

Zhang et al. describes the synthesis of fluorescent ligands, seen below in Figure 2, with useful properties for studying amyloid fibril polymorphism(21).

Figure 2. Molecular representations of X-34 and the X-34 analogues TSB, NSB, QSB and BTDSB presented in the article by Zhang et al. The molecules were generated by substituting the central benzene unit of X-34. The molecular structures were drawn using Chemdraw Ultra 12.0 (27)

The ligands showed a substantial difference in binding affinity which can be seen in Table 1. The largest difference in binding affinity can be observed between the TSB and NSB ligand, displaying a binding affinity ratio of roughly 1:47 in the competition assay and 1:20 in the fluorescence assay (Table 1). Since molecular docking is inaccurate regarding reproducing absolute binding affinity values, the aim will be to reproduce a specific binding affinity ratio, relying on the same principle as Virtual Screening(23, 24). The ratio which will be attempted to be reproduced is the TSB/NSB ratio, since it constitutes the largest binding affinity ratio, of the ligands presented in Figure, which will yield the most easily interpreted results. An additional reason these ligands were chosen as references are because of their high rigidity, since both ligands contain ethenyl linkers and only 8 rotatable bonds. The docking accuracy of several docking softwares has been observed to decrease with increasing number of rotatable bonds(28).

Table 1. Binding affinities for the X-34 molecule and the X-34 derivatives TSB, NSB, QSB and BTDSB. The binding affinities are expressed as EC50 -values from the competition assay and KD – values from the fluorescence assay. Both values are expressed in units of nanomolar. Table adapted from Zhang et al (19).

Ligands Aβ EC50[nM] Aβ KD [nM]

X-34 80 80

BTDSB 140 750

QSB 120 16

NSB 750 120

(10)

5

Amyloid fibril binding site of p-FTAA

König et al. performed a non-biased molecular dynamics simulation and identified an Aβ-amyloid fibril binding site for the luminescent conjugated polythiophene p-FTAA, a ligand used for detecting and analysing amyloid fibrils PFTAA (14, 29). The binding site was placed in the groove between the residues Lys16, Val18 and Phe20 on 7 sequential Aβ monomers. As can be seen in Figure 3, the negatively charged carboxylate groups of the p-FTAA ligand are electrostatically interacting with the positively charged lysine wall. The molecule is positioned flat along the fibril axis with the carbon and sulphur atoms interacting with the Val18 and Phe20 residues via van der Waals interactions. This binding site will be used for evaluating the ability of ab initio molecular docking to reproduce relative binding affinities of NSB and TSB.

Figure 3. Illustration of the identified binding site of an Aβ-amyloid fibril for the p-FTAA ligand. The Phe20 residues are depicted in red, the Val18 residues are depicted in dark blue and the Lys16 residues are depicted in pink. As can be seen the carboxylate groups of p-FTAA are interacting with the positively charged lysine residues and the core of the molecule are interacting with the Val18 and Phe20 residues with van der Waals interactions. Both pictures were made in SAMSON(13).

The article did not characterize the binding site for the X-34 analogues. However, a previous binding competition study revealed that X-34 competes with p-FTAA in amyloid fibril binding, and Zhang et al. showed that TSB and NSB compete with X-34(21, 28). Therefore, it can be assumed that TSB and NSB binds to the same site as the one identified for p-FTAA.

A potential problem with focusing on this binding site for the docking of TSB and NSB is that König et al. used an amyloid fibril that would not be generated under the conditions used for fibril formation in Zhang et al. Hence there is an inherent risk that the fibrils differ in morphology. For identification of the p-FTAA binding site, König et al used a fibril with PDB code 5OQV(14, 29). For assessment of binding affinity, Zhang et al. used a fibril formation protocol which would generate fibrils more resembling the fold of PDB code 2NAO(21, 25).

As can be seen in Figure 4, the interface of 5OQV reported by König et al. is not identical with the interface in the same residue position of 2NAO. In 2NAO, the Phe20 residues are not pointing towards the interface and are instead positioned adjacent to the outward pointing Phe19 residues. However, in 2NAO there is an Ala21 residue pointing in towards the interface and it can be theorized that his residue might substitute the effect from Phe20. Nonetheless, the difference in interface structure between the proteins adds an element of uncertainty to the project since the NSB and TSB might bind differently to 2NAO in comparison to 5OQV and thus produce different binding affinity ratios.

(11)

6

Figure 4. Interface structures for 2NAO and 5OQV fibrils between the 16-20/21 residues. The lysine residues are depicted in pink residues, the valine residues are depicted in dark blue, the phenylalanine are depicted in red and the alanine residues are depicted in black. The images were made in SAMSON. (A) Interface between 16-21 residues for the 2NAO Aβ-amyloid fibril. The fibril formation protocol used by Zhang et al. generated a fibril resembling this fold. (B) Interface between the residues 16-21 on the 5OQV Aβ-amyloid fibril, which was identified by König et al. to be the binding site for the p-FTAA ligand.

HADDOCK modelled amyloid-ligand complexes

Below are representations of the HADDOCK modelled amyloid-ligand complexes HET-s – Congo Red(16) and HET-s – p-FTAA(15), the latter complex contains E265K point mutations of the HET-s amyloid fibril. Both ligands present in the complexes are used for detecting and analysing amyloid fibrils(29, 32). The complexes were modelled with the data-driven docking software HADDOCK and these complexes will be used as references to verify whether the ab initio docking approach can reproduce binding modes of amyloid-ligand complexes(17).

The Congo Red molecule seen below in Figure 5:A, exhibits a high degree of planarity, with the molecule being positioned parallel relative to the lysine residues. The molecule contains two sulfonate groups both pointing upwards and electrostatically interacting with the positively charged Lys229 residues on the second and fourth monomers(16).

The p-FTAA molecule seen below in Figure 5:B, shows a twisted conformation, with all four of its negatively charged carboxylate groups interacting with the lysine wall consisting of positively charged Lys229 and Lys265 residues on four sequential monomers(15).

Figure 5. Representations of the two HADDOCK modelled amyloid complexes. The lysine residues are depicted as pink residues. Both images were made in SAMSON. (A) Representation of the HADDOCK modelled HET-s – Congo Red complex. The molecule exhibits a high degree of planarity with both sulfonate groups pointing upwards interacting with the Lys229 residues. (B) Representation of the HADDOCK modelled HET-s – p-FTAA complex. The thiophene units of the molecule are twisted relative to each other and the four carboxylate groups are all pointing towards the lysine wall.

(12)

7 It must be emphasized that these complexes are only virtually simulated and does not necessarily resemble the “true” pose. However, as previously mentioned, data-driven docking uses experimental knowledge about the binding interface to drive the docking process which produces in general more reliable results than ab initio molecular docking and the complexes above, obtained from data-driven docking can therefore function as reference for the ab initio docking. However, it needs to be stated that the HADDOCK model results are less accurate, in terms of the orientation and position of the ligand, in comparison to a complex structure determined by an experimental method.

In Figure 4 below can the molecular structures of the Congo Red and p-FTAA ligands be viewed more clearly(29, 32). Comparing the structure reveals that the core of the Congo Red molecule is more rigid than the core of the pFTAA-molecule, with two non-rotatable azo groups connecting the molecule. In the p-FTAA ligand, the thiophene units, comprising the core of the molecule, are linked by rotatable single bonds, making the core of the ligand more flexible.

Figure 6. Molecular representations of the Congo Red and p-FTAA ligand present in the HADDOCK modelled amyloid-ligand complexes seen in Figure 5. The molecular structures were drawn in Chemdraw Ultra 12.0.

Limitations of this work

As previously mentioned, the work conducted in this report had limited prior experimental knowledge. This led to the need to make assumptions, which might have negatively affected the outcome of the results.

Another important limitation was the lack of computational power. The molecular dynamics simulation used by König et al. were computationally demanding, especially considering the system it was applied to(14, 18). Since this report does not have access to such high amounts of computational power, it will not be able to apply the method used by König et al. which might have been desirable, considering the promising results of the study.

An additional limitation was the discrepancy between the subject of the report and the earlier education of the author. The original project was aimed towards wet lab work but was redirected to a theoretical project in computational chemistry due to the COVID-19 pandemic. The author was not experienced nor prepared for this kind of subject, which made the project more arduous.

(13)

8

Materials and Methods

Method theory

Molecular docking is composed of two basic components, search algorithm and scoring function, which will be described below(33). This section will also cover the softwares used in this report and as well describe the principle of redocking.

Search algorithm

If every possible conformation would be generated, it would require excessive amount of time and computational power(33). A search algorithm solves this problem by only sampling high quality conformations and excluding the non-valuable ones(34). Search algorithms are defined by different set of rules and parameters. One essential parameter is the level of flexibility incorporated into the docking system, i.e. how the bonds known to be rotatable in nature are treated in the docking system. Even though they are rotatable in nature, they can be viewed as non-rotatable in the docking system. The incorporation of flexibility can be categorized into three classes: rigid docking, semi-flexible docking and flexible docking(35).

Rigid docking uses the simplest system and does not include any rotatable bonds, which means that the conformation of the protein or the ligand will not be changed during the docking run. This system therefore only considers the three rotational and three translational degrees of freedom characteristic of any object in space. Semi-flexible docking systems only consider the ligand as flexible and treats the protein as rigid, meaning that only the ligand will possibly contain rotatable bonds. Flexible docking systems consider both the ligand and the protein as flexible. The rigid docking system can be useful for initial stages of drug discovery(33). However, flexible systems are more realistic simulations of nature and have become even more important with the knowledge of conformational changes of molecular entities upon complex formation.

Scoring function

The scoring function (SF) evaluates the generated poses by the docking software and scores them, with the score representing the relative binding affinity. The top ranked pose is assumed to be the pose most accurately resembling the “true” binding mode of the protein-ligand complex(34). SFs are usually based on simplifications and assumptions to minimize the required time of a docking run.

(14)

9

Softwares used in this report

This report will deal with two different softwares: Autodock4 (AD4) and Autodock Vina (Vina). Both softwares will use a graphical user interface called AutoDockTools (ADT). The central display is shown below in Figure 7.

Figure 7. The central display of the graphical user interface ADT. The picture resembles a protein with an inserted grid box.

Autodock4

AD4 is a docking program using a grid-based method meaning that a grid is placed around the target protein and a probe atom is placed on each grid point(36). The interaction energies between the probe atom and the protein are subsequently calculated. This information is stored in the grid and can be used during the docking process. AD4 requires a search space specification, i.e. a determined area set by coordinates where the search for the most stable bound state of a ligand will be conducted(37).

AD4 allows full flexibility of specific parts of the protein as well as full flexibility of the ligand(36). This is achieved by selecting specific side chains to be flexible and treating these explicitly during the docking process.

The conformations generated by AD4 are analysed and conformations which differ less than 0.5 Ångström (Å) in root mean square deviation (RMSD) are grouped into one cluster, and the clusters are thereafter ranked and presented(38). It is favourable that the cluster with the lowest binding affinity have the highest number of conformations, also called cluster size, since high cluster sizes are indicative of a more probable conformation(39). Only when the cluster sizes are presented will the results obtained from AD4 be referred as clusters. In other sections the results from AD4 will be referred as poses.

Autodock Vina

Vina is a follow up on AD4 with a different search algorithm and scoring function(37). As with AD4, Vina uses a search space in which conformations are generated, and is also able to incorporate side-chain flexibility into selected amino acids(37, 38). In contrast to AD4, Vina clusters and ranks the results itself and presents the results as binding modes, which eliminates the need for the user to consider cluster size(37). Vina has been proven to have a significantly higher accuracy and speed than AD4.

Redocking

Redocking is a method to verify the performance of a specific software(42). In redocking, a protein-ligand complex is split, and the ligand is subsequently docked back to the protein. If the initial conformation of the protein-ligand complex is reproduced, it is verified that docking software is viable. The difference

(15)

10 between the native ligand and docked ligand are measured in RMSD, where a redocking run producing a RMSD ≤ 2.0 Å is viewed as successful.

Experimental

The docking method differed between the redocking and the docking on 2NAO. Below in Figure 8 is a general workflow diagram describing both methods.

Figure 8. Workflow diagram for redocking of HET-s-Congo Red, HET-s-p-FTAA and docking to 2NAO with TSB and NSB.

Redocking of Congo Red – HET-s and p-FTAA - HET-s with Autodock Vina

The following experimental description of the docking procedure, will be presented in singular form to avoid confusion. However, the description applies to both complexes mentioned in the header.

The complex was downloaded from the protein data bank (PDB codes 2MUS(15) and 2LBU(16)). The PDB file was uploaded to PyMOL Molecular Graphics System(43) and a grid box was inserted into the protein using the PyMOL Vina legacy plug in. The relative position of the protein was adjusted, using the PyMOL “3-Button Editing” function, to position the binding site parallel to the visualized grid box. The

(16)

11 protein ligand complex was subsequently split into two separate PDB files, one containing the ligand and one containing the protein, using the PyMOL “export molecule” function or manually using text editor. The ligand was uploaded to ADT and the torsional root was assigned. Finally, the ligand was saved in PDBQT format.

The HET-s protein was uploaded to ADT and modified by deleting potential water molecules and adding polar hydrogens. The protein was subsequently saved in PDBQT format. Thereafter, the grid box was implemented and the grid box spacing was changed from 0.375 Å to 1.000 Å. The grid box was centered around the ligand and reshaped to include the residues observed to be interacting with the ligand in the complex.

The docking runs were performed with default settings, but with exhaustiveness increased to 100, which was changed to increase the probability of finding the most accurate sampled conformation. After the docking, the input ligand and the generated poses in PDB format were uploaded to Discovery Studio Visualizer(44). The RMSD of the heavy atoms was calculated between the starting conformation of the ligand and the conformations generated by the docking procedure.

Redocking of p-FTAA-HET-s complex with Autodock4.

The p-FTAA ligand and the HET-s protein was prepared in the same way as described in the section above, except that the HET-s protein was not modified by deleting water or adding polar hydrogens. The grid box was set with the spacing of 0.375 Å and was centered on the ligand and reshaped to include the residues observed to interact with the ligand.

Prior to the docking run, the interaction energies for each grid point was calculated with Autogrid4 using the different atom types in the ligand as probe atoms. Upon completion of the precalculation of interaction energies, docking of the p-FTAA ligand was set up. The default parameters were used except for the Maximum number of evaluations which was changed from “medium” to “short” and the number of GA runs which was increased from 10 to 50. The reason to change number of GA runs to 50 was to increase the probability of sampling more accurate poses. Changing number of evaluations to “short” was done to decrease the docking run time. The docking results were thereafter uploaded to Discovery Studio Visualizer(44). The RMSD of the heavy atoms were measured between the starting conformation of the ligand and the conformations generated by the docking procedure.

Docking to duplicated 2NAO with TSB and NSB

The docking to duplicated 2NAO with TSB and NSB was done with semi-flexible and flexible docking. It used both Vina and AD4 for the flexible docking.

Duplication of amyloid fibril 2NAO using CreateFibril v 2.5

The amyloid fibril with the PDB code 2NAO was downloaded. Since the PDB file consisted of an NMR-ensemble, it was uploaded to the Olderado server to find the NMR model most representative of the NMR ensemble(45). NMR model six was selected and the rest models were deleted from the PDB file.

In order to obtain the appropriate settings for the duplication of 2NAO, the python script

AnglebetweenHelices was downloaded and used to measure the average rotational angle between the second beta-strands (residues 15-17) of the A,B and C chains in PyMOL. The distance between the α carbon of the 15th residue of chains A,B and C was measured using the measurement tool in PyMOL.

(17)

12 The duplication of the amyloid fibril was performed using the CreateFibril v 2.5 software(12). The coordinates of the γ-carbon on the 34th residue in chain B was set as the fibril axis, chain B was also used as the extension unit and duplicated seven times. The distance was set to 5.1 Ångström (Å) and the rotational angle was set to 358.93°, both these values were obtained from the measurements described in the section above. The software was run on the Ubuntu operating system with Python version 2.0 since the software was incompatible with Python versions above 2.5.

The output file from the CreateFibril v 2.5 software was missing the 10th_{, 11}th_{and 12}th_{columns necessary}

for visualization in ADT. The missing columns were manually corrected in the text editor for each atom. This was done by adding the original 10th_{and 11}th_{columns from the input PDB file to the output PDB file.}

The 12th_{column was created by adding the first letter of the third column of the input PDB file to the empty}

12th_{column in the PDB output file. Thereafter, the residues constituting the flexible domain (1-14) of each}

chain were deleted. Finally, the duplicated protein was energy minimized with 200 cycles of steepest descent using the Swiss-PdbViewer v4.1.0 software.(46)

Semi-flexible docking of NSB and TSB to 2NAO with Autodock Vina

The ligands NSB and TSB presented in the article by Zhang et al.were drawn in ChemDraw Ultra 12.0 and subsequently subjected to MM2 energy minimization using the same software.

The following description of the docking procedure will be presented in singular form to avoid confusion. However, the description applies to both the NSB to 2NAO - and the TSB to 2NAO docking.

The ligand was uploaded to ADT and the torsional root was assigned. The ligand was thereafter saved as PDBQT. The protein was uploaded to ADT and modified by deleting water molecules and adding polar hydrogens. The grid box was inserted with the grid box spacing of 1.000 Å and was reshaped to include the interface created by the residues 16K, 18V and 21A residues positioned on six sequential monomers as well as the side chains of the amino acids previously mentioned. The docking was run with default settings but with exhaustiveness increase to 100. The exhaustiveness was changed to increase the probability of finding the most accurate sampled pose.

Flexible docking of NSB and TSB to 2NAO with Autodock Vina

The preparatory steps for the ligand and the protein was the same as described in the section “Semi-flexible docking of NSB and TSB to 2NAO with Autodock Vina”. For the flexible docking, the Cβ-Cγ bonds on the Lysine16 residue on the first five sequential monomers was set as rotatable, and the protein was saved as one rigid PDBQT file and one flexible PDBQT file. The grid box included the interface created by the residues 16K, 18V and 21A positioned on six sequential monomers as well as the side chains of the amino acids. The grid box also included space behind the 5 lysine residues set as flexible, to allow for flexible rotation of the lysine residues. The docking procedure was run with default parameters but with exhaustiveness increased to 200. The exhaustiveness was changed to increase the probability of finding the most accurate sampled pose.

Flexible docking of NSB and TSB to 2NAO with Autodock4

The preparatory steps for the ligand and the protein was the same as described for semi-flexible docking. However, the protein was not modified by deleting the waters and adding polar hydrogens. The Cβ-Cγ bonds on Lysine16 on the first five sequential monomers was set as flexible and the protein was subsequently split into one rigid PDBQT file and one flexible PDBQT file. The rigid PDBQT was set to overwrite the file name for the initial protein PDBQT file. Thereafter, the interaction energies were calculated using the atom types in the ligand and the flexible residues as probe atoms. The docking run was set up with default parameters except for the number of evaluations which was set to “short” and the number of GA runs which

(18)

13 was set to 50. The reason to change number of GA runs to 50 was to increase the probability of sampling more accurate pose. Changing number of evaluations to “short” was done to decrease the docking run time.

Results

Redocking of HET-s

– p-FTAA and HET-s – Congo Red complexes

The redocking was done on two different complexes, HET-s – p-FTAA and HET-s – Congo Red. For the HET-s – p-FTAA redocking, AD4 and Vina was used. For the redocking of the HET-s – Congo Red complex, only the Vina software was used.

Autodock Vina redocking of HET-s – p-FTAA complex

Figure 9 below represents the highest ranked pose of the redocking procedure of the HET-s – p-FTAA complex with Vina. In contrast to Figure 5, the lysine residues are presented in blue-white coloured Gaussian surface instead of pink residues.

The results indicate a low fit of the highest ranked pose relative to the starting pose since the molecules display significantly different positions. There can also be seen a carboxylate group pointing away from the lysine wall, which is not characteristic of the starting pose, which exhibited all its carboxylate groups pointing towards the lysine wall.

Figure 9. Two representations of the highest ranked pose of the Vina docking run and the starting pose. The lysine residues are illustrated as white-blue Gaussian surface. The CPK coloured molecule is the native pose and the red coloured molecule is the docking pose. The carboxylate groups for the starting pose are all pointing towards the lysine wall. In contrast, the docking pose exhibits one carboxylate group pointing away from lysine wall. It is also observable that the general fit between the native and docked pose is low. Both pictures were generated in SAMSON.

Figure 10 below presents every pose, in total nine, obtained by the Vina run. It can be noted that there is a high deviation in the poses generated. The carboxylate groups are also frequently pointing the opposite direction from the lysine wall as was observed in Figure 9.

(19)

14

Figure 10. Two representations of every pose generated by the Vina software. The nine poses are illustrated with different colours and the lysine residues are depicted as white-blue Gaussian surfaces. The observed variance between the poses are high and multiple poses have carboxylate groups pointing away from the lysine wall, uncharacteristic of the starting pose. Both pictures were made in SAMSON.

The resemblance of the generated poses relative to the expected pose can be viewed below in Table 2. The binding affinity and the RMSD values are presented for each pose. The pose with lowest RMSD value is pose 3 with a RMSD of 2.9602. Since the acceptable limit for RMSD is ≤ 2.0 it can be concluded that the docking run were unable to generate a single binding mode determined as acceptable. It should also be considered that the pose with the determined higher binding affinity, by the Vina software, does not have the lowest RMSD. This is not desirable, since the pose deemed by the docking software to best fit with the protein, i.e. with the highest relative binding affinity, should be the pose most resembling the expected pose, if the expected pose is assumed to be the lowest energy pose.

Table 2. Presentation of binding affinities and RMSD values of every pose generated by the Vina run. The binding affinities are expressed in kcal/mol. The RMSD values describes the distance between the generated pose and the expected pose. All poses show higher RMSD values than the acceptable limit of 2.0 Å.

Autodock4 redocking of HET-s – p-FTAA complex

Shown below in Figure 11 is the top ranked binding mode for the redocking of HET-s- p-FTAA complex with AD4. The redocking showed similar results as the Vina run. The pose displays an orientation clearly deviating from the starting pose. As was previously observed in the Vina run, one of the carboxylate groups of the docked molecule is pointing away from the lysine wall depicted as white-blue gaussian surface, and the core of the docked molecule displays a general low fit relative to the expected pose.

Pose rank Binding affinity (kcal/mol) RMSD

1 -7.5 3.3071 2 -7.3 4.1046 3 -7.1 2.9602 4 -6.9 3.0034 5 -6.7 4.4517 6 -6.5 3.0976 7 -6.4 3.3595 8 -6.3 4.1585 9 -6.3 3.3595

(20)

15

Figure 11. Two representations of the highest ranked pose of the AD4 docking run. The lysine residues are illustrated as white-blue Gaussian surface. The CPK coloured molecule is the starting pose and the red coloured molecule is the docking pose. The images reveal a similar tendency of deviation of a central carboxylate group relative to the lysine wall, not characteristic of the expected pose, and a general low fit. Both pictures were made in SAMSON.

Figure 12 below shows the top nine out of the 34 generated clusters by AD4 for the redocking run of HET-s – p-FTAA. Figure 11 shows a similarity between the Vina generated poses and the AD4 generated poses with multiple poses having carboxylate groups oriented away from the lysine wall (Figure 10). However, there are two poses, coloured yellow and dark green, with all carboxylate groups pointing towards the lysine wall. These poses were the fourth and eight highest ranked, respectively. In general, there can be seen a high variation between the generated poses. For some of the poses (black, orange and light blue) the entire orientation deviates from the expected pose, with the molecules being positioned with a more perpendicular orientation to the lysine wall in comparison to the ligand in the starting pose which exhibited a parallel orientation relative to the lysine wall.

Figure 12. Two representations of the nine highest ranked poses generated by the AD4 software. The nine poses are illustrated with different colours and the lysine residues are depicted as white-blue Gaussian surfaces. The images show a high variance between the generated poses and carboxylate groups pointing in directions not characteristic of the starting pose. Some molecules have their entire orientation positioned in an entirely different orientation in comparison to the expected pose. Both pictures were made in SAMSON.

Table 3 below describes the estimated binding affinities, the RMSD values and cluster sizes of the AD4 for the clusters (poses) generated by the redocking run of HET-s – p-FTAA. As can be seen in the table, no generated cluster fulfilled the criteria of a RMSD value ≤ 2.0. Important thing to note is that the table only presents nine of the generated clusters, the total amount of clusters generated was 34. The highest ranked cluster also had the highest cluster size, meaning that the docking software views this interaction as favourable.

(21)

16 Comparing the RMSD values between the Vina run and the AD4 run does show a general lower RMSD value of the poses generated by AD4, not only with the first pose but on several lower ranked others as well. However, it is important to note that there is a possibly high variance in the references, since the complexes were computer modelled, and conclusions regarding the general fit between the starting poses and docked clusters should therefore be viewed with caution.

Table 3. Estimated binding affinities, RMSD and cluster sizes of the top nine generated clusters out of the total 34 generated clusters by AD4. The top ranked cluster has the largest cluster size which is desirable. The binding affinities are expressed in kcal/mol. The RMSD values describes the distance between each one of the generated poses and the starting pose i.e. the expected pose. All clusters show higher RMSD values than the acceptable limit of 2.0 Å.

Autodock Vina redocking of HET-s – Congo Red

The redocking of Congo Red on HET-s is shown in Figure 13. The highest ranked pose, shown in red, display a similar tendency as the poses generated from the HET-s – p-FTAA docking runs with one of the negatively charged sulfonate groups pointing away from the lysine wall, not characteristic of the starting pose. It can also be observed a general low fit between the starting pose and the top ranked pose.

Figure 13. Two representations of the highest ranked pose and the starting pose of the Vina docking run. The lysine residues are visualized as white-blue Gaussian surface. The CPK coloured molecule represents the starting pose and the red coloured molecule represents the docking pose. The images show a deviation of the docking conformation relative to the pose with one the negatively charge sulfonate group, pointing away from the lysine wall instead of towards it. Both pictures were made in SAMSON.

Figure 14shows every pose generated by the Vina run for the HET-s - Congo Red complex. A strong tendency can be seen where at least one of the negatively charged sulfonate groups are leaning away from the lysine wall, similar to previous redocking runs with p-FTAA. However, the generated binding modes have less variation between each other in comparison to the p-FTAA runs. This might be explained by the higher rigidity of the Congo Red core in comparison to the p-FTAA molecule which have rotatable bonds between every thiophene unit. The picture does show one pose, coloured in pistage, with both sulfonate

Cluster rank Binding affinity

(kcal/mol) RMSD Cluster size 1 -11.75 2.78 6 2 -11.03 3.98 3 3 -11.33 2.35 2 4 -11.33 2.62 1 5 -10.83 3.54 2 6 -11.26 5.6 1 7 -11.10 4.2 1 8 -10.72 2.87 1 9 -10.63 3.32 1

(22)

17 groups to some degree oriented towards the lysine wall. However, it must be emphasized that this pose was ranked the lowest out of all poses, therefore even though it might have been somewhat accurate representation of the expected pose, the Vina software was not able to filter it out.

Figure 14. Two representations of every pose generated by the Vina software. The nine poses are illustrated with different colours and the lysine residues are depicted as white-blue Gaussian surfaces. A strong tendency of having at least one negatively charged group pointing away from the lysine wall is seen. In contrast to the p-FTAA runs, there can be observed a smaller variation between the generated poses. Both pictures were made in SAMSON.

Table 4 seen below presents the binding affinities and RMSD values of the Vina HET-s – Congo Red run. The RMSD value describes the distance between each generated pose and expected pose. Interestingly, the RMSD-values for both the highest ranked pose and several lower ranked poses are lower than for the p-FTAA run with Vina. This difference in RMSD might be explained by the favourability for Vina to dock more rigid molecules as Congo Red. However, these comparisons might be hard to draw since the reference complexes were both modelled and contain internal variance.

Table 4. Presentation of binding affinities and RMSD-values of every pose generated by the Vina run. The binding affinities are expressed in kcal/mol. The RMSD values describes the distance between the generated poses and the starting pose. All poses show higher RMSD values than the acceptable limit of 2.0 Å. Several poses show lower RMSD values in comparison to the RMSD values of the Vina docking run of p-FTAA, seen in Table 2.

Pose rank Binding affinity (kcal/mol) RMSD

1 -9.1 3.0262 2 -9.0 2.9019 3 -9.0 3.0354 4 -8.9 3.1539 5 -8.9 2.1805 6 -8.7 2.9753 7 -8.6 2.9463 8 -8.5 2.8877 9 -8.5 2.5057

Docking to duplicated 2NAO with TSB and NSB ligands

The docking with TSB and NSB was performed with semi-flexible docking using Vina and flexible docking using Vina and AD4 on the duplicated 2NAO fibril. The duplicated 2NAO fibril consisted of 7 monomers obtained from the sixth NMR model of the 2NAO NMR ensemble. The monomers were positioned with 1.04° rotation between the second beta-strands (residues 15-17) and a 5.1 Å distance between the Cα -carbons of the 15th_{residues of each monomer.}

(23)

18 Semi-flexible docking of NSB and TSB to 2NAO using Autodock Vina

The first step in trying to reproduce the binding affinities reported by Zhang et al. was to use a semi-flexible docking system for both NSB and TSB, i.e. setting only the ligand as flexible, and treating the duplicated 2NAO Aβ-amyloid fibril as rigid.

Semi-flexible docking of TSB

The top ranked pose of the semi-flexible docking with Vina of TSB to the fibril model of 2NAO generated by CreateFibril v 2.5 can be seen below in Figure 15. Interestingly, the molecule is oriented parallel to the lysine wall, coloured in pink, with the hydroxyl groups and one of the carboxylate groups pointing towards the lysine wall. This pose differs from the pose presented by König et al. for the p-FTAA molecule, which can be seen in Figure 3. In the pose generated by König et al in Figure 3 in which the p-FTAA ligand is positioned with a higher degree of parallelism relative to the lysine wall in comparison to the position of TSB seen below in Figure 15. The perpendicular position results in large distances between the negatively charged groups of the TSB molecule and the lysine wall. It is therefore questionable whether this pose is correctly oriented.

Figure 15. Two representations of the top ranked pose for the semi-flexible docking run of TSB on 2NAO with Vina. The lysine residues are depicted in pink, the valine residues are depicted as dark blue and the alanine coloured in black. A parallel position relative to the lysine wall can be observed, with both hydroxyl groups and carboxylate group oriented towards the lysine wall. Both pictures were made in SAMSON.

Figure 16 below presents every binding mode obtained by the semi-flexible docking with Vina of TSB to the fibril model of 2NAO generated by CreateFibril v 2.5. The poses show high resemblance between each other with all poses being positioned parallel to the lysine wall, as was seen with the top ranked pose displayed in Figure 15. However, there are some variations regarding the positioning of the molecules along the fibril axis.

Figure 16. Two images depicting every pose generated by the semi-flexible docking run with Vina. The lysine residues are illustrated as pink residues, the valine residues are illustrated as dark blue residues and the alanine residues are illustrated as black residues. A high similarity between the poses and with the top ranked pose, presented above in Figure 14, can be observed.Both pictures were made in SAMSON.

(24)

19 Semi-flexible docking of NSB

The top ranked pose of the semi-flexible docking with Vina of NSB to the fibril model of 2NAO generated by CreateFibril v 2.5 can be seen below in Figure 17. The pose shows a high resemblance to the TSB pose, seen in Figure 15, with the molecule being positioned parallel to the lysine wall. However, with NSB one of the hydroxyl groups are pointing away from the lysine wall. Hydrogen bonding should be favourable, if the scoring function is choosing to have hydroxyl groups pointing away from lysine residues, this might indicate complications of the scoring function for the Vina software.

Figure 17. Two images depicting the top ranked pose for the semi-flexible docking run of NSB on 2NAO with Vina. The lysine residues are depicted in pink, the valine residues are depicted as dark blue and the alanine coloured in black. A parallel orientation relative to the lysine wall can be observed, similar to the pose generated for TSB in Figure 14. Both pictures were made in SAMSON.

Figure 18 below presents every binding mode obtained by the semi-flexible docking with Vina of NSB to the fibril model of 2NAO generated by CreateFibril v 2.5. A similar tendency as with the semi-flexible TSB docking poses, seen in Figure 16, can be observed. All the poses are positioned parallel to the lysine wall, with the hydroxyl and carboxylate groups pointing upwards or downwards. Positional variations along the fibril axis between the molecules can be seen.

Figure 18. Two representation of all nine poses generated by for the semi-flexible docking run of NSB on 2NAO with Vina. The lysine residues are depicted as pink residues, the valine residues are depicted in dark blue and the alanine residues as depicted in black. A strong tendency of the poses being positioned parallel to the lysine wall can be seen. Both pictures were made in SAMSON.

Binding affinities and binding affinity ratios for NSB and TSB

Below in Table 5, are the estimated binding affinities of each NSB and TSB pose and the binding affinity ratio between each TSB and NSB pose with the same rank. It can be observed that the binding affinity ratio being reported by Zhang et al. is not reproduced in the docking results. The ratio between the highest ranked poses are 0.850 with NSB having the highest binding affinity. This indicates that the semi-flexible docking by Vina is not able to reproduce the experimentally derived relative binding affinities between TSB and NSB to 2NAO fibril.

(25)

20

Table 5. Table describing the binding affinities (kcal/mol) for the semi-flexible docking run of TSB and NSB on duplicated 2NAO. The binding affinities are expressed in kcal/mol. The table presents the experimental EC50-value ratios and KD-value ratios (TSB/NSB) reported by Zhang et al. and the binding affinities ratios for the docking runs (TSB/NSB) are also displayed. The values obtained from the docking are inconsistent with the experimental reported values by Zhang et al. for TSB and NSB.

Pose rank/experiment Binding affinity TSB (kcal/mol)

Binding affinity NSB (kcal/mol)

Binding affinity ratio (TSB/NSB) Pose 1 -7.4 -8.7 0.850 Pose 2 -7.3 -8.7 0.839 Pose 3 -7.3 -8.7 0.839 Pose 4 -7.3 -8.7 0.839 Pose 5 -7.3 -8.6 0.848 Pose 6 -7.3 -8.6 0.848 Pose 7 -7.3 -8.6 0.848 Pose 8 -7.3 -8.5 0.858 Pose 9 -7.2 -8.5 0.847 Zhang et al. competition assay (EC50) N/A N/A 47 Zhang et al. fluorescence assay (KD) N/A N/A 20

Flexible docking of NSB and TSB to duplicated 2NAO using Autodock Vina

It was theorized that the reason for the molecules being arranged parallel to the lysine wall might be caused by steric hindering from the lysine residues leaning into the interface. Therefore, side-chain flexibility was incorporated into five lysine residues on five sequential monomers, to allow more space for the molecule in the interface.

Flexible docking of TSB

The top ranked pose of the flexible docking with Vina of TSB to the fibril model of 2NAO generated by CreateFibril v 2.5 can be seen below in Figure 19. In comparison to semi-flexible docking of TSB, seen in Figure 15, a shift in the TSB molecules position relative to the lysine wall can be seen, with the molecule being oriented with a higher degree of perpendicularity towards the lysine wall. The carboxylate groups are pointing away from the lysine wall and the hydroxyl groups are instead interacting with the lysine residues, which is suspicious, since the carboxylate-lysine interactions are stronger than the hydroxyl-lysine interactions.

(26)

21

Figure 19. Two representations of the top ranked pose for the flexible docking of TSB to 2NAO with Vina. The lysine residues are depicted in pink with the Cβ-Cγ bonds of the first five lysine coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. A perpendicular orientation of TSB molecule relative to the lysine wall can be seen with the hydroxyl groups interacting with the lysine wall by hydrogen bonds. An important thing to note is that every pose has its unique conformation of the five lysine residues set as flexible, but in these pictures only the conformation for the flexible residues for pose 1 are shown. Both pictures were made in SAMSON.

Figure 20 below presents every binding mode obtained by the flexible docking with Vina of TSB to the fibril model of 2NAO generated by CreateFibril v 2.5. A similar tendency as the top ranked pose can be seen for multiple poses, with the molecule being oriented perpendicular to the lysine wall. However, some of the poses still show partial parallel orientation relative to the lysine wall (colored pistage, light blue and dark green), with some of the negatively charged side groups pointing upwards. These poses were not the highest ranked by Vina, which is favourable considering their weaker interactions with th lysine, but the Vina software were not entirely able to filter these poses out since they still exhibit a proximal binding affinity, which can be seen in Table 6, to the highest ranked pose which had a perpendicular orientation relative to the lysine wall. Some of the conformations also display carboxylate groups interacting with the lysine wall.

Figure 20. Two representations of every pose generated by the flexible docking of TSB to 2NAO using Vina. The lysine residues are depicted in with the Cβ-Cγ bonds of the first five lysine residues coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. An important thing to note is that every pose has its unique conformation of the five lysine residues set as flexible, but in these pictures only the residues for pose 1 are shown. A strong tendency of the molecules being positioned perpendicular to the lysine wall can be seen. As well some of the molecule’s carboxylate groups can be seen interacting with the lysine wall, which was not observed for the top ranked pose. Both pictures were made in SAMSON.

(27)

22 Flexible docking of NSB

The top ranked pose of the flexible docking with Vina of NSB to the fibril model of 2NAO generated by CreateFibril v 2.5, can be seen below in Figure 21. A similar shift, as with flexible TSB docking, in the molecule’s position relative to the lysine wall can be seen. However, in this case one of the carboxylate groups and of the hydroxyl groups are positioned against the lysine wall, in comparison with the TSB top binding mode where both the hydroxyl groups were positioned against the lysine wall. Still, it seems odd that the scoring function ranks a hydroxyl-lysine interaction as more favourable than a carboxylate-lysine interaction, since it can be theorized that the highest binding affinity pose should have both its carboxylate groups pointing towards the lysine wall. It is also observable that the hydroxyl group is pointing away from the lysine wall, which was also seen with the TSB molecule in Figure 18.

Figure 21. Two representations of the top ranked pose for the flexible docking of NSB to 2NAO with Vina. The lysine residues are depicted in pink with the Cβ-Cγ bonds of the first five lysine residues coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. An important thing to note is that every pose has its unique arrangement of the five lysine residues set as flexible, but in these pictures only the conformation of the flexible lysine residues for pose 1 shown. The pose shows a perpendicular orientation to the lysine wall with one of the carboxylate groups and one of the hydroxyl groups being closest to the lysine wall. However, the hydroxyl group is pointing away from the lysine residues.

Figure 22 visualizes all the poses, in total nine, generated by the Vina run for the flexible docking of NSB to the fibril model of 2NAO generated by CreateFibril v 2.5. A similar tendency can be seen with the molecules being positioned perpendicular to the lysine wall, however, some of the poses still partially orients in a parallel fashion towards the lysine wall, resembling the more unreliable poses seen in the semi-flexible docking (Figure 17-18). The parallel orientation of the molecules is more unreliable since it is deviating from the p-FTAA binding pose presented by König et al and also exhibit weaker interactions with the lysine wall in comparison to the perpendicular orientation. None of the poses with parallel orientation relative to the lysine wall are the highest ranked pose, which is desiring since the docking software should rank these poses the lowest considering their lower reliability. However, these poses have still proximal binding affinity to the top ranked pose, which can be seen in Table 6 below, which means that the scoring function is not entirely capable of filtering these molecules out.

As seen with the top ranked pose, many molecules have at least one of their carboxylate groups pointing away from the lysine wall. This tendency indicates a low performance of the docking software since the carboxylate-lysine interaction should be more favourable than the hydroxyl-lysine interaction.

(28)

23

Figure 22. Two representations of every pose generated by the flexible docking of NSB to 2NAO using Vina. The lysine residues are depicted in pink with the Cβ-Cγ bonds of the first five lysine residues coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. An important thing to note is that every pose has its unique conformationof the five lysine residues set as flexible, but in these pictures only the conformation of the flexible residues for pose 1 is shown. A similar tendency as the TSB run, seen in Figure 20, can be observed with multiple poses having a perpendicular orientation towards the lysine wall, and several poses presenting a partial parallel orientation towards the lysine wall. Both pictures were made in SAMSON.

Binding affinities and binding affinity ratios for NSB and TSB

In table 6, presented below, binding affinities for all nine of the TSB and NSB poses and the binding affinity ratio between the poses with the same rank can be observed. Interestingly, there can be seen an increase in binding affinity relative to the semi-flexible docking with Vina. This might be attributed to the shortened distance between the lysine residues and the carboxylate/hydroxyl groups of the NSB and TSB molecule when the poses are displaying a more perpendicular orientation towards the lysine wall. However, it might also be explained by random variance since the increase in binding affinity is not exceptionally large. Even though some of the poses showed increased resemblance to the p-FTAA pose presented by König et al. in Figure 3, the binding affinity ratios obtained by the docking are still not even near to being close to the 1:47 or the 1:20. factor ratio between TSB and NSB presented by Zhang et al, with the ratios yielded by the docking run still being below 1.0.

Table 5. Table describing the binding affinities (kcal/mol) for the semi-flexible docking run of TSB and NSB on duplicated 2NAO. The binding affinities are expressed in kcal/mol. The table presents the experimental EC50-value ratios and KD-value ratios (TSB/NSB) reported by Zhang et al. and the binding affinities ratios for the docking runs (TSB/NSB) are also displayed. The values obtained from the docking are inconsistent with the experimental reported values by Zhang et al. for TSB and NSB.

Pose rank/experiment Binding affinity TSB (kcal/mol)

Binding affinity NSB (kcal/mol)

Binding affinity ratio (TSB/NSB) Pose 1 -8.3 -9.6 0.865 Pose 2 -8.2 -9.6 0.854 Pose 3 -8.1 -9.6 0.844 Pose 4 -8.1 -9.5 0.85 Pose 5 -8.1 -9.4 0.862 Pose 6 -8.0 -9.4 0.851 Pose 7 -8.0 -9.4 0.851 Pose 8 -8.0 -9.3 0.860 Pose 9 -7.9 -9.3 0.849 Zhang et al Competition assay (EC50) N/A N/A 47 Zhang et al Fluorescence (Kd) N/A N/A 20

(29)

24

Flexible Docking of NSB and TSB to duplicated 2NAO with Autodock4

Since Vina did not reproduce the results obtained by Zhang et al. the AD4 software was tried. However, since the incorporation of side-chain flexibility showed more promising results than the semi-flexible docking, the AD4 docking run was performed with flexible lysine residues.

Flexible Docking of TSB

Figure 23 shows the results for the top ranked pose for the flexible docking with AD4 of TSB to the fibril model of 2NAO generated by CreateFibril v 2.5. The system is set up identically to the flexible system with Vina (Figure 19-22), with the Cβ-Cγ bonds of the first five lysine residues being set as rotatable. As with

the flexible docking with Vina, the TSB pose obtained by AD4 shows a shift in orientation, in comparison to the poses generated by semi-flexible docking, with the pose exhibiting a perpendicular orientation towards the lysine wall. However, in this binding mode, both carboxylate groups of the TSB molecule are oriented towards the lysine wall, indicating a more reliable performance by the AD4 software than with the Vina software, since the carboxylate-lysine interactions are stronger than the hydroxyl-lysine interactions.

Figure 23. Two representations of the top ranked pose for the flexible docking of TSB to 2NAO with AD4. The lysine residues are depicted as pink with the Cβ-Cγ bonds of the first five lysine residues coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. An important thing to note is that every pose has its unique conformation of the five lysine residues set as flexible, but in these pictures only the conformation for the lysine residues of pose 1 is shown. The binding mode display parallel orientation towards the lysine wall, with both carboxylate groups directed towards the lysine wall. Both pictures were made in SAMSON.

Figure 24 visualizes all the poses generated by the AD4 run, in total nine, for the flexible docking of TSB to the fibril model of 2NAO generated by CreateFibril v 2.5. A high variance between the generated binding modes can be seen. Similar to the flexible Vina runs, an increased perpendicular orientation can be seen in all molecules relative to the semi-flexible docking. The poses also display both their carboxylate groups pointing towards the lysine wall instead of the hydroxyl groups pointing towards the lysine wall, which is indicative of more accurate poses than produced by Vina, since the electrostatic interactions are stronger than hydrogen bonding.

(30)

25

Figure 24. Two representations of every pose generated by the flexible docking of TSB to 2NAO using AD4. The lysine residues are depicted in pink with the Cβ-Cγ bonds of the first five lysine residues coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. An important thing to note is that every pose has its unique arrangement of the five lysine residues set as flexible, but in these pictures only the conformation for the lysine residues of pose 1 is shown. The binding modes display parallel orientations towards the lysine wall, with both carboxylate groups directed towards the lysine wall. Both pictures were made in SAMSON.

Flexible docking of NSB

Figure 25 presents the top ranked pose for the flexible docking with AD4 of NSB to the fibril model of 2NAO generated by CreateFibril v 2.5. It can be observed that the position of the molecule is not as flat as the TSB pose in Figure 23, however, the molecule is still positioned with a degree of perpendicularity allowing its carboxylate groups to be positioned proximal to the lysine wall.

Figure 25. Two representations of the top ranked pose for the flexible docking of NSB to 2NAO with Vina. The lysine residues are depicted in pink with the Cβ-Cγ bonds of the first five lysine residues coloured in bright green since those bonds were set as rotatable. The valine residues are depicted in blue and the alanine residues are coloured in black. An important thing to note is that every pose has its unique arrangement of the five lysine residues set as flexible, but in these pictures only the conformation for the lysine residues of highest ranked pose is shown. The pose display an parallel orientation towards the lysine wall, with both carboxylate groups directed towards the lysine wall. Both pictures were made in SAMSON.

Viewing all eight poses in Figure 26 for the flexible docking with AD4 of NSB to the fibril model of 2NAO generated by CreateFibril v 2.5 reveals a high variance in terms of position along the fibril axis between the molecules. However, as with the TSB poses, all carboxylate groups are interacting with the lysine residues, indicating a better performance of the AD4 software in comparison to the Vina software.

Evaluation of ab initio molecular docking for prediction of amyloid-ligand interactions