• No results found

BCR-ABL1 A TYROSINE KINASE ASSOCIATED WITH CHRONIC MYELOID LEUKEMIA

N/A
N/A
Protected

Academic year: 2021

Share "BCR-ABL1 A TYROSINE KINASE ASSOCIATED WITH CHRONIC MYELOID LEUKEMIA"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

Master Thesis in Pharmacy, 30 HP Master of Science in Pharmacy, 120 HP

Report approved: Spring term 2021

BCR-ABL1

A TYROSINE KINASE

ASSOCIATED WITH

CHRONIC MYELOID

LEUKEMIA

Calculation of solvation energies

and electrostatics by APBS and

prediction of 2D and 3D structure

(2)
(3)

Abstract

Introduction

ABL1 tyrosine kinase is non-receptor tyrosine kinase that exist in all vertebrated

organisms. During reciprocal chromosomal translocations between chromosomes 9 and 22 forming the Philadelphia chromosome, BCR and ABL1 genes with location

t(9;22)(q34;q11) fuse resulting in a BCR-ABL1 oncoprotein. The tyrosine kinase activity is constitutively activated in BCR-ABL1 causing the pathogenesis of Chronic Myeloid

Leukemia (CML) arising in the hemopoietic stem cells. Tyrosine kinase inhibitors are central in the therapy of CML. Point mutations in the BCR-ABL1 kinase domain are involved in resistance development against tyrosine kinase inhibitors.

Objective

The main objective of the present work was to calculate the change in solvation free energiesand electrostatic surface potential of BCR-ABL1 kinase domain in active-/inactive state that harbor mutations of clinical importance.

Methods

Calculation of changes in solvation free energies, was performed by APBS. Visualization of molecular electrostatics was done with APBS using the PyMol APBS plug in. The active-/inactive state of the kinase domain were generated by protein homology modeling.

Result and discussion

Significant changes in solvation free energies between different clinically important mutations were detected. The results from calculations of solvation free energies with structures from homology protein modeling, largely support the data from calculation with the PDB structure of 2GQG. Of the analyzed double mutations, more than half were showing negative solvation free energies, suggesting these are stabilizing mutations.

Conclusions

It can be concluded that there were significant differences in solvation free energies and electrostatic surface potentials between some mutants compared to WT. The net

difference in solvation energy between the inactive and active state of kinase domain harboring point mutations generated by homology protein modeling shows that some mutations have significant impact on the solvation energy of the kinase domain during the transition from inactive to active state.

(4)
(5)

Table of contents

Abstract ... 3

Introduction ... 8

ABL1 Tyrosine Kinase ... 8

BCR-ABL1 and CML ... 9

Treatment of CML ... 9

BCR-ABL1 inhibitors ... 9

Active and inactive states ... 10

Mutations leading to resistance ... 10

Protein electrostatics ... 11

Solvation free energies and Poisson-Boltzmann equation ... 11

Objective ... 12

Methods ... 13

Calculation of solvation free energies with APBS ... 13

Visualization of molecular electrostatics ... 14

Homology protein modeling ... 14

BCR-ABL1 protein ... 15

Modeling of the inactive- and active kinase domain ... 15

Results ... 16

Identification of structural domains of ABL1 and BCR proteins ... 16

The BCR-ABL1 fusion protein modeling ... 16

Homology protein modeling of the kinase domain ... 17

The inactive form of the kinase domain ... 17

The active form of the kinase domain ... 18

Solvation free energies ... 20

Electrostatic surface potential ... 23

APBS calculation for double mutations in 2GQG structure ... 24

Discussion ... 26

Methods and results discussion ... 26

(6)
(7)
(8)

Introduction

ABL1 Tyrosine Kinase

ABL1 tyrosine kinase belong to a highly conserved ABL-family of protein kinases (ABL, from Abelson Murine Lymphosarcoma virus), which are non-receptor tyrosine kinases. All vertebrate organisms contain the ABL gene as well the paralog ARG gene (ABL1 reference gene), named ABL2, related to ABL gene by sequence similarity) [1]. The main function of the ABL1 protein is phosphorylation of tyrosine residues in target proteins. The molecular structure of the ABL1 protein consists of SH3-SH2-TK (Src (from sarcoma) homology 3-Src homology 2-tyrosine kinase (TK)) domains responsible for autoregulated kinase activity [1, 2]. At the carboxy (C) -terminal is found the actin-binding domain (ABD) containing elements for binding to DNA and cytoskeletal components such as F-actin, G-actin and microtubules [1]. In addition, the carboxy-terminal contain three nuclear localization signals (NLS, with independent function), which regulate the entry of ABL1 protein into the nucleus [2]. Additionally, the ABL1 molecule contains a nuclear export signal (NES), that binds to exportin-1 and thus facilitating the active export of ABL1 out of nucleus [2]. The DNA-binding domain consist of three HLB (HMG (high mobility group)-like box) regions: HLB1, HLB2 and HLB3, which function cooperatively and mediate the interaction between the ABL1 protein with A/T-rich DNA segments [2]. The ATP binding site is located at position K271 and the proton acceptor site is at the position D363 (catalytic segment), Appendix 1 (sequence alignment). The normal ABL1 protein tyrosine kinase (inclusive human ABL1) contains all, above described, domains. A myristoyl group is attached to the amino-terminal glycine of the ABL1 molecule and is essential for kinase autoinhibition. The myristoyl group is matching a surface pocket in the kinase domain of the C-lobe [1]. The inserting of the myristoyl group into the kinase pocket trigger a conformational change leading to docking of the SH3-SH2-unit onto the back of the kinase domain that increases the rigidity of the structure. The final docking of the SH2-unit “inside” the kinase domain leads to the ABL1 autoinhibition [3]. On the amino-terminal segment of the c-ABL1 (c- for cell), there is a so called “cap” sequence that stabilize the inactive conformation through surface interactions [3].

Tyrosine kinase activation is suppressed by cooperative action of SH3-SH2-domains, the myristoyl group and the “cap” sequence [4]. Changes leading to disruption of the above described autoinhibitory interactions result in increased tyrosine kinase activity of the ABL1 protein. Phosphorylation in the amino-terminal within the kinase domain via tyrosine phosphorylation leads to ABL1 activation and downstream signaling [1]. This activity is negatively regulated by tyrosine phosphatases in vivo [5].

Previously, it was considered that the combination of both ABL and ARG kinase functions were essential mainly during embryogenesis [2]. Later, it was reported that selective BCR-ABL1 (BCR from Breakpoint Cluster Region protein) inhibitors caused teratogenic effects on fetuses, thus indicating that ABL activity was important during this period as well. Recently, missense mutations resulting in substitutions Tyr245Cys and Ala356Thr in the ABL1 protein (recurring in both isoforms: 1a and 1b) were found to cause

(9)

BCR-ABL1 and CML

Activation of ABL genes as a result of chromosome translocations cause various hematopoietic malignancies [1], [9]. Chromatin structural elements, microhomologies and scattered repeat sequences in human chromosomes 9 and 22 contribute to reciprocal chromosomal translocations resulting in Philadelphia chromosome (Ph) [1]. As a result of chromosome translocation of BCR and ABL1 genes t(9;22)(q34;q11) the BCR-ABL1 fusion gene product (p210) is formed [1], [9]. The tyrosine kinase activity is constitutively

activated in BCR-ABL1 causing the pathogenesis of Chronic Myeloid Leukaemia (CML) [9]. Other examples of hematopoietic malignancies caused by chimeric ABL1 proteins are NUP214-ABL1 (NUP214 from Nuclear Pore complex 214) fusion associated with a rare subtype of B-cell precursor Acute Lymphoblastic Leukemia (ALL), inclusive childhood leukemia [10], EML1-ABL1 (EML1 from Echinoderm Microtubule-associated protein-Like1) described in patients with acute T-cell lymphoblastic leukemia [11] and other leukemias resulting from ETV6-ABL1 (ETV6 from translocation-Ets-leukemia Virus) [12]. A BCR-ABL1, p190 fusion protein is found in 5 % childhood and ca 30 % of adult ALL, while p230 is associated with chronic neutrophilic leukemia [1], [13]. Other mutations causing CML prior the formation of BCR-ABL1 oncoprotein, have been recently described as important cofactor in clonal evolution of CML [14].

BCR-ABL1 Protein Tyrosine Kinase cause chronic myeloid leukemia (CML) arising in the hemopoietic stem cells (HSC) [9]. The BCR-ABL1 gene is present in all cases of CML, therefore detecting the gene using fluorescence in situ hybridization (FISH) and quantitative reverse transcription PCR (RT-qPCR), along with karyotyping for Ph+ are used for confirmation of a CML diagnosis [9] [15]. CML can be divided into three phases based on progression of the disease. The first, is the chronic indolent phase (CP) that lasts, without interventional treatment, between 3 and 5 years. The second phase is the accelerated phase (AP) (blast cell count in the peripheral blood 10-20%) and the third phase is characterized by aggressive blast crisis (BC) (blast cell count >20%) [9] [15]. In Sweden the incidence of CML is ca 100 adults per year, generally 59 years of age, where, 9 of 10 patients are in the first, chronic phase without serious symptoms [15]. Nevertheless, blood analysis can reveal elevated white blood cells in the peripheral blood and some patients can have enlarged spleen. The phenotype of BC can be myeloid or lymphoid often in a ratio of 2:1, or in some cases, of mixed type [9] [15].

Treatment of CML

In Sweden the treatment regiments of CML are formed in accordance with therapy recommendations “Kronisk myeloisk leukemi. Nationellt vårdprogram. 2019-06-25.

Version 2.”[15]. Tyrosine kinase inhibitors (TKI) are central in the therapy of CML,

Appendix 2. CML patients which are in the blast crisis have a poor prognosis and treatment with conventional cytostatic therapy has shown to increase median survival with 4–8 months. Only with allogeneic stem cell transplantation it is possible to achieve long-term survival. Therefore, patients in accelerated phase should be considered for allogeneic stem cell transplantation if they do not respond optimally to TKIs. Allogeneic stem cell transplantation is prioritized when detecting the T315I mutation as well [15]. Unfortunately, the TKI therapy is not curative and some patients treated with imatinib, a first-generation inhibitor, can develop resistance. The resistance is mainly due to point mutations and in a lesser degree to BCR-ABL-independent mechanisms such as increased active drug efflux and activation of alternative onco-pathways [16]. Therefore, a second, and a third generation of TKIs was developed and approved in order to overcome resistance issues, without negative change in binding capacity or diminished specificity [16]. Today, new approaches and treatment combinations for CML are tested in clinical trials [17].

BCR-ABL1 inhibitors

(10)

Sweden there are several clinically approved TKIs that target the catalytic ATP binding site of ABL1 used to inhibit the constitutively active BCR-ABL1 [18]. These compounds are Imatinib (Glivec®), Dasatinib (Sprycel®), Nilotinib (Tasigna®), Ponatinib

(Iclusig®) and Bosutinib (Bosulif®) [15]. Asciminib is an allosteric inhibitor of the tyrosine kinase activity of BCR-ABL1 [13, 18]. Asciminib in combination with nilotinib were shown to eradicate resistant (resulting from single-agent treatment) CML xenograft tumors without recurrence after the cessation of treatment [18].

Under the past two decades new small molecule protein kinase inhibitors have been developed and more than 48 has been approved by FDA [19]. The classification of these compounds based on the structure of the enzyme-bound inhibitor complex was divided into three types I, II (A and B) and III [20]. A new, more detailed, classification was proposed recently [20]. For simplicity, only TKI that obtained registration in Sweden are shown below:

• Type I - dasatinib and bosutinib;

• Type II (subtype A) - imatinib, nilotinib and ponatinib; • Type II (subtype B) – asciminib.

Active and inactive states

Philadelphia (Ph, after the city where it was first discovered) chromosome was first observed in the bone marrow cells of CML patients in 1960 by Nowell and Hungerford, while Rowley discovered that translocation between chromosomes 9 and 22

(t(9;22)(q34;q11)) lead to formation of Ph chromosome in 1973 [4]. Later, it was

established that the long arm of the chromosome 22 contains a breakpoint cluster region,

BCR (DNA breaks occur in this region) [4], [21]. After Ph formation, the part of ABL1

gene containing the myristoyl N-cap (responsible for stabilization of the inactive form) is missing in the BCR-ABL1 fusion protein, thus rendering constitutive activation of the kinase domain [22].

The BCR-ABL1 protein consist of two highly conserved domains, where the ATP-binding site is positioned between the lobes. The kinase domain contains the phosphate binding loop (P-loop) and the activated loop (A-loop) that can be phosphorylated and leading to change in protein conformation responsible for the transformation from active- to inactive form and vice versa. A highly conserved aspartate-phenylalanine-glycine (DFG) motif positioned at the N-terminal end of the A-loop is an important regulatory

mechanism for the majority of kinases in general, and plays an important role in

determining the active-/inactive form of BCR-ABL1 and, thus, binding of TKIs [22]. The inactive conformation based on DFG position is “DFG- out/αC-helix-in, A-loop closed” and the active is, consequently, in a reverse way. The transition between active- to inactive state goes through a range of intermediate states rather rapidly (ca tenth of a millisecond) [23].

Mutations leading to resistance

Mutations leading to resistance in CML patients are known to occur in all parts of the BCR-ABL1 kinase domain such as in the P-loop, C-helix, ATP-binding pocket, catalytic segment, A-loop and C-term [16], [24], [25], [26]. Single, double or poly-BCR-ABL1 mutants for imatinib, dasatinib, ponatinib were mapped and described previously [24], [26]. Ponatinib have been used against the “gate keeper” residue mutation T315A after failure with first and second generation TKIs in CML patients [23]. Today there is enough data indicating that, developing at least one mutation against particular TKI, increases the risk for resistance [16]. One or more mutations in the BCR-ABL1 kinase domain leads to non-optimal outcome in the CML patients [27]. The IC50 kinase activity data (cellular

(11)

Based on these data, it can be concluded that, alongside to the gatekeeper mutant T315I which is of concern as a strong driver to imatinib resistance development, mutations such

as 317V

Intensive research is going on in order to understand the role of mutations located in the kinase domain of the BCR-ABL1 protein in the acquisition of resistance to TKI. Several of these mutations were shown to cause conformational changes, modification of the

binding energies (to ligand), loss of hydrogen bonds in important regions, such as binding pockets, catalytic regions, activation loop etc., resulting in lower efficacy of TKI

treatments and development of resistance [28], [29], [30], [31]. In order to obtain kinase structures or conformations of interest, as well as generating different mutations, the homology (template based) modeling is usually employed [32].

Protein electrostatics

Little information is available today on how point mutations affect BCR-ABL1 kinase domain, in particular its protein stability (often denoted as thermodynamic stability, ΔΔG). The effects of point mutations on thermodynamic stability of proteins/enzymes was shown to be: destabilizing at ΔΔG>0 kcal/mol (ca 70% of all mutations) or ΔΔG>1 kcal/mol (43 % of all mutations); significantly destabilizing ΔΔG>3 kcal/mol (ca 15 % of all mutations) and stabilizing exhibit ΔΔG<-1 kcal/mol [33], [34], [35], [36], [37], [38]. The effects of point mutations (leading to resistance development to TKI) on

thermodynamic stability of the BCR-ABL1 kinase domain is of interest and can be investigated with the help of measuring the changes in electrostatic solvation (ES) free energy and by solving the finite-difference with Poisson-Boltzmann (FDPB) methods [39], [40]. There is little known about how the point mutations influence the electrostatic potential on the kinase surface. The changes in the electrostatic solvation free energies are due to the changes in the electrostatic (ES) potential, so the examination of the ES

potential is important. Visual representation of electrostatic surface potential can be done by continuum electrostatics [41].

Electrostatics, in general, are very important for biomolecules, via conferring unique structures, that are significant for folding that affect various molecular properties.

Electrostatics interactions between biomolecules exist in all living organisms and occur in all biophysical processes, such as protein-protein associations, catalytic activity, ligand binding, intra- and intercellular transport processes etc. [40]. Solvation properties and electrostatic interactions are therefore of importance because of their long range nature of interaction within-, or between different biomolecules or other biopolymer components [42].

Solvation free energies and Poisson-Boltzmann equation

With the help of calculations derived from finite difference solutions of

(12)

(1)

where:

ϕ -electrostatic potential;

ε - dielectric constant describing the solvent;

M- mobile ion species, described by their charges q

i

, concentrations c

i

, as well as V

i

steric ion-solute interaction potential incorporating the molecular structure into the PB

equation;

i -ion type;

c

i

-the bulk number density of ion type i;

q

i

-point charges;

e

- charge of ion type i;

ρ–

atomic charges; is a sum of Dirac delta distribution located at atomic centers;

β=(kT)-1 is the inverse thermal energy, where k is the Boltzmann constant and T is the

absolute temperature [45].

PB electrostatic calculations are today utilized in the design of peptidic- and

peptidomimetic inhibitors. This is especially useful for proteins that are excessively charged in the catalytic regions and where the ligand binding is largely based on

electrostatic complementarity [46]. It was reported that calculations performed with PB with the DelPhi program generated results in high agreement with experimental data [40]. Electrostatic contribution to solvation free energy (ΔGsolv) showed excellent

agreement with results from microscopic solvent simulations (levels of hydrogen bonding interactions) [40]. A consolidated computational framework has recently been developed to investigate electrostatic interactions and compare electrostatic potentials across families of proteins (resulting in a library) in order to study the electrostatic structure of proteins [47]. The evaluation of electrostatics requires an atomic-resolution three-dimensional structure of a protein. The structures can be obtained from the protein data bank (PDB) (generated by X-ray crystallography or nuclear magnetic resonance (NMR)) and for proteins lacking an experimentally determined structure or with manipulated structure (missing loops, or other segments, mutants etc.), the homology modeling by sequence alignments using templates (with known two- and three dimensional structure ) is usually employed [46], [48].

Objective

The purpose of this study is to calculate the solvation free energy and electrostatic field potential surrounding the kinase domain of the BCR-ABL1 protein and

1) compare the electrostatic solvation free energy between wild type and different point mutations of clinical importance in the kinase domain,

2) which is in its active- and inactive states.

(13)

Methods

In order to make feasible to understand all steps in the present study, the projects workflow is shown figure 1:

Figure 1. Workflow of the present project.

The molecular structures determined by X-ray diffraction techniques 2GQG- active BCR-ABL1 kinase domain bounded to dasatinib [49], and 2HYY - inactive kinase domain bound to imatinib (STI571, Glivec) [50] were selected from RCSB Protein Data Bank (PDB). Dasatinib and imatinib, as well as ions, water and other small molecules were removed from the structure and one chain only was used in the further analysis and calculations. The latest step was done in PyMol software – “a user sponsored molecular

vizualization system on an open source foundation, maintained and distributed by Schrödinger,” https://pymol.org/2/. The point mutations of interest were done by mutagenesis function in PyMol in a wild type structure (WT). Following mutations were prepared: M244V; L248V; G250E; Q252H; Y253F; Y253H; E255K; E255V; D276G; E279K; V299L; V304D; F311L; T315I; T315A; T315M; F317L; F317V; M351T; F359V; V379I; L384M; L387M; H396R; H396P; F486S. The double mutants were:

T315I/Y253H; T315I/V299L; T315I/V304D; Y253H/V299L; V304D/Y253H; V304D/V299L; T315I/F317L; V299L/F317L; V304D/F317L; T315I/G250E; Y253H/E255V; T315I/E255V; T315I/F359V. The energy minimization in PyMol is rudimentary, therefore an relative energy minimization for each mutants is achieved during PDB2PQR preparation, where a “debump” option provides removing of steric clashes between residues [44]. Further energy optimization (minimization) is performed by APBS based on the integrated APBS/CHARMM module [51]. After mutagenesis step, the structures were saved in individual PDB file format that later were used for

calculation of surface electrostatic potential as well in calculations of solvation free energies.

Calculation of solvation free energies with APBS

(14)

assign charge and radius parameters from different force fields [53]. In this study, the parameters selected for generating of a PQR file are shown in the Appendix 4.

The resulting PQR files were used for APBS calculations of solvation free energies. APBS solves the equations of continuum electrostatics for large biomolecules inclusive proteins [44, 54]. After the APBS program was downloaded an input file was generated (originally based on the Born ion model) called “solvprot.in” [54], [55]. The grid spacing was 0,40 Å as it showed consistent results when different grid sizes were tested (0,39 Å, 0,40 Å and 0,41 Å). All the calculations of solvation free energies were performed in APBS by solving non-linearized Poisson-Boltzmann equation with 1:1 0,15 M NaCl concentration and ion charge radius of 2,0. The solute (molecule) dielectric was 1 and the solvent dielectric constant was set to 78,54 (for water), the temperature was set to 298,15 K; molecular surface with a solvent probe radius of 1,4 Å defined for dielectric boundary, see Appendix 5 for details.

The APBS software solves the Poisson-Boltzmann equation for the calculation of solvation free energies and electrostatics, se equation 1 [44]. The resulting data, obtained in

kJ/mol, was further used for calculation of solvation energies after multiplying with a factor of 0,239, in order to convert kJ/mol into kcal/mol. Changes in solvation free energies were calculated according equations:

ΔΔGsolv *=ΔGsolv mutation - ΔGsolv WT (2)

ΔΔG corrected= ΔΔGsolvwt→mutation+** (ΔGsolv amino acid A - ΔGsolvamino acid B (3)

*Energy of solvation calculated in APBS.

** Solvation free energies of amino acid side chains with PARSE parameters (in

kcal/mol); where A is the original amino acid and B is the replacing amino acid in a point mutation (data retrieved from literature, [39]), calculated using the finite-difference Poisson-Boltzmann (FDPB) method [39]. The correction of solvation free energies is done in order to take in an account the side chain solvation energies in accordance with the literature data [39]. The calculations of the raw data and building of diagrams were performed in Microsoft Excel.

Visualization of molecular electrostatics

The latest version of PyMol software, PyMol-2.4.0 (https://pymol.org/2/) was used to calculate APBS, generate an electrostatic map and visualize the electrostatic surface potential of biomolecules, in this case the kinase domain (KD) of the BCR-ABL1 protein. In the study, the active- and inactive forms ABL1 kinase domain (2GQG and 2HYY - used to generate the active- and respectively inactive form of ABL1 kinase domain), WT as well as different mutations such as E255K, E255V, G250E and V304D (causing the largest changes in solvation free energy) were used for visualizing of molecular electrostatics using PyMol APBS plug in.

Homology protein modeling

The identification of ABL1 protein kinase domains (P00519 (UniProtKB/Swiss-Prot)) was done by search in PROSITE and PFAM databases.

Prediction of protein secondary

structure and transmembrane regions was performed with PSIPRED Workbench.

The FASTA sequence (P00519) of ABL1 from UniProtKB was used in PSIPRED Workbench for predicting secondary structure of the protein. The following analysis functions were used

[56] [57]

:

• PSIPRED 4.0 (Predict Secondary Structure);

• MEMSAT-SVM (Membrane Helix Prediction);

• DISOPRED3 (Disopred Prediction);

• DomPred (Protein Domain Prediction);

(15)

BCR-ABL1 protein

The prediction model of the entire molecule of BCR-ABL1 was prepared by using SWISS-MODEL, ExPASy server, from 50 templates 3 were chosen with the highest QMEAN and template sequence identity (more than 98,40%). There were three separate models (molecular segments) obtained for the BCR-ABL1 protein from SWISS-MODEL because this protein is large. There was not possible to generate the entire molecule model for BCR-ABL1 protein by using SWISS-MODEL (only separate domain models were

obtained). Therefore, a new attempt was done, by submitting the sequence of BCR-ABL1 to CPHmodels-3.2 Server for homology modeling of the consolidated 3D structure of the fusion protein (Center for Biological Sequence Analysis, DTU Health Tech (Technical University of Denmark)) [48].

Modeling of the inactive- and active kinase domain

The homology protein modeling of the BCR-ABL1 kinase domain (KD) was performed by using SWISS-MODEL an automated protein structure homology-modeling server (also accessible via ExPASy). Following steps of homology protein modeling procedure were performed:

• In the first step the template recognition and initial alignment correction was done using UniProt. The target sequence of the kinase domain of BCR-ABL1 kinase (aa 242-493) was obtained from the UniProt Knowledgebase UniProtKb databank (P00519). After a search with BLAST against the database resulted in a list of known protein structures that were presented in Pfam with UniProt

residues, PDB ID, PDB chain ID and PDB residues (all have the corresponding 3 D structures). The UniProt residues 242-493 with PDB ID 2HYY were used for templates modeling of an inactive conformation of the KD, whereas 2GQG was used as a template for the active conformation structure.

• In the second step the backbone generation, loop modeling and side chain modeling were performed all in one using the server SWISS-MODEL.

(16)

Results

Identification of structural domains of ABL1 and BCR proteins

Search results from PROSITE database on sequence P00519 (UniProtKB/Swiss-Prot) identified following domains of the ABL1 protein kinase: SH3 (61-121), SH2 (127-217) and protein kinase domain (242-493). The search in the PFAM database resulted in more detailed structure and identified the domains: SH3_1 (67-113), SH2 (127-202),

Pkinase_TYR (242-493) and the final domain F-actin binding domain (1025-1130). The FASTA sequence (P00519) of ABL1 from UniProt was used in PSIPRED Workbench for predicting secondary structure of the protein. MEMSAT-SVM identified N-terminal as extracellular with a pore lining region at 422-437. Putative domain boundaries were predicted at locations 241 and 496. The knowing of the exact boundaries of the kinase domains were important for further work when generating the active- and inactive form of the BCR-ABL1 kinase domain. Prediction of native disorder of the ABL1 protein

identified high disordered regions (glycosylation regions) from residue 500. Identification of the BCR protein domains was performed by using PROSITE and PFAM databases. There were found four PROSITE hits in the BCR (Breakpoint cluster region protein or Renal carcinoma antigen) sequence (P112771). These are the domains DH (498-691), PH (708-866), C2 (893-1020) and RHOGAP Rho GTPase-activating proteins domain profile (1068-1220).

The BCR-ABL1 fusion protein modeling

In an attempt to generate the consolidated molecule of BCR-ABL1 fusion protein the segment of BCR aa 1-927 and ABL1 aa 46-1149 were submitted to CPHmodels for generating a secondary structure model and for generating a single PDB file for all domains of the molecule. The input data was in FASTA format and following amino acid residues (aa) were submitted: BCR aa 1-927 and ABL aa 46-1149 (for p210). No

consolidated 3D structure of BCR-ABL1 could be obtained, because of the large input segment, nevertheless the major domains could be modeled individually. The

oligomerization domain (CC-from coil-coil), DH and PH domains could be modeled using of CPHmodels-3.2 Server. The schematic structure for BCR-ABL1, p210 is shown in figure 2.

Figure 2. The modeling results for BCR-ABL1 (p210 isomer) using of CPHmodels-3.2 Server (F-actin binding domain is not showed). CC- coil-coil domain; DH- Dbl homology domain; PH- Pleckstrin Homology domain; SH3- and SH2- (Src (from sarcoma)

homology 3-Src homology 2-tyrosine kinase (TK)) domains.

CC DH PH

BCR aa 1-927 ABL aa 46-1149

SH2

(17)

Homology protein modeling of the kinase domain

The inactive form of the kinase domain

The structure prediction of the inactive form of the BCR-ABL1 kinase domain was based on the 2HYY, as a template, obtained from PDB database. The structure of inactive, unphosphorylated form of BCR-ABL1 kinase domain with imatinib in the binding pocket (figure 3 A, yellow arrow) is presented in the figure 3 A. The model shows high quality as deduced from QMEAN of 0,32; GMQE of 0,99, normalized QMEAN4 score |Z-score|<1 and 100% sequence identity and coverage, figure 3 B. The SWISS-MODEL generates Ramachandran plots automatically and the results from this quality estimation is presented, figure 3 C.

Figure 3. A)

The structure of inactive, unphosphorylated form of BCR-ABL1 kinase domain with imatinib in the binding pocket, 2HYY from PDB database; B) the model shows high quality as deduced from QMEAN of 0,32; GMQE of 0,99, normalized

QMEAN4 score |Z-score|<1 and 100% sequence identity and coverage; C) Ramachandran plot generated by SWISS-MODEL.

90°

A

B

(18)

The active form of the kinase domain

The structure of the active form of the BCR-ABL1 kinase domain was based on the 2GQG as a template selected from PDB database. In the figure 4 A, the structure of active form is shown. The model shows high quality as deduced from QMEAN of 0,07; GMQE of 0,99, normalized QMEAN4 score |Z-score|<1 and 100% sequence identity and coverage, figure 4 B. The Ramachandran plot is shown with results from the quality estimation, figure 4 C.

Figure 4. A) The structure of the active form of the BCR-ABL1 kinase domain was based on the 2GQG from PDB database.; B) the model shows high quality as deduced from QMEAN of 0,07; GMQE of 0,99, normalized QMEAN4 score |Z-score|<1 and 100% sequence identity and coverage; C) Ramachandran plot generated by SWISS-MODEL is shown with results from the quality estimation.

The results from homology protein modeling are presented in the figure 5. Both the active- (A) and inactive (B) form models of the BCR-ABL1 kinase domain, have the same sequence length of aa 242-493 and exact amino acid composition, as shown below:

ITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYI ITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLS RLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYR MERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAF

The alignment of both structures shows opposite conformational position of the A-loop, as shown in the figure 5 C.

A

B

(19)

Figure 5. A) showing the inactive form of the protein kinase domain homology model based on 2HYY (“DFG- out/αC-helix-in, A-loop closed”, [23]); B) the active form of the protein kinase domain homology model based on 2GQG C) the alignment of both

structures in PyMol, in red - active and in green the inactive state). The residues used for kinase domain modeling were 242-493 for both models.

The resulting structures of the active- and inactive forms of the BCR-ABL1 kinase domain were further used in generating a number of mutants by mutagenesis function in PyMol software. The solvation free energies were calculated using Poisson-Boltzmann equation by using the Adaptive Poisson-Boltzmann Solver (APBS) software. Analysis of molecular electrostatics, such as electrostatic surface potentials was performed using the PyMol APBS plug in, as described in methods.

A

B

(20)

Solvation free energies

The results from calculations of solvation free energies using Poisson-Boltzmann equation methodology in APBS & PDB2PQR are presented in table 3 and figure 6. Initially, APBS calculations were performed with the kinase domain structure 2GQG (active state) from PDB. Of the total of 26 mutations, 19 showed a negative value of

ΔΔGcorr. These are M244V, L248V, Q252H, Y253F, Y253H, E255K, E255V, D276G, F311L,

T315I, T315A, T315M, F317L, F317V, F359V, L387M, H396R, H396P, F486S.

The solvation free energies data, that was not corrected for taking in considerations the solvation energies of amino acid chains (kcal/mol), in this case, the amino acid from the kinase domain, that have been mutated (substituted) with another one (see equation 2), showed, prevalently negative- or near zero values for ΔΔG, table 3. For mutations D276G, E279K, L384M positive ΔΔG values were obtained. The change in solvation free energies expressed as ΔΔGcorr (ΔΔGcorrected, kcal/mol) are used in figure 6. After correction of the

data (as described previously, see equation 3) the mutations followed the similar pattern, except for G250E and V304D showing a switch from significant negative to positive ΔΔGcorr values, table 3. The results for G250E correlated very well with the change in

electrostatic surface potential, compared to WT, see figure 7.

Interesting results were obtained for mutations E255K (ΔΔGcorr=-98.11 kcal/mol), E255V

(ΔΔGcorr=-92.11 kcal/mol) and D276G (ΔΔGcorr=-34.02 kcal/mol) that showed large

decrease in ΔΔGcorr. The negative changes in solvation free energies (-ΔΔGcorr) correlated

with substituting of an acidic and negatively charged (at pH7) amino acid such as glutamate (E) and aspartate (D) to a basic positively charged lysine (K) and aliphatic, hydrophobic (non-polar) valine (V) and glycine (G), table 3, figure 6, [33]. Though, it is not known entirely that these mutations increase thermodynamic stability, other factors could play role in the large negative changes in solvation free energies (-ΔΔGcorr). The rest

of mutations caused lesser changes in solvation energy, as shown in figure 6.

The results from calculations of solvation free energies with structures obtained by homology protein modeling, namely the kinase domain in active state and in inactive state, support the data from calculation with 2GQG- PDB structure, table 3 and figure 6. The net difference, calculated as change in solvation free energy (ΔΔGcorr, kcal/mol)

(21)

Table 1. The change in solvation free energies, for wild type and different mutants of 2GQG, as well as the active-and inactive form of BCR-ABL1 kinase domain obtained by protein homology protein modeling.

Mutation ΔΔG,

kcal/mol ΔΔG

corrected

kcal/mol Physical properties of the amino acid residues involved in the mutagenesis*

2GQG Active Inactive 2GQG Active Inactive WT

M244V 0.25 -2.00 -3.95 -3.52 -5.77 -7.72 neutral-polar/aliphatic hydrophobic

L248V -2.24 -1.07 0.13 -2.24 -1.07 0.13 hydrophobic/aliphatic, hydrophobic

G250E -45.01 -22.96 -34.85 36.83 58.88 46.99 hydrophobic/acidic, charged Q252H -3.45 -0.68 -7.14 -2.63 0.14 -6.32 neutral-polar/basic

Y253F 1.00 -1.57 -1.82 -4.67 -7.24 -7.49 aromatic/hydrophobic

Y253H -6.13 -19.91 -9.73 -2.20 -15.98 -5.80 aromatic/basic

E255K -89.76 -125.75 -50.19 -98.11 -134.10 -58.54

acidic, charged/basic, charged

E255V -10.27 -22.22 -5.49 -92.11 -104.06 -87.33 acidic, charged/aliphatic, hydrophobic

D276G 49.12 36.78 30.50 -34.02 -46.36 -52.64 acidic, charged/hydrophobic E279K 11.17 -17.63 -17.63 2.82 -5.45 -25.98 acidic, charged/basic, charged

V299L 0.60 2.90 -0.60 0.60 2.83 -0.60 aliphatic,

hydrophobic/hydrophobic

V304D -35.82 -11.37 -2.28 47.32 71.77 80.86 aliphatic,

hydrophobic/acidic,charged F311L 1.05 1.33 0.73 -2.19 -1.91 -2.51 hydrophobic/hydrophobic T315I 0.71 3.36 3.77 -6.45 -3.80 -3.39 neutral-polar/hydrophobic T315A 0.57 1.93 1.31 -6.59 -5.23 -5.85 neutral-polar/hydrophobic T315M -2.70 -1.71 2.30 -6.09 -5.10 -1.09 neutral-polar/neutral-polar F317L 0.24 3.34 1.70 -3.00 0.10 -1.54 hydrophobic/hydrophobic F317V 0.89 2.41 2.54 -2.35 -0.83 -0.70 hydrophobic/aliphatic, hydrophobic M351T -2.66 -1.40 -5.85 0.73 1.99 -2.46 neutral-polar/neutral-polar F359V -2.65 0.48 3.44 -5.89 -3.72 0.20 hydrophobic/aliphatic, hydrophobic V379I 2.88 -1.84 -0.18 2.88 -1.84 -0.18 aliphatic, hydrophobic/hydrophobic L384M 5.74 -0.01 -3.50 9.51 3.76 0.27 hydrophobic/neutral-polar L387M -14.94 -4.59 -4.70 -11.17 -0.82 -0.93 hydrophobic/neutral-polar H396R -63.31 -34.33 -23.47 -6.70 22.28 33.14 basic/basic, charged

H396P 0.60 27.80 7.33 -12.24 14.96 -5.51 basic/conformational, cyclic

F486S -6.91 -6.84 -5.33 -2.70 -2.63 -1.12 hydrophobic/neutral-polar

(22)

Figure 6. Solvation free energy calculations derived from finite difference solutions of Poisson-Boltzmann equation by APBS for wild type and different mutants. Serie 1 – the active state (generated by homology protein modeling); Serie 2 – the inactive state (generated by homology protein modeling); Serie 3 –2GQG (PDB).

The net difference of solvation free energy between the inactive-active states of the kinase domain in presented in the figure 7.

Figure 7. The net difference, calculated as change in solvation free energy (𝚫𝚫Gcorr,

kcal/mol) during transition from an inactive- to active state of the kinase domain harboring point mutations generated by homology protein modeling.

-150 -100 -50 0 50 100 150 𝚫 𝚫 G corr , k ca l/ m ol

Solvation Free Energy

Serie1 Serie2 Serie3

-1,95 1,2 -11,89-6,46 -0,25 10,18 75,56 16,73 -6,28 -20,53 -3,43 9,09 -0,6 0,41-0,624,01-1,640,13-4,453,92 1,66-3,49-0,11 10,86 -20,47 1,51 -40 -20 0 20 40 60 80 100 𝚫 𝚫 G cor r, k ca l/ m ol

(23)

Electrostatic surface potential

The structure of the active BCR-ABL1 kinase domain PDB entry 2GQG, as well as different mutations generated by mutagenesis tool in PyMol software was used in the analysis of molecular electrostatics using PyMol APBS plug in, figure 8. WT compared to mutations E255K, causing a large negative change as well as G250E causing a large positive change in solvation free energy were selected for analysis and visualization of molecular electrostatics. The electrostatic potential is presented on the molecular surface as Solvent Accessible Surface (range: ± 5,0 kT/e). Mapped electrostatic surface potentials are shown as following: the neutral charges are depicted in white; positive charges in blue and negative charges in red. The arrow points at the surface showing significant change of the electrostatic surface potential. The results show that the active state has more positive charges compared to the inactive kinase domain.

Mutant /state WT E255K G250E 2GQG active Inactive state

Figure 8. The influence of different mutations on electrostatic surface potential of 2GQG, as well as on the active-and inactive form of BCR-ABL1 kinase domain obtained by protein homology protein modeling. Mutations E255K, showing the largest negative changes in solvation free energies (ΔΔGcorr) and G250E showing the largest positive

changes in solvation free energies, (ΔΔGcorr) were compared to WT. The arrow points at

the surface showing significant change of the electrostatic surface potential. Electrostatic potential presented on the molecular surface as Solvent Accessible Surface (range: ± 5.0 kT/e. The mapped electrostatic surface potentials are shown as following: the neutral charges are shown in white; positive charges in blue and negative charges in red. Plot structure and electrostatic potential were generated by PyMol APBS plugin. The ΔΔGcorr

for each of the mutants is given in kcal/mol.

ΔΔGcorr = -58.54 kcal/mol ΔΔGcorr = 46.99 kcal/mol

ΔΔGcorr = 36.83 kcal/mol

(24)

APBS calculation for double mutations in 2GQG structure

The effect of double mutations on solvation free energy was calculated as well, see table 2 and figure 9. The changes in amino acid composition are shown in the table 2.

Table 2. The change in solvation free energies for wild type and double mutants for BCR-ABL1 kinase domain of 2GQG (PDB, active form).

Mutation ΔΔG,

kcal/mol ΔΔG

corrected

kcal/mol Physical properties of the amino acid residues involved in

the mutagenesis*

T315I/Y253H -5.15 -8.66 neutral-polararomatic/basic/hydrophobic and T315I/V299L

1.00 -5.86 neutral-polaraliphatic, hydrophobic/hydrophobic/hydrophobic and

T315I/V304D

-35.75 40.87

neutral-polar/hydrophobic and

aliphatic, hydrophobic/acidic,

charged

Y253H/V299L

-5.34 -1.61

aliphatic, hydrophobic/acidic,

charged and aliphatic,

hydrophobic/hydrophobic

V304D/Y253H

-41.57 45.11 aliphatic, hydrophobiccharged and aromatic//basicacidic,

V304D/V299L

-35.73 47.91

aliphatic, hydrophobic/acidic,

charged and aliphatic,

hydrophobic/hydrophobic

T315I/F317L 0.96 -9.46 neutral-polarhydrophobic/hydrophobic/hydrophobic and V299L/F317L

0.84 -2.41 aliphatic, hydrophobic/hydrophobic and hydrophobic/hydrophobic

V304D/F317L

-35.57 44.32

aliphatic, hydrophobic/acidic,

charged and

hydrophobic/hydrophobic

T315I/G250E -44.32 30.38 neutral-polarhydrophobic//acidic, chargedhydrophobic and Y253H/E255V -16.91 -94.31 aromaticcharged//aliphatic, hydrophobicbasic and acidic, T315I/E255V

-9.34 -98.56

neutral-polar/hydrophobic and

acidic, charged/aliphatic,

hydrophobic

T315I/F359V

-1.94 -12.34 neutral-polarhydrophobic/aliphatic, hydrophobic/hydrophobic and

*Amino acid residues involved substitutions caused by mutations are presented in different color in order to distinguish between their physiochemical properties such as hydrophobicity (in blue), charge, as well as ability to donate and accept hydrogen bonds or polarity (red and purple).

Of the analyzed double mutations, more than half were showing negative ΔΔGcorr,

suggesting these are stabilizing mutations. Mutations T315I/E255V and Y253H/E255V displayed large negative ΔΔGcorr values, -98.56 kcal/mol and -94.31 kcal/mol. Mutations

(25)

Figure 9. The change in solvation free energy calculations derived from finite difference solutions of Poisson-Boltzmann equation by APBS for wild type and different double mutants directly derived from the active structure of BCR-ABL1 kinase domain, PDB entry 2GQG. -120 -100 -80 -60 -40 -20 0 20 40 60

Solvation Free Energy

(26)

Discussion

The main objective of the present work was to calculate the change in solvation free energies (ΔΔGcorr), and electrostatic surface potential of BCR-ABL1 kinase domain (KD).

Different mutations of clinical importance were investigated as well as the active- and inactive form of BCR-ABL1 kinase domain.

In order to obtain the active- and inactive form

of the

kinase domain of the BCR-ABL1 protein with identical sequence, a protein

homology or template-based modeling methodology was employed. Therefore, another objective of the work was to generate a 2D and 3D structure of BCR-ABL1 kinase domain as

active- and inactive form

by homology protein modeling. The project was initiated with the analysis of the ABL1 protein (human) in order to identify all the domains and

structures needed for protein homology modeling.

Methods and results discussion

The identification and analysis of the natural variant of ABL1 protein was done by using the entry name P00519 in The Universal Protein Resource Knowledgebase (UniProtKB) database. The UniProtKB database provides information about sequence and function of a large number of proteins [58]. This is the first step in an analysis of a protein of interest and is used largely in bioinformatics. The isoform IA is the canonical sequence with

length 1130 amino acids (aa) and mass 122,873 Da. The ATP binding site was located at

position K271 and the proton acceptor site was found at the position D363 according to UniProtKB database. The breakpoint for translocation to form BCR-ABL1 oncoprotein was located between amino acids 26-27, Appendix 1. The identification of specific domains of the ABL1 protein was performed using PROSITE - a database of protein families and domains (PROSITE was created at the SIB, Swiss Institute of Bioinformatics in 1989) and PFAM (containing protein families represented by multiple sequence alignments and hidden Markov models) databases

[56] [57]

.

These databases provide

accurate data for protein domain predictions.

A pore-lining region was found between 422-437, corresponding to a region rich in nonpolar sidechains (WAFGVLLWEIATYGM), namely, myristoyl pocket and in accordance to previously published data [59]. The

disordered region that was identified, belongs to regions that are known to contain short linear peptide motifs, for example “hot” loops (structures difficult to predict), SH3 ligands, targeting signals etc., that are important for protein function, as well as glycosylation sites [56]. Glycosylation is a complex post-translational modification of proteins regulating a range of functions such as protein partition and subcellular

localization, ligand recognition, protein folding, dimerization and other interactions, thus suggesting the involvement of the BCR-ABL1 in multiple pathway of cellular events [57]. After the first step in identification of specific domains of the native ABL1 protein, the modeling of the secondary structure was performed by using Domserf 2.1 and

MODELLER with PSIPRED Workbench. The PSIPRED was chosen because this server contains tools for protein structure prediction. PSIPRED Workbench is allowing the submission of a protein sequence from the user and performs protein structure prediction with an high accuracy [60]. Domserf 2.1 and MODELLER generated eight well-organized domain structures of high accuracy from the sequence covering the entire protein, see additional data in Appendix 6, [1], [3]. The results correspond accurately with data from previously published studies showing that c-ABL1 is a multidomain protein, where different domains have a specific location in the cell. The c-ABL1 structure was suggested to be able to self-assemble in a compact or elongated form, where the later one is

considered to be in an active state [3]. The kinase domain (242-293), which was of a major interest in this project, showed a secondary structure in agreement with data from published literature sources [1], [24].

(27)

in PFAM database. The oligomerization domain called as well as coil-coil (CC) was previously found to be responsible for oncogenesis of BCR-ABL1 via dimerization and later tetramerization between several CC monomers, leading to reciprocal constitutive activation of kinase domains [61], [62]. After submission of segments of interest (in FASTA format), several domains of the BCR protein were generated with by Domserf 2.1 and MODELLER, PSIPRED Workbench, see additional information in Appendix 7. The CC domain, aa 2-72 is a linear (loop) structure between 6-150 aa (highly disordered); domain 144-343 is a serine/threonine kinase domain (S/TK); a linear (loop) structure between 328-432 (highly disordered); and DH (Dbl homology) domain at 475-708; PH (Pleckstrin Homology) domain at aa 701-866 [21]. These domains are a part of BCR-ABL1 fusion protein and therefore of interest to determine their structure. The domains 893-1034 and 1045-1239 (Rho GTPase-activating proteins domain profile) are not a part of the BCR-ABL1 oncoprotein [21]. The region 197-385 is known to bind to SH2-domain of the BCR-ABL1 and most likely, contributing to constitutive activation of the BCR-ABL1 kinase, Appendix 7 [21].

In an alternative attempt to generate a prediction model of the entire BCR-ABL1 by SWISS-MODEL, ExPASy server, from 50 templates, where 3 of them were chosen

(possessing the highest QMEAN and template sequence identity (more than 98,40%)) did not prove resultative, basically because of the large input sequence (data not showed). Therefore, the modeling of the secondary structure of the BCR-ABL1 protein was performed by template recognition based on profile-profile alignment guided by secondary structure predictions with the help of protein homology modeling server CPHmodels 3.2. This server is located at the bioinformatic unit, Center for Biological Sequence Analysis, DTU Health Tech (Technical University of Denmark) [48]. Based on previously obtained data, the segment 1-927 from BCR and the segment 46-1149 from ABL1, the input file in FASTA sequence, were submitted to CPHmodels in order to obtain a PDB file (consolidating several different domains) of BCR-ABL1 [21]. It was not

possible to model a consolidated structure of the BCR-ABL1 protein in this case either, because of the large size of the input sequence. After a search in PubMed for a

consolidated secondary and tertiary structure of the BCR-ABL1 fusion protein, only the schematic representations of different domains of BCR-ABL1 as separate units were found [21], [3]. Additionally, the BCR-ABL1 molecule was found to be strongly co-localized with F-actin and mainly occurred in the cytosol, according to these publications [21], [3].

The generation of the secondary and tertiary structures of the kinase domain in its active- and inactive form was performed with the help of protein homology modeling as well. Protein homology (template-based modeling) alongside to X-ray crystallography and nuclear magnetic resonance spectroscopy (from PDB) is largely used today in many areas, such as biotechnology, drug design and development etc. Protein homology modeling was shown to be especially of importance for generating proteins with customized structure and functions, for example with different mutations [46]. Therefore, protein structure homology modeling of the active (2GQG, as template) and inactive (2HYY, as template) forms of the kinase domain of BCR-ABL1 were applied in the study. The modelled

structures of kinase domain in its active- inactive state were obtained with high accuracy, as shown in figure 3, 4 and 5. Both the active- and inactive forms of the kinase domain, have the same sequence length and exact amino acid composition, as deduced in previous steps (aa 242-493). The alignment of both structures shows different states, where the A-loop shows opposite positions as visualized in the figure 5 C. As previously described in the literature, the kinase domain adopts multiple active conformational states in solution (most likely in the cytosol as well), nevertheless the active and inactive form show distinct differences [63], [64]. Therefore, in the present work, the active and inactive states (no intermediate states) of the kinase domain were chosen for calculation of solvation energy and electrostatics.

(28)

biophysics and protein biochemistry. APBS belong to implicit solvation models that can be run through a number of molecular stimulation programs such as AMBER, CHARM, NAMD, etc., as described previously [44], [45]. APBS in configuration with PDB2PQR (converts PDB files to PQR) provides a flexible tool that can be run from the command line on different platforms such as Linux, Mac and Windows [44]. With the help of Poisson-Boltzmann equation (PBE) numerical solutions makes it possible the description of the shape of the solute (in our case protein molecule) in a continuum of a solvent, so called continuum electrostatics. This makes the continuum electrostatics to be used extensively in structural biology, in particular for visualization of the surface electrostatic potential of the proteins and other biomolecules [40]. According to the authors, the information obtained by solving the PBE is more complex and detailed compared to simple three-dimensional protein structure and can give unique information about active cites [40]. Electrostatic surface potential of the kinase domain, 2GQG, was visualized by molecular electrostatics using PyMol APBS plugin, figure 8. Mutations E255K, causing a large negative-, as well as, G250E causing a large positive change in solvation free energy, ΔΔGcorr, were selected for analysis of molecular electrostatics and compared to WT.

Mutation E255K caused a significant increase in the positive charge compared to WT. This increase is due to substitution of the negatively charged glutamic acid (E) with positively charged lysine (K), correlating well with published data [39]. The contribution of individual amino acids to the electrostatic surface potential and changes in the

energetics of proteins surfaces can lead to alteration of protein function and stability [39]. Free energies changes, for example, in case of point mutations, might predict the stability of different conformation of proteins. It is of a goal to be able to predict changes in

protein stability upon amino acid substitutions. According to published literature, destabilizing mutations are direct cause of a range of diseases. Stabilizing mutations on the protein surface increase hydrophobicity of the protein, thus protein stability is

inversely related to its solubility and likely functionality as well [65]. There are numerous computational tools, such as Rosetta-ddG, FoldX, EGAD and PoPMuSiC, etc, to predict the effect of mutations on stability [65]. In a study done by computing with the protein design software FoldX it was found that mutations that have an implication on enzymatic functions (positioned in catalytic regions) and substrate binding pockets are mostly destabilizing. Nevertheless, proteins harboring these mutations can evolve anyway, based on other compensatory mutations (neutral), located in regions of the protein that are not important to its function, thus exerting a stabilizing effect [33].

The calculations of solvation free energies using APBS software package was done in the present work. The change in free solvation energies of, so called, WT BCR-ABL1 kinase domain as well as different mutants of clinical importance were investigated, Appendix 3 [16], [22], [24], [25], [26], [29], [28]. All these mutations render resistance to imatinib and other TKI in CML patients. Thus, the changes in the structure of the kinase domain caused by these mutations might affect the binding affinity of TKI agents to kinase domain. It would be of major interest to investigate the mechanism of resistance, though this task was not addressed in the present project. The obtained results show a distinct pattern of distribution of ΔΔGcorr values, both in active- and inactive conformation of the

kinase domain, table 1 and figure 6. In the table 1 are shown the residue types of amino acids involved substitutions upon mutations, and their physiochemical properties of their side chains, including hydrophobicity and charge, as well as ability to donate and accept hydrogen bonds. According to their location on the kinase domain structure the these mutations could be classified as P-loop mutations (M244V, L248V, G250E, Q252H, Y253F, Y253H, E255K, E255V), C-helix (D276G, E279K), ATP-binding region (V299L, V304D, F311L), gatekeeper residue mutants (T315I, T315A, T315M), hinge region mutants (F317L, F317V), catalytic segment (M351T, F359V, V379I), activation loop mutants (L384M, L387M, H396R, H396P) and C-term mutants (F486S) [66]. The substitution from one amino acid residue to another with opposite hydrophobicity and charge caused large changes in ΔΔGcorr values, resulting in direct influence on electrostatic

(29)

In the, so called “gate keeper” T315I mutant, the neutral and polar residue of threonine is replaced by hydrophobic isoleucine. During this substitution a hydrogen bond is lost between dasatinib and T315-residue of the highly conserved region of the binding pocket of the kinase domain (V299 and F317 are involved as well in this interaction) [64]. This cause to resistance to dasatinib and, consequently, has a serious implication on treatment and outcome for CML patients. Thought, no large change in the ΔΔGcorr values occur upon

this substitution and in the several mutations in proximity of T315I, gate keeper and hinge regions, they, obviously, have a negative effect on binding affinity of small molecular inhibitors into the binding pocket [23].

The effect of mutations on binding affinity to TKI was investigated previously experimentally, though changes in free energy upon mutations was investigated in a lesser degree, according to the literature. In a published work, the relative alchemical free-energy calculation and predicted free-energy changes (ΔΔG) were used to predict the effect of point mutations in kinase domain of BCR-ABL1 protein on binding affinity of small molecule TKI [67]. The study of data from 10,336 cancer patients showed that 68.5% of missense kinase mutation were new, and have never been observed previously, and 87.4% have been observed no more than ten times, thus concluding that the majority of mutations in kinase domain of BCR-ABL1 were unique for each patient. For 144

clinically identified mutations, the resistance to eight TKI was predicted with high

accuracy, showing that only ca 13% are likely to result in failure of therapy. When binding energy, or affinity, decreased by more than 10-fold, ΔΔG >1.36 kcal mol−1, the mutants were classified as resistant, while mutants with diminished affinity less than 10-fold,

ΔΔG ≤ 1.36 kcal mol−1where considered as susceptible [67]. The predicted resistance

correlated well with experimental results, indicating that simulation data are of

importance for predicting resistance in different mutants. In Appendix 3 there are shown the effectiveness of different TKI that were experimentally calculated for clinically

important mutations, shown as the half maximal inhibitory concentration IC50 (nM)

(adapted from [16]). There were no other publications (after searching in PubMed) on how mutations in kinase domain of BCR-ABL1 may affect the change in solvation free energies calculated by APBS software. In a similar study, point mutations in

pyrazinamidase (PZase) and ribosomal protein S1 (RpsA) causing resistance to

Mycobacterium tuberculosis strains, were shown to affect the solvation free energy. The

solvation free energy was lower in mutants compared to WT, except E325K and G341R. The authors concluded that point mutations affected the conformational landscape of these enzymes [68]. Recently a web-based application MutantElec was developed for the study of the effects of mutations on electrostatic potential of proteins of interest and providing a graphical display allowing visualization of changes caused by point mutations, resulting in alteration of protein function [69]. A new promising method was reported in the literature for calculating changes in binding energy, namely Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) containing a fast algorithm based on modification of molecular mechanics Poisson-Boltzmann Surface Area (MM/PBSA), that could be employed for calculation of TKI affinity in mutants of BCR-Abl1 kinase domain [70].

Double mutations in BCR-ABL1 kinase domain are known to increase the risk for resistance to TKI. Accumulation of double and more somatic mutations during the course of CML, often leads to non-optimal or poor outcome CML patients [14, 27].

According to previous publications, changes in ΔΔG and functional effects of multiple mutations are often additive [33]. Ponatinib have good efficacy against all imatinib

resistant single mutants, including the “gate keeper” T315I mutation, nevertheless, double mutants such as G250E/T315I, E255K/T315I, and E255V/T315I have been shown to increase the risk for drug resistance development even for ponatinib [23]. There was a large variation in the ΔΔGcorr values when measured with help of APBS, indicating that

(30)

imatinib (first generation of TKIs), as well as dasatinib and nilotinib (second generation of TKIs) to the ATP-binding site containing double mutations dramatically diminished [26].

The net difference in solvation energy between the inactive and active state of kinase domain harboring point mutations generated by homology protein modeling

shown in figure 8, shows that some mutations have significant impact on the solvation energy of these two forms. This finding suggests that mutations can induce and stabilize kinase domain to be in an active form. For example, the mutation E255K has a large impact on solvation free energy (ΔΔGcorr, kcal/mol) during transition from an inactive- to

active state of the kinase domain. Other mutations, such like E279K and H396P diminish the net solvation energy between the inactive and active state of kinase domain compared to WT. During the transition between inactive- to the active form of the kinase domain, there is an increase in the positive charges (in blue) as shown as mapped electrostatic surface potentials in figure 8. The inactive form of the kinase domain harboring the mutation E255K is becoming more positive charged on its molecular surface during the transition to an active one, as illustrated in figure 7. Thus, mutations in the kinase domain might influence the likelihood of the kinase domain to be in inactive- or active form via changes in solvation free energy.

Conclusions

Calculations of solvation free energies for WT of BCR-ABL1 kinase domain as well as different mutants of clinical importance was performed by using APBS software. Significant changes in solvation free energies between different mutants were detected. The results from calculations of solvation free energies with structures from homology protein modeling, largely support the data from calculation with PDB structure 2GQG. The obtained results show a distinct pattern of distribution of ΔΔGcorr values, both in

active- and inactive conformation of the kinase domain. The net difference in solvation energy between the inactive and active state of kinase domain harboring point mutations generated by homology protein modeling shows that some mutations have significant impact on the solvation energy of the kinase domain during the transition from inactive to active state. These changes are reflected as changes in charges on the molecular surface (positive, negative and neutral charges), thus electrostatic potential of the kinase domain is affected by nature of these mutations. There was found a large variation in ΔΔGcorr

values for double mutations. More research is needed in order to investigate the effect of double mutations on electrostatic potential of the kinase domain of the BCR-ABL1

protein. Protein homology modeling was used in this study and the active- inactive kinase domain were generated. The method proved useful for modeling of customized

structures, in this case modeling the active- and inactive kinase domain that could be used in mutagenesis and molecular electrostatics.

Acknowlegements

(31)

Appendix 1.

ABL1-protein analysis using UniProtKB (entry name P00519). The isoform IA (the canonical sequence, identifier: P00519) (length 1130; mass 122,873 Da).

The ATP binding site is located at position K271 and the proton acceptor site is at the position D363.

CLUSTAL O(1.2.4) multiple sequence alignment SP|P00519|ABL1_HUMAN

SP|P00519-2|ABL1_HUMAN

(32)

Appendix 2.

Swedish therapy recommendations for CML [15].

Tyrosine kinase inhibitors

(TKIs) Target Indication

Imatinib (Glivec® Novartis, or generica) 1 x 400 mg per day (max 800 mg is possible)

Inhibition of BCR-ABL1, c-KIT

and PDGFR. First line treatment under chronic phase. Registered in Sweden since 2001 after failure of interferon-α therapy. Nilotinib (Tasigna®) 150 mg 2 x

2;

The dose of 200 mg 2 x 2 nilotinib is recommended as second line treatment.

Inhibition of BCR-ABL1, c-KIT, ARG, PDGFR-α, PDGFR-b, not SRC. Nilotinib is 30–50 times more potent compared to imatinib. Binds only to the activated conformation of BCR-ABL1. Nilotinib inhibits, in vitro, almost all tested BCR-ABL1-mutation forms, with exception of T315I.

Nilotinib was first registered in Sweden 2007 on indication CML resistant to imatinib. Nilotinib was registered even for first line treatment under chronic phase, 2010.

Bosutinib (Bosulif®, Pfizer) 400 mg x 1, max 500 mg x 1. Bosutinib is recommended under second line treatment in a dose of 400 mg x 1, 500 mg x 1 or 600 mg x 1 in case of therapy failure with other TKIs.

Inhibition of BCR-ABL1, SRC-kinase family members such SRC, LYN and HCK. Bosutinib has a slight inhibitory effect on PDGFR and c-KIT.

Bosutinib was first registered in 2013 on indication CML in cases of resistance issues or

intolerance towards other TKIs. Bosutinib was registered even for first line treatment under chronic phase, 2010.

Dasatinib (Sprycel®, Bristol-Myers Squibb) 100 mg x 1. In case of accelerated phase CML and blast crisis the recommended start dose is 140 mg x 1.

Inhibition of BCR-ABL1, SRC, c-KIT and PDGFR-b. In vitro is dasatinib 325 times more potent compared to imatinib. Unlike imatinib, dasatinib is binding to the active conformation form of BCR-ABL1. Dasatinib inhibits in vitro, effectively the majority of tested BCR-ABL1-mutation forms, except T315I.

Dasatinib was registered in 2006 with indication CML (CP, AP or BC) in relation to resistance and tolerance issues of imatinib, as well as for treatment of Ph+ ALL

(resistance and tolerance issues to earlier treatments).

In case of mutations in V299L, T315A and F317L/V/I/C should nilotinib be used. BCR-ABL1 present mutations Y253H, E255K/V and F359V/C/I dasatinib should be used. Ponatinib (Iclusig®, Incyte

Biosciences) 45-30 mg x 1.

BCR-ABL1-containing mutation

T315I. Ponatinib is registered for treatment of all phases CML in patients with resistance against dasatinib or nilotinib, or in case of intolerance against these TKIs, as well as for patients with T315I-mutation or Ph+ ALL. Pronatinib is not recommended for patients with acute cardio-events such stroke or heart attack, because of negative side effects. Pronatinib was

(33)

Appendix 3.

Mutations in BCR-ABL1 and their location on the kinase domain and TKI, IC50 (nM)

published 2018 (adapted from [16]).

Domain Mutated

kinase Imatinib, IC50 (nM) Ponatinib, IC(nM) 50

(34)

Appendix 4.

PQR output file was obtained in following steps: - uploading the PDB file;

- pKa →pH 7.0;

- used PROPKA (to assign protonation states at provided pH); - used forcefield → PARSE (PARameters for Solvation Energy, [71] ; - output →PARSE;

Select “ Additional Options:

- Ensure that new atoms are not rebuilt too close to existing atoms

- Optimize the hydrogen bonding network - Create an APBS input file

- Remove the waters from the output file” - Start Job” [53].

The running of the APBS calculation and visualization of the resulting electrostatic potential map was done as following:

• PDB file was loaded into

PyMol-2.4.0;

• Go to Plug in of APBS Electrostatics;

• In the window of APBS Electrostatics choose options:

• Prepare Molecule → Options >>Method: “pdb2pqr adds hydrogens and missing sidechain atoms, assigns partial charges and radii. Removes ligands and modified residues”;

• Calculate Map with APBS

→ Options >>Grid Spacing: 0,50;

• Molecular Surface Visualization

→ Options >> Projects the

electrostatic potential onto the molecular surface; Range +/- 5,00; Output Ramp; Solvent Excluded Surface (Connolly surface) and Solvent Accessible Surface. Other

Visualizations → Options >>After the map has been

calculated, create additional visualizations using “Action” items in the object menu panel, in this case Field lines with “A>gradient>default” was chosen.

In the Advanced Configurations are Program Locations for APBS, PDB2PQR and command line option ––ff=AMBER––chain was selected.

The commands used in the “Terminal—Bash” were: apbs solvprot .in>solvprot.out

(35)

Appendix 5.

References

Related documents

We would like to point out that the genetic link between TBK1 and neurodegeneration is largely based on Mendelian dominantly inherited deleterious loss-of-function mutations (such

Increased prevalence of prior malignancies and autoimmune diseases in patients diagnosed with chronic myeloid leukemia.. Second malignancies following treatment of

The primary aim of the present thesis was to investigate the influence of CYP3A metabolic activity and cellular transport mediated by genetic variants of ABCB1 and ABCG2

In this method of cancer treatment, the molecules have the ability to bind to the receptor and block the signal and therefore inhibit the dangerous cell growth. What the

Nilotinib versus imatinib for the treatment of patients with newly diagnosed chronic phase, Philadelphia chromosome-positive, chronic myeloid leukaemia: 24-month minimum follow-up

As GUS was the reference gene in the majority of cases (n = 33), we also compared the diagnostic phase BCR-ABL1 transcript levels between poor responders and responders in GUS

Bhargava R, Gerald WL, Li AR, Pan Q, Lal P, Ladanyi M, Chen B: EGFR gene amplification in breast cancer: correlation with epidermal growth factor receptor mRNA and protein

Linköping University Medical Dissertations