Theoretical study of the molecular aspect of the suspected novichok agent A234 of the Skripal poisoning

Novichoks are the suspected nerve agents in the March 2018 Skripal poisoning. In this context, the novichok agent A234 (chemical structure proposed by Mirzayanov) was studied using computational methods to shed light on its molecular, electronic, spectroscopic, thermodynamic and toxicity parameters as well as on potential thermal and hydrolysis degradation pathways. The poisoning action and antidote of A234 were also investigated. Some of these parameters were compared to three common G- and V-series nerve agents, namely GB, VR and VX. The research findings should be useful towards the detection, development of antidotes and destruction of A234.


Introduction
Nerve agents, organophosphate containing chemical warfare agents, are among the most toxic chemicals known to mankind [1]. They can inactivate acetylcholinesterase (AChE) which is a key central nervous system (CNS) enzyme responsible for the breakdown of the neurotransmitter acetylcholine; thus, leading to rapid and severe adverse effects on the environment, human and animals [2,3]. Several nerve agents of the G-[tabun (GA), sarin (GB), chlorosarin (GC) and soman (GD)] and V-series (VE, VR, VS and VX) have been deployed not only in warfare but also in acts of terrorism and high-profile assassinations [4,5]. these nerve agents are still a threat to the international community, despite the fact that their use is being regulated by the Organisation for the Prohibition of Chemical Weapons under the Chemical Weapons Convention (CWC) [6].
A novel class of nerve agents, the novichoks or the A-series, has recently come into the limelight following the March 2018 assassination attempt on the former Russian spy, Sergei Skripal and his daughter Yulia in Salisbury, UK [7][8][9]. Almost 30 years have passed since these 'fourth generation' nerve agents were developed by the Soviet Union in a Cold War-era weapons programme [10]. However, information on novichoks is still guarded as 'top secret' and exact reliable data are missing. The Skripal poisoning has catalysed the pursuit of detailed information on the history, chemical structure, synthesis, toxicity, deployment, detection and destruction of novichoks. This yielded some reports which are insightful although incomplete at molecular level [4,7-9,11 -13]. One of the concerns is related to their chemical structure. For example, there is still a debate as to whether a novichok agent denoted as A234 (scheme 1) corresponds to either structure (a) as proposed by Mirzayanov [14] or structure (b) as proposed by Hoenig [15] and Ellison [16].
To the best of our knowledge, there is only one scientific publication that provides experimental data for a chemical (alkyl-N-[bis(dimethylamino)methylidene]-P-methylphosphonamidates) that is specifically identified as a novichok and this compound is included in the Schedule 2.B.04 of the CWC [17]. We are reporting a theoretical study on novichok and in particular on the novichok agent A234 (structure (a) of scheme 1; N'-[ethoxy(fluoro)phosphoryl]-N,N-diethylethanimidamide) because some open sources [18 -20] speculated the use of A234 in the absence of official communications on the exact chemical identified from the UK incident. Henceforth, A234 will correspond to structure (a) of scheme 1 throughout this paper. It is noteworthy that A234 has not yet been scheduled under the CWC and at the revision stage of this manuscript, two reports on novichoks pertaining to QSAR [21] and molecular docking approaches [22] appeared.

Methodology
Conformational analysis for A234, VR and VX was carried out using the MMFF94s force field as implemented in the CONFLEX software [23 -25]. The conformational search was limited to 5 kcal mol 21 and this led to 115, 8061 and 2473 conformers of A234, VR and VX, respectively. The optimization of the 115 conformers of A234 using the B3LYP/6-311þþG(d,p) [26][27][28] method converged to 26 conformers; their optimized structures together with their relative energies are provided in electronic supplementary material, figure S1. The first 125 conformers of VR and VX were also optimized using the B3LYP/6 -311þþG(d,p) and the relative energies and Cartesian coordinates of their resulting first 15 minimum energy structures are provided in the electronic supplementary material. Previously reported [29] minimum energy structures of GB were herein revisited using the B3LYP/6-311þþG(d,p) method. The B3LYP lowest minimum energy structures of A234, GB, VR and VX were also optimized using the M06-2X/6-311þþG(d,p) [ conjunction with different Pople basis sets have been used to study nerve agents and related organophosphorus species. Geometry optimization was followed by analytic Hessian computation using the same methods. The absence of negative Hessian eigenvalues confirmed the stationary points as minima on the potential energy hypersurfaces. Zero-point energy correction was included in the relative energies (DEs). Reported energies are given at 298.15 K and 1 atm. Natural bond orbital (NBO) analysis was also carried out using the M06-2X/6-311þþG(d,p) method [37]. Computations were performed by means of resources provided (Gaussian 16 [38] package) by SEAGrid [39][40][41][42]. 1 Geometry optimization was also conducted in solvents (water and n-octanol) using the M06-2X/6-311þþG(d,p) method in view to calculate the lipophilicity of A234, GB, VR and VX. Solvent effect was taken into account based on the polarizable continuum model [43]. The classical descriptor for lipophilicity is the log P o/w ( partition coefficient between n-octanol and water) [44]. The log P o/w values were obtained using the following equation: where R is the universal gas constant (8.314 J K 21 mol 21 ), T is the system temperature (298.15 K) and DG (in J mol 21 ) is the difference between the solvation absolute Gibbs free energies in water (G water ) and in noctanol (G n-octanol ). The log P o/w values for the four nerve agents were also calculated via the SwissADME [44] and the ALOGPS 2.1 program [45]. The optimized structures in water were used for computing nuclear magnetic resonance (NMR) chemical shifts with the gauge-including atomic orbital method [46] using shieldings of trimethylsilane (for 1 H NMR and 13 C NMR), nitromethane (for 15 N NMR), phosphoric acid (for 31 P NMR) and trichlorofluoromethane (for 19 F NMR) computed using the M06-2X/6-311þþG(d,p) method.
The CBS-QB3 composite method [47] was employed to calculate the enthalpy of formation (D f H 298 ) of the nerve agents. In the CBS-QB3 model, the geometry and frequency computations were performed using the B3LYP/6-311G(2d,d,p) method (also denoted as cbsb7). The CBS-QB3 model is known to perform well and predict accurate energies for organophosphorus compounds [48,49]. The D f H 298 of A234 was further employed to determine its bond dissociation energies (BDEs) as per the following process RR 0 ! R † þ R' † (where RR 0 and R †/R' † represent A234 and the individual radical fragments, respectively) to provide an insight into favourable thermal decomposition pathways [48]. The BDE corresponds to the enthalpy of reaction (D r H 298 ) of the thermal dissociation process, where The D f H 298 of each radical was determined using the CBS-QB3 method.

Results and discussion
Section 3.1 consists of the minimum energy structures of A234, GB, VR and VX. Their spectroscopic parameters, conceptual DFT-based reactivity descriptors, molecular electrostatic potential (MEP) and ADME (absorption, distribution, metabolism, excretion) properties are discussed in § §3.2-3.5, respectively. Their poisoning action and possible antidotes based on model reactions are reported in §3.6. The hydrolysis and thermal degradation of A234 are discussed in § §3.7 and 3.8, respectively.

Structural parameters
Selected bond lengths of the gas-phase optimized structures of A234, GB, VR and VX are provided in figure 1 and additional details are collected in electronic supplementary material,

Spectroscopic analysis
The IR and Raman spectra of GB, VR and VX were revisited and those of A234 are reported based on the M06-2X/6-311þþG(d,p) method in the gas phase (figure 2 and table 1). The IR and Raman spectra of the nerve agents consist of two distinct regions, notably a low-wavenumber region at 200-1700 cm 21 and a high-wavenumber region at 2950-3200 cm 21 . The high-wavenumber region consists of C-H stretching vibrations which are weakly IR active (A234: 3052-3173 cm 21 ; GB: 3053-3176 cm 21 ; VR: 2965-3179 cm 21 ; VX: 3006-3172 cm 21 ). This particular region is highly Raman active with broad absorption bands. The low-wavenumber region (fingerprint region comprising several vibrational modes) of each nerve agent differs from each other. The low-wavenumber region features weakly to non-Raman active absorption bands. A sharp and distinct peak due to C¼N stretching at 1670 cm 21 is noticeable in the IR spectra of A234. This peak is characteristic for substituted amidines which absorb strongly at 1600-1700 cm 21 [53]. The C¼N stretching of an N-phosphorylated alkylisourea of the type (EtO) 2 P(O)NC(OEt)N(CH 2 CH¼CH 2 ) 2 has been reported at 1640 cm 21 and this correlates with that of A234 [50]. The C-N stretching of the -N¼CR -N, acetoamidine unit of A234 is of weaker intensity and appears at a lower wavenumber (1560 cm -1 ) than that of C¼N stretching. Literature values indicate that the C-N bond of amidine derivatives vibrates at lower frequencies within a range of 1230-1412 cm 21 [54 -56]. The peak of highest intensity within the low-wavenumber region corresponds to the O2C stretching for A234, VR and VX (1113, 1091 and 1107 cm 21 , respectively) and P-O -C asymmetric stretching for GB (1045 cm 21 ). The 1 H, 13 C, 15 N, 31 P and 19 F NMR chemical shifts of A234, GB, VR and VX in water are provided in electronic supplementary material, table S3 and the atom labelling of A234 are given in figure 3.

13 C NMR
The tertiary carbon atom (C2) of the acetoamidine unit of A234 resonates in a weak field with the highest chemical shift of 190.71 ppm compared to the other carbon atoms present in either A234 or GB, VR and VX. This is characteristic of amidine/guanidine-containing molecules [57,58]. The C2 atom is in an electron-poor environment, being bonded to two electronegative nitrogen atoms, and hence, less shielded as opposed to the other carbon atoms. Further, the carbon atom (C26) bonded to oxygen in

19 F NMR
The chemical shift of the F6 atom of A234 appears at 286.22 ppm while that of GB is at 263.69 ppm.

Conceptual DFT-based reactivity descriptors
The HOMO and LUMO plots of A234, GB, VR and VX obtained using the M06-2X/6-311þþG(d,p) method are illustrated in figure 4. The HOMO of A234 is mainly localized on the N atoms and the ethyl groups of the acetoamidine unit. A small contribution from the orbitals of the electronegative O and F atoms is also observed in HOMO. Its LUMO is mainly centred on the alkyl groups of the acetoamidine unit. Some similarities are also reflected from the HOMO and LUMO of GB, VR and VX. In GB, the HOMO is mainly localized on its alkyl groups as well as its O atom, while in VR/VX, the HOMO is centred on their tertiary amino unit. Further, LUMO is mainly localized on the alkyl groups of GB, VR and VX. A close analysis of the MO coefficients indicates that the orbitals of the P atom contribute to some extent to the HOMO and LUMO of A234, GB, VR and VX. Population analysis shows that the active orbitals associated with acetoamidine unit dominate over that of the A large HOMO-LUMO energy gap has been associated with a large dipole moment [74]. VX has both the lowest energy gap and the lowest dipole moment. A234 has the largest dipole moment (which points towards the N3-P4 bond) and hence, is more polar. The dipole moment decreases on going from A234 ! GB ! VR ! VX.
The electrophilicity index increases from VX ! VR ! A234 ! GB. All the nerve agents are highly electrophilic. They are prone to accept electron density from incoming nucleophiles, bearing oxygen, nitrogen and sulfur donor atoms, preferentially at the electropositive centre/s of the nerve agents. NBO analysis of A234 indicates that the carbon atom (0.578e) of the .N1-CR¼N3 -acetoamidine unit and the phosphorus atom (2.498e) are both positively charged. All other carbon atoms are    , table  S5). Thus, these two particular centres (C2 and P4) will be the prime target of nucleophiles. By contrast, GB, VR and VX have only one electropositive centre where nucleophilic attack can take place. Their phosphorus centres (P GB ¼ 2.365e, P VR ¼ 1.974e, P VX ¼ 1.971e) are less positively charged than that of A234. This suggests that a nucleophile may attack the phosphorus centre of A234 more readily than that of GB, VR and VX.

Molecular electrostatic potential
The MEP surfaces of A234, GB, VR and VX, obtained using the M06-2X/6-311þþG(d,p) method, are shown in figure 5 and their back surfaces are also provided in electronic supplementary material, figure S4. Some common features are observed in the MEP surfaces of the four nerve agents. The negative charge (red region) is mainly localized on the oxygen atom of the P¼O unit and this will favour strong hydrogen bonding, for example, with the amino acid residue of the AChE active site. The regions governing the electron-rich O (coordinated to alkyl group), S and F atoms are weakly negative (yellow in colour) and these may form weak hydrogen bonding with the -NH 2 moiety of the amino acid of AChE. The region at the nitrogen atom of the tertiary amino unit of VR and VX are also weakly negative (see electronic supplementary material, figure S4); however, the same observation is not reciprocated by A234. The surfaces around the nitrogen atom (N1) together with the coordinated ethyl groups of A234 are weakly positive ( pale blue in colour). The -NEt 2 unit is electron-deficient most probably due to the significant charge transfer from the lone pair of N1 atom to the anti-bonding orbital of C2¼N3 bond (as observed from the second-order perturbation theory analysis within the NBO framework). The observation made from the MEP analysis also correlates with the natural charges of the N atoms of the nerve agents. The N1 atom of A234 (20.493e) is less negatively charged than that of VR (20.597e) and VX (20.598e). On the other hand, the N3 atom of A234 is more negative (21.005e). Electrophiles will approach the N3 centre more readily than the N1 centre.

ADME parameters
The ADME parameters, namely lipophilicity (log P), solubility (log S), topological polar surface area (TPSA) and skin permeability (log K p ) of the nerve agents, are summarized in table 3. The nerve agents are lipophilic in nature (the log P o/w values are positive). The lipophilicity generally increases in the order of VX % VR . A234 . GB. Data derived from SwissADME indicate a good correlation among the molecular weight, TPSA, lipophilicity, solubility, human gastrointestinal absorption, blood-brain barrier permeability and skin permeability of the nerve agents A234, GB, VR and VX (see electronic supplementary material, figures S5a-d). Lipophilicity increases as the molecular weight and TPSA increase. This can be correlated with an increase in the penetration of the nerve agent in the CNS [76]. All the nerve agents are soluble in water and solubility is in the order of GB . A234 . VR % VX. It is known that the solubility of the compound can influence the absorption and bloodbrain barrier permeability [77]. The solubility of the nerve agents can be associated with their high human gastrointestinal absorption as well as good blood -brain barrier and skin permeability. SwissADME predicts that the skin permeability of A234 and GB are comparable and that their more negative log K p values indicate that they are marginally less skin permeant than VR and VX [44].

Nerve agent poisoning and antidotes
Nerve agent poisoning is caused from the inhibition of the AChE enzyme activity; thereby, forming a covalent bond between the phosphorus atom of the nerve agent and the alcoholic oxygen of the serine residue of the active site. Modelling the reaction between AChE and nerve agents is complex and time-consuming due to the large size of the protein and the requirement of high-accuracy computations. This reaction can alternatively be studied with model species. The simplest model for the active serine site of AChE is methanol [78]. Thus, the phosphonylation reactions of A234 with both the deprotonated CH 3 Oanion and the neutral CH 3 OH molecule were investigated using the M06-2X/6-311þþG(d,p) method (scheme 2). The corresponding reactions for GB, VR and VX are provided in electronic supplementary material, scheme S1. The central phosphorus atom of A234 is attached to three potential leaving groups -F, -OEt or -N¼C(Me)NEt 2 . Thus, the reaction for the displacement of each leaving group via nucleophilic attack by CH 3 O -/CH 3 OH was studied. The enthalpy and free energy change of the reactions indicate that attack by the CH 3 Oanion is favoured over CH 3 OH. This is also observed in the case of GB, VR and VX, except for the processes involving the cleavage of the P-S bond. Further, attack by the CH 3 Oanion will result in the preferential displacement of the -F group over that of -OEt or -N¼C(Me)NEt 2 . The relative energies for the substitution of -F by -OMe are comparable for A234 (DH ¼ 29.3 kcal mol 21 and DG ¼ 26.5 kcal mol 21 ) and GB (DH ¼ 29.3 kcal mol 21 and DG ¼ 27.5 kcal mol 21 ). Nerve agent poisoning can traditionally be treated with pralidoxime in conjunction with atropine. The main function of the oxime is to act as a reactivator of the phosphorylated AChE [79]. Thus, the effect of two antidotes on the reactivation of the A234-inhibited AChE model was probed using the M06-2X/6-311þþG(d,p) method (scheme 3). The first model antidote used corresponds to the simplest oximate, formoximate anion (H 2 C¼NO 2 ). The second one is the hydroxylamine (H 2 NO 2 ) anion which has been predicted to be a better antidote than formoximate for the reactivation process of GB-inhibited AChE [80]. The current study also indicates that reactivation of A234-inhibited AChE model will preferentially be induced by hydroxylamine than by the formoximate anion. Reaction with the formoximate anion is highly endergonic for all studied nerve agents (scheme 3; electronic supplementary material, scheme S2). Thus, this suggests that the A234-inhibited AChE model obtained from the displacement of the -F atom can be successfully reactivated using the hydroxylamine anion.

Hydrolysis
Nerve agents commonly undergo nucleophilic attack by the water molecule at the electropositive phosphorus centre via either an S N 1 or S N 2 hydrolysis reaction [81]. Thus, hydrolysis reaction at the P4 centre of A234 was investigated using the M06-2X/6-311þþG(d,p) method. This is illustrated by reactions 5a-c (scheme 4) which involve the displacement of the leaving groups -F, -OEt and -N¼C(Me)NEt 2 , respectively. The free energy values indicate that these processes are endergonic. The carbon atom (C2) of the acetoamidine unit of A234 is also electropositive. Hydrolysis at the acetoamidine unit can take place via two different pathways as highlighted by Wu et al. [82].    Nucleophilic attack by the water molecule at the C2 centre is illustrated by reactions 6a,b (scheme 4) which involve bond breaking at N3¼C2 and C2 -N1, respectively. Reaction 6a is exergonic (DG ¼ 29.2 kcal mol 21 ). Based on the reaction energetics and in the absence of information on A234, it can be deduced that its hydrolysis may potentially take place via reaction 6a to yield N,Ndiethylacetamide and ethyl phosphoramidofluoridate.

Heat of formation and bond dissociation energy
The D f H 298 was calculated using the CBS-QB3 method for the nerve agents. The formation of the nerve agents is exothermic with D f H 298 of 2238.13 (A234), 2240.49 (GB), 2179.54 (VR) and 2179.51 (VX) kcal mol 21 . The D f H 298 values were estimated as in [48] and details are provided in the electronic supplementary material. GB is marginally more stable than A234. GB and A234 have greater thermal stability than VR and VX (% 60 kcal mol 21 ). Figure 6 illustrates the D r H 298 values for the dissociation of various bonds in A234 and these were compared with those of GA and GB [48]. The D r H 298 for the dissociation of the P-F bond to form the radicals †F and †P¼O(OEt)(N¼C(CH 3 )(NEt 2 )) was calculated to be 144.5 kcal mol 21 . The P-F bond is the strongest and this is comparable to that of GB (143.2 kcal mol 21 ) [48]. The C-C bond breakings of the acetoamidine unit correlate to the lowest BDEs (82.5 and 85.1 kcal mol 21 ) and these dissociations will predominate in unimolecular initiation reactions. These C-C BDEs are significantly lower than that of the -OEt unit of A234 (91.7 kcal mol 21 ) and GA (93.8 kcal mol 21 ) [48]. Significant amount of electron density transfer from the lone pair of N1 atom to the anti-bonding orbital of the C2¼N3 bond has a destabilizing effect on the ethyl groups attached to the N1 atom. There is also a slight difference between the C -C bond distances of the acetoamidine (C8 -C9 ¼ 1.530 Å and C15-C16 ¼ 1.532 Å ) and -OEt (C26-C27 ¼ 1.515 Å ) units of the B3LYP/6-311G(2d,d,p) optimized structure of A234. The next favourable unimolecular initiation will occur at the O -C bond of A234.

Summary
Theoretical methods were employed to study (i) the molecular, spectroscopic, electronic and toxicity properties, (ii) poisoning action and antidotes based on model reactions, and (iii) hydrolysis and thermal degradation of A234. Some of these parameters were compared with GB, VR and VX. Distinct royalsocietypublishing.org/journal/rsos R. Soc. open sci. 6: 181831 features in the spectra of A234 are observed, namely (i) a sharp peak due to C¼N stretching at 1670 cm -1 in the IR spectra, (ii) a large 13 C NMR chemical shift of 190.71 ppm due to the -N¼CR-N, acetoamidine carbon atom, and (iii) a relatively small 31 P NMR chemical shift of 3.89 ppm. NBO analysis indicates that A234 may have diverse chemistry to other nerve agents due to the presence of two active electronegative centres, namely the carbon atom of the -N¼CR-N, acetoamidine group and the phosphorus atom. The energetics for (i) the reactions of the nerve agents with AChE model nucleophiles, (ii) the reactions of nerve agents-inhibited AChE with model antidotes, and (iii) hydrolysis and thermal degradation of A234 will serve as foundations for future computations. Detailed mechanistic studies are currently being carried out in our laboratory for an in-depth understanding of the hydrolysis of A234. Overall, this study suggests that VX and VR are potentially more reactive than A234 and GB further to (i) their lowest HOMO-LUMO energy gaps, (ii) marginally high skin permeability, and (iii) highly negative DG values associated with their reaction with MeO 2 and MeOH (for the cleavage of the P-S bond). The current theoretical work could not authenticate the claim made by Mirzayanov on A234 being more potent than VX and this is in accordance with the recent study carried out by Carlsen [21]. The findings from this research work should provide incentives towards efficient detection, development of antidotes and destruction of A234.