Statistical efficiency of methods for computing free energy of hydration

The hydration free energy (HFE) is a critical property for predicting and understanding chemical and biological processes in aqueous solution. There are a number of computational methods to derive HFE, generally classified into the equilibrium or non-equilibrium methods, based on the type of calculations used. In the present study, we compute the hydration free energies of 34 small, neutral, organic molecules with experimental HFE between +2 and -16 kcal/mol. The one-sided non-equilibrium methods Jarzynski Forward (JF) and Backward (JB), the two-sided non-equilibrium methods Jarzynski mean based on the average of JF and JB, Crooks Gaussian Intersection (CGI), and the Bennett Acceptance Ratio (BAR) are compared to the estimates from the two-sided equilibrium method Multistate Bennett Acceptance Ratio (MBAR), which is considered as the reference method for HFE calculations, and experimental data from the literature. Our results show that the estimated hydration free energies from all the methods are consistent with MBAR results, and all methods provide a mean absolute error of ∼0.8 kcal/mol and root mean square error of ∼1 kcal for the 34 organic molecules studied. In addition, the results show that one-sided methods JF and JB result in systematic deviations that cannot be corrected entirely. The statistical efficiency ε of the different methods can be expressed as the one over the simulation time times the average variance in the HFE. From such an analysis, we conclude that ε(MBAR) > ε(BAR) ≈ ε(CGI) > ε(JX), where JX is any of the Jarzynski methods. In other words, the non-equilibrium methods tested here for the prediction of HFE have lower computational efficiency than the MBAR method.


I. INTRODUCTION
Hydration or aqueous solvation of molecules is essential in many biochemical processes such as transfer of compounds through the cell membrane or the activity of the biological macromolecules in cells 1 but also in chemical processes, such as micelle formation, protein folding, and aggregation or binding of drugs to biological macromolecules.The hydration free energy (HFE) is the amount of free energy needed to transfer a molecule from the gas phase to aqueous solution.It aids in understanding the outcomes of various chemical and biological processes in aqueous solutions. 2,3Computational approaches to predict HFE are important to understand molecular interactions in the aqueous phase.
There has been extensive research on the computation of HFE.For instance, hydration free energies have been calculated for 504 small neutral organic molecules in an implicit solvent 4 and explicit solvent 5 using the Bennett acceptance ratio (BAR) 6 based on molecular dynamics (MD) simulations.In a recent study, Matos et al. 7 have used the multistate a) Author to whom correspondence should be addressed: david.vanderspoel@icm.uu.se.
Bennett acceptance ratio (MBAR) to get highly accurate HFEs of organic molecules, and the results were presented in the FreeSolv database of hydration free energies. 8Furthermore, HFE calculations have been reported for amino acid side chain analogs [9][10][11][12][13] and 60 small molecules. 14,153][24][25] Gibbs free energies of solvation in other solvents than water are effectively uncorrelated to experimental data with correlation coefficients R from 0.25 (GB) to 0.5 (PB) compared to 0.85 for explicit models, 25 suggesting that implicit solvent models should be used with caution.In two studies, the reference interaction site models (RISM [26][27][28] and 3D-RISM [29][30][31] ) have been used for HFE calculations of drug-like molecules. 22,32However, even though statistical mechanical theories developed for molecular liquids by using the distribution functions of the system are computeefficient compared to MD, their HFE results are poorer due to the approximations in theories. 2,33,34n order to compute the accurate hydration free energies of small molecules via MD, many researchers have focused on equilibrium methods such as BAR 35 and MBAR. 36These methods are based on data collected from the equilibrium simulations as well as free-energy perturbations.On the other hand, non-equilibrium methods such as the one-sided Jarzynski Equality (JE) 37 and two-sided Crook Gaussian Intersection (CGI) [38][39][40][41] methods have been proposed to be computeefficient alternatives for estimating the hydration free energies of small molecules.Here, we calculated the HFE of 34 small neutral organic molecules with the experimental range from +2 kcal to −16 kcal/mol with both the two-sided CGI, BAR, and MBAR methods and the one-sided JE using MD.The calculated HFE results are compared with MBAR results and experimental values from the literature.

II. METHODS
The free energy corresponds to the amount of work that a system can perform.Free energy differences between two equilibrium states of a system can be calculated, using MD simulations, with the Jarzynski equality (JE), the Crooks Gaussian Intersection (CGI) relation, or the Bennett Acceptance Ratio (BAR) method.These methods are explained briefly below:

A. Jarzynski Equality
JE computes the free energy difference ∆G AB between two equilibrium states from the exponential average of the work W performed in non-equilibrium transitions from one state to the other.It requires that the transition be started from an equilibrium state.The energy ∆G AB is determined from e −β∆G AB = e −βW . ( The work done on the system along the variable λ which regulates the potential energy of the system can be obtained by In Eq. ( 2), instantaneous ∂H/∂λ values are integrated.The resulting work includes contributions from the free energy difference between the states and the dissipated work along the transition path.However, it has been reported that the Jarzynski estimator is biased by the finite number of the trajectories. 42or a small system in a near-equilibrium state, the magnitude of this bias B J (N) can be estimated with an empirical relationship proposed by Gore et al. 42 where N is the number of trajectories (N = 50 in our case), W is the mean dissipated work which is defined as the difference between the work and the equilibrium free energy difference, α is a decreasing function of W , and where β = (RT ) −1 , T is the temperature and R is the gas constant (R = 0.008 31 kJ/mol K), and σ 2 w is the variance of the work distribution. . ( The bias-corrected JE is defined as The variance for bias-corrected JE is given by For bias calculation, Eq. ( 5) was used with C = 15, but for the variance, αv = α(C = 50) based on the work by Gore et al. 42 Finally, the mean square error (MSE) for the bias-corrected Jarzynski estimator is defined as Here, we used three variations of the bias-corrected JE: Jarzynski Forward (JF), in which the simulation is started in the hydrated state, Jarzynski Backward (JB), starting from the gas state (decoupled), and Jarzynski Mean (JM), taking the mean of JF and JB.

B. Crooks Gaussian Intersection
The Crooks Fluctuation Theorem (CFT) is an equation based on the ratio of the work distributions of two-sided transitions (that are started from equilibrium ensembles), moving from state A to state B and from state B to state A, with dissipated work involved in the transformation.The free energy difference can be computed from the CFT as the work value W where both distributions overlap P f (W ) = P b (−W ), Plotting the logarithm of the left side of Eq. ( 9) against the work values obtained from the non-equilibrium transitions gives a line with a slope β.This line intercepts the work axis at a value equal to ∆G.
Goette and Grubmüller derived a CGI estimator which is based on Gaussian approximation and showed that CGI yields accurate free energy estimations. 41The CGI method is mathematically expressed as where W f n f and W b n b are the work averages in the forward and backward directions, respectively.n f and n b are the number of the transitions of the respective directions, σ f and σ b are the standard deviations of the respective Gaussian functions, and σ 2 f and σ 2 b are the variances of the work distributions for the transitions in both directions.
For the calculations of HFE with CGI, we used the work distributions of two-sided (forward and backward) transitions based on the Gaussian approximation.The HFE ∆G is where the two distributions intercept (Fig. 1).

C. Bennett Acceptance Ratio
The BAR method estimates the free energy difference between two states A and B based on the data obtained from the simulations of both states simultaneously.The free energy difference between two states is given as where β = 1/k B T , k B is the Boltzmann constant, and T is the temperature.The subscripts A and B in Eq. ( 11) denote that the ensemble averages < > are calculated from the trajectories of the initial (A) and final (B) states, respectively.The symbols f and and the Hamiltonian of a system that consists of a kinetic and a potential energy component H(p, q) = K(p) + U(q), respectively.The unknown constant C is found numerically from Here n A and n B are the number of samples in each state and Q denotes the corresponding partition function.The value of C obtained from Eq. ( 12) yields the free energy difference ∆G AB , To assess data from multiple states, Shirts and Chodera 36 extended the method to the Multistate Bennett Acceptance Ratio (MBAR), using a maximum likelihood formulation.

III. SIMULATION DETAILS
All simulations were run using the MD software package Groningen Machine for Chemical Simulation (GROMACS) (version 2016.2). [43][44][45][46][47] The force field models and the initial coordinates of compounds were taken from the FreeSolv database. 7,8In all simulations, we used a leap-frog stochastic dynamics integrator 48 with the AMBER99SB-ILDN force field, 49 TIP3P water model, 50 the Linear Constraint Solver (LINCS) algorithm 51 for hydrogen bond constraints, and SET-TLE algorithm 52 to keep the water bonds and angle rigid.The equations of motion were integrated using a time step of 2 fs.Temperature coupling was performed using Langevin dynamics 48,53 with a coupling constant of 1.0 ps and a reference temperature of 298.15 K. To establish and maintain a pressure of 1 bar, the Berendsen barostat 54 was used during the equilibrium stage of the simulations and the Parrinello-Rahman barostat 55 was used during production simulations, with a time constant of 1.0 ps and a compressibility of 4.5 × 10 −5 bar −1 .The particle mesh Ewald (PME) algorithm 56 was used for electrostatic interactions with a switching distance of 1.2 nm, a grid spacing of 0.12 nm, and an interpolation order of 6 for long range electrostatics.For the van der Waals interactions, a cutoff of 1.1 nm was used.A softcore potential 57 was used during free energy calculations with parameters α = 0.3, σ = 0.25, and p = 1.
For the HFE calculations, the non-equilibrium transition simulations were conducted based on a non-equilibrium fast-growth thermodynamic integration (FGTI) protocol.First, 10 ns simulations were performed in the equilibrium states A (coupled) and B (decoupled).Then, from the first 5 ns of the simulations, 50 snapshots were extracted and used to run short non-equilibrium simulations of 100 ps each, in which the coupling with the environment was inverted by switching lambda from 0 to 1 (forward: decoupling, A → B) and from 1 to 0 (backward: coupling, B → A) taking increments/decrements of ±2 × 10 −5 .For the analysis of the forward and backward simulations, the pmx tool was used. 58The uncertainties associated with each method were obtained with 1000 bootstrap iterations except for MBAR.Since MBAR presents an underestimate of the true uncertainty in the values due to finite sampling, 59 we performed three replicates of the simulations to obtain a more realistic uncertainty.

IV. RESULTS AND DISCUSSION
HFE calculations were performed on the 34 small neutral organic molecules given in Table I with the experimental range of free energies from +2 kcal to −16 kcal/mol using six different methods, JF, JB, JM, CGI, BAR, and MBAR, based on MD simulations.The MBAR results presented here are virtually identical to the FreeSolv values [Pearson R 2 = 99.95%,root mean square deviation (RMSD) 0.02 kcal/mol].Here, the main focus is on comparing the statistical efficiency of the methods, but first, we establish the correctness and convergence of the calculations.
To assess the convergence of the initial 10 ns trajectories, both in the coupled and the decoupled state, each was divided into two parts of 5 ns.The average root mean square deviation (RMSD) for both parts was calculated by dividing the first and last 5 ns of the MD trajectory into five blocks of 1 ns, computing the mean values in each block, and then calculating the standard deviation of the mean values from the five blocks.The similarity between the average RMSDs of the first and second parts of the trajectories (Fig. S1) suggests convergence in both parts of the simulations in the coupled state, where van der Waals and Coulomb interactions between the small molecule and the surrounding water molecules are on.This then suggests that only 5 ns of simulation should be sufficient for the selection of starting points for the non-equilibrium HFE calculations.To check the effect of simulation time on the accuracy of HFE, we calculated the hydration free energies of all the compounds studied for both trajectory parts as well.For the last 5 ns of the trajectories, the calculated hydration free energies are given in Table S1, but the results and figures presented in the main text were obtained using the first 5 ns of the equilibrated trajectories of 10 ns.The difference between the results obtained from the first and second half of the trajectory is small, as can be seen comparing Tables I and  S1.The results of HFE without bias correction, bias, variance, and mean square error (MSE) for JF and JB are presented in Tables S2 and S3.
Detailed results of the HFE estimates of the first 5 ns and the last 5 ns of the trajectories obtained using the CGI method are provided as the supplementary material (Figs.S2-S35 and Figs.S36-S69, respectively).It is noted that in those supplementary graphs, the sign of ∆G should be inverted since HFE is ∆G (B → A) = G (state A) − G (state B), while CGI is a specific estimator for the ∆G (A → B) = G (state B) -G (state A) based on the CFT.As can be seen from the joint set of CGI plots (Figs.S2-S69), for all the molecules, the work distributions of the forward and backward states exhibit a large overlap, suggesting that the equilibrated A and B states have converged well.
Table I lists the computational predictions of JF, JB, JM, CGI, BAR, and MBAR together with the experimental data for all the molecules studied.From this table, it is possible to assess how well the computational predictions correspond to the experimental values and which methods give similar or different results.Table I shows that all methods produce the same mean absolute error (MAE) and root mean square error (RMSE) from experiment, which is a prerequisite for further analysis.It is apparent that the error bars in the individual HFE are larger for JF, JB, JM, CGI, and BAR than for MBAR.The error bars are related to the amount of simulation time per molecule of the employed methods (Table I).The MBAR results involve a total of 20 windows (intermediate λ states) of 5 ns for each molecule, thus costing 100 ns for each molecule.In this study, we used equilibrium trajectories of 5 ns in both the coupled and the decoupled states.From these trajectories of the simulations, 50 snapshots were extracted and non-equilibrium transitions (simulations) were performed for 100 ps each.The cost of both the equilibrium and non-equilibrium simulations is 5 ns, with a total of 10 ns for each molecule using either of the one-sided methods, and double that for the two-sided methods (Table I).
To assess the accuracy of the calculated HFE for each method and see whether the size of the deviations depends on the size of the HFE and/or the nature of the compound, the deviations from the experimental values are shown in Fig. 2, with gray bands denoting the range of absolute errors within 1 and 2 kcal/mol.The figure shows that ∆∆G free energy differences of all the methods with experimental values are within 1 kcal/mol for 22 molecules, within 2 kcal/mol for 10 molecules, and within 3 kcal/mol for 2 molecules.From Fig. 2, it appears that both one-and two-sided methods employed in this study can be used to predict the experimental HFE values, but the use of one-sided methods, in particular, JF, leads to slightly larger systematic errors.Although the methods yield different RMSE, the deviation from the "true" value defined by the force field can be assumed to be stochastic with an expectation value of zero at least for the two-sided methods.As a result, the deviation from experiment is identical for these methods.The correction for the sampling bias 42 compensates this to a FIG. 2. Absolute errors between calculated and experimental hydration free energies.The darker and lighter gray shaded areas mark a deviation of 1 kcal/mol and 2 kcal/mol, respectively.Lines are drawn between neighboring points to guide the eye and stress the profiles in the deviations for comparing the different methods.
large extent.This means there is no established way to correct the JF or JB methods to obtain the true HFE ("true" as in the value inherent to the force field used).However, averaging JF and JB into JM results in a reliable estimate of the HFE, as proposed by Collin et al. 60 To investigate force-field consistency, a linear regression was performed of the computed values against the MBAR values for each of the methods in Fig. 3.The correlation coefficients between the computed hydration free energies with JF, JB, JM, CGI, BAR, and MBAR have R of 1.0.Additionally, the dataset of each method was compared statistically with MBAR values.The absolute error (MAE) and the root mean square error (RMSE) for all methods are given in Fig. 3.These show that JM, CGI, and BAR yield results that are more consistent with MBAR than the one-sided JF and JB.
Finally, we return to Table I and evaluate the statistical efficiency ε of each of the methods, defined as 61 where T is the simulation time that is multiplied by the average variance (MSD) in the HFE calculations.Here, the simulation time is used as a proxy for the amount of information in the simulation.This yields a number ε for each method that can be used as follows: for a desired RMSD , the simulation time T needs to be 1/(ε RMSD 2 ).This makes sense if we assume independent statistical sampling since in this case, the RMSD decreases as the square root of the sampling time.Table S1 shows that the ε are reproducible since they are roughly the same as in Table I.The MBAR method has more than double the efficiency of CGI and BAR, both of which have slightly higher efficiency than either JM, JF, or JB.In order for JM to obtain the same RMSD as MBAR, which is considered the best and most reliable estimator, 7,62 approximately 13 times more sampling would be needed, annulling any perceived computational efficiency gains of the non-equilibrium method.It should be noted that the absolute values of ε are dependent on the dataset being considered and that the amount of information may differ somewhat between different methods.There is no a priori reason to believe, however, that the relative efficiencies would be different for other datasets.It should be noted that the accuracy of non-equilibrium free energy methods depends on a number of settings, such as the size of molecule, the speed of (de)coupling, and the simulation time of the equilibrium trajectories and the non-equilibrium transitions.It is likely that the larger/flexible molecule, faster (de)coupling, and shorter simulation time would lead to higher variance due to the large and/or fast perturbations of the molecule, especially in the decoupled state, and hence lower statistical efficiency as what is found here.Paliwal and Shirts performed an extensive statistical analysis of free energy methods, including thermodynamic integration, BAR and MBAR.In line with our results, they concluded that MBAR is the most efficient method in terms of the lowest attainable error; 62 however, they did not take into account the amount of sampling needed to obtain this efficiency.A point of concern in our analysis is whether the errors in the equation for ε are accurate or indeed comparable between different methods.The error estimators used here are considered to be state of the art for all methods for BAR and MBAR as well as for the non-equilibrium methods. 62hould better estimators be derived, the analysis may need to be redone.

V. CONCLUSION
Hydration free energies of 34 small neutral organic molecules were computed using the non-equilibrium free energy calculation methods JF, JB, JM, BAR, and CGI and the results compared with the equilibrium method MBAR and experimental values.Comparison of the different methods shows that all the non-equilibrium one-and two-sided methods reproduce the HFE results from the equilibrium twosided method MBAR reasonably well.A careful comparison of the efficiency ε of the methods as expressed in average RMSD divided by simulation time yields that ε(MBAR) > ε(BAR) ≈ ε(CGI) > ε(JX), where JX is any of the Jarzynski methods.
It should be noted that all the molecules studied are small.Large and/or flexible molecules may suffer from convergence issues due to the large perturbations caused by their internal motions and local roto-translational motions in the decoupled state, even with two-sided methods.Since free energy is a state function, therefore, splitting a large perturbation into series of smaller ones by multistage free energy perturbation calculations using multiple λ values is recommended. 63

SUPPLEMENTARY MATERIAL
See supplementary material for figures of the predicted hydration free energies of all the compounds with the CGI method.

)FIG. 1 .
FIG. 1. Schematic representation of the non-equilibrium HFE calculation for switching from the equilibrium state A to another state B for a forward (A → B) and a backward (B → A) process with CGI.W f and W b are the means of the work distributions of the forward [P(W f )] and backward state [P(−W b )], respectively.∆G is the intercept of the two distributions.
Gore et al. have reported that bias needs a function α, which depends on a parameter C.

TABLE I .
7,8 hydration free energies from the first 5 ns of the trajectories of all the compounds studied.Energies are in kcal/mol.∆GExp.valuesare taken from the FreeSolv database.7,8Statisticsare computed with respect to experimental data, mean absolute error (MAE), root mean square error (RMSE), and Pearson correlation coefficient R.