Saturday, November 22, 2014

Computing a solvent accessible surface area using Jmol

Someone asked me how to compute a surface area of a molecule.  Here's how to compute a solvent accessible surface area using Jmol.

1. Load your molecule into Jmol (File > Open)

2. Open the Script Editor and Output Console (File > Script Editor, etc)

3. Type the following three lines into the Script Editor window and hit run

isosurface solvent
isosurface area set 0

1.2 Å is the probe radius for water solvent.  If you are using another solvent input another value.  It will correspond roughly to the radius of the solvent molecule.

4. The are will be printed out in the Console.  The units are not specified but I'm sure it's Å2 (square Ångstrom)

Questions about the book?

I have a sneaking suspicion that at least one of you have questions about somethings that I wrote in the book.  So ask away below in the comments section and please remember to tag me in the post (+jan jensen - see picture below) so I get alerted.

Wednesday, November 19, 2014

Introducing the Computational Chemistry Daily

Yesterday I started the Computational Chemistry Daily which aggregates posts related to computational chemistry on social media such as Twitter, Google+ and Facebook on a daily basis.  It identifies the posts by certain words or hashtags such as #compchem and everyone can contribute. Hope you find it useful.

Sunday, November 9, 2014

Calculation of Solvation Free Energies of Charged Solutes Using Mixed Cluster/Continuum Models

2015.01.25:  I have summarized discussion of this and related issues in this paper.  Point 1 is wrong and Gibbs free energies should be used throughout.

This post takes it title from this paper by Bryantsev et al., which provides an excellent discussion of the issue.  The point of this post is to suggest four modifications to the one of the approaches discussed in the paper:

1. The use of standard Helmholz free energies instead of standard Gibbs free energies

2. Correction for indistinguishability of water molecules

3. Parameterization of a water oxygen solvent radius

4. The general use of the cluster model

Background
Continuum solvation models do not account for strong and specific interactions of solvent molecules with the solute (notably ions) so one must include some explicit solvent molecules in the continuum calculations.  The question is how many explicit solvent molecules to use.

Pliego and co-workers suggested that the optimum number of water molecules is the one for which $\Delta G^*_{\text{solv}}(\text{A}^{m\pm})$, computed using Scheme 1, is most negative (refer to the paper for a definition of terms).

Taken from 10.1021/jp802665d. (c) American Chemical Society

However, Bryantsev et al. argue that $\Delta G^*_{\text{solv}}(\text{A}^{m\pm})$ should be computed according to Scheme 2 and that computed in this way, $\Delta G^*_{\text{solv}}(\text{A}^{m\pm})$ will converge smoothly to the best value.  Thus, the optimum number of explicit solvent molecules is that for which $\Delta G^*_{\text{solv}}(\text{A}^{m\pm})$ changes very little when you add another one.

In principle, Scheme 1 and 2 should yield the same $\Delta G^*_{\text{solv}}(\text{A}^{m\pm})$ but Bryantsev et al. argue that Scheme 2 provides better cancellation of error compared to Scheme 1. The authors demonstrate this by point showing that the computed free energy of water cluster formation in bulk water, computed using Scheme 3, deviates significantly from 0 with increasing cluster size.

Taken from 10.1021/jp802665d. (c) American Chemical Society

In Figure 1 I show a plot of the residual error ($RE$, large dots),

$RE=-n\Delta G^*_{\text{solv}}(\text{H}_2\text{O}) + \Delta G^\circ_{\text{g,bind}}-(n-1)\Delta G^{\circ-*}$
$+ \Delta G^*_{\text{solv}}((\text{H}_2\text{O})_n)-RT\ln (n[\text{H}_2\text{O}]^{n-1})$

computed using two different continuum solvation models (SM6 red and COSMO blue) for increasing number of $n$.

Figure 1. Plot of RE without (large dots) and with (small dots) the Helmholz free energy and correction for indistinguishability of water molecules

Next I'll suggest some changes that significantly reduce $RE$.

1. The use of standard Helmholz free energies instead of standard Gibbs free energies
The change in Helmholz free energy ($\Delta A^\circ$) is related to the change in Gibbs free energy ($\Delta G^\circ$) by

$\Delta G^\circ=\Delta A^\circ+p^\circ \Delta V$

In the gas phase $p^\circ \Delta V = \Delta n RT$ where $\Delta n$ is the difference between the number of product and reactant molecules.  In solution $\Delta V$ is hard to estimate, but the best approximation is $\Delta V \approx 0$ and $\Delta G^\circ \approx \Delta A^\circ$.  So, I suggest using

$\Delta A^\circ_{\text{g,bind}}=\Delta G^\circ_{\text{g,bind}}-(n-1)RT$

2. Correction for indistinguishability of water molecules
Let's consider formation of the water dimer in solution, i.e. Scheme 3 for $n=2$.  As I have written about earlier, there are two ways of making the water dimer: one in which molecule $A$ is the H-donor and one where molecule $B$ is the H-donor (I assume the gas phase Hessian calculation for the water molecule is done in $C_{2v}$ symmetry).  In general, a (H$_2$O)$_n$ cluster can be made $n!$ ways so I suggest using

$\Delta A^\circ_{\text{g,bind}}=\Delta G^\circ_{\text{g,bind}}-(n-1)RT-RT\ln(n!)$

With these corrections $RE$ vs $n$ is significantly reduced as shown in Figure 1 (small dots). For example, for $n=18$ and COSMO $RE$ has been reduced from 41 to 10 kcal/mol.  The remaining error comes from errors in the solvation energies, hydrogen bond strengths, anharmonicity, and conformational entropy.  The $n=18$ cluster has around 30 hydrogen bonds to the 10 kcal/mol error could easily be explained by a 0.3 kcal/mol error in the computed hydrogen bond strength.

3. Parameterization of a water oxygen solvent radius
The solvation energies are clearly another source of error.  For example, the COSMO value for $\Delta G^*_{\text{solv}}(\text{H}_2\text{O})$ is 0.35 kcal/mol lower than the experimental value, corresponding to an error of 6.3 kcal/mol for 18 water molecules.  Part of that error is probably cancelled by similar errors in $\Delta G^*_{\text{solv}}((\text{H}_2\text{O})_n)$ but not all.

In fact, the higher errors for SM6 must be due to its underestimating $\Delta G^*_{\text{solv}}(\text{H}_2\text{O})$ by 2.51 kcal/mol (corresponding to ca 45 kcal/mol in the gas phase for $n=18$!).  Thus if SM6 has to be used it is important to re-parameterize it so that $\Delta G^*_{\text{solv}}(\text{H}_2\text{O})$ is reproduced reasonably well.  The easiest way is probably to change the radius of the oxygen water molecule sphere.

Notice that results are not improved simply by using the experimental value of $\Delta G^*_{\text{solv}}(\text{H}_2\text{O})$ instead of the computed one.  For example, for $n=18$ the difference in $RE$ for SM6 and COSMO (where the error in the solvation energy is quite small) is "only" ca 20 kcal/mol so introducing a 45 kcal/mol correction will not improve things for the right reasons.

Changes in Scheme 1 and 2
So, I suggest the following corrections to Scheme 1 and 2, respectively

$\Delta A^\circ_{\text{g,bind}}(I)=\Delta G^\circ_{\text{g,bind}}(I)-nRT-RT\ln(n!)$

$\Delta A^\circ_{\text{g,bind}}(II)=\Delta G^\circ_{\text{g,bind}}(II)-RT$

Notice that there is a $RT\ln(n!)$ term for both $(\text{H}_2\text{O})_n$ and $[\text{A}(\text{H}_2\text{O})_n]^{m\pm}$ leading to cancellation of this term for Scheme 2.

Clearly the corrections are much more important for Scheme 1 and Scheme 2 is still the more accurate approach due to better cancellation of errors.

4. The general use of the cluster model
While Bryantsev et al. advocate the use of the cluster cycle to find the optimum cluster size for the ions they still seem to advocate the use of a monomer cycle for pKa predictions (Scheme 4).

Taken from 10.1021/jp802665d. (c) American Chemical Society

I think better cancellation would be achieved by using two water clusters of size $n$ and $m$ instead of $n+m$ water monomers.  Thus

$\Delta G^\circ_{\text{g,deprot}}-(n+m-1)\Delta G^{\circ-*} \rightarrow \Delta A^\circ_{\text{g,deprot}} - \Delta G^{\circ-*}$

$\Delta A^\circ_{\text{g,deprot}}=\Delta G^\circ_{\text{g,deprot}}-RT$

$(n+m)\Delta G^*_{\text{solv}}(\text{H}_2\text{O}) \rightarrow \Delta G^*_{\text{solv}}((\text{H}_2\text{O})_n)+ \Delta G^*_{\text{solv}}((\text{H}_2\text{O})_m)$

$(n+m)RT\ln([\text{H}_2\text{O}]) \rightarrow 2RT\ln([\text{H}_2\text{O}]/(nm))$