Sunday, December 19, 2010

ChemDoodling on the iPad and the future of interactive chemistry textbooks

Interactive model of morphine

I have bought an iPad!  In honor of this purchase I bring to you this blog's first interactive figure that also works on the iPad (and most other mobile devices).  It is made with ChemDoodle Web Components (a modified version of this page), an open source javascript based toolkit for chemistry, made by Kevin Theisen and co-workers at his company iChemLabs.

Readers of this blog will know that I am quite fond of Jmol for interactive molecular models, but Jmol is written in Java, which is not supported by the iOS operating system that iPads, iPhones, and iPods use - and perhaps it never will be.   This decision by Apple basically means back to square one for interactive chemistry when it comes to the iPad.

I know of just two options for interactive models for the iPad: the Molecules app by Brad Larson and ChemDoodle Web Components (TwirlyMol does not appear to be interactive on the iPad).  The Molecules app looks a bit more three dimensional, but works only on the iPad.  Chemdoodle Web Components should work on most browsers and most operating systems, and a fully 3D version is also available.  The 3D version of ChemDoodle Web Components requires something called WebGL, which is not available in standard browsers yet, but should be soon.  You can get access to it now by downloading Google Chrome (BETA).

It is the Molecules app that is used in the interactive text book from Inkling that I wrote about earlier (thanks again to Henry Rzepa for the info).  But I think ChemDoodle Web Components holds tremendous promise for interactive chemistry textbooks when combined with another new innovation on the horizon: EPUB3.

Epub is basically code that makes XHTML look nice when viewed in an epub reader (such as iBooks), but the current version does not allow for things like javascript, needed for interactivity.  That will change with epub3 and, when combined with ChemDoodle Web Components, should allow us to make interactive chemistry textbooks that can be read on most devices.   It will be an exciting time.

Saturday, December 11, 2010

Computational chemistry exercises

Someone (I'll call him N. O'Boyle ... no, too obvious ... Noel O.) wrote me asking if I had any computational exercises I'd be willing to share. Since I took the trouble of writing him back, it occurred to me that I had free blog material. Here's my reply in a slightly edited form.

When I taught a computational chemistry course in Iowa I used exercises from "A Laboratory Book of Computational Organic Chemistry" by Warren Hehre et al., and Spartan in those pre-Avogadro days. Specifically experiments 4, 11, 34 and 76.  See here for an example.  The computational component of the other half of the course was individual research projects.

I would assign an experiment on a Monday, discuss it the following Monday, and have a write-up due  the Monday after that. I graded the first write-up (of exp 4) very lightly and then gave the student this example write-up of exp 4 so they could see how how to do it.  I also made this check list for a report and a list of questions for each experiment (taken from the book): exp 4, exp 11, exp 34, and exp 76.

Here in Copenhagen I co-teach a similar course with 5 other people, so I just get 1-2 exercises a year, and here I try to fit the content of the exercises in with the other instructors and the topic I cover. I teach the chapter on DFT and here I have developed an exercise using bond energies.

Sometimes I also teach the chapter on geometry optimization and then I use exp 76. Other instructors use Gaussian/Gaussview so that's what the student tend to use here too, so I have made no tutorials to go with the exercises.

"Exercises" in Molecular Modeling Basics: In Chapter 4 I illustrate applications of QM to various chemical problems, and Chapter 5 gives you some details of the underlying GAMESS input and output files (and there are now several blog posts with even more information on the various examples). The intent is that people can reproduce the results I present in Chapter 4 relatively easily.  Depending on the level of the course, reproducing these example may be challenging enough. Otherwise, one could easily come up with additional related problems. Let me know if that is of interest.

Many of the molecules and concepts are very P-chem oriented, i.e. uses small non-organic molecules to illustrate P-chem concepts, but there are some organic molecule/concept examples too: steric strain, hydrogen bonding, amide hydrolysis.  

Sunday, December 5, 2010

Arsenic and odd life

A recent Science paper describes "A bacterium that can grow by using arsenic instead of phosphorus", and molecular visualization is used to elegantly illustrate the basic idea, a shown in the above screencast (for the whole video visit gizmodo).

Whether the basic idea is correct is a whole other matter: for examples see here and here (be sure to check out the comments).

Saturday, December 4, 2010

Simulations in teaching physical chemistry: thermodynamics and statistical mechanics

In this post I summarize the simulations and I have used in teaching thermo and stat mech, and talk a bit about how I use them.

I co-teach two quite similar courses on this topic: one for nano-students and another for chemistry and biochemistry students.  In the nano course we use the book Molecular Driving Forces by Dill and Bromberg, and in the other Quanta, Matter, and Change by Atkins, de Paula, and Friedman.  At the end of this post I have organized the simulations by chapter for each book.

Some of simulations I have made (or modified extensively) and most of these have been discussed in previous blog posts, so I simply give the link to the respective blog post where there is more information.

The other simulations are from the Molecular Workbench (MW) library of models, and here I provide links that will open in MW, so you need to install MW before clicking on the links.  For some of them I also provide a brief description of what concepts try to demonstrate using the simulations.

How do I use the simulations?
All simulations are used during lecture to visualize concepts, start discussions, and motivate equations. I'll take Illustrating energy states as an example: instead of saying "Molecules in a gas translate, rotate, vibrate, and ....", I say "Here is a zoomed-in view of butane gas where you can see the molecules.  You can see that individual molecules move differently.  How do they move differently?  Anyone?  Right, they have different speeds.  This kind of motion is called translation.  What else? ..."

Practical tips
On a very practical note, my own simulations are all on web sites and I make sure to open all of them before the lecture, while I have all the MW simulations for the course indexed on a single MW page (click here to open in MW). It is not possible to embed these simulations in Powerpoint slides, but you can switch between Powerpoint and other applications without quitting Powerpoint (on a Mac you use command-tab and on Windows i believe it is windowskey-tab).  Note that you need access to the internet in the lecture room.

While I have screencasts of most of simulations on the blog posts, I don't use these during lecture.  I think it is too passive, and puts the students to sleep.  But I believe the screencasts are a good way for the students to review the main points of simulations after the lecture.  I put links to the blog posts on the course web site and in the lecture notes.

Is using simulations a good idea?
If possible I try to use a simulation within the first five minutes of a lecture, and have a maximum of 20 minutes between simulations.  I now only have one (45 minute) lecture left where I don't use a single simulation and I can just feel how I loose the student's attention after about 30 minutes.  You can just see it.  That being said, no one has ever mentioned the simulations in their course evaluations (good or bad), so I have no hard evidence that it improves my teaching.  But I can tell you that I enjoy lecturing much more with the simulations, so unless I get complaints I'll keep doing it. 

Making room for simulations in the lecture
I have taught the topics for many years without any simulations, and was never at a loss for material to cover.  Lecture time is precious, and these simulations take time to present and discuss.  You really have to introduce the simulation carefully (don't rush this part!) before you start them, and very often you want the students to speculate about what will happen before you start them.  Furthermore, they tend to stimulate many more questions, that you can hopefully turn into a discussion instead of simply answering them, than derivations - that's the whole point.

So how do you "make room" for the simulations?  I have cut out most of the derivations from the lectures.  To pay for my sins, I provide relatively detailed (typed) lecture notes ahead of lecture (I generally don't use Powerpoint), which include step-by-step derivations. So I'll say things like "Starting with these assumptions we can write down this equation.  This can be rewritten as this equation, which is much simpler.  The details on how we got from here to there are in your notes, but note that in step 3 we assume that ... which is an approximation."  No complaints so far.  If only more progress had been made on simulating derivations ...

Here are the simulations organized by chapter

Molecular Driving Forces by Dill and Bromberg (1st edition)

Ch 6: Entropy and the Boltzmann distribution law
Illustrating entropy

Ch 10: Boltzmann distribution law
Polymer unfolding: The book uses two simple bead models of polymers in this chapter to illustrate micro and macrostates and model protein melting.  I use this example extensively both in lectures and homework problems.  So I made this simulation to illustrate how higher energy macrostates become more likely at higher temperatures.

Ch 11: Statistical mechanics of simple gasses and solids
Illustrating energy states
Energy states in the water molecule: a slightly more complicated molecule than HCl (used in Illustrating energy states) with more than one vibrational mode and 3 rotational degrees of freedom.
Internal energy and molecular motion
Entropy, volume, and temperature

Ch 12: Temperature, heat capacity
The molecular basis of differential scanning calorimetry: heat capacity and energy fluctuations

Ch 13: Chemical equilibria
Seeing chemical equilibrium (opens in MW)
Dalton's law of partial pressure (opens in MW)

Ch 14: Equilibria between solids, liquids, and gasses
Seeing specific and latent heat (opens in MW): I use this simulation to illustrate how the same substance can be solid, liquid, and gas depending on the temperature.
A gas under a piston (opens in MW): I use this simulation to show that, for example, decreasing the pressure can have the same effect as increasing the temperature.
The phase diagram explorer (opens in MW)
Raoult's law: ideal solutions (opens in MW): Here, I use the simulation of the pure liquid to illustrate vapor pressure.

Ch 15: Solution and Mixtures
Mixing gasses, and mixing of ideal and non-ideal liquids
Raoult's law: ideal solutions (opens in MW)
Raoult's law: negative deviation (opens in MW) 
Raoult's law: positive deviation (opens in MW)

Ch 16: Solvation and transfers of molecules between phases
Visualizing osmotic pressure in an osmotic equilibrium (opens in MW)
Desalination using reverse osmosis (opens in MW)

Quanta, Matter, and Change by Atkins, de Paula and Friedman (1st edition)

Ch 13: The Boltzmann distribution
Illustrating energy states
Energy states in the water molecule: a slightly more complicated molecule than HCl (used in Illustrating energy states) with more than one vibrational mode and 3 rotational degrees of freedom.
Internal energy and molecular motion

Ch 14: The first law of thermodynamics#
The molecular basis of differential scanning calorimetry: heat capacity and energy fluctuations
Ch 15: The second law of thermodynamics
Illustrating entropy
Entropy, volume, and temperature
Ch 16: Physical equilibria
Seeing specific and latent heat (opens in MW): I use this simulation to illustrate how the same substance can be solid, liquid, and gas depending on the temperature.

A gas under a piston (opens in MW): I use this simulation to show that, for example, decreasing the pressure can have the same effect as increasing the temperature.

The phase diagram explorer (opens in MW)
Raoult's law: ideal solutions (opens in MW): Here, I use the simulation of the pure liquid to illustrate vapor pressure.
Visualizing osmotic pressure in an osmotic equilibrium (opens in MW)
Desalination using reverse osmosis (opens in MW)

Ch 17: Chemical equilibria#
Seeing chemical equilibrium (opens in MW)
Dalton's law of partial pressure (opens in MW)

# I don't teach this part of the course, but if I did I would use these simulations

Related posts:
An Atkins Diet of Molecular Workbench 
One, Two, Three, MD 
Tunneling and STM (a first stab at using Molecular Workbench to teach quantum mechanics)

Illustrating mixing

This screencast shows Molecular Workbench simulations I have made to illustrate mixing.

The first simulation illustrates the mixing of 2 ideal gases, which mix readily.  Since the gas particles don't interact you can think of the mixing as each gas expanding to fill both containers independently of each other.  As I have shown in this simulation, the driving force for this expansion is an increase in entropy.  Therefore, the driving force for mixing two ideal gasses is also purely entropic.
The second set of simulations illustrates the mixing of 2 liquids.  Since they are liquids there must be attractive interactions between the atoms.  If there were no interactions they would be (ideal) gasses.  The strength of the interactions (and the temperature) determine whether they mix or not.

In the first liquid simulation, the attraction between two green atoms (εGG), between two blue atoms (εBB), and between a green and a blue atom (εGB) are the same.
This means that a green atom doesn't care whether it is sitting next to a blue atom or another green atom.  The net effect is that green and red atoms are equally likely to be on the right or left side of the container, and the liquids mix for the same reason as the ideal gasses mix: the driving force is purely entropic. That means the enthalpy of mixing is zero:
This is the definition of an ideal mixture (or ideal solution).  The two liquids will mix at any temperature.

In the second liquid simulation, the attraction between two blue atoms (εBB) is stronger than between two green atoms (εGG) and between a green and a blue atom (εGB).
Note that the ε's are negative: a smaller ε means a stronger attraction.  This means that the blue particles would rather be with other blue particles, i.e. the enthalpy increases if the particles are mixed.
(z is the number of contacts between particles in solution, and xG is the mole fraction of green atoms).  This is an example of a non-ideal mixture, where the definition for a non-ideal mixture is
Because ΔmixH > 0 this non-ideal solution mixes spontaneously (i.e. ΔmixG < 0) only for
Oil and water is a common example of such an non-ideal mixture: the oil-oil interactions are stronger than the oil-water and water-water interactions.

Salt and water is another example of on idea mixture, but here ΔmixH <  0 so salt and water almost always mixes spontaneously.  The interpretation is that the interactions between the salt ions and water is stronger than the average interaction between salt ions and between water molecules.
Implications and limitations
The definition of ΔmixH in terms of the ε's suggest that liquids should also mix if
which would be a more general definition of an ideal mixture.

This is tested in the third liquid simulation.  As you can see the liquids mix more than in the second simulation, but not quite as much as in the first simulation.  This is mostly because of the simulation runs only for 100 picoseconds, which is to short to mix fully.  But another reason is that there is less space (on average) between the blue atoms compared to the green atoms, because the blue atoms attract each other more.  The next effect is that the blue particles tend to stay together to lower the enthalpy.  More mathematically,  z (the number of contacts between particles in solution) is not exactly the same for the blue and green particles so the interpretation of ΔmixH in terms of the ε's breaks down.  The "safest" definition of an ideal mixture thus remains:
i.e. "like dissolves like".

Accessing the simulations
You can play around with the simulations here and here, or you can download the models here and here if you have Molecular Workbench installed on your computer.

Sunday, November 21, 2010

The Open Science unjournal

In this post I do something a little different than normal for this blog: I introduce a scientific unjournal called Open Science.  It doesn't exist, but it should.
What is an unjournal?
An unjournal is to journals what an unconference is to conferences.  To define what an unjournal is, take the first 2 sentences in the wikipedia entry on unconferences and substitute a few words:  "An unjournal is a facilitated, participant-driven journal centered on a theme or purpose. The term "unjournal" has been applied, or self-applied, to a wide range of publications that try to avoid one or more aspects of a conventional publishing, such as loss of copyright, high fees, [list your favorite pet-publishing-peeve here]."

Here are some frequently asked questions (FAQs) about Open Science:

How do I publish in Open Science?
The usual way is to deposit your manuscript on  Once you have completed the submission process the paper is given a time and date stamp and the paper is published and open for review (see next question).  While there is a recommended template available for the paper, there is no fixed formatIt is possible to upload all your raw data or link to the data if it is hosted elsewhere.  Experience has shown that this tends to increase the scores (see below) of your papers significantly.

You can also choose to publish the paper on any of the unjournal sites whose contents are linked to  These sites are often run by established publishers and offer more user-friendly interfaces, but may require a fee and may ask you to give up your copyright (though the content is open access by definition).

Is Open Science peer reviewed?
For a publication in an unjournal such as Open Science the question should be: is my article in Open Science peer reviewed?  That is in large part up to you.  The paper is open for online, non-anonymous, and completely transparent review and you have 2 months in which you can change the content of the article in response to the comments.  After the 2 month period you cannot change the content, but you can of course respond to new comments online.  For very serious criticisms you may want published a new Open Science article to respond.

It is up to you to solicit reviews, though any published author of a peer-reviewed paper (defined below) can review your paper during the 2 months review process.  Based on our experience only well-written and well-presented articles on scientifically interesting questions get reviewed.

An Open Science article is neither rejected nor accepted at the end of the review process.  Instead it receives an initial score (see next FAQ).  Obviously, any paper that does not generate a single review (or is reviewed but gets an initial score of 0) is not considered peer reviewed.

What is the impact factor of Open Science?
For a publication in an unjournal such as Open Science the question should be: what is the impact factor of my article in Open Science?  During the 2 months review process each reviewer gives the paper a score between 5 (good) and 0 (bad).  This score can be adjusted by the reviewers based on the changes you make within the first 2 months, after that it is fixed. The initial score for your paper is the average of all reviewers final scores.  

If the paper is cited it receives an additional score (called the current score).  The score is determined by the number of citations, and if the citing paper is published in Open Science, the score is weighted by the initial and current score of that paper.  The authors of that paper can also choose to indicate how important your paper was to theirs.  If high scoring papers cite your paper in a positive manner, the current score of your paper increases. Self-citations are not included.  

Why should I review for Open Science?
The work you put into reviewing is now documented for all to see.  Have you contributed greatly to science by identifying Open Science papers with high current scores?  Do the reviews you submit carry more weight with the author and other reviewers as a result?  Some sites now list Open Science reviewers with particularly high impact as a kind of editorial board for the journal.

How should I cite an Open Science paper?
One suggestion is: Author(s), Title, Open Science, date of submission, initial/current score.  If you publish in Open Science using the suggested template, the current score is updated automatically. 

Why should I publish in Open Science?
There are many reasons:
(1) You retain the copyright and anyone can see the paper.
(2) Your paper is accessible upon submission. (Don't rush to publish though: you only have 2 months to get a good initial score).
(3) The impact of your paper is evident in the citation, but disconnected from the conventional impact factor of the journal you managed to get it in to.  The initial score of your paper can help the paper off to a good start, but your truly important papers will ultimately be identified by its current score.
(4) You choose the publishing format you like.  What's your pleasure? machine readable? interactive figures? link to raw data?
(5) Your paper is a living document: comments or questions continue to roll in on important papers and you can update links to your papers (related articles, a new data format) as you see fit.
(6) If you write a good paper, you will get more reviews (i.e. more suggestions and input) but the rantings of a single idiot reviewer will not prevent publication.  Isn't this the place to publish daring and ground-breaking work?  

So what brought this on?
The blogosphere: Egon Willighagen's latest post got me thinking about this particular idea, but the general problems it is trying to address was brought to my attention by many other blogs such as Michael Nielsen's The Future of Science and Is Scientific Publishing About to be Disrupted? posts; most posts by Peter Murray-Rust; Henry Rzepa's long fight to include interactive figures in conventional journals; Mat Todd's excitement for an unconference and the discussion it generated at Derek Lowe's blog.  Why can't we have this in a journal?

Is the Open Science unjournal a good idea?
The blogosphere will decide: no comments on, and no re-blogging of, this post will mean this idea dies a quiet death (by receiving an initial and current score of 0).  But if you get enough smart people fired up about an important idea that can be solved by IT, good things can happen.

Saturday, November 6, 2010

The molecular basis of differential scanning calorimetry: heat capacity and energy fluctuations

Melting and boiling points are convenient and important measures of stability.  But how do you measure a melting point of, for example, nanoparticles that are too small to see?

This screencast shows two Molecular Workbench simulations (you can find them here and here) I made [see credits at the end of the post] to illustrate the connection between phase transitions, changes in heat capacity, and energy fluctuations, and the slides below takes you through the basic ideas behind them.

Slide 1: In the first simulation heat is added to a nano-particle and the resulting temperature increase is measured. When viewing the simulation notice that the temperature increases less during the melting/evaporation.

Slide 2: To analyze the data we first switch the x and y-axes, so that heat added (i.e. the internal energy, U) is plotted as a function of temperature.

Slide 3: The data is a bit noisy (mainly because the simulation heats the particle too fast: from 0 to almost 3000 K in about 200 picoseconds!), so I smooth it by fitting a curve to it.

Slide 4: From the smoothed data I can calculate how fast the energy changes with temperature.  This is the heat capacity (Cv), which peaks at a temperature around 1350 K - the melting temperature of the particle.

Slide 5: This observation forms the basis of differential scanning calorimetry, which measures the temperature as a function of the flow of energy to a system, and determines the melting point by finding the temperature where the heat capacity peaks.

Slide 6: One way to explain why the heat capacity peaks at a phase transition such as melting is through its relation to energy fluctuations: the system changes most during a phase transition ("bonds" between particles are broken and formed), so the energy fluctuates more, meaning that the heat capacity is largest.

Slides 7, 8, and 9: In the second set of simulations the energy is plotted a function of time at 3 temperatures: before the particle melts (500 K), when the particle melts/evaporates (1350 K), and after the particle has evaporated (3000 K).

Slide 10: Results from the 3 simulations are compared.  Clearly the fluctuations are largest when the temperature is 1350 K.  The fluctuations at 3000 K are larger than at 500 K, even though the heat capacities are similar.  This is because the heat capacity is proportional to the average energy fluctuation divided by the temperature squared (slide 6).
You may wonder why we don't see two heat capacity peaks: one for melting and one for evaporation.  This is because of the particle is so small (i.e. composed of relative few particles).  For a macroscopic systems (like water) the phase transitions are well defined.  Water is ice at 272 K, melts at 273 K, and is a liquid at 274 K (at 1 atmosphere of pressure); and the heat capacity has a very narrow peak at 273 K. As particles become smaller their phase transitions become less well defined, the heat capacity peak becomes broad, and in some cases (like this one) you get a single heat capacity peak for melting and evaporation.  This means that the phase transition cannot really be classified as melting or evaporation and that is occurs over a relatively large temperature range.  Li and Truhlar have an interesting article on this subject. If you would like to play around with or modify the simulations they can be downloaded here and here, but you need to download Molecular Workbench first. Credits: The simulations are based on models and scripts by Arie Aziman and Carlos Gardena, who based their work on a model by Dan Damelin, i.e. they are made possible, like Molecular Workbench itself, by open source science.

Thursday, October 7, 2010

Nobel reactions

You may have heard this already: this years Nobel prize in chemistry went to Richard F. Heck, Ei-ichi Negishi, and Akira Suzuki "for palladium-catalyzed cross couplings in organic synthesis"Like last year, molecular modeling and animation is here to bring the science behind the prize to life.  This time, through the very excellent Chemtube3D (a Molecular Modeling Basics favorite) and its page on on organopalladium chemistry.

Saturday, October 2, 2010

Illustrating energy states

Here is a screencast showing 2 simulations I made to illustrate energy states (often also called microstates).  Open any book on statistical mechanics and you'll see a formula like this (for the Boltzmann distribution),
and, perhaps, a figure much like this.
But what are these "energy states" and ε's exactly? Hopefully the screencast and associated simulations, which you can find here and here, will help make this clearer.

About the simulations
If you want to play around with the HCl simulation, note that you have to reload the page if you want to change the translational motion.  This simulation is made with Jmol.  The molecular dynamics simulation of butane is made with Molecular Workbench.

Saturday, September 11, 2010

iPad: even 3-D molecules that can be viewed from any angle

I recently came across this report on, a new company aimed at bringing interactive textbooks to the iPad.  Interactive molecular models was mentioned two times and clearly left an impression on the author (italics are mine):

"Inkling’s software turns textbooks into interactive content, with video, hyperlinks between text and images, notes that can be shared between students and teachers, and even 3-D molecules that can be viewed from any angle."

"MacInnis – who worked at Apple for eight years, including a stint in the company’s educational division — says that the iPad is the perfect device for the kind of interactivity that Inkling provides because it has the ability to produce high-end graphics, such as the 3-D spinning molecule that is a feature of the company’s biology textbook."

This feature is also shown in inklings promotion video, excerpt below:
Update: MacInChemBlog keeps a list of science related iPhone/iPad apps.

Thursday, September 2, 2010

Sunday, August 29, 2010

Entropy, volume and temperature

This screencast shows a Molecular Workbench simulation I made to illustrate the connection between entropy, volume, and temperature.

Each simulation runs for 50 picoseconds (ps) and records how much time the two particles spend together (as a dimer) and apart (as monomers).  These times are a reflection of the relative probabilities of the dimer (pdimer) and monomers (pmonomers).
In the first simulation of the screencast the particles spend 38.3 ps together and 11.7 ps apart.  When I double the volume (while keeping the temperature constant) in the second simulation, the particles spend 32.7 ps together and 17.3 ps apart.

This makes intuitive sense: when the particles are not together they have a harder time finding each other again in a bigger volume.  Put another way, the probability is lower that the particles find each other when they move in a larger volume.

Now I want to connect this intuitive explanation with a thermodynamic one (i.e. an explanation in terms of free energy and entropy) :

The relative probability of being together and apart is given by the change in free energy (ΔA) on going from the dimer to the two monomers (dimer -> 2 monomers)
The increase in pmonomers/pdimer (from 0.31 to 0.53) on doubling the volume must mean that ΔA  decreases when the volume is doubled.  Statistical mechanics tells us that the only thermodynamic term in A that is affected by volume (V) is the translational entropy
 and that an increase in volume will increase the entropy of a particle
So when the volume is doubled the entropy of each particle (the dimer and each monomer) is increased by Rln(2). As a result ΔS for the reaction dimer -> 2 monomers increases [by Rln(2)] and ΔA = ΔU - TΔS decreases by RTln(2).  So the probability of the dimer relative to the monomers decrease because the entropy is increased when the volume increases.

In the last simulation of the screencast I double the temperature (while keeping the volume constant) compared to the first simulation. As a result the particles now spend less time together (19.2 ps) than apart (30.8 ps).

This also makes intuitive sense: the dimer is more likely to be struck by a particle with a kinetic energy larger than the potential energy that is holding it together: Furthermore, when the monomers happen to collide they are more likely to have a combined kinetic that is larger than the potential energy holding the dimer together, i.e. "with too much kinetic energy to stick together".

As before, this must mean that ΔA decreases when the temperature is increased.  The increase in translational entropy due to temperature is indeed larger than for the volume
and consistent with the observed larger change in the time the particles spend apart.

If you think of the dimer as a crude model of a salt you want to dissolve, this explains why dilution (increasing the volume) or heating increases the solubility (the amount you can dissolve).  Conversely, increasing the concentration (the amount of particles per volume) or decreasing the temperature helps dimer formation and, in a larger sense, self assembly.

You can play around with it on this web page, or you can download the model if you have Molecular Workbench installed on your computer. Enjoy!

Clarifications and approximations
 You may wonder why I discuss the Helmholtz free energy change (ΔA; some books use ΔF) rather than the Gibbs free energy change, when the volume changes.  This is because the volume of the system (i.e. the two compartments) does not change. Another argument is that no work is done during the expansion.

I have implicitly invoked the ergodic hypothesis which states that the probability computed using a collection of molecules at a single instant in time is equal to the probability computed for a single molecule over a long period of time.  While it makes intuitive sense, I don't believe this has been rigorously proven.

Related blog posts
Illustrating entropy
Where does the ln come from in S = k ln(W)?

Tuesday, August 24, 2010

Installing GAMESS on a linux PC

Probably best viewed in full screen mode

2014.02.18 Update: See this post by +Jimmy Charnley Kromann for more up-to-date instructions

Alchemist requested a post on installing GAMESS on a linux system: after all "I[f] someone can not install a program, how he/she can use it?" Fair enough.

If you have a mac or windows machine, pre-compiled versions of GAMESS is available (see this post on for installing such a version on Macs).  But if you have bought a linux computer, you do have to compile GAMESS yourself.

Three disclaimers:
(1) The post is aimed at people who want to install GAMESS on their personal computer that happens to run linux.
(2) The post is based on installing the March 25, 2010 version of GAMESS and corresponding scripts as they were distributed in mid August, 2010.  If you view this post a few years from now things may have changed.
(3) I don't have a linux PC, so the screencast is made on a shared research computer.  This affects one of the steps as I describe below, and may affect others steps that I don't know about.

The screencast assumes you have already downloaded gamess (gamess-current.tar.Z).  To download GAMESS start here and follow the instructions.  If you have problems watch the first few minutes of the screencast this post.

Here's a summary of the steps in the  screencast:

1. Unpack GAMESS: type "zcat gamess-current.tar.Z | tar -xvf -"
Now you should have a GAMESS directory.  The file gamess/misc/readme.unix has most of the instructions you need.  What follows it basically a condensed version with a few modifications.

2. In the GAMES directory: type "./config".
Follow the instructions and answer the questions.  You need to know whether you have a 32- or 64-bit chip.

3. Type "/sbin/sysctl -w kernel.shmmax=1610612736"
If you are curious about what this does, read section 5 of the file gamess/ddi/readme.ddi.
You need to be logged in as root for this to work.  I don't have root access on the machine I used, so I get an error message in the screencast.  But our system administrator had executed the command previously.

5. Change to the ddi subdirectory (gamess/ddi)Type "./compddi >& compddi.log" and then "mv ddikick.x .."

6. Go back to the gamess directory and type "./compall >& compall.log"
It will take 30-60 minutes to compile GAMESS.

7. Type  "./lked gamess 00 >& lked.log"

8. Edit the file rungms.  Here are the lines I modify or add:
set SCR=./
set USERSCR=./
set GMSPATH=./

setenv ERICFMT ./ericfmt.dat
setenv MCPPATH ./mcpdata

setenv  MAKEFP $USERSCR/$JOB.efp
setenv   GAMMA $USERSCR/$JOB.gamma
setenv   INPUT $SCR/$JOB.F05
setenv   PUNCH $USERSCR/$JOB.dat

   if ($os == Linux)   set GMSPATH=./

#  if ($NCPUS == 1) then
#     set NNODES=1
#     set HOSTLIST=(`hostname`)
#  endif
   set HOSTLIST=(`hostname`:cpus=$NCPUS)
   set NNODES=1 

#           echo I do not know how to run this node in parallel.
#           exit 20

9. Edit one line in the file runall: #chdir /u1/mike/gamess.
Type "./runall >& runall.log"

10. Go to the tools/checktst subdirectory and type "./checktst"
(Update: if the gamess directory is not in your home directory, you need to change this in the checktst file.  See this comment)

Running GAMESS in parallel
If you computer has more than one core, you may want to run GAMESS in parallel. To run exam01 on 2 cores type "./rungms exam01 00 2 >& exam01.log &"

Other useful programs and getting started with GAMESS
Now that you have GAMESS running you may want to install other useful programs such as Avogadro, GAMESSQ, and MacMolplt.
Here are some related blogposts to wet you appetite:
Building a complicated molecule with Avogadro

And here are a few blogposts on getting started with GAMESS:
Some GAMESS input basics
A typical set of GAMESS calculations

Acknowledgment: Mike Schmidt and Casper Steinmann helped with this post.

August 30, 2010 update: Alchemist has made the following page on installing GAMESS on Ubuntu 64 bit.  Be sure to check out the comments for additional useful links.

October 20, 2010:  Are you using Ubuntu 9.10 and 10.04, Linux Mint 8 and 9, Knoppix 6.2.1 or Suse 11.2? Be sure to check out this comment by Mott.

Wednesday, August 11, 2010

Amide hydrolysis, revisited 2

In a previous post I discussed the TSs of acetamide hydrolysis and hydrolysis of a related amide (3).  I have already made a post on how to find the TS for 3, and in this post I summarize the two ways of finding the TS for acetamide hydrolysis, that I describe in Chapter 5 of the book (where you can find many more details).

To start, I use Avogadro to build the geometry shown in Figure 5.32a and use it to construct the input file for an optimization with 4 constrained bond length.  The values for the constraints are taken from a paper.  Figure 5.32b shows the equilibrium geometry obtained with GAMESS, after 17 steps. Based on this geometry GAMESS finds the TS in three steps.

Figure 5.32a. Initial guess geometry for the constrained optimization discussed in Figure 5.33.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

Figure 5.32b. The geometry resulting from the constrained optimization and the normal mode of the imaginary frequency computed for this structure at the PM3 level.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

It will not always be possible to find an article that reports the TS structure for your reaction of interest. So let’s try to find the TS without the information from the article.  I start from the same structure (Figure 5.32a), and I arbitrarily pick C1-O10 as the reaction coordinate constrained to 1.70 Å.

This constrained optimization results (after 23 steps) in the geometry shown in Figure 5.35a.  The subsequent TS search results in the geometry shown in Figure 5.35b, which has no imaginary frequency and is clearly the product.

Figure 5.35a. The geometry resulting from a constrained optimization in which the distance between atoms 1 and 10 (see Figure 5.32a for numbering) was constrained to 1.7 Å, and the normal mode of the imaginary frequency computed for this structure at the PM3 level.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

Figure 5.35b. The geometry resulting from a TS search initiated from the geometry shown in Figure 5.35(a).
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

The fact that the C–O bond is re-formed indicates that it should be stretched more in the initial guess, so I repeat the optimization with the C–O distance constrained to 2.0 Å instead of 1.70 Å. This results (after 39 steps) in the geometry shown in Figure 5.36.  Using this as a starting geometry for the TS search leads to the TS in Figure 4.30a after 17 steps.

Figure 5.36. The geometry resulting from a constrained optimization in which the distance between atoms 1 and 10 (see Figure 5.32a for numbering) was constrained to 2.0 Å, and the normal mode of the imaginary frequency computed for this structure at the PM3 level.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

In the screencast below I try to reproduce the calculations that lead to these figures.  Because I rebuild the structure in Figure 5.32a, I get different starting coordinates, so the energies and number of steps are different than what I discuss in the book for the 4-constraints approach.

In the case of the 1-constraint approach, I actually manage to find the TS using the 1.70 Å constraint.  Thus, the structures in Figures 5.35b and 5.36 do not appear in the screencast.  This just goes to show how finicky TS searchers are to starting geometries.

While editing the screencast I noticed that the PM3 energies of the two TS structures I find are very different (11 kcal/mol).  This is also true for the TSs I found when writing the book (where they are different by 6.6 kcal/mol).

This appears to be a problem that PM3 has with structures where bonds are partially broken or formed.  I could not reproduce this for structures with normal bond lengths such as the reactants and products.   Note that in the book, I use M06/6-31G(d) single point energies to compute barriers, and that the  M06/6-31G(d)//PM3 single point energies of the TSs found with the two different method are only different by 0.03 kcal/mol.

Thursday, July 29, 2010

Amide hydrolysis, revisited

Figure 4.29. Sketch of hydrolysis reaction of several amides together with their experimentally observed half-lives. The free energies of activation are computed via Equations (4.30) and (4.31). (Adapted from N. M. Hernandes et al 2008. Journal of Organic Chemistry 73: 6413–6416.)
From Molecular Modeling Basics CRC Press, 2010
January, 2011 update: The figure in the book is missing an N atom in structures 4 and 5

Figure 4.30a and b shows the TS geometries for the hydrolysis of 1 and 3 computed at the PM3 level of theory, together with the normal modes associated with the imaginary frequencies (2336i and 239i cm–1, respectively).

Figure 4.30a. PM3 geometries of the TSs for hydrolysis of compound 1 (Figure 4.29), and the normal modes associated with the imaginary frequencies.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

Figure 4.30b. PM3 geometries of the TSs for hydrolysis of compound 3 (Figure 4.29), and the normal modes associated with the imaginary frequencies.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

As I show in the book, the predicted activation free energy of 1 is 4.2 kcal/mol higher than the activation free energy for the hydrolysis of 3. The source of the difference is roughly half electronic (1.9 kcal/mol) and half thermodynamic (2.3 kcal/mol). The explanation for the latter is the loss of translational entropy associated with hydrolysis of 1, but it is not obvious why the electronic activation energy should be lower.

For example, one might imagine that it would be energetically unfavorable to fold the chain of 3 into a ring due to some kind of strain when forming the transition state. This can be tested by studying the hydrolysis of 2 (Figure 4.29), which should have roughly the same amount of ring-strain associated with the reaction.

Indeed, the free energy of activation for hydrolysis of 2 is considerably higher than that for 3, and due entirely to an increased electronic activation energy.  This points toward the importance of the amine group in the middle of the chain in lowering the electronic barrier for 3-hydrolysis, presumably by stabilizing the partially positive –NH3+-like portion of the TS (Figure 4.31).

Figure 4.31. 0.002 au isodensity surface with superimposed molecular electrostatic potential of the TS for hydrolysis of 3. The maximum potential value is 0.05 au, and the level of theory is M06/6-31G(d). The orientation is the same as Figure 4.30.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

It might be tempting to ascribe the more open TS structure in 3 compared to 1 (Figure 4.30) to ring-strain, but the TS for “methanolysis” of 1 is equally open (Figure 4.32). This is presumably due to steric hindrance of the methanol methyl group and the carbonyl oxygen.

Figure 4.32. PM3 geometries of the TSs for methanolysis of compound 1 (Figure 4.29), and the normal modes associated with the imaginary frequencies.
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

I will discuss how to find the TS structure for the hydrolysis of 1 in a future post.  I have already discussed how to find the TS structure for the hydrolysis of 3 here and hereThis post describes how to verify that the TS connects the correct reactants and product, and this post describes the relationship between half lives and activation free energy.

Sunday, July 25, 2010

Melting: a simple model

Figure 4.26. PM3 optimized structure of the V-shaped water timer. Shown also is the normal mode corresponding to the lowest vibrational frequency (36 cm–1).
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

While the lowest energy conformation of three water molecules is the ring structure (Figure 4.17), there is another minimum (at least on the PM3 potential energy surface - Figure 4.26) that is 7.7 kcal/mol higher in (electronic) energy.

The free energy difference between the cyclic and V-shaped structure is zero at around 480 K. This can be considered a very simple model for melting (i.e., the T at which higher enthalpy conformations are most probable because of entropy). The entropic term has two basic contributions: there are more higher-energy structures (they are more disordered so there are more ways to make them: e.g. there are 3 identical V-shaped structures), which lowers the conformational free energy, and the structures are “floppier” (they have more low-frequency vibrational modes), which lowers the vibrational free energy.

Figure 4.28a. Structure of one of the water clusters found by Maeda and Ohno. The coordinates are taken from their supplementary materials. (From S. Maeda and K. Ohno, 2007. Journal of Physical Chemistry A. 111: 4526–4534.)
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010

Of course, this is a hypothetical melting transition because the cyclic “ice” structure already sublimates at 285 K. The main reason for the high melting temperature is the small number of higher-enthalpy conformations, which increases quickly with the number of water molecules. For example, for water octamer [(H2O)8] a study by Maeda and Ohno found 164 different conformations, and only seven of these can be classified as some variant of the lowest-enthalpy cubic conformation (Figure 4.28a), while the rest are more disordered (e.g., Figure 4.28b). The study estimates that the temperature at which the cubic and more disordered structures become equally probable (i.e., the melting temperature) is around 280–320 K, which is significantly closer to the melting temperature of bulk ice of 273 K. The uncertainty in the estimate of Tmelt comes from the difficulty in estimating the effect of BSSE and anharmonic effects.

Figure 4.28b. Structure of one of the water clusters found by Maeda and Ohno. The coordinates are taken from their supplementary materials. (From S. Maeda and K. Ohno, 2007. Journal of Physical Chemistry A. 111: 4526–4534.)*
Click on the picture for an interactive version.
From Molecular Modeling Basics CRC Press, 2010
* The Jmol structure does not correspond exactly to the picture.  When making the figure I forgot to write down which of the 164 structures I used for this figure.  Still, the point is the same.

Wednesday, June 16, 2010 - Solve puzzles for science

I just came across this amazing game/puzzle/software - call it what you will - Foldit.

It's an interactive protein folder that evaluates the energy on the fly to let you know how well you are doing.

But, and here is the important bit, the energy function is the same used in the Rosetta packages - a state-of-the-art protein folder (at least that's how I read it - the details are a bit sketchy). This is not a toy, but an exciting experiment to test whether human intuition and pattern recognition can compete with Monte Carlo sampling.

The screencast below (see a bigger version here) shows 6 intro puzzles that teaches you some of the techniques for protein folding. You get the program here and make an account for yourself here (making an account using Foldit did not work for me).

When you start Foldit the first time it takes you into the intro puzzles immediately (i.e. bypassing the menu).

I don't know where to start about the many implications of this program for fundamental science, teaching, and public outreach. So I'll simply say wow!, and go on to level 2-3. Not a moment to lose, CASP9 is underway you know!

June 20th, 2010 update:

Not sure why I didn't think to check youtube immediately but there are many videos about foldit. Here are a couple:

Monday, June 7, 2010

Geometry and molecular motion

Figure 4.18. The relative energy of H2 as a function of H–H distance and the energy of the lowest vibrational level (horizontal line).
From Molecular Modeling Basics CRC Press, 2010

When discussing structure and other molecular properties it is important to remember that the molecule is in constant internal motion due to vibrational motion. The bond length and angles that one so carefully computes and reports to so many decimal places actually changes considerably with time.

Consider the H2 molecule. Using B3LYP/6-31G(d) and the harmonic oscillator approximation, the frequency for the H–H stretch vibration is 4450 cm–1, which corresponds to 6.4 kcal/mol of kinetic energy. If we compute the energy as a function of the H–H bond length (Figure 4.18), we see that this kinetic energy is enough to compress the H–H bond length to roughly 0.64 Å and increase it to 0.88 Å: the classical turning points.

The classical turning points can be estimated using the harmonic approximation of the potential energy surface (PES)
where νi is the vibrational frequency in units of cm–1, with a corresponding normal coordinate li (see section 1.3 of the book), and n is the vibrational quantum number. Thus, νi = 4450 cm–1 corresponds to a turning point at li =±0.087 amu½ × Å which can be converted to Cartesian coordinates (and hence a bond length) using the normal mode component Lij In the case of H2, this gives classical turning points at 0.621 Å and 0.865 Å, which are in good agreement with the previous estimate from Figure 4.18. The screencast below shows ho MacMolPlt can be used to estimate the classical turning points.

For lower frequencies the displacements at the classical turning points can be quite substantial, both because the PES is flatter and because higher vibrational levels are populated. For example, in the case of ethane the lowest frequency is 310 cm–1 [at the B3LYP/6-31G(d) level of theory] and corresponds mainly to rotation about the CC bond (Figure 4.20a). For the ground vibrational level the classical turning points occur at li =±0.330 amu½ × Å , which corresponds to a dihedral angle change of ±14° (Figure 4.20b, see screencast below). At room T 4% of the ethane molecules have three quanta of energy in this mode (n = 2), corresponding to a kinetic energy of 2.2 kcal/mol, enough to change the dihedral angle by ±31°.

Figure 4.20. (a) Normal mode associated with the lowest frequency computed for ethane using B3LYP/6-31G(d). (b) Structure obtained by displacing along the mode by 0.330.
Click on the picture for an interactive version.
Click here for a pop-up window
From Molecular Modeling Basics CRC Press, 2010

Some of the lowest frequencies you’ll observe are for intermolecular interactions such as the water dimer hydrogen bond (Figure 4.22). Curious how much the geometry is displaced? Why not try it yourself? (See this post to get started with GAMESS).

Figure 4.22. (a) The PM3 normal mode corresponding to (a) torsional motion about the hydrogen bond and (b) hydrogen bond stretch in the water dimer. The corresponding frequencies are 128 and 415 cm–1.
Click on the picture for an interactive version.
Click here for a pop-up window
From Molecular Modeling Basics CRC Press, 2010

Using MacMolPlt to compute the displaced geometry
As you can see from the second equation, the normal coordinate is basically a scale factor. Thus, MacMolPlt can be used to compute a displaced geometry by converting li to a percent and changing the units to amu½ × Bohr by dividing by 0.52918 Å/Bohr. Thus, li =±0.330 becomes 62.0% (note that there is a mistake in the book: I forgot about the unit conversion in Figure 5.27). Unfortunately, MacMolPlt does not include the m so this is only an estimate.