II. Simulation of water using periodic boundary conditions and the CHARMM program

Periodic boundary conditions are often used in the simulation of condensed phase systems. In this tutorial, a simulation of pure water in the liquid phase will be done using periodic boundary conditions; the water molecules are treated explicitly. Be forewarned that the explicit treatment of the solvent is much more CPU intensive than doing a vacuum simulation The goal is to familiarize the user with the commands necessary for setting up and defining periodic boundary conditions.

The water model used with the CHARMM force field is closely related to the TIP3P water model developed by Jorgensen and co-workers (Jorgensen et al. 1983, J. Chem. Phys. 79: 926; Neria et al 1996, J. Chem. Phys. 105: 1902). It is a three particle model which includes the oxygen atom and two hydrogen atoms; there is no internal flexibility in this model. Other models include internal flexibility and still others include the oxygen lone electron pairs.


Below are the different steps for setting up a simulation with periodic boundary conditions.

  • PSF Generation for water
  • Energy calculation

Molecular Dynamics

  • Heating-Equilibration-Production molecular dynamics
  • Analysis

i) Preparation

Create a new sub-directory exercise_pbc (mkdir exercise_pbc) and change to that directory. Copy the following files from your local CHARMM program directory to your new directory:



These two files are located in the toppar directory of the standard CHARMM distribution

Download the following input files from this course:

Type ls to see the contents of the directory. Make a subdirectory scr; this subdirectory will be used for writing files during the dynamics simulation.

As in the previous exercise, the first step in any CHARMM calculation is to generate the PSF file. Start by opening the file setup-box.inp using whatever editor you like. This file holds the appropriate CHARMM commands for reading in a coordinate file for a cube of water and generating the necessary CHARMM data structures. The coordinate file is in CHARMM format rather than PDB format.

* Generate PSF for water
! read the topology and parameter file
open read formatted unit 12 name top_all22_prot.inp
read rtf card unit 12
open read formatted unit 12 name par_all22_prot.inp
read para card unit 12
! Generate water psf and coordinates
set 4 bulk ! water identified
! read box of equilibrated water
open unit 1 read formatted name tip3p.crd
read sequence unit 1 coor
rewind unit 1
! generate new segment id and psf
generate bulk noangle nodihedral
! read water box coordinates
read coor card unit 1
close unit 1
! verify location of water box
! write out the psf and the coordinates
open unit 1 write formatted name scr/@4.psf
write psf card unit 1
open unit 1 write formatted name scr/@4.pdb
write coordinate pdb unit 1
* COORdinates

The initial coordinate set (tip3p.crd) is a cube of pre-equilibrated water that was generated in an earlier calculation; the length of each side of the cube is 18.856Å. In analogy with the protein setup done in Exercise1, a segment ID is generated for the water molecules. The number of water molecules is determined by reading in the water sequence from the coordinate file (read sequence). The generate command builds the psf and assigns the segment ID bulk , for bulk water. The two additional keywords in the generate command specify that there are no angle and dihedral terms in the potential energy function for the water model. This is because the water model used here is a rigid water model. The coor statistics command verifies the position of the box; it should be near (0.0, 0.0, 0.0). Finally, the PSF and the coordinate file are written to disk. Note that in this input file, we make use of a variable definition; the variable 4 is assigned the character string bulk, which, in turn, is used (@4) for defining the name of the output files for the psf and coordinate file.

Execute this CHARMM command file by typing the following command at the UNIX prompt; the job should take less than a minute to run.

charmm < setup-box.inp > setup-box.out

If you do an ls in your directory, you will see the additional files corresponding to the psf file and the pdb file. Look at the output file and determine how many water molecules were read into CHARMM?

{ answer 216 }

You can see the output here --> setup-box.out

ii) Dynamics


The next step is to heat the system to the desired temperature; the procedure is very much like that used in the previous exercise. The difference in the present exercise, relative to Exercise 1, is that one must specify the details for the periodic boundary conditions. Periodic boundary conditions are used in the simulation of systems which include solvent as well as for infinitely repeating system such as crystals, polymers or nucleic acids such as DNA or RNA. When employing periodic boundary conditions in a simulation, specific transformations such as translations or rotations are used to define the image cells which surround the primary cell containing the system of interest (periodic boundary conditions). CHARMM has a general image facility that allows the simulation of many different crystal type as well as infinite polymers and DNA. The image facility allows translations, rotations and reflections. The IMAGE facility is invoked by reading an image transformation file. Once initiated, the images of the primary atoms will be included in all energy and force calculations for the remainder of the calculation. The image facility must be re-initiated with each subsequent calculation, for example, when one wishes to continue a particular simulation. The image transformation file is discussed below.

Shown below is the input file for heating the system to the desired temperature, heat.inp. The topology, the parameter, the PSF and the coordinate files are first read into CHARMM; the coordinate file was generated in the previous calculation, setup-box.inp. Notice that we are again using variable definitions.

* system heating to 300 K
! { parameters}
set t 300?! temperature
set n bulk1?! output files
! read the topology and parameter file
open read formatted unit 12 name top_all22_prot.inp
read rtf card unit 12
open read formatted unit 12 name par_all22_prot.inp
read para card unit 12
! { psf }
open unit 1 read formatted name scr/bulk.psf
read psf card unit 1
close unit 1
! { coordinates }
OPEN UNIT 1 READ FORMatted NAME scr/bulk.pdb
! set up periodic boundary conditions using images
!set up periodic images
!size of box, must use variable 9 to be consistent with waterbox.img
set 9 18.856
!read in image file for translation images
open unit 8 read card name waterbox.img
read imag unit 8
image byres xcen 0.0 ycen 0.0 zcen 0.0 sele all end

After reading in the preliminary files, the images facility is invoked. This involves reading a file which contains the image transformations (read imag). The next command, image, defines the manner in which centering is done. Image centering is designed primarily for solvent. During a dynamics simulation, a particular solvent molecule may become far from the rest of the primary structure. The centering features allows the image closest to the primary space to become the primary solvent molecule. This maintains a constant number of molecules in the primary cell. The centering is done relative to the coordinates XCEN, YCEN and ZCEN, which are most often chosen to be (0,0,0). The centering selection is done on the basis of individual residues, since in this case, each residue of the sequence corresponds to an individual water molecule. See the CHARMM document images.doc for a more complete discussion.

In the simulation of a periodic box of water, the transformations involve only translations of the primary box; the translation length equals the length of one side of the cube. The file waterbox.img contains the translation definitions, only a portion of the file is shown below. The scale command scales all transformations by the variable 9 which is defined prior to the reading of the image transformation file. It is set to be equal to the length of one side of the cube, or 18.856Å in the present example. For each transformation, it is required that its inverse also be defined. For example, the transformation in the x direction is defined, as well as the inverse transformation in the -x direction. There are 26 transformations required to set up the periodic boundary condition.

* IMAGE transformation file for an infinite cube with an edge
* length passed as CHARMM parameter 9
* ...
! Scale all transformations by parameter 9
SCALE @9 @9 @9
! Define the 26 image objects required for each face, edge,
! and vertex of the primary cube. To construct the image
! name, use the letters X, Y, and Z for positive x, y,
! and z directions; use the letters A, B, and C for
! negative x, y, and z directions.
! Define adjacent IMAGE in x direction
TRANS 1.0 0.0 0.0
! Define adjacent IMAGE in the -x direction
TRANS -1.0 0.0 0.0
! Define adjacent IMAGE in y direction
TRANS 0.0 1.0 0.0
! Define adjacent IMAGE in -y direction
TRANS 0.0 -1.0 0.0

Once the image facility is invoked and initialized, the SHAKE algorithm is invoked to fix the bond lengths between all hydrogen and heavy atoms. The SHAKE algorithm is used to constrain all hydrogen-heavy atom bond lengths to their default values. This is done because simulating the high frequency of the hydrogen-heavy atom bond vibration would necessitate the use of a very short time step. This, in turn, would lead to a very time consuming simulation. The nonbonded model is defined and an energy calculation done. An important rule-of-thumb when using periodic boundary conditions is that the nonbond cutoff distance should be less than 1/2 the box length. In the current example, the box length is 18.856Å and the cutoff is 9.0Å. The keyword cutim defines the cutoff distance between a primary particle and an image particle; this is the distance for deciding which interactions are included on the nonbond list. The cutoff for the energy calculation is defined by the keyword ctofnb.

! use shake
shake bonh para
! energy parameters
energy cutnb 10.0 cutim 10.0 ctonnb 8.0 ctofnb 9.0 -
vswitch shift cdie eps 1
! md run
open write formatted unit 31 name scr/@n.res
dyna leap verlet start nstep 6000 timestep 0.001 -
iseed 12310238 firstt 0 finalt 300 -
ihtfrq 10 teminc .5 -
ieqfrq 0 iasors 1 iasvel 1 -
isvfrq 500 iprfrq 500 nprint 100 -
iunrea -1 iunwri 31 iuncrd -1 -
imgfrq 10 inbfrq 10
! write out coordinates
open write card unit 20 name scr/@n.pdb
write coor pdb unit 20
* COORdinates

To run the molecular dynamics, the CHARMM DYNAmics command is used. Since molecular dynamics simulations can often require a significant amount of CPU time, the simulations are often broken up into smaller pieces. The restart file is written at the end of a segment and it allows one to continue the simulation at another time. In preparation for writing the restart file at the end of the heating segment, the file opened with write permission before the dynamics simulation is started. During the course of the simulation, the restart file is written periodically to this file. For this simulation, the Leapfrog algorithm is used to integrate the equations of motion.

The different keywords specify how the heating simulation is done:

  • The start (for Start) keyword specifies that this is the beginning of a simulation.
  • The simulation starts at 0 K and the system is heated up to 300 K using 0.5 K increments (FIRSTT, FINALT, TEMInc) every 10 fs (= 0.01 ps).
  • The simulation is run for 6000 steps using a 0.001 picosecond time steps (NSTEp, TIMEstep). In this exercise, the entire heating process takes 6 ps; longer heating periodscan also be used.
  • IHTfrq indicates that a temperature increment will be performed every 10 integration steps.
  • The velocities are periodically and randomly chosen from a Gaussian distribution (iasors 1 iasvel 1) at the new temperature. The iseed keyword initiates the random distribution.
  • The energies are written out every 100 steps (NPRInt) and their averages are calculated every 500 steps (IPRfrq).
  • IUNWrite indicates to which file the information needed to restart the trajectory in the next stage will be written, in the present example, the information is written to unit 31 which was opened in the scr subdirectory as bulk1.res.
  • The other four keywords describe features such as the initial distribution of velocities.
  • The keywords INBFrq and IMGFrq specify the frequency with which to update the nonbond interaction list and the image atom list, respectively. In the current example, these updates are done every 10 time steps.

For a more detailed description of these and other keywords, look at the CHARMM documentation (dynamc.doc) found online at


Run this CHARMM input file for heating the system.

Equilibration and production

The input files for equilibration and for the production runs are very similar to the corresponding input files from Exercise1 except that they include the commands for invoking the image facility.


After heating to 300K, equilibration is done to ensure that the system is in a stable state. Use the file equil1.inp to start this procedure. Here, the system is equilibrated for 6 ps at 300 K. The trajectory is restarted from the previous heating stage. In this case, the goal is to keep the system at the target temperature of 300K. To do so, , the temperature is checked periodically and if it falls outside the window of ±5K (in this example) of the desired temperature, the velocities are scaled in such a way as to bring the temperature back around the target temperature.

Execute this and the subsequent equilibration file, equil2.inp. Carry out the same energy analysis as you did in the previous exercise starting with the heating stage. Plot the total energy, kinetic energy, potential energy and temperature as a function of time. Has the system reached a reasonably steady state? energy plot

Assess the stability of the system by comparing the rms fluctuations in the energy to the mean average energy. One measure of the stability is the fluctuation of the total energy. The simulations done here should conserve the total energy of the system and the fluctuations should be very small. Often, the ratio between the average RMS fluctuation of the total energy and the average total energy is calculated; if it less than 10-4 for the type of simulation done here, the energy presumed to be stable.

An awk script (energy.awk) is provided with which you can use to calculate the mean total energy and the fluctuations. To execute it type,

awk -f energy.awk <grep_output>

What is the quality of the simulation based of the fluctuations of the total energy?


The subsequent part of the simulation is the production phase where the system is at equilibrium. No modification of the velocities are done in this type of simulation where we keep the number of particle (N), the volume (V) and the energy (E) constant; this is an constant NVE simulation. To which ensemble does this correspond? (ensembles).

Execute the input file run1.inp and analysis the energy fluctuations of the simulation. Is this a stable simulation of pure water? [ It should be if all was done correctly]. To continue the simulation, modify the run1.inp input file. Remember to change the names of the restart files and coordinate files reflect the next stage of the simulation.