MBG wiki: Editing Calculation of water and ion distributions

The object of the exercise is to generate a map showing the distribution of waters (hydration pattern) and ions around the solute. This map will pretty much look like a crystallographic electron density map, the difference being that instead of electron density it will show average water (or ion) density. An example of what we are trying to produce is shown below (taken from Reddy, Lecelrc & Karplus, /Biophysical J/, 84, 2003) :

[[image: water and ions distribution around a DNA decamer]]

The contours depict the average water oxygen (yellow) and sodium (green) distributions around the MD-averaged structure of a DNA decamer.

Before starting with the practical issues, you should be warned about the two most notable shortcomings of the suggested procedure :

1. The method will _fail_ if the solute diffused away from the water box. The reason for this is that the periodic boundary periodicity is _not_ being taken into account (and the reason for this is programmer's lazyness).

The easiest way to check that the solute does stay inside the solvent box is to view the trajectory with /carma/ (See also [[Quick view of large trajectory (DCD) files]]). For example, giving something in the spirit of

<code>
carma -gl 80.0 -trace -atmid " OH2 " -atmid " CA " my.psf my.dcd
</code>

will pop-up a window showing the trajectory. Use the x, y, z keys to change the view. You should be seeing something like :

[[image: Snapshots from carma showing a trajectory]]

2. The method will _fail_ if the solute does not have a well behaved average structure. In other words, if the structure of the solute fluctuates too much, you should view the results with scepticism (or, repeat the calculation using shorter trajectory segments).

There are two ways of doing the calculation (both with /carma/). The first requires the creation of just one index file *but creates a large intermediate DCD file*. The second is doing the calculation on the fly (no intermediate files created) but requires you to create two index files (one for the atoms that will be used for fitting, the other for the atoms for which the density map will be build). You should probably use the second method.

== Method 1 (creates large intermediate DCD file) ==

* Start from a DCD file containing all atoms of your system (and its corresponding PSF). The first frame of your DCD should be the first frame from the heating phase (the heat_out.dcd). The reason is that the first frame from the heating phase is almost identical with the initial system you've prepared, and thus, the axes of inertia of the molecule are aligned with the orthogonal frame. We want the axes of inertia aligned in order to keep the size of the map to a minimum. You can artificially add the first frame from heat_out.dcd to an existing DCD using /catdcd/. You should probably avoid including the frames corresponding to the initial minimisation and heating-up phase (but you can include all frames after the final temperature is reached).

* The first step is to align all frames of the trajectory using the solute atoms, *but keeping all ions and water oxygens*. You can do this with carma by using its -/fit/ *and* -/index/ keywords :

<code>
carma -v -fit -index -atmid " OH2 " -atmid " SOD " -atmid " CLA " -atmid " CA " my.dcd my.psf
</code>

where the fit.index file (which should be present in the current directory) should contain indeces for the Cα atoms{{If you have, say, 112 Cα atoms, give

and save the resulting lines in the fit.index file.
}}.

* Rename the DCD file prepared by /carma/ : mv carma.fitted.dcd waters_ions.dcd

* Now prepare a PSF file for the reduced system using a /xplor/ script :

<code>
structure @my.psf end
delete selection=( not (name OH2 or name CA or name CLA or name SOD)) end
write structure output=waters_ions.psf end
stop
</code>

* You should now determine the limits on /x/, /y/, /z/ inside which a map will be calculated. If you did what suggested above (and the first frame of your DCD is the first frame from the heating phase), you can calculate these limits using /pdbset/ :

** Find the PDB file containing the solute coordinates from /psfgen/ and just before running /autoionize/ (something like psfgen.pdb).

** Run /pdbset/ : pdbset xyzin psfgen.pdb  (and type END).

** Find the limits of the molecule in the output :

<code>
 Orthogonal Coordinate limits in output file:
 Minimum Maximum Centre Range
 On X : -24.30 24.30 0.00 48.60
 On Y : -15.05 15.05 0.00 30.11
 On Z : -10.98 15.87 2.44 26.85
</code>

** You will have to add a border around it, so in this example the limits would be something like -30.0 to 30.0 on /x/, -20.0 to 20.0 on /y/ and -15.0 to 20.0 on /z/. Write them down.

* Calculate the hydration map :

<code>
carma -v -map -30 30 -20 20 -15 20 0.50 -atmid " OH2 " waters_ions.psf waters_ions.dcd
</code>
where the numbers after '-map' are the limits of the map and the grid size (in this case 0.50 Angstrom).

== Method 2 (requires the creation of two index files) ==

* You will need to prepare two index files. The first (*fit.index*) will contain the indeces for the solute atoms that will be used to remove overall rotations and translations. The second index file (*fmap.index*) should contain the indeces of the atoms for which the distribution map will be calculated (eg. water oxygens). Both index files should be zero-offset. An easy way to prepare these files (assuming that you start from /my.dcd/ and /my.psf/ which describe the whole system) is :

** Prepare a /xplor/ script to select the atoms that will be used for fitting *plus* the atoms for which you want to calculate the distribution map :
<code>
structure @my.psf end
delete selection=( not (name OH2 or name CA)) end
write structure output=waters_CAs.psf end
stop
</code>

** Examine the waters_CAs.psf file. Write down the atom number range corresponding to the two classes of atoms (for example, 1-112 for Cα atoms, 113-4000 for water oxygens).

** Use *awk* to prepare the index files. You need something in the spirit of :

** Do not forget to edit the index files and confirm that the numbers you have inside are indeed what you've wanted.

* Determine the limits on /x/, /y/, /z/ inside which a map will be calculated. If you did what suggested above (and the first frame of your DCD is the first frame from the heating phase), you can calculate these limits using /pdbset/ :

** Find the PDB file containing the solute coordinates from /psfgen/ just before running /autoionize/ (something like psfgen.pdb).

** Run /pdbset/ : pdbset xyzin psfgen.pdb  (and type END).

** Find the limits of the molecule in the output :

** You will have to add a border around it, so in this example the limits would be something like -30.0 to 30.0 on /x/, -20.0 to 20.0 on /y/ and -15.0 to 20.0 on /z/. Write them down.

* Calculate the hydration map :

<code>
carma -v -fit -index -fmap -30 30 -20 20 -15 20 0.50 -atmid " OH2 " -atmid " CA " my.psf my.dcd
</code>

where the numbers after '-fmap' are the limits of the map and the grid size (in this case 0.50 Angstrom).

== Viewing the map ==

The output from /carma/ will be a map file named carma.fitted.dcd.map. This is an electron density format used by /xplor/ and /cns/. For vmd to understand the format you will have to rename it so that its suffix is .edm. So :

<code>
mv carma.fitted.dcd.map carma.edm
vmd carma.edm
</code>

You will see nothing. Go to 'representations', select 'isosurface', select 'None' (instead of 'Box'), click 'create replication', and change the contouring level ('isolevel'). You should see it.

You can now quit /vmd/, use xplor to calculate an average solute structure (for the same frames) and then use /vmd/ to view and study the model + map. If you have a crystal structure for your molecule, you could even compare the experimentally observed and the MD-derived hydration patterns.

[[image: MD-derived hydration pattern example]]

[[image: MD-derived hydration pattern of zwitterionic AFA tripeptide at 280K]]

One other interesting analysis you can perform is whether the distribution of waters (or ions) converged. A simple-minded way of doing this is to calculate the distributions for two non-overlapping parts of the trajectory (which will have to be a sufficiently long one). If the two maps (from the two trajectory segments) do look alike (say, have a high value of the linear correlation coefficient) then you can argue that the distribution converged.

Summary:

This change is a minor edit.

Username:

Replace this text with a file.