MBG wiki: χ-value analysis (including χ1-χ2 plots and residence times)

In order to examine what rotamers a particular residue occupies during the trajectory, you need to calculate as much dihedrals as you need. If, for example, you are interested in Leu, then all you need is dihedrals χ1 and χ2 to specify a rotamer. Also, you need to know four atoms to specify a dihedral angle. A way to do that is via a script for xplor. Here is an example for χ1 and χ2 dihedrals.

structure @native.psf end

dynamics analysis
    pick dihedral
   ( residue 8 and name n   and segid="A   ")
   ( residue 8 and name ca  and segid="A   ")
   ( residue 8 and name cb  and segid="A   ")
   ( residue 8 and name cg  and segid="A   ")  geometry

  asci=false
  input=native.dcd
  begin=0 skip=1 stop=2000000

  output=chi1_8A.list

end

stop

structure @native.psf end

dynamics analysis
    pick dihedral
   ( residue 8 and name ca    and segid="A   ")
   ( residue 8 and name cb    and segid="A   ")
   ( residue 8 and name cg    and segid="A   ")
   ( residue 8 and name cd1   and segid="A   ")  geometry

  asci=false
  input=native.dcd
  begin=0 skip=1 stop=2000000

  output=chi2_8A.list

end

stop

This way, you have two ASCII files ( chi1_8A.list and chi2_8A.list ) each containing two columns. The first one is an attempt of xplor to convert frames into time and the second is the calculated dihedral angles of the specific residue for each frame. What you need to do is to make out of them one file containing also two columns, where the first one is the χ1 values and the second one the χ2. Also it would be better if the values of the angles are between 0 and 360 degrees. Probably xplor will give values from -180 to 180. A way to do that is, for example, to write a program in C to make the conversions of the angles for you. The program's source code could be something in the spirit of:

#include <stdio.h>
#include <math.h>

main ()
{
  float val1;
  float val2;

while ( scanf ( "%f %f", &val1, &val2)== 2 )
    
    {
      if ( val2 > 0.0 )
       printf( "%15.6f %15.6f\n", val1, val2);

      else
        { 
          val2+=360.0;
          printf( "%15.6f %15.6f\n", val1, val2);
                                                  }
                                                   
                                                      }
}

Assuming you have done this job for all the residues you need, you now have a number of postscript files from which you can obtain an nice image of the rotamers of each residue.

So, now you have a file containing a list of χ1,χ2 values for each frame. We can now calculate what percentage of simulation time is spend in each rotamer. To find this we will have to perform a cluster analysis of the χ1-χ2 distribution and count the number of data points in each cluster. This is easier than it sounds :

If you have more than ~3000 frames you will have to reduce the number of points, otherwise the physical memory will not be adequate. The easiest method is to use a macro-capable editor to delete, say, all data points corresponding to even numbered frames. Repeat the procedure until you have a sufficiently small number of data points.

Use wc to find how many pairs of values you have. Note this number.

The rest is R :

R : Copyright 2004, The R Foundation for Statistical Computing
Version 2.0.1  (2004-11-15), ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for a HTML browser interface to help.
Type 'q()' to quit R.

> A <- matrix(scan("chi1_chi2.list", n = 3217*2), 3217, 2, byrow = TRUE)
Read 6434 items
> library(cluster)
> p <- pam( A, 4)
> plot(p)

giving :

Repeat the procedure with different number of clusters (it is the '4' in the line "p <- pam( A, 4)").

Write down the number of points that belong to the various clusters (shown in the last graph).

In the R console type p and note down the χ1-χ2 centroids of the clusters. Confirm that the results are reasonable. Calculate percentages. You are done.