structure @native.psf end dynamics analysis pick dihedral ( residue 8 and name n and segid="A ") ( residue 8 and name ca and segid="A ") ( residue 8 and name cb and segid="A ") ( residue 8 and name cg and segid="A ") geometry asci=false input=native.dcd begin=0 skip=1 stop=2000000 output=chi1_8A.list end stop
structure @native.psf end dynamics analysis pick dihedral ( residue 8 and name ca and segid="A ") ( residue 8 and name cb and segid="A ") ( residue 8 and name cg and segid="A ") ( residue 8 and name cd1 and segid="A ") geometry asci=false input=native.dcd begin=0 skip=1 stop=2000000 output=chi2_8A.list end stop
This way, you have two ASCII files ( chi1_8A.list and chi2_8A.list ) each containing two columns. The first one is an attempt of xplor to convert frames into time and the second is the calculated dihedral angles of the specific residue for each frame. What you need to do is to make out of them one file containing also two columns, where the first one is the χ1 values and the second one the χ2. Also it would be better if the values of the angles are between 0 and 360 degrees. Probably xplor will give values from -180 to 180. A way to do that is, for example, to write a program in C to make the conversions of the angles for you. The program's source code could be something in the spirit of:
#include <stdio.h> #include <math.h> main () { float val1; float val2; while ( scanf ( "%f %f", &val1, &val2)== 2 ) { if ( val2 > 0.0 ) printf( "%15.6f %15.6f\n", val1, val2); else { val2+=360.0; printf( "%15.6f %15.6f\n", val1, val2); } } }
Assuming you have done this job for all the residues you need, you now have a number of postscript files from which you can obtain an nice image of the rotamers of each residue.
So, now you have a file containing a list of χ1,χ2 values for each frame. We can now calculate what percentage of simulation time is spend in each rotamer. To find this we will have to perform a cluster analysis of the χ1-χ2 distribution and count the number of data points in each cluster. This is easier than it sounds :
- If you have more than ~3000 frames you will have to reduce the number of points, otherwise the physical memory will not be adequate. The easiest method is to use a macro-capable editor to delete, say, all data points corresponding to even numbered frames. Repeat the procedure until you have a sufficiently small number of data points.
- Use wc to find how many pairs of values you have. Note this number.
- The rest is R :
R : Copyright 2004, The R Foundation for Statistical Computing Version 2.0.1 (2004-11-15), ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for a HTML browser interface to help. Type 'q()' to quit R. > A <- matrix(scan("chi1_chi2.list", n = 3217*2), 3217, 2, byrow = TRUE) Read 6434 items > library(cluster) > p <- pam( A, 4) > plot(p)
giving :
- Repeat the procedure with different number of clusters (it is the '4' in the line "p <- pam( A, 4)").
- Write down the number of points that belong to the various clusters (shown in the last graph).
- In the R console type p and note down the χ1-χ2 centroids of the clusters. Confirm that the results are reasonable. Calculate percentages. You are done.