Note to self: here's how you identified disordered residues in the NMR ensemble 2KCU.pdb
1. In Pymol: "fetch 2kcu"
2.Action > align > states (*/CA)
2016.08.07 update: the above command also aligns the tails. Use "intra_fit (2kzn///6-158/CA)"
3. "save 2kcu_aligned.pdb, state=0"
4. In terminal: grep CA 2kcu_aligned.pdb > lis
5. python disorder.py
disorder.py (given below) calculates the standard deviation of the x, y, and z coordinate of each CA atom (\sigma_{x,i}, \sigma_{y,i}, \sigma_{z,i}). It then averages these three standard deviations for each CA atom (\sigma_i). To find outliers, it averages these values for the entire protein (\langle \sigma_i \rangle) and computes the standard deviation of this average (\sigma_{\langle \sigma_i \rangle}). Any residues for which \sigma_i > \langle \sigma_i \rangle + \sigma_{\langle \sigma_i \rangle} is identified as disordered.
Here I've colored the disordered residues red (haven't updated the picture based on Step 2-change yet)
Yes, I know: "the 1970's called and want their Fortran code back". How very droll.

This work is licensed under a Creative Commons Attribution 4.0
1. In Pymol: "fetch 2kcu"
2.
2016.08.07 update: the above command also aligns the tails. Use "intra_fit (2kzn///6-158/CA)"
3. "save 2kcu_aligned.pdb, state=0"
4. In terminal: grep CA 2kcu_aligned.pdb > lis
5. python disorder.py
disorder.py (given below) calculates the standard deviation of the x, y, and z coordinate of each CA atom (\sigma_{x,i}, \sigma_{y,i}, \sigma_{z,i}). It then averages these three standard deviations for each CA atom (\sigma_i). To find outliers, it averages these values for the entire protein (\langle \sigma_i \rangle) and computes the standard deviation of this average (\sigma_{\langle \sigma_i \rangle}). Any residues for which \sigma_i > \langle \sigma_i \rangle + \sigma_{\langle \sigma_i \rangle} is identified as disordered.
Here I've colored the disordered residues red (haven't updated the picture based on Step 2-change yet)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
models = 20 | |
res = 166 | |
filename = "lis" | |
file = open(filename,'r') | |
ave = [[0 for x in xrange(3)] for x in xrange(res)] | |
stdev = [[0 for x in xrange(3)] for x in xrange(res)] | |
coord = [[[0 for x in range(3)] for x in xrange(res)] for x in xrange(models)] | |
#print coord | |
for i in range(models): | |
for j in range(res): | |
line = file.readline() | |
words = string.split(line) | |
coord[i][j][0] = float(words[6]) | |
coord[i][j][1] = float(words[7]) | |
coord[i][j][2] = float(words[8]) | |
for k in range(3): | |
ave[j][k] += coord[i][j][k]/models | |
for i in range(models): | |
for j in range(res): | |
for k in range(3): | |
stdev[j][k] += (coord[i][j][k]-ave[j][k])**2/models | |
ave_stdev = [] | |
for j in range(res): | |
temp = 0.0 | |
for k in range(3): | |
stdev[j][k] = math.sqrt(stdev[j][k]) | |
temp += stdev[j][k] | |
# print j, ",", temp/3.0 | |
ave_stdev.append(temp/3.0) | |
#tails are removed | |
start_res = 5 #actually 6 but we count from 0 | |
end_res = 158 #actually 159 but we count from 0 | |
v = np.asarray(ave_stdev[start_res:end_res]) | |
for j in range(start_res,end_res): | |
if ave_stdev[j] > np.mean(v)+np.std(v): | |
print j+1 |

This work is licensed under a Creative Commons Attribution 4.0
No comments:
Post a Comment