pikie pikie - 1 month ago 9
Python Question

Make several lists matching several ranges in python

I have a problem with my code and I have spent a long time on it and I can't fix it:

I have a file like this:

ATOM 1375 N PHE F 411 81.522 91.212 98.734 1.00 0.00 N
ATOM 1376 H PHE F 411 82.393 91.667 97.546 1.00 0.00 H
ATOM 1377 CA PHE F 411 80.451 91.974 95.377 1.00 0.00 C
ATOM 1378 CB PHE F 411 80.968 93.339 100.842 1.00 0.00 C
ATOM 1379 CG PHE F 411 81.813 93.277 102.083 1.00 0.00 C
ATOM 1381 HD1 PHE F 411 83.566 92.729 105.124 1.00 0.00 H


What I want to do is to group the lines by using the values on the eighth column and then extract the corresponding values of the sixth column and find their maximum and minimum.

Like this:

Group 1 8th column values from 95 to 100

ATOM 1375 N PHE F 411 81.522 91.212 98.734 1.00 0.00 N

ATOM 1376 H PHE F 411 82.393 91.667 97.546 1.00 0.00 H

ATOM 1377 CA PHE F 411 80.451 91.974 95.377 1.00 0.00 C


My desired output is:

Min-80.451

Max-82.393


Second group - from 100 to 105

Mix - 80.968

Max - 83.566


And so on and so forth

This is my code in python:

def get_x(file):
x=[]
for line in file:
new_line=line.split()
for z in range(80, 140, 2):
if ( float(new_line[8]) >z and float(new_line[8])<z+2):
x.append(float(new_line[6]))
else:
pass
maxx=max(x)
minx=min(x)
print maxx, minx


The output I get is the minimum and the maximum of all values , like

Min- 80.451 ; Max- 83.566


Any help is appreciated :)

Thanks in advance

Answer

Here is some code I put together that does what you are looking for. I would look at the python documentation for map and filter which are really handy for doing things like this.

text = """ATOM   1375  N   PHE F 411      81.522  91.212  98.734  1.00  0.00           N
ATOM   1376  H   PHE F 411      82.393  91.667  97.546  1.00  0.00           H
ATOM   1377  CA  PHE F 411      80.451  91.974  95.377  1.00  0.00           C
ATOM   1378  CB  PHE F 411      80.968  93.339 100.842  1.00  0.00           C
ATOM   1379  CG  PHE F 411      81.813  93.277 102.083  1.00  0.00           C
ATOM   1381  HD1 PHE F 411      83.566  92.729 105.124  1.00  0.00           H """
lines = text.split('\n')

data = [line.split() for line in lines]

groups = [
    (95, 100),
    (100, 105),
]

for i in range(len(groups)):
    gMin, gMax = groups[i]
    results = filter(lambda x: gMin <= float(x[8]) < gMax, data)
    results = map(lambda x: float(x[6]), results)

    print "Group", i+1, "8th column values from", gMin, "to", gMax
    print "Min -", min(results)
    print "Max -", max(results)
    print ""

Here is the output:

Group 1 8th column values from 95 to 100
Min - 80.451
Max - 82.393

Group 2 8th column values from 100 to 105
Min - 80.968
Max - 81.813

This code is inefficient because it does not remove results from previous groups. If you need the added efficiency let me know.