Praxeolitic - 8 months ago 36

Python Question

What suggestions do people have for quickly calculating dihedral angles in Python?

In the diagrams, phi is the dihedral angle:

What's your best for calculating angles in the range 0 to pi? What about 0 to 2pi?

"Best" here means some mix of fast and numerically stable. Methods that return values over the full range 0 to 2pi are preferred but if you have an incredibly fast way of calculating the dihedral over 0 to pi share that too.

Here are my 3 best efforts. Only the 2nd one returns angles between 0 and 2pi. It's also the slowest.

**General comments about my approaches:**

arccos() in Numpy seems plenty stable but since people raise this issue I may just not fully understand it.

The use of einsum came from here. Why is numpy's einsum faster than numpy's built in functions?

The diagrams and some inspiration came from here. How do I calculate a dihedral angle given Cartesian coordinates?

**The 3 approaches with comments:**

`import numpy as np`

from time import time

# This approach tries to minimize magnitude and sqrt calculations

def dihedral1(p):

# Calculate vectors between points, b1, b2, and b3 in the diagram

b = p[:-1] - p[1:]

# "Flip" the first vector so that eclipsing vectors have dihedral=0

b[0] *= -1

# Use dot product to find the components of b1 and b3 that are not

# perpendicular to b2. Subtract those components. The resulting vectors

# lie in parallel planes.

v = np.array( [ v - (v.dot(b[1])/b[1].dot(b[1])) * b[1] for v in [b[0], b[2]] ] )

# Use the relationship between cos and dot product to find the desired angle.

return np.degrees(np.arccos( v[0].dot(v[1])/(np.linalg.norm(v[0]) * np.linalg.norm(v[1]))))

# This is the straightforward approach as outlined in the answers to

# "How do I calculate a dihedral angle given Cartesian coordinates?"

def dihedral2(p):

b = p[:-1] - p[1:]

b[0] *= -1

v = np.array( [ v - (v.dot(b[1])/b[1].dot(b[1])) * b[1] for v in [b[0], b[2]] ] )

# Normalize vectors

v /= np.sqrt(np.einsum('...i,...i', v, v)).reshape(-1,1)

b1 = b[1] / np.linalg.norm(b[1])

x = np.dot(v[0], v[1])

m = np.cross(v[0], b1)

y = np.dot(m, v[1])

return np.degrees(np.arctan2( y, x ))

# This one starts with two cross products to get a vector perpendicular to

# b2 and b1 and another perpendicular to b2 and b3. The angle between those vectors

# is the dihedral angle.

def dihedral3(p):

b = p[:-1] - p[1:]

b[0] *= -1

v = np.array( [np.cross(v,b[1]) for v in [b[0], b[2]] ] )

# Normalize vectors

v /= np.sqrt(np.einsum('...i,...i', v, v)).reshape(-1,1)

return np.degrees(np.arccos( v[0].dot(v[1]) ))

dihedrals = [ ("dihedral1", dihedral1), ("dihedral2", dihedral2), ("dihedral3", dihedral3) ]

`# Testing arccos near 0`

# Answer is 0.000057

p1 = np.array([

[ 1, 0, 0 ],

[ 0, 0, 0 ],

[ 0, 0, 1 ],

[ 0.999999, 0.000001, 1 ]

])

# +x,+y

p2 = np.array([

[ 1, 0, 0 ],

[ 0, 0, 0 ],

[ 0, 0, 1 ],

[ 0.1, 0.6, 1 ]

])

# -x,+y

p3 = np.array([

[ 1, 0, 0 ],

[ 0, 0, 0 ],

[ 0, 0, 1 ],

[-0.3, 0.6, 1 ]

])

# -x,-y

p4 = np.array([

[ 1, 0, 0 ],

[ 0, 0, 0 ],

[ 0, 0, 1 ],

[-0.3, -0.6, 1 ]

])

# +x,-y

p5 = np.array([

[ 1, 0, 0 ],

[ 0, 0, 0 ],

[ 0, 0, 1 ],

[ 0.6, -0.6, 1 ]

])

for d in dihedrals:

name = d[0]

f = d[1]

print "%s: %12.6f %12.6f %12.6f %12.6f %12.6f" \

% (name, f(p1), f(p2), f(p3), f(p4), f(p5))

def profileDihedrals(f):

t0 = time()

for i in range(20000):

p = np.random.random( (4,3) )

f(p)

p = np.random.randn( 4,3 )

f(p)

return(time() - t0)

print "dihedral1: ", profileDihedrals(dihedral1)

print "dihedral2: ", profileDihedrals(dihedral2)

print "dihedral3: ", profileDihedrals(dihedral3)

`dihedral1: 0.000057 80.537678 116.565051 116.565051 45.000000`

dihedral2: 0.000057 80.537678 116.565051 -116.565051 -45.000000

dihedral3: 0.000057 80.537678 116.565051 116.565051 45.000000

dihedral1: 2.79781794548

dihedral2: 3.74271392822

dihedral3: 2.49604296684

As you can see in the benchmarking, the last one tends to be the fastest while the second one is the only one that returns angles from the full range of 0 to 2pi since it uses arctan2.

Answer

Here's an implementation for torsion angle over the full 2pi range that is a bit faster, doesn't resort to numpy quirks (einsum being mysteriously faster than logically equivalent code), and is easier to read.

There's even a bit more than just hacks going on here -- the math is different too. The formula used in the question's `dihedral2`

uses 3 square roots and 1 cross product, the formula on Wikipedia uses 1 square root and 3 cross products, but the formula used in the function below uses only 1 square root and 1 cross product. This is probably as simple as the math can get.

Functions with 2pi range function from question, Wikipedia formula for comparison, and the new function:

dihedrals.py

```
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import numpy as np
def old_dihedral2(p):
"""http://stackoverflow.com/q/20305272/1128289"""
b = p[:-1] - p[1:]
b[0] *= -1
v = np.array( [ v - (v.dot(b[1])/b[1].dot(b[1])) * b[1] for v in [b[0], b[2]] ] )
# Normalize vectors
v /= np.sqrt(np.einsum('...i,...i', v, v)).reshape(-1,1)
b1 = b[1] / np.linalg.norm(b[1])
x = np.dot(v[0], v[1])
m = np.cross(v[0], b1)
y = np.dot(m, v[1])
return np.degrees(np.arctan2( y, x ))
def wiki_dihedral(p):
"""formula from Wikipedia article on "Dihedral angle"; formula was removed
from the most recent version of article (no idea why, the article is a
mess at the moment) but the formula can be found in at this permalink to
an old version of the article:
https://en.wikipedia.org/w/index.php?title=Dihedral_angle&oldid=689165217#Angle_between_three_vectors
uses 1 sqrt, 3 cross products"""
p0 = p[0]
p1 = p[1]
p2 = p[2]
p3 = p[3]
b0 = -1.0*(p1 - p0)
b1 = p2 - p1
b2 = p3 - p2
b0xb1 = np.cross(b0, b1)
b1xb2 = np.cross(b2, b1)
b0xb1_x_b1xb2 = np.cross(b0xb1, b1xb2)
y = np.dot(b0xb1_x_b1xb2, b1)*(1.0/np.linalg.norm(b1))
x = np.dot(b0xb1, b1xb2)
return np.degrees(np.arctan2(y, x))
def new_dihedral(p):
"""Praxeolitic formula
1 sqrt, 1 cross product"""
p0 = p[0]
p1 = p[1]
p2 = p[2]
p3 = p[3]
b0 = -1.0*(p1 - p0)
b1 = p2 - p1
b2 = p3 - p2
# normalize b1 so that it does not influence magnitude of vector
# rejections that come next
b1 /= np.linalg.norm(b1)
# vector rejections
# v = projection of b0 onto plane perpendicular to b1
# = b0 minus component that aligns with b1
# w = projection of b2 onto plane perpendicular to b1
# = b2 minus component that aligns with b1
v = b0 - np.dot(b0, b1)*b1
w = b2 - np.dot(b2, b1)*b1
# angle between v and w in a plane is the torsion angle
# v and w may not be normalized but that's fine since tan is y/x
x = np.dot(v, w)
y = np.dot(np.cross(b1, v), w)
return np.degrees(np.arctan2(y, x))
```

The new function would probably be a bit more conveniently called with 4 separate arguments but it to match the signature in the original question it simply immediately unpacks the argument.

Code for testing:

test_dihedrals.ph

```
from dihedrals import *
# some atom coordinates for testing
p0 = np.array([24.969, 13.428, 30.692]) # N
p1 = np.array([24.044, 12.661, 29.808]) # CA
p2 = np.array([22.785, 13.482, 29.543]) # C
p3 = np.array([21.951, 13.670, 30.431]) # O
p4 = np.array([23.672, 11.328, 30.466]) # CB
p5 = np.array([22.881, 10.326, 29.620]) # CG
p6 = np.array([23.691, 9.935, 28.389]) # CD1
p7 = np.array([22.557, 9.096, 30.459]) # CD2
# I guess these tests do leave 1 quadrant (-x, +y) untested, oh well...
def test_old_dihedral2():
assert(abs(old_dihedral2(np.array([p0, p1, p2, p3])) - (-71.21515)) < 1E-4)
assert(abs(old_dihedral2(np.array([p0, p1, p4, p5])) - (-171.94319)) < 1E-4)
assert(abs(old_dihedral2(np.array([p1, p4, p5, p6])) - (60.82226)) < 1E-4)
assert(abs(old_dihedral2(np.array([p1, p4, p5, p7])) - (-177.63641)) < 1E-4)
def test_new_dihedral1():
assert(abs(wiki_dihedral(np.array([p0, p1, p2, p3])) - (-71.21515)) < 1E-4)
assert(abs(wiki_dihedral(np.array([p0, p1, p4, p5])) - (-171.94319)) < 1E-4)
assert(abs(wiki_dihedral(np.array([p1, p4, p5, p6])) - (60.82226)) < 1E-4)
assert(abs(wiki_dihedral(np.array([p1, p4, p5, p7])) - (-177.63641)) < 1E-4)
def test_new_dihedral2():
assert(abs(new_dihedral(np.array([p0, p1, p2, p3])) - (-71.21515)) < 1E-4)
assert(abs(new_dihedral(np.array([p0, p1, p4, p5])) - (-171.94319)) < 1E-4)
assert(abs(new_dihedral(np.array([p1, p4, p5, p6])) - (60.82226)) < 1E-4)
assert(abs(new_dihedral(np.array([p1, p4, p5, p7])) - (-177.63641)) < 1E-4)
```

Code for timing:

time_dihedrals.py

```
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from dihedrals import *
from time import time
def profileDihedrals(f):
t0 = time()
for i in range(20000):
p = np.random.random( (4,3) )
f(p)
p = np.random.randn( 4,3 )
f(p)
return(time() - t0)
print("old_dihedral2: ", profileDihedrals(old_dihedral2))
print("wiki_dihedral: ", profileDihedrals(wiki_dihedral))
print("new_dihedral: ", profileDihedrals(new_dihedral))
```

The functions can be tested with pytest as `pytest ./test_dihedrals.py`

.

Timing results:

```
./time_dihedrals.py
old_dihedral2: 1.6442952156066895
wiki_dihedral: 1.3895585536956787
new_dihedral: 0.8703620433807373
```

`new_dihedral`

is about twice as fast as `old_dihedral2`

.

...you can also see that the hardware used for this answer is a lot beefier than the hardware used in the question (3.74 vs 1.64 for `dihedral2`

) ;-P

If you want to get even more aggressive you can use pypy. At the time of writing pypy doesn't support `numpy.cross`

but you can just use a cross product implemented in python instead. For a 3-vector cross product the C pypy generates is probably at least as good as what numpy uses. Doing so gets the time down to 0.60 for me but at this we're wading into silly hax.

Same benchmark but with same hardware as used in the question:

```
old_dihedral2: 3.0171279907226562
wiki_dihedral: 3.415065050125122
new_dihedral: 2.086946964263916
```