ifreak ifreak - 1 year ago 79
Python Question

manual adjustement of newick tree and usage of ETE

I have a problem which seems to be too weird for me.

I have this newick tree:


when i try to read it using ETE:

t=Tree("(((637,5250,607,14782)6942,641)6441)0;", format=8)

everything works normally, but now I want to make it bifurcating, so the new tree should be something like:


and now I try to read it using the same syntax as above:

t=Tree("(((((637,5250),607),14782)6942,641)6441)0;", format=8)

I got this error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/site-packages/ete2-2.1rev539-py2.7.egg/ete2/coretype/tree.py", line 200, in __init__
read_newick(newick, root_node = self, format=format)
File "/usr/lib/python2.7/site-packages/ete2-2.1rev539-py2.7.egg/ete2/parser/newick.py", line 218, in read_newick
return _read_newick_from_string(nw, root_node, format)
File "/usr/lib/python2.7/site-packages/ete2-2.1rev539-py2.7.egg/ete2/parser/newick.py", line 280, in _read_newick_from_string
_read_node_data(closing_internal, current_parent, "internal", format)
File "/usr/lib/python2.7/site-packages/ete2-2.1rev539-py2.7.egg/ete2/parser/newick.py", line 351, in _read_node_data
raise NewickError, "Unexpected leaf node format:\n\t"+ subnw[0:50]
ete2.parser.newick.NewickError: Unexpected leaf node format:

and this is driving me really crazy, anyone can help with this?

Answer Source

You have format=8 in the Tree() function. According with the specification that means each node needs to have a name.

To work with format 8 you will need to give names to the nodes, like this:

t=Tree("(((((637,5250)a,607)b,14782)6942,641)6441)0;", format=9)


t=Tree("(((((637,5250)0,607)0,14782)6942,641)6441)0;", format=9)

You can also change to format 9:

t=Tree("(((((637,5250),607),14782)6942,641)6441)0;", format=9)

This changes the format to 9, which only requires the leafs to have names. You can also remove the format and just work with the first argument.