I want to convert multiple FASTA format files (DNA sequences) to the NEXUS format using BIO.SeqIO module but I get this error:
Traceback (most recent call last):
File "fasta2nexus.py", line 28, in <module>
File "fasta2nexus.py", line 23, in process
File "/Library/Python/2.7/site-packages/Bio/SeqIO/__init__.py", line 1003, in convert
with as_handle(in_file, in_mode) as in_handle:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
File "/Library/Python/2.7/site-packages/Bio/File.py", line 88, in as_handle
with open(handleish, mode, **kwargs) as fp:
IOError: [Errno 2] No such file or directory: 'c'
from __future__ import print_function # or just use Python 3!
from Bio import SeqIO, Nexus
from Bio.Alphabet import IUPAC
test = "/Users/teton/Desktop/test"
files = os.listdir(os.curdir)
# retuns ("basename", "extension"), so  picks "basename"
base = os.path.splitext(filename)
return SeqIO.convert(filename, "fasta",
base + ".nex", "nexus",
for files in os.listdir(test):
for file in files:
fullpath = os.path.join(file)
This code should solve the majority of problems I can see.
from __future__ import print_function # or just use Python 3! import fileinput import os import re import sys from Bio import SeqIO, Nexus from Bio.Alphabet import IUPAC test = "/Users/teton/Desktop" def process(filename): # retuns ("basename", "extension"), so  picks "basename" base = os.path.splitext(filename) return SeqIO.convert(filename, "fasta", base + ".nex", "nexus", alphabet=IUPAC.ambiguous_dna) for root, dirs, files in os.walk(test): for file in files: fullpath = os.path.join(root, file) print(process(fullpath))
I changed a few things. First, I ordered your imports (personal thing) and made sure to import
Bio.Alphabet so you can actually assign the correct alphabet to your sequences. Next, in your
process() function, I added a line to split the extension off the filename, then used the full filename for the first argument, and just the base (without the extension) for naming the Nexus output file. Speaking of which, I assume you'll be using the
Nexus module in later code? If not, you should remove it from the imports.
I wasn't sure what the point of the last snippet was, so I didn't include it. In it, though, you appear to be walking the file tree and
process()ing each file again, then referencing some undefined variable named
count. Instead, just run
process() once, and do whatever
count refers to within that loop.
You may want to consider adding some logic to your
for loop to test that the file returned by
os.path.join() actually is a FASTA file. Otherwise, if any other file type is in one of the directories you search and you
process() it, all sorts of weird things could happen.
OK, based on your new code I have a few suggestions. First, the line
files = os.listdir(os.curdir)
is completely unnecessary, as below the definition of the
process() function, you're redefining the
files variable. Additionally, the above line would fail, as you are not calling
os.curdir(), you are just passing its reference to
The code at the bottom should simply be this:
for file in os.listdir(test): print(process(file))
for file in files is redundant, and calling
os.path.join() with a single argument does nothing.