Zaichao Sheng Zaichao Sheng - 9 days ago 6
Python Question

How to insert tab in a sequential word in python?

I have a problem with a very large text file which looks like following:

A T T A G C A
A AT A G C A
T TT AG G A
G T T A G C A


Every character was split by
\t
,but some characters are connected, I want to add
\t
to these sequence. What I need is like following:

A T T A G C A
A A T A G C A
T T T A G C A
G T T A G C A


What can I do in Python? and I need to fully use my computer memory to speed up the process.

Answer

Assuming the input is stored in in.txt, an elegant solution would be

import re

with open('in.txt') as fin, open('out.txt') as fout:
    for line in fin:
        out.write('\t'.join(re.findall('\w', line))+'\n')

The output is stored in the file out.txt.

Comments