DST DST - 1 year ago 71
Python Question

python split() issue on whitespace, can someone explain?

AA vowel
AE vowel
AH vowel
AO vowel
AW vowel
AY vowel
B stop
CH affricate
D stop
DH fricative
EH vowel
ER vowel
EY vowel
F fricative
G stop
HH aspirate
IH vowel
IY vowel
JH affricate
K stop
L liquid
M nasal
N nasal
NG nasal
OW vowel
OY vowel
P stop
R liquid
S fricative
SH fricative
T stop
TH fricative
UH vowel
UW vowel
V fricative
W semivowel
Y semivowel
Z fricative
ZH fricative

This is the content in a file, I then separate them into lines and parse them. The problem is when I use
or even
re.split(r'\t+', line)
, seeing that the whitespace in between them resemble a tab, I get a problem that it splits them into characters. Help please, I don't understand where I am going wrong.

code for split

datafile = open(filename,'r')
except IOError:
print('Could not open ' + filename)

stypes = {}

for line in datafile.readlines():
if line:
re.split(r'\t+', line)
phone = line[0]
type = line[1]
print(line[0] + ' ' + line[1] + ' ' + line[2])

Answer Source

You are printing the original line not the list with the split results. This should work better:

with open('mywords.txt') as fobj:
    for line in fobj:
        res = line.split()


['AA', 'vowel']
['AE', 'vowel']

The with statement opens a file and will close it as soon as you dedent to the level of with, i.e. fobj will only be open until you write more code on the same level of with (or end your function or program there). This is called a context manager. The context is the indented lines below with.


with open('mywords.txt') as fobj:
    print('closed', fobj.closed)
print('closed', fobj.closed)


closed False
closed True