train.txt has 3 coulumns. First coulumn and second coulumn were kept in A but I can not use it for fit_transform. Why? Please, help me
from macpath import split
from sklearn.feature_extraction import DictVectorizer
A=[]
B=[]
C=[]
D=[]
vec = DictVectorizer()
with open("train.txt") as f:
f1=[x.strip() for x in f if x.strip()]
for x in f1[0:]:
data=[tuple(x.split())]
for x in data:
A.append(x[0]+" "+x[1])
B.append(x[2])
X=vec.fit_transform(A)
you have to change following part
A.append(x[0]+" "+x[1])
with this:
A.append({x[0],x[1]: 1 })