Mehdi Mehdi - 1 month ago 14
Python Question

How to get number of dimensions in OneHotEncoder in Scikit-learn

I am using the

from Scikit-learn in my project. And I need to know what would be the size of each one-hot vector when the
n_value
is set to be
auto
. I thought
n_value_
would show that but it seems I have no way other than trying out training samples. I made this toy example code to show the problem. Do you know any other solution?

from sklearn.preprocessing import OneHotEncoder

data = [[1], [3], [5]] # 3 different features

encoder = OneHotEncoder()
encoder.fit(data)

print(len(encoder.transform([data[0]]).toarray()[0])) # 3 number of dimensions in one-hot-vector
print(encoder.n_values_) # [6] == len(range(5))

Answer

Is this what you are looking for?

>>> encoder.active_features_
array([1, 3, 5])

>>> len(encoder.active_features_)
3