Applying Onehotencoder On Numpy Array
I am applying OneHotEncoder on numpy array. Here's the code print X.shape, test_data.shape #gives 4100, 15) (410, 15) onehotencoder_1 = OneHotEncoder(categorical_features = [0, 3,
Solution 1:
Don't use a new OneHotEncoder on test_data
, use the first one, and only use transform()
on it. Do this:
test_data = onehotencoder_1.transform(test_data).toarray()
Never use fit()
(or fit_transform()
) on testing data.
The different number of columns are entirely possible because it may happen that test data dont contain some categories which are present in train data. So when you use a new OneHotEncoder and call fit()
(or fit_transform()
) on it, it will only learn about categories present in test_data
. So there will be difference between the columns.
Post a Comment for "Applying Onehotencoder On Numpy Array"