Shape Mismatch: If Categories Is An Array, It Has To Be Of Shape (n_features,)
Here is the code I'm trying to execute to encode the values of the first column of my data set using dummy values. import numpy as py import matplotlib.pyplot as plt import pandas
Solution 1:
Your second doesn't seem to be a categorical features, you should only one_hot_encode features which can take a finite number of discrete value. Like the first column which can only take a limited number of value ('spain', 'germany', 'france') If you only encode de the first column you can do:
from sklearn.preprocessingimportOneHotEncoder
onehotencoder=OneHotEncoder(categories=[['France','Germany','Spain']])
x_1=onehotencoder.fit_transform(x[:,0].reshape(-1, 1)).toarray()
x = np.concatenate([x_1,x[:,1:]], axis=1)
and then your data will be in the form:
France Germany Spain score
1 0 0 44.0
0 0 1 27.0
...
Also, You only have 3 columns on your data but you're calling the fourth column with y=DataSet.iloc[:,3].values (first column start at index 0 -> .iloc[:,3] should give 4th column, then.
Post a Comment for "Shape Mismatch: If Categories Is An Array, It Has To Be Of Shape (n_features,)"