Sklearn Multilabelbinarizer() Error When Using For Production
Edit: I have changed the code , from mlb to TfIdfVectorizer(). Still I am facing a problem. Please see below my code. from sklearn.externals import joblib from sklearn.preprocessin
Solution 1:
The issue is you are not saving any model on your path. Let's forget the GridSearch
here
from sklearn.externals import joblib
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import SGDClassifier
from sklearn.multiclassimport OneVsRestClassifier
dataset = pd.DataFrame({'X': ['How to resent my Password',
'Where to See the next Road',
'What is my next topic',
'Can I move without pass']*10,
'Y': [['Pass','ResetPass'], ['Direction','NaN'], ['Topic','Class'], ['Pass','MovePass']]*10})
mlb = MultiLabelBinarizer()
X, Y = dataset['X'], mlb.fit_transform(dataset['Y'])
X_Train, X_Test, Y_Train, y_test = train_test_split(X, Y, random_state=0, test_size=0.33, shuffle=True)
clf = SGDClassifier(loss='hinge', penalty='l2',
alpha=1e-3, random_state=42,
max_iter=5, tol=None)
text_clf = Pipeline([('vect', TfidfVectorizer()),
('clf', OneVsRestClassifier(clf))])
text_clf.fit(X, Y) ### new line here
predict = text_clf.predict(X_Test)
predict_label = mlb.inverse_transform(predict)
joblib.dump(text_clf, 'PATHTO/model_mlb.pkl') #save the good model
joblib.dump(mlb, 'PATHTO/mlb.pkl') # save the MLB
model = joblib.load('PATHTO/model_mlb.pkl')
mlb = joblib.load('PATHTO/mlb.pkl') # load the MLB
new_input = 'How to resent my Password'
pred = model.predict([new_input]) ## tfidf in your pipeline
pred = mlb.inverse_transform(pred)
And this returns
[('Pass', 'ResetPass')]
as in your train test
And if you want your grid search to be save just save the fit
(= grid.fit()
)
Post a Comment for "Sklearn Multilabelbinarizer() Error When Using For Production"