Skip to content
This repository was archived by the owner on Jan 31, 2023. It is now read-only.
This repository was archived by the owner on Jan 31, 2023. It is now read-only.

Error when predict with converted model built with CountVectorizer(binary=True) #19

@phongvis

Description

@phongvis

Describe the bug
An error is raised when making an inference with a converted sklearn model built with CountVectorizer(binary=True). It's ok if binary=False

To Reproduce

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from pure_sklearn.map import convert_estimator

vectorizer = CountVectorizer(binary=True)
model = LogisticRegression(random_state=0)
pipeline = Pipeline([
    ('vect', vectorizer),
    ('clf', model)
])

X_train = ['one text', 'two text', 'three text']
y_train = ['1', '2', '3']
pipeline.fit(X_train, y_train)
converted = convert_estimator(pipeline)
converted.predict(['four'])

It's ok if a vectorizer is created with binary=False.

Expected behavior
There shouldn't be any errors.

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions