-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[MRG+1] Pipeline Refactor - Reduce Code Footprint #654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #654 +/- ##
=========================================
- Coverage 98.46% 96.4% -2.07%
=========================================
Files 82 82
Lines 4886 4807 -79
=========================================
- Hits 4811 4634 -177
- Misses 75 173 +98
Continue to review full report at Codecov.
|
This pull request fixes 1 alert when merging d4e7aea into 3839df1 - view on LGTM.com fixed alerts:
|
@@ -167,6 +167,15 @@ def _validate_steps(self): | |||
% (estimator, type(estimator)) | |||
) | |||
|
|||
def _iter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I thought that we might need tests for this but since the tests of the pipeline are passing might be just a sanity check. So it's ok.
Tanks @MattEding. At first pass it seems ok. Since we delete so much code and we do not break tests I do not think that is a reason not to merge it. |
Hello @MattEding! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-12-05 13:07:54 UTC |
This is nice. I did not note the introduction of |
This pull request fixes 3 alerts when merging a9d7909 into 3839df1 - view on LGTM.com fixed alerts:
|
Changes/Fixes
Reduces much code duplication in
imblearn.pipeline.Pipeline
since much ofit could be inherited from
sklearn.pipeline.Pipeline
. This reduces need ofadapting to potential future changes in
sklearn.pipeline.Pipeline
'simplementation, increasing code maintainability, and reduces code footprint.
I also believe that it fixes some misleading docstrings--see below.
Notes
The only difference to sklearn's implementation for each of these
methods are the
if hasattr(transform, "fit_resample"): pass
and thedocstring starting with
"Apply transformers/samplers, and ..."
insteadof
"Apply transforms, and ..."
. I don't think this docstring differenceshould exist since none of these methods actually do
resampling; rather they explicitly pass over it. Also, not all of these
methods used the new wording. The skipping over
fit_transform
is simple to do via inheritance with by redefiningdef _iter(with_final=True, filter_passthrough=True, filter_resample=True)
.The default value allows resuse of
sklearn.Pipeline
inherited definitionswhile requiring a change to
_fit
to call it withfilter_resample=False
once.Sklearn
Imblearn
Has "/samplers" in docstring
predict
predict_proba
decision_function
predict_log_proba
transform
score
Omits "/samplers" in docstring
score_samples
inverse_transform
Duplicate definitions
score_samples
was redefined twice_fit_transform_one
was identical to sklearn's