Skip to content

Commit 4d0a629

Browse files
glemaitrechkoar
andauthored
Apply suggestions from code review
Co-Authored-By: Christos Aridas <chkoar@users.noreply.github.com>
1 parent 2940dbb commit 4d0a629

File tree

1 file changed

+9
-9
lines changed

1 file changed

+9
-9
lines changed

examples/applications/plot_impact_imbalanced_classes.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""
22
========================================================================
33
Model fitting on imbalanced dataset and comparison of methods to improve
4-
performance
4+
its performance
55
========================================================================
66
77
This example illustrates the problem induced by learning on datasets having
@@ -39,7 +39,7 @@
3939

4040
###############################################################################
4141
# This dataset is only slightly imbalanced. To better highlight the effect of
42-
# learning from imbalanced dataset, we will increase this ratio to 30:1
42+
# learning from an imbalanced dataset, we will increase its ratio to 30:1
4343

4444
from imblearn.datasets import make_imbalance
4545

@@ -87,7 +87,7 @@
8787
###############################################################################
8888

8989
###############################################################################
90-
# We will first define an helper function which will train a given model
90+
# We will first define a helper function which will train a given model
9191
# and compute both accuracy and balanced accuracy. The results will be stored
9292
# in a dataframe
9393

@@ -177,7 +177,7 @@ def evaluate_classifier(clf, df_scores, clf_name=None):
177177

178178
###############################################################################
179179
# We can see that our linear model is learning slightly better than our dummy
180-
# baseline. However, it is impacted by class imbalanced.
180+
# baseline. However, it is impacted by the class imbalance.
181181
#
182182
# We can verify that something similar is happening with a tree-based model
183183
# such as `RandomForestClassifier`. With this type of classifier, we will not
@@ -247,7 +247,7 @@ def evaluate_classifier(clf, df_scores, clf_name=None):
247247
#
248248
# Another way is to resample the training set by under-sampling or
249249
# over-sampling some of the samples. `imbalanced-learn` provides some samplers
250-
# to do such precessing.
250+
# to do such processing.
251251

252252
from imblearn.pipeline import make_pipeline as make_pipeline_with_sampler
253253
from imblearn.under_sampling import RandomUnderSampler
@@ -277,7 +277,7 @@ def evaluate_classifier(clf, df_scores, clf_name=None):
277277
df_scores
278278

279279
###############################################################################
280-
# Applying a random under-sampler before to train the linear model or random
280+
# Applying a random under-sampler before the training of the linear model or random
281281
# forest, allows to not focus on the majority class at the cost of making more
282282
# mistake for samples in the majority class (i.e. decreased accuracy).
283283
#
@@ -290,7 +290,7 @@ def evaluate_classifier(clf, df_scores, clf_name=None):
290290
# Use of `BalancedRandomForestClassifier` and `BalancedBaggingClassifier`
291291
# .......................................................................
292292
#
293-
# We already show that random under-sampling can be effective on decision tree.
293+
# We already showed that random under-sampling can be effective on decision tree.
294294
# However, instead of under-sampling once the dataset, one could under-sample
295295
# the original dataset before to take a bootstrap sample. This is the base of
296296
# the `BalancedRandomForestClassifier` and `BalancedBaggingClassifier`.
@@ -306,7 +306,7 @@ def evaluate_classifier(clf, df_scores, clf_name=None):
306306
df_scores
307307

308308
###############################################################################
309-
# The performance with the `BalancedRandomForestClassifier` are better than
309+
# The performance with the `BalancedRandomForestClassifier` is better than
310310
# applying a single random under-sampling. We will use a gradient-boosting
311311
# classifier within a `BalancedBaggingClassifier`.
312312

@@ -332,7 +332,7 @@ def evaluate_classifier(clf, df_scores, clf_name=None):
332332
# to bring some diversity for the different GBDT to learn and not focus on a
333333
# portion of the majority class.
334334
#
335-
# We will repeat the same experiment but a ratio of 100:1 and make a similar
335+
# We will repeat the same experiment but with a ratio of 100:1 and make a similar
336336
# analysis.
337337

338338
###############################################################################

0 commit comments

Comments
 (0)