[BUG] NCL - class should be cleaned if number of sampes is 0.5 * minority samples, not if 0.5* data.shape[0]

#### Describe the bug

Neighbourhood cleaning rule procedure:

1. Split data T into the class of interest C (minority) and the rest of data O.
2. Identify noisy data A1 in O with edited nearest neighbor rule.
3. For each class Ci in O: (this is, for each observation in the majority class(es)
if ( x Ci in 3-nearest neighbors of misclassified y C )
and ( | Ci | ‡ 0.5 · | C | ) then A2 = { x } A2
4. Reduced data S = T - ( A1 union A2 )

The above is a copy of the pseudo code in the article. There, C is the minority class or class of interest.

Further quote what is on the article:
"To avoid excessive reduction of small classes, only examples from classes larger or equal to 0.5 * | C | are considered while forming A2. " and it previously mentions that C is the minority. They refer to the entire dataset as T.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] NCL - class should be cleaned if number of sampes is 0.5 * minority samples, not if 0.5* data.shape[0] #764

Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] NCL - class should be cleaned if number of sampes is 0.5 * minority samples, not if 0.5* data.shape[0] #764

Description

Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions