Closed
Description
Describe the bug
Neighbourhood cleaning rule procedure:
- Split data T into the class of interest C (minority) and the rest of data O.
- Identify noisy data A1 in O with edited nearest neighbor rule.
- For each class Ci in O: (this is, for each observation in the majority class(es)
if ( x Ci in 3-nearest neighbors of misclassified y C )
and ( | Ci | ‡ 0.5 · | C | ) then A2 = { x } A2 - Reduced data S = T - ( A1 union A2 )
The above is a copy of the pseudo code in the article. There, C is the minority class or class of interest.
Further quote what is on the article:
"To avoid excessive reduction of small classes, only examples from classes larger or equal to 0.5 * | C | are considered while forming A2. " and it previously mentions that C is the minority. They refer to the entire dataset as T.
Metadata
Metadata
Assignees
Labels
No labels