Closed
Description
Description
The *_Supervised classes allow to use a weakly supervised algorithm on labeled data by creating constraints from the labeled points.
I do not really understand why these classes have a parameter num_labeled
, which is used to ignore some labeled points when creating constraints. To reduce the training complexity, it is enough to limit the number of constraints by using the num_constraints
parameter. Using all available points to create the desired number of constraints can only benefit the algorithm as it sees more different points in the training constraints (i.e., it reduces variance).
I think num_labeled
could be removed so that all available points are used to create the desired number of constraints.