Skip to content

[Bug] RCA_Supervised crashes when fit on dataset with unlabeled points #260

Closed
@bellet

Description

@bellet

Description

While the module Constraints provide the ability to have unlabeled points as input (labeled -1), the method chunks removes unlabeled points in the returned chunks array, which thus has different dimension than the dataset X passed as input to the fit method of RCA_Supervised.

I think the most natural and simple solution is to keep the unlabeled points in chunks with value -1, which is already interpreted by RCA as "not belonging to any chunk".

Steps/Code to Reproduce

from metric_learn import RCA_Supervised
import numpy as np

X = np.random.rand(5, 2)
y = [1, 1, -1, 2, 2]

rca = RCA_Supervised(num_chunks=2)
rca.fit(X, y)

Expected Results

Fit without error

Actual Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/aurelien/Documents/research/github/metric-learn/metric_learn/rca.py", line 244, in fit
    return RCA.fit(self, X, chunks)
  File "/home/aurelien/Documents/research/github/metric-learn/metric_learn/rca.py", line 132, in fit
    X, chunks = self._prepare_inputs(X, chunks, ensure_min_samples=2)
  File "/home/aurelien/Documents/research/github/metric-learn/metric_learn/base_metric.py", line 101, in _prepare_inputs
    **kwargs)
  File "/home/aurelien/Documents/research/github/metric-learn/metric_learn/_util.py", line 131, in check_input
    y_numeric=y_numeric)
  File "/home/aurelien/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 729, in check_X_y
    check_consistent_length(X, y)
  File "/home/aurelien/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 205, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [5, 4]

Versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions