Skip to content

Error when taking sample data for SCML #348

Closed
@nikosmichas

Description

@nikosmichas

Summary

Use replace=True when taking sample data for SCML https://github.com/scikit-learn-contrib/metric-learn/blob/master/metric_learn/scml.py#L268

Use Cases

When would you use this?

When having data with large number of features, the following error occurs:

  File "venv/lib/python3.9/site-packages/metric_learn/scml.py", line 268, in _generate_bases_dist_diff
    select_triplet = rng.choice(n_triplets, size=n_features, replace=False)
  File "mtrand.pyx", line 959, in numpy.random.mtrand.RandomState.choice
ValueError: Cannot take a larger sample than population when 'replace=False'

By changing the replace option to True we will be able to use repeated data instances without raising an error. Is it correct to use such an approach?


Message from the maintainers:

Want to see this feature happen? Give it a 👍. We prioritise the issues with the most 👍.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions