Closed
Description
Summary
Use replace=True when taking sample data for SCML https://github.com/scikit-learn-contrib/metric-learn/blob/master/metric_learn/scml.py#L268
Use Cases
When would you use this?
When having data with large number of features, the following error occurs:
File "venv/lib/python3.9/site-packages/metric_learn/scml.py", line 268, in _generate_bases_dist_diff
select_triplet = rng.choice(n_triplets, size=n_features, replace=False)
File "mtrand.pyx", line 959, in numpy.random.mtrand.RandomState.choice
ValueError: Cannot take a larger sample than population when 'replace=False'
By changing the replace
option to True
we will be able to use repeated data instances without raising an error. Is it correct to use such an approach?
Message from the maintainers:
Want to see this feature happen? Give it a 👍. We prioritise the issues with the most 👍.