Skip to content

transformer_from_metric can return wrong results or nan #175

Closed
@wdevazelhes

Description

@wdevazelhes

Description

Our strategy to compute transformer_from_metriccan lead to wrong results: for instance, there is a bug in the following example:

  • the transformation matrix will have nans in it because some elements of the Mahalanobis matrix are negative. In this case we should maybe put to zero some elements that are close to zero but a bit negative
  • Also, in the particular case of the example, all the coeffs of the Mahalanobis matrix are at the same order of magnitude (1e-9, see below), so in fact its not really a diagonal matrix. It will not raise a nan or bug but still will lead to a wrong result because the matrix will be considered as a diagonal matrix. We should not just test the diagonal in absolute but we should probably test the diagonal relatively to the non-diagonal coefficients

Steps/Code to Reproduce

from metric_learn import ITML
from sklearn.datasets import load_iris
from sklearn.utils import shuffle
from metric_learn import Constraints
from sklearn.utils import check_random_state
import numpy as np

SEED = 42

input_data, labels = load_iris(return_X_y=True)
X, y = shuffle(input_data, labels, random_state=SEED)
num_constraints = 50
constraints = Constraints(y)
pairs = (constraints
      .positive_negative_pairs(num_constraints, same_length=True,
                               random_state=check_random_state(SEED)))
c = np.vstack([np.column_stack(pairs[:2]), np.column_stack(pairs[2:])])
target = np.concatenate([np.ones(pairs[0].shape[0]),
                           - np.ones(pairs[0].shape[0])])
c, target = shuffle(c, target, random_state=SEED)

itml = ITML()
itml.fit(X[c], target)
print(itml.get_mahalanobis_matrix())
print(itml.predict(X[c]))

Expected Results

No error is thrown, the transformation matrix contains real values, and the result is not all -1

Actual Results

/home/will/anaconda3/envs/py27sklearnmin/bin/python /home/will/.PyCharmCE2018.3/config/scratches/scratch_50.py
[[ nan  nan  nan  nan]
/home/will/Code/metric-learn/metric_learn/_util.py:346: RuntimeWarning: invalid value encountered in sqrt
 [ nan  nan  nan  nan]
  return np.sqrt(metric)
 [ nan  nan  nan  nan]
 [ nan  nan  nan  nan]]
/home/will/Code/metric-learn/metric_learn/base_metric.py:335: RuntimeWarning: invalid value encountered in greater_equal
  return 2 * (self.decision_function(pairs) >= - self.threshold_) - 1
[-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1]

Versions

Linux-4.4.0-142-generic-x86_64-with-debian-stretch-sid
('Python', '2.7.15 |Anaconda, Inc.| (default, May  1 2018, 23:32:55) \n[GCC 7.2.0]')
('NumPy', '1.8.2')
('SciPy', '0.13.3')
('Scikit-Learn', '0.20.3')
('Metric-Learn', '0.4.0')

Note: here is the Mahalanobis matrix that I got in debug mode:

[[  1.79429787e-09   1.06777934e-09  -3.56650702e-09  -1.25100147e-09]
 [  1.06777933e-09   2.07953604e-09  -3.35935951e-09  -2.34694895e-09]
 [ -3.56650696e-09  -3.35935864e-09   8.69685077e-09   3.69865990e-09]
 [ -1.25100146e-09  -2.34694906e-09   3.69866011e-09   4.22204202e-09]]

Note: here is our current implem of transformer_from_metric:

https://github.com/metric-learn/metric-learn/blob/bf5c7224cc7ad4c025e15b247a80e076b7f75062/metric_learn/_util.py#L345-L351

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions