Closed
Description
Description
Our strategy to compute transformer_from_metric
can lead to wrong results: for instance, there is a bug in the following example:
- the transformation matrix will have nans in it because some elements of the Mahalanobis matrix are negative. In this case we should maybe put to zero some elements that are close to zero but a bit negative
- Also, in the particular case of the example, all the coeffs of the Mahalanobis matrix are at the same order of magnitude (1e-9, see below), so in fact its not really a diagonal matrix. It will not raise a nan or bug but still will lead to a wrong result because the matrix will be considered as a diagonal matrix. We should not just test the diagonal in absolute but we should probably test the diagonal relatively to the non-diagonal coefficients
Steps/Code to Reproduce
from metric_learn import ITML
from sklearn.datasets import load_iris
from sklearn.utils import shuffle
from metric_learn import Constraints
from sklearn.utils import check_random_state
import numpy as np
SEED = 42
input_data, labels = load_iris(return_X_y=True)
X, y = shuffle(input_data, labels, random_state=SEED)
num_constraints = 50
constraints = Constraints(y)
pairs = (constraints
.positive_negative_pairs(num_constraints, same_length=True,
random_state=check_random_state(SEED)))
c = np.vstack([np.column_stack(pairs[:2]), np.column_stack(pairs[2:])])
target = np.concatenate([np.ones(pairs[0].shape[0]),
- np.ones(pairs[0].shape[0])])
c, target = shuffle(c, target, random_state=SEED)
itml = ITML()
itml.fit(X[c], target)
print(itml.get_mahalanobis_matrix())
print(itml.predict(X[c]))
Expected Results
No error is thrown, the transformation matrix contains real values, and the result is not all -1
Actual Results
/home/will/anaconda3/envs/py27sklearnmin/bin/python /home/will/.PyCharmCE2018.3/config/scratches/scratch_50.py
[[ nan nan nan nan]
/home/will/Code/metric-learn/metric_learn/_util.py:346: RuntimeWarning: invalid value encountered in sqrt
[ nan nan nan nan]
return np.sqrt(metric)
[ nan nan nan nan]
[ nan nan nan nan]]
/home/will/Code/metric-learn/metric_learn/base_metric.py:335: RuntimeWarning: invalid value encountered in greater_equal
return 2 * (self.decision_function(pairs) >= - self.threshold_) - 1
[-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
Versions
Linux-4.4.0-142-generic-x86_64-with-debian-stretch-sid
('Python', '2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55) \n[GCC 7.2.0]')
('NumPy', '1.8.2')
('SciPy', '0.13.3')
('Scikit-Learn', '0.20.3')
('Metric-Learn', '0.4.0')
Note: here is the Mahalanobis matrix that I got in debug mode:
[[ 1.79429787e-09 1.06777934e-09 -3.56650702e-09 -1.25100147e-09]
[ 1.06777933e-09 2.07953604e-09 -3.35935951e-09 -2.34694895e-09]
[ -3.56650696e-09 -3.35935864e-09 8.69685077e-09 3.69865990e-09]
[ -1.25100146e-09 -2.34694906e-09 3.69866011e-09 4.22204202e-09]]