Skip to content

Error on covariance initialization when there are linearly dependent dimensions #276

Closed
@grudloff

Description

@grudloff

Description

While fitting on MMC_Supervised with init="covariance" on the MNIST dataset, a warning is raised and an error is thrown. I tracked down the problem to the initialization which is done with the inverse of the covariance matrix, which isn't invertible in this case because there are some LD dimensions.

This can be easily fixed by replacing by the (Moore-Penrose) pseudo-inverse as it is done for the random initialization. I am not sure this would be necessarily the desired behavior, but in any case, this issue should be addressed at least whit a more user-friendly error stating that there are some linearly dependent dimensions and that the input should be reduced to eliminate this redundancy.

Haven't checked but this issue should arise on any algorithm using the covariance initialization that doesn't require an SDP matrix.

Steps/Code to Reproduce

from metric_learn import MMC_Supervised
from sklearn import datasets

# Load digits dataset
digits = datasets.load_digits()
X = digits.data
y = digits.target

mmc=MMC_Supervised(init='covariance')
mmc.fit(X,y)

Expected Results

No error or a more user-friendly error.

Actual Results

The following warnings are thrown during runtime.

/home/gabrielrudloff/anaconda3/lib/python3.7/site-packages/metric_learn/_util.py:718: RuntimeWarning: divide by zero encountered in true_divide
  M = np.dot(u / s, u.T)
/home/gabrielrudloff/anaconda3/lib/python3.7/site-packages/metric_learn/_util.py:718: RuntimeWarning: invalid value encountered in true_divide
  M = np.dot(u / s, u.T)

And this is the error thrown, which states that the matrix is not symetric. The computed matrix is full of nan/inf.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-272efe10621c> in <module>
      9 
     10 mmc=MMC_Supervised(init='covariance')
---> 11 mmc.fit(X,y)

~/anaconda3/lib/python3.7/site-packages/metric_learn/mmc.py in fit(self, X, y, random_state)
    609                                         random_state=self.random_state)
    610     pairs, y = wrap_pairs(X, pos_neg)
--> 611     return _BaseMMC._fit(self, pairs, y)

~/anaconda3/lib/python3.7/site-packages/metric_learn/mmc.py in _fit(self, pairs, y)
     63       return self._fit_diag(pairs, y)
     64     else:
---> 65       return self._fit_full(pairs, y)
     66 
     67   def _fit_full(self, pairs, y):

~/anaconda3/lib/python3.7/site-packages/metric_learn/mmc.py in _fit_full(self, pairs, y)
    186     self.n_iter_ = cycle
    187 
--> 188     self.components_ = components_from_metric(self.A_)
    189     return self
    190 

~/anaconda3/lib/python3.7/site-packages/metric_learn/_util.py in components_from_metric(metric, tol)
    398   """
    399   if not np.allclose(metric, metric.T):
--> 400     raise ValueError("The input metric should be symmetric.")
    401   # If M is diagonal, we will just return the elementwise square root:
    402   if np.array_equal(metric, np.diag(np.diag(metric))):

ValueError: The input metric should be symmetric.

Versions

Linux-5.3.0-26-generic-x86_64-with-debian-buster-sid
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
NumPy 1.17.2
SciPy 1.3.1
Scikit-Learn 0.21.3
Metric-Learn 0.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions