Description
According to the documentation, the learned distance between x
and y
is expressed as sqrt((x-y).dot(inv(M)).dot(x-y))
, which is equivalent to evaluating the Euclidean distance in the transformed space obtained from dot(X, L.T)
for n x d
data matrices X
with L = inv(cholesky(M))
.
However, if an algorithm does not learn a metric M
, but its corresponding transformer L
directly, the current implementation of BaseMetricLearner.metric()
computes M
as L.T.dot(L)
, which actually equals inv(M)
, according to the definition of L
given above.
If I got the documentation right, then the following should hold:
learner.transformer().T.dot(learner.transformer()) == learner.metric()
This is, however, currently not the case for algorithms learning M
(e.g., ITML, LSML or even the covariance method).
In addition, the definition of a "metric" is not even consistent across different learners: ITML and LSML learn a metric sqrt((x-y).dot(M).dot(x-y))
, while the covariance method and SDLM (I guess) use inv(M)
as stated in the readme.
Thus, the transformers returned by ITML and LSML are wrong!
I would suggest harmonizing the terminology towards the one currently employed by ITML and LSML, since it seems also predominant in literature, in my opinion. This would require removing the inv()
from BaseMetricLearner.transformer()
as well as changing all learners relying on the current terminology, such as ITML ans LSML.