Add description of algorithms to the doc #178

hansen7 · 2019-03-06T23:27:10Z

Fixes #172
add some descriptions of the algorithms, as discussed in the issue #172

bellet · 2019-03-07T18:07:20Z

Hi, thanks for the PR! I will review as soon as possible (most likely early next week)

wdevazelhes

Thanks a lot for your contribution @hansen7 ! I just did a pass too with a few comments and nitpicks

@bellet and @perimosocordiae feel free to react on LMNN, on the convexity and slack variables remarks

wdevazelhes · 2019-03-08T07:43:09Z

doc/supervised.rst

+sharing the same label, and :math:`x_l` are all the other instances within 
+that region with different labels, :math:`\eta_{ij}, y_{ij} \in \{0, 1\}` 
+are both the indicators, :math:`\eta_{ij}` represents :math:`x_{j}` is the 
+k nearest neighbors(with same labels) of :math:`x_{i}`, :math:`y_{ij}=0` 


Suggested change

k nearest neighbors(with same labels) of :math:`x_{i}`, :math:`y_{ij}=0`

k nearest neighbors (with same labels) of :math:`x_{i}`, :math:`y_{ij}=0`

wdevazelhes · 2019-03-08T08:08:31Z

doc/supervised.rst

@@ -46,12 +46,31 @@ LMNN

 Large-margin nearest neighbor metric learning.

-`LMNN` learns a Mahanalobis distance metric in the kNN classification
+`LMNN` learns a Mahalanobis distance metric in the kNN classification
 setting using semidefinite programming. The learned metric attempts to keep


It's not in your modifs, but I just realized this would be righter than what we have currently:

- setting using semidefinite programming. The learned metric attempts to keep - k-nearest neighbors in the same class, while keeping examples from different + setting using semidefinite programming. The learned metric attempts to keep close + k-nearest neighbors from the same class, while keeping examples from different

wdevazelhes · 2019-03-08T08:10:49Z

doc/supervised.rst

+are both the indicators, :math:`\eta_{ij}` represents :math:`x_{j}` is the 
+k nearest neighbors(with same labels) of :math:`x_{i}`, :math:`y_{ij}=0` 
+indicates :math:`x_{i}, x_{j}` belong to different class, :math:`[\cdot]_+` 
+is Hinge loss. In the optimization process, the second term is replaced 


Suggested change

is Hinge loss. In the optimization process, the second term is replaced

is the Hinge loss. In the optimization process, the second term is replaced

wdevazelhes · 2019-03-08T08:12:44Z

doc/supervised.rst

+
+.. math::
+
+      \min_\mathbf{M}\sum_{i, j}\eta_{ij}||\mathbf{L}(x_i-x_j)||^2 + 


Suggested change

\min_\mathbf{M}\sum_{i, j}\eta_{ij}||\mathbf{L}(x_i-x_j)||^2 +

\min_\mathbf{L}\sum_{i, j}\eta_{ij}||\mathbf{L}(x_i-x_j)||^2 +

wdevazelhes · 2019-03-08T08:18:04Z

doc/supervised.rst

 setting using semidefinite programming. The learned metric attempts to keep
 k-nearest neighbors in the same class, while keeping examples from different
 classes separated by a large margin. This algorithm makes no assumptions about
 the distribution of the data.

+The distance is learned using the following convex optimization:


Since we are optimizing on L, I think our problem is not necessarily convex (it's the M version that is). I am not sure we should say semidefinite programming either (above, l. 49-50).

Yes. The problem is not convex, so replace by
"The distance is learned by solving the following optimization problem"

wdevazelhes · 2019-03-08T08:46:04Z

doc/supervised.rst

+k nearest neighbors(with same labels) of :math:`x_{i}`, :math:`y_{ij}=0` 
+indicates :math:`x_{i}, x_{j}` belong to different class, :math:`[\cdot]_+` 
+is Hinge loss. In the optimization process, the second term is replaced 
+by the slack variables :math:`\xi_{ijk}` for the sake of convexity. 


I am not sure in our implementation we use slack variables, since we optimize on L and in the paper they mention slack variables only when optimizing on M (and for "sake of convexity", also same as above, I think it's convex only if we optimize on M which we don't)

Indeed we are not, we simply solve the problem by (sub)gradient descent, so this sentence should be removed

wdevazelhes · 2019-03-08T08:53:06Z

metric_learn/nca.py

@@ -1,6 +1,40 @@
 """


The latex part below does not render well on my computer, but with this it's solved:

Suggested change

"""

r"""

wdevazelhes · 2019-03-08T09:23:12Z

metric_learn/nca.py

@@ -1,6 +1,40 @@
 """
 Neighborhood Components Analysis (NCA)
 Ported to Python from https://github.com/vomjom/nca
+
+Neighborhood Components Analysis (`NCA`) is a distance


Now that we're at it, I know that it's not completely done in metric-learn yet, but we should try to adopt scikit-learn conventions for the user guide, i.e. :

We should add short descriptions of the algorithms (without mathematical formulation) in the docstring of the class we want to document

We should put the full description in the user guide

So I think what you could do is :

put a reference at the top of metric_learn.nca.rst like this:
https://github.com/scikit-learn/scikit-learn/blame/4140657700cc55830347c871134c8e982d29fab5/doc/modules/neighbors.rst#L515-L517

Put the following description in the docstring of the class NCA (at the top of the file we should just keep a title, the authors, and the encoding) with this line as a reference to the user guide: https://github.com/scikit-learn/scikit-learn/blob/4140657700cc55830347c871134c8e982d29fab5/sklearn/neighbors/nca.py#L37

I agree, the full doc with math descriptions in the docstrings is too much

wdevazelhes · 2019-03-08T09:27:05Z

metric_learn/rca.py

@@ -7,6 +7,19 @@
 Those relevant dimensions are estimated using "chunklets",
 subsets of points that are known to belong to the same class.

+For a training set with :math:`n` training points in :math:`k`


Same remark as above here

wdevazelhes · 2019-03-08T09:27:40Z

metric_learn/sdml.py

 An efficient sparse metric learning in high-dimensional space via
-L1-penalized log-determinant regularization.
-ICML 2009
+double regularization: L1-penalized on the off-diagonal elements of Mahalanobis


Same remark here

bellet · 2019-03-14T14:43:21Z

Hi @hansen7, any reason you are closing this? I was planning to have a look today

bellet

@hansen7 I interpreted you closing the branch as a mistake, so I have reviewed your PR. Please confirm that you still want to work on this - otherwise we will take it from there (we need this for our next release).

There are some changes to make, but this is definitely a good start. When you make the modifications in the rst files make sure you also update the docstrings accordingly

bellet · 2019-03-14T17:03:06Z

doc/supervised.rst

@@ -46,12 +46,31 @@ LMNN

 Large-margin nearest neighbor metric learning.

-`LMNN` learns a Mahanalobis distance metric in the kNN classification
+`LMNN` learns a Mahalanobis distance metric in the kNN classification
 setting using semidefinite programming. The learned metric attempts to keep


I would remove the reference to semidefinite programming, as we are actually not solving the problem in the PSD matrix but in the unconstrained linear transformation matrix

bellet · 2019-03-14T17:03:59Z

doc/supervised.rst

 setting using semidefinite programming. The learned metric attempts to keep
 k-nearest neighbors in the same class, while keeping examples from different
 classes separated by a large margin. This algorithm makes no assumptions about
 the distribution of the data.

+The distance is learned using the following convex optimization:


Yes. The problem is not convex, so replace by
"The distance is learned by solving the following optimization problem"

bellet · 2019-03-14T17:22:09Z

doc/supervised.rst

+
+      \min_\mathbf{M}\sum_{i, j}\eta_{ij}||\mathbf{L}(x_i-x_j)||^2 + 
+      c\sum_{i, j, k}\eta_{ij}(1-y_{ij})[1+||\mathbf{L}(x_i-x_j)||^2-||
+      \mathbf{L}(x_i-x_l)||^2]_+)


x_l should be x_k

bellet · 2019-03-14T17:24:09Z

doc/supervised.rst

+      c\sum_{i, j, k}\eta_{ij}(1-y_{ij})[1+||\mathbf{L}(x_i-x_j)||^2-||
+      \mathbf{L}(x_i-x_l)||^2]_+)
+
+where :math:`x_i` is the 'target', :math:`x_j` are its k nearest neighbors 


We should avoid using the word "target" for x_i as in LMNN the term "target neighbors" are used for x_j

bellet · 2019-03-14T17:24:23Z

doc/supervised.rst

+      \mathbf{L}(x_i-x_l)||^2]_+)
+
+where :math:`x_i` is the 'target', :math:`x_j` are its k nearest neighbors 
+sharing the same label, and :math:`x_l` are all the other instances within 


bellet · 2019-03-14T17:55:56Z

doc/weakly_supervised.rst

@@ -317,6 +335,20 @@ implicit assumptions of MMC is that all classes form a compact set, i.e.,
 follow a unimodal distribution, which restricts the possible use-cases of this
 method. However, it is one of the earliest and a still often cited technique.

+This is the first Mahalanobis distance learning method, the algorithm aims at 
+maximizing the sum of distances between all the instances from the dissimilar 


this is not consistent with the first paragraph. While the two problems are equivalent, it is better to stick to the one which is actually solved in the code (which is to minimize the sum of distances between similar pairs while keeping the sum of dissimilar pairs larger than 1

Yes, I agree that more doc for additional algorithms can be added in a next PR. It would be nice to be able to merge this soon. @hansen7 do you expect to have time to address the small comments?

Hi, @wdevazelhes @bellet , thanks for the strong support! I have resolved the listed the issues, and add the mathematical descriptions of all the metric learning algorithm but the cov. Please see attached, for the latex part, it is all work locally, and I have standardise the notations:

uppercase bolded letters for matrices

lowercase bolded letters for vectors

uppercase unbolded letters for sets

lowercase unbolded letters for scalars

bellet · 2019-03-14T17:56:38Z

doc/weakly_supervised.rst

+.. math::
+
+      \max_{\mathbf{M}\in\mathbb{S}_+^d}\sum_{(x_i, x_j)\in\mathbf{D}} 
+      d_{\mathbf{M}}(x_i, x_j)\qquad \qquad \text{s.t.} \qquad 


for some reason the \text{s.t.} does not show properly on my locally compiled version of the doc

bellet · 2019-03-14T17:57:38Z

doc/weakly_supervised.rst

+
+      \max_{\mathbf{M}\in\mathbb{S}_+^d}\sum_{(x_i, x_j)\in\mathbf{D}} 
+      d_{\mathbf{M}}(x_i, x_j)\qquad \qquad \text{s.t.} \qquad 
+      \sum_{(x'_i, x'_j)\in\mathbf{S}} d^2_{\mathbf{M}}(x'_i, x'_j) \leq 1


the primes are not necessary, it is clear without them as they come from the different S

bellet · 2019-03-14T17:58:12Z

metric_learn/lmnn.py

+k nearest neighbors(with same labels) of :math:`x_{i}`, :math:`y_{ij}=0`
+indicates :math:`x_{i}, x_{j}` belong to different class, :math:`[\cdot]_+`
+is Hinge loss. In the optimization process, the second term is replaced
+by the slack variables :math:`\xi_{ijk}` for the sake of convexity.


We do not, see discussion above

bellet · 2019-03-14T17:59:21Z

metric_learn/nca.py

@@ -1,6 +1,40 @@
 """
 Neighborhood Components Analysis (NCA)
 Ported to Python from https://github.com/vomjom/nca
+
+Neighborhood Components Analysis (`NCA`) is a distance


I agree, the full doc with math descriptions in the docstrings is too much

hansen7 · 2019-03-15T20:59:51Z

@hansen7 I interpreted you closing the branch as a mistake, so I have reviewed your PR. Please confirm that you still want to work on this - otherwise we will take it from there (we need this for our next release).

There are some changes to make, but this is definitely a good start. When you make the modifications in the rst files make sure you also update the docstrings accordingly

Sorry @bellet , I sort of mixing up the procedures for pushing the codes.. I thought my supplements has already been added, I will check all beforehand comments and make adjustments this weekend.

bellet · 2019-03-22T09:51:05Z

Hi @hansen7 do you expect to have some time to work on this soon?

hansen7 · 2019-03-22T20:09:08Z

Hi @hansen7 do you expect to have some time to work on this soon?

yes... I will do it this weekend, sorry for the delay...

…details of sdml

hansen7 · 2019-03-28T23:11:27Z

Please, see the most recent recent push.. I have noticed that the arrangement of doc is quite different..plus, is it more systematic to add all the mathematical interpretations of the algorithms for itml, lfda, lsml, mlkr, rca as well?

hansen7 · 2019-03-28T23:23:18Z

@wdevazelhes @bellet , I have deleted the math part from the docstring, and use the same style for referencing as sklearn. But I am not sure about using Read more in the :ref:User Guide <nca>. properly, I am not sure here User Guide should be replaced with some other keywords or not...

bellet

Thanks for the update! I have a few small change requests.
I think it would indeed be great to have a basic mathematical description for each algorithm. I think RCA is okay as it is - but there is currently nothing for itml, lfda, lsml and mlkr

bellet · 2019-03-31T19:21:09Z

doc/supervised.rst

      \mathbf{L}(x_i-x_l)||^2]_+)

-where :math:`x_i` is the 'target', :math:`x_j` are its k nearest neighbors 
+where :math:`x_i` is an data point, :math:`x_j` are its k nearest neighbors 


a data point

bellet · 2019-03-31T19:24:20Z

doc/weakly_supervised.rst

-double regularization: L1-penalized on the off-diagonal elements of Mahalanobis
-matrix :math:`\mathbf{M}` and the log-determinant divergence between 
+double regularization: an L1-penalization on the off-diagonal elements of the 
+Mahalanobis matrix :math:`\mathbf{M}`, and a log-determinant divergence between 
 :math:`\mathbf{M}` and :math:`\mathbf{M_0}` (set as either :math:`\mathbf{I}` 
 or :math:`\mathbf{\Omega}^{-1}`, where :math:`\mathbf{\Omega}` is the 
 covariance matrix).


empirical covariance matrix

bellet · 2019-03-31T19:25:22Z

doc/weakly_supervised.rst

+
+.. math::
+
+    \min_{\mathbf{M}} = \text{tr}((M_0 + \eta XLX^{T})\cdot M) - \log\det M 


need to bold all matrices

bellet · 2019-03-31T19:25:43Z

doc/weakly_supervised.rst

+    \min_{\mathbf{M}} = \text{tr}((M_0 + \eta XLX^{T})\cdot M) - \log\det M 
+    + \lambda ||M||_{1, off}
+
+where :math:`\mathbf{X}=[x_1, x_2, ..., x_n]`, :math:`\mathbf{L = D − K}` is 


where :math:\mathbf{X}=[x_1, x_2, ..., x_n] is the training data

bellet · 2019-03-31T19:30:38Z

doc/weakly_supervised.rst

+:math:`\mathbf{K}` is the incidence matrix to encode the (dis)similarity 
+information as :math:`\mathbf{K}_{ij} = 1` if :math:`(x_i,x_j)\in \mathbf{S}`, 
+:math:`\mathbf{K}_{ij} = -1` if :math:`(x_i,x_j)\in \mathbf{D}`, 
+:math:`||\cdot||_{1, off}` is the off-diagonal L1 norm of :math:`\mathbf{M}`.


there are a couple of issues here:

please inverse the order: start with K, then D, then L so that everything needed has been defined when you introduce them

D for dissimilar pairs conflicts with the matrix D. I suggest not to introduce new notations and simply say things like :math:\mathbf{K}_{ij} = 1 if :math:(x_i,x_j) is a similar pair

bellet · 2019-03-31T19:30:53Z

doc/weakly_supervised.rst

+    + \lambda ||M||_{1, off}
+
+where :math:`\mathbf{X}=[x_1, x_2, ..., x_n]`, :math:`\mathbf{L = D − K}` is 
+the Laplacian matrix, :math:`\mathbf{D}` is a diagonal matrix whose diagonal 


second occurrence of "diagonal" is not needed

bellet · 2019-03-31T19:33:31Z

doc/weakly_supervised.rst

@@ -327,18 +348,17 @@ Side-Information, Xing et al., NIPS 2002

 `MMC` minimizes the sum of squared distances between similar examples, while
 enforcing the sum of distances between dissimilar examples to be greater than a
-certain margin. This leads to a convex and, thus, local-minima-free
+certain margin, default is 1. This leads to a convex and, thus, local-minima-free


we do not provide a way to change this margin (which is only a scaling factor anyway), so please simply say greater than 1.

wdevazelhes · 2019-04-09T15:27:25Z

Hi @hansen7 , yes, thanks for the update !

I think your PR is almost ready for merging:
For RCA you already have the mathematical formulation in both the User Guide and the docstring, what you just need to do is to remove it from the docstring and replace it by Read more in the :ref:`User Guide <rca>` (and add the .. rca_: reference in the User Guide), like for the others you did.

But I am not sure about using Read more in the :ref:User Guide <nca>. properly, I am not sure here User Guide should be replaced with some other keywords or not...

You did it well, there's just need to replace :ref:User Guide <nca> by :ref:User Guide <lmnn> for instance for lmnn

Regarding ITML, LSML, LFDA and MLKR, (and Covariance) which right now have no mathematical formulation, yes eventually we will do a mathematical formulation for the release, but if you don't have time to work on it, I think it's OK, we'll deal with them in another PR

Also, there's just the following problem that have appeared: I noticed that when we have a reference for the title, the link of the algorithm in the description are now broken (like when we have .. _nca:) (the link `NCA` links now to the head of the paragraph ( .. _nca:) instead of the docstring). In the previous commit is an example of how to fix it: replace `ALGOXXX` in the User Guide by :py:class:`ALGOXXX <metric_learn.algoxxx.ALGOXXX>` . You could do that for all the algorithm where there is a reference in the User Guide like .. _nca:.

And I guess as soon as you have resolved @bellet's comments, we should be able to merge the PR

Also, there are conflicts that you need to resolve (but it's ok, just in the gitignore you should accept both changes (I think we want to gitignore idea and DS_Store files)

bellet · 2019-04-15T14:19:30Z

Yes, I agree that more doc for additional algorithms can be added in a next PR. It would be nice to be able to merge this soon. @hansen7 do you expect to have time to address the small comments?

hansen7 · 2019-04-16T22:27:24Z

Yes, I agree that more doc for additional algorithms can be added in a next PR. It would be nice to be able to merge this soon. @hansen7 do you expect to have time to address the small comments?

Hi @wdevazelhes and @bellet , thanks for the support! I have resolved all the comments listed above, plz check the most recent branch! I have added mathematical descriptions for all the metric learning algorithms but the cov, fix the references issues, delete the redundant math parts from the docstrings, and formalised the notation in the latex math parts:

bolded uppercase letters for matrices
bolded lowercase letters for vectors
unbolded uppercase letters for sets
unbolded lowercase letters for scalars

bellet

Thanks for the work, @hansen7 !
I have indicated a few simple changes to make to clarify the formulations, then I think we are good to merge

bellet · 2019-04-17T12:40:30Z

doc/supervised.rst

+    \,\,\mathbf{A}_{i,j}(1/n-1/n_l) \qquad y_i = y_j\end{aligned}\right.\\
+
+here :math:`\mathbf{A}_{i,j}` is the :math:`(i,j)`-th entry of the affinity
+matrix :math:`\mathbf{A}`:


the affinity matrix is not defined. You should make clear that it is given as input to the algorithm, and explain its semantics. This is key as it is the part which enforces the "locality"

bellet · 2019-04-17T12:41:33Z

doc/supervised.rst

+
+    \hat{y}_i = \frac{\sum_{j\neq i}y_jk_{ij}}{\sum_{j\neq i}k_{ij}}
+
+The tractable property has enabled the distance metric learning problem to 


I would suggest to remove this paragraph and the following equation, to be consistent with other algorithms for which we have not described the algorithm but only the problem formulation

bellet · 2019-04-17T12:41:49Z

doc/weakly_supervised.rst

+
+.. math::
+
+    \min_\mathbf{A} \textbf{KL}(p(\mathbf{x}; \mathbf{A}_0) || p(\mathbf{x}; 


it would be better to write the optimization problem in terms of the logdet divergence, which gives a more "concrete" formulation than the KL version

bellet · 2019-04-17T12:43:58Z

doc/weakly_supervised.rst

+
+.. math::
+
+    L(d(\mathbf{x}_a, \mathbf{x}_b) < d(\mathbf{x}_c, \mathbf{x}_d)) = 


We could get rid of the notation L, which is redundant with H. You could define directly H(d_\mathbf{M}(\mathbf{x}_a, \mathbf{x}_b), which is what you use in the final problem formulation. Of course you can rename H by L then ;-)

I think you have forgotten this one

nevermind, got confused

bellet · 2019-04-17T12:48:34Z

doc/weakly_supervised.rst

+where :math:`\mathbf{X}=[\mathbf{x}_1, \mathbf{x}_2, ..., \mathbf{x}_n]` is 
+the training data, incidence matrix :math:`\mathbf{K}_{ij} = 1` if 
+:math:`(\mathbf{x}_i, \mathbf{x}_j)` is a similar pair, otherwise -1. The 
+Laplacian matrix :math:`\mathbf{L}` is calculated from :math:`\mathbf{K}` 


I think it is better to give the definition of K, even though it requires to define D. Without this, it is hard to understand

To be concise you could write something like:
The Laplacian matrix :math:\mathbf{L}=\mathbf{D}-\mathbf{K} is calculated from :math:\mathbf{K} and :math:\mathbf{D}, a diagonal matrix whose entries are the sums of the row elements of :math:\mathbf{K}.

wdevazelhes

Good job @hansen7 ! This will be a great addition to metric-learn for the next release

You just need to put back the references of type .. _nca in the documentation otherwise the link "See User Guide won't work", and address @bellet 's comments

Otherwise, LGTM

wdevazelhes · 2019-04-18T07:28:12Z

doc/supervised.rst

+`NCA` is a distance metric learning algorithm which aims to improve the 
+accuracy of nearest neighbors classification compared to the standard 
+Euclidean distance. The algorithm directly maximizes a stochastic variant 
+of the leave-one-out k-nearest neighbors(KNN) score on the training set. 


Suggested change

of the leave-one-out k-nearest neighbors(KNN) score on the training set.

of the leave-one-out k-nearest neighbors (KNN) score on the training set.

wdevazelhes · 2019-04-18T07:39:59Z

doc/supervised.rst

@@ -41,17 +41,36 @@ the covariance matrix of the input data. This is a simple baseline method.

    .. [1] On the Generalized Distance in Statistics, P.C.Mahalanobis, 1936

+


You should let the reference for LMNN here (right now the link from the docstring to the user guide doesn't work), this way:

.. _lmnn: LMNN --------

wdevazelhes · 2019-04-18T07:40:11Z

doc/supervised.rst

@@ -80,16 +99,43 @@ The two implementations differ slightly, and the C++ version is more complete.
       -margin -nearest-neighbor-classification>`_ Kilian Q. Weinberger, John
       Blitzer, Lawrence K. Saul

+


wdevazelhes · 2019-04-18T07:40:22Z

doc/supervised.rst

@@ -116,16 +162,54 @@ classification.
    .. [2] Wikipedia entry on Neighborhood Components Analysis
       https://en.wikipedia.org/wiki/Neighbourhood_components_analysis

+


wdevazelhes · 2019-04-18T07:40:44Z

doc/supervised.rst

@@ -155,13 +239,54 @@ LFDA is solved as a generalized eigenvalue problem.
 MLKR


wdevazelhes · 2019-04-18T07:40:51Z

doc/weakly_supervised.rst

@@ -151,18 +151,50 @@ tuples you're working with (pairs, triplets...). See the docstring of the
 Algorithms
 ==================

+


wdevazelhes · 2019-04-18T07:41:01Z

doc/weakly_supervised.rst

@@ -196,8 +228,60 @@ programming.
 LSML


wdevazelhes · 2019-04-18T07:41:11Z

doc/weakly_supervised.rst

@@ -228,8 +312,31 @@ Residual
 SDML


wdevazelhes · 2019-04-18T07:41:18Z

doc/weakly_supervised.rst

@@ -263,14 +370,28 @@ L1-penalized log-determinant regularization
 RCA


wdevazelhes · 2019-04-18T07:41:25Z

doc/weakly_supervised.rst

@@ -301,21 +421,33 @@ of points that are known to belong to the same class.
    .. [3]'Learning a Mahalanobis metric from equivalence constraints', JMLR
       2005

+


hansen7 · 2019-04-27T23:03:41Z

Thanks for the guide! There are still some format improvements can be pursued such as alignments, btw, what is the plan for the next version, will there be new algorithms added?

wdevazelhes · 2019-04-29T15:47:06Z

Good job @hansen7 , LGTM
You should should just resolve the conflicts by merging master into your branch

btw, what is the plan for the next version, will there be new algorithms added?

No, there will be no new algorithms, the next version will be mostly about API changes (more scikit-learn like API, allowing to GridSearch the Weakly Supervised Algorithms for instance), as well as fixes on algorithms like SDML, and of course, a better documentation, a lot coming from this PR :)

hansen7 · 2019-04-29T17:37:00Z

No, there will be no new algorithms, the next version will be mostly about API changes (more scikit-learn like API, allowing to GridSearch the Weakly Supervised Algorithms for instance), as well as fixes on algorithms like SDML, and of course, a better documentation, a lot coming from this PR :)

@wdevazelhes Cool, can I also join the development for the development part..I have noticed that the next version of scikit-learn also add implementations for the metrics learn as well:
sklearn.neighbors.NeighborhoodComponentsAnalysis

bellet · 2019-04-30T08:35:19Z

Hi @hansen7, thanks for the update. Can you please resolve the conflicts and we are good to go.

You're very welcome to contribute more to metric-learn :-) Besides what's already mentioned in the current issues, adding more algorithms and improving scalability by adding stochastic optimization solvers would be very useful.

hansen7 · 2019-05-02T17:08:00Z

it seems there are some encoding issues, is it okay for me to just add # coding: utf-8 at the head of each script

bellet · 2019-05-03T10:39:07Z

Looking at the CI report, it seems that you have inserted a non-ASCII character (probably due to copy/paste) in line 4-5 of itml.py. I think it is what appears to be a non-standard hyphen in "Kullback-Leibler". Please try to simply replace by normal hyphen

bellet · 2019-05-03T13:47:35Z

Fixed. Merging

bellet · 2019-05-03T13:49:04Z

Thanks a lot, @hansen7 ! You are very welcome to contribute more :-)

hansen7 added 2 commits March 6, 2019 23:19

Update_Doc_hc

afe0fb2

Update lmnn.py, test if the issue was raised up by the unicode encode

2fcd8ca

wdevazelhes approved these changes Mar 8, 2019

View reviewed changes

Update .gitignore

ecedf1d

hansen7 closed this Mar 14, 2019

hansen7 deleted the hc branch March 14, 2019 14:12

hansen7 restored the hc branch March 14, 2019 14:14

bellet reviewed Mar 14, 2019

View reviewed changes

bellet reopened this Mar 14, 2019

Update with mispelled notations, wrong interpretations, and add more …

7f2f0f6

…details of sdml

bellet requested changes Mar 31, 2019

View reviewed changes

Test: fix nca's reference

abea174

hansen7 added 2 commits April 16, 2019 18:27

Update Doc and Docstring

ffc5e3e

Update Doc and Docstring

19a00d2

bellet approved these changes Apr 17, 2019

View reviewed changes

bellet requested changes Apr 17, 2019

View reviewed changes

wdevazelhes reviewed Apr 18, 2019

View reviewed changes

Minor Equation Update and Reference

70af371

bellet approved these changes Apr 30, 2019

View reviewed changes

Merge branch 'master' into hc

328d7f5

fix bad hyphen

c50fbaf

bellet merged commit d4badc8 into scikit-learn-contrib:master May 3, 2019

bellet changed the title ~~Update_Doc_hc~~ Add description of algorithms to the doc May 3, 2019

	k nearest neighbors(with same labels) of :math:`x_{i}`, :math:`y_{ij}=0`
	k nearest neighbors (with same labels) of :math:`x_{i}`, :math:`y_{ij}=0`

	is Hinge loss. In the optimization process, the second term is replaced
	is the Hinge loss. In the optimization process, the second term is replaced


		.. math::

		\min_\mathbf{M}\sum_{i, j}\eta_{ij}\|\|\mathbf{L}(x_i-x_j)\|\|^2 +

	\min_\mathbf{M}\sum_{i, j}\eta_{ij}\|\|\mathbf{L}(x_i-x_j)\|\|^2 +
	\min_\mathbf{L}\sum_{i, j}\eta_{ij}\|\|\mathbf{L}(x_i-x_j)\|\|^2 +


		.. math::

		\min_{\mathbf{M}} = \text{tr}((M_0 + \eta XLX^{T})\cdot M) - \log\det M


		\hat{y}_i = \frac{\sum_{j\neq i}y_jk_{ij}}{\sum_{j\neq i}k_{ij}}

		The tractable property has enabled the distance metric learning problem to


		.. math::

		\min_\mathbf{A} \textbf{KL}(p(\mathbf{x}; \mathbf{A}_0) \|\| p(\mathbf{x};


		.. math::

		L(d(\mathbf{x}_a, \mathbf{x}_b) < d(\mathbf{x}_c, \mathbf{x}_d)) =

	of the leave-one-out k-nearest neighbors(KNN) score on the training set.
	of the leave-one-out k-nearest neighbors (KNN) score on the training set.

		@@ -41,17 +41,36 @@ the covariance matrix of the input data. This is a simple baseline method.

		.. [1] On the Generalized Distance in Statistics, P.C.Mahalanobis, 1936

		@@ -80,16 +99,43 @@ The two implementations differ slightly, and the C++ version is more complete.
		-margin -nearest-neighbor-classification>`_ Kilian Q. Weinberger, John
		Blitzer, Lawrence K. Saul

		@@ -116,16 +162,54 @@ classification.
		.. [2] Wikipedia entry on Neighborhood Components Analysis
		https://en.wikipedia.org/wiki/Neighbourhood_components_analysis

		@@ -155,13 +239,54 @@ LFDA is solved as a generalized eigenvalue problem.
		MLKR

		@@ -151,18 +151,50 @@ tuples you're working with (pairs, triplets...). See the docstring of the
		Algorithms
		==================

		@@ -263,14 +370,28 @@ L1-penalized log-determinant regularization
		RCA

		@@ -301,21 +421,33 @@ of points that are known to belong to the same class.
		.. [3]'Learning a Mahalanobis metric from equivalence constraints', JMLR
		2005

Add description of algorithms to the doc #178

Add description of algorithms to the doc #178

Uh oh!

Conversation

hansen7 commented Mar 6, 2019 • edited by bellet Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bellet commented Mar 7, 2019

Uh oh!

wdevazelhes left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bellet commented Mar 14, 2019

Uh oh!

bellet left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hansen7 commented Mar 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bellet commented Mar 22, 2019

Uh oh!

hansen7 commented Mar 22, 2019

Uh oh!

hansen7 commented Mar 28, 2019

Uh oh!

hansen7 commented Mar 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bellet left a comment

Choose a reason for hiding this comment

Uh oh!

hansen7 commented Mar 6, 2019 •

edited by bellet

Loading

hansen7 commented Mar 15, 2019 •

edited

Loading

hansen7 commented Mar 28, 2019 •

edited

Loading