Skip to content

Commit 23d0746

Browse files
wdevazelhesbellet
authored andcommitted
[MRG] New api design (#139)
[MRG] New api design
1 parent ac0e230 commit 23d0746

36 files changed

+3534
-704
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ dist/
55
.coverage
66
htmlcov/
77
.cache/
8+
doc/auto_examples/*

README.rst

Lines changed: 2 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -34,27 +34,8 @@ package installed).
3434

3535
**Usage**
3636

37-
For full usage examples, see the `sphinx documentation`_.
38-
39-
Each metric is a subclass of ``BaseMetricLearner``, which provides
40-
default implementations for the methods ``metric``, ``transformer``, and
41-
``transform``. Subclasses must provide an implementation for either
42-
``metric`` or ``transformer``.
43-
44-
For an instance of a metric learner named ``foo`` learning from a set of
45-
``d``-dimensional points, ``foo.metric()`` returns a ``d x d``
46-
matrix ``M`` such that the distance between vectors ``x`` and ``y`` is
47-
expressed ``sqrt((x-y).dot(M).dot(x-y))``.
48-
Using scipy's ``pdist`` function, this would look like
49-
``pdist(X, metric='mahalanobis', VI=foo.metric())``.
50-
51-
In the same scenario, ``foo.transformer()`` returns a ``d x d``
52-
matrix ``L`` such that a vector ``x`` can be represented in the learned
53-
space as the vector ``x.dot(L.T)``.
54-
55-
For convenience, the function ``foo.transform(X)`` is provided for
56-
converting a matrix of points (``X``) into the learned space, in which
57-
standard Euclidean distance can be used.
37+
See the `sphinx documentation`_ for full documentation about installation, API,
38+
usage, and examples.
5839

5940
**Notes**
6041

bench/benchmarks/iris.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
'LMNN': metric_learn.LMNN(k=5, learn_rate=1e-6, verbose=False),
1111
'LSML_Supervised': metric_learn.LSML_Supervised(num_constraints=200),
1212
'MLKR': metric_learn.MLKR(),
13-
'NCA': metric_learn.NCA(max_iter=700, learning_rate=0.01, num_dims=2),
13+
'NCA': metric_learn.NCA(max_iter=700, num_dims=2),
1414
'RCA_Supervised': metric_learn.RCA_Supervised(dim=2, num_chunks=30,
1515
chunk_size=2),
1616
'SDML_Supervised': metric_learn.SDML_Supervised(num_constraints=1500),

doc/conf.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
'sphinx.ext.viewcode',
88
'sphinx.ext.mathjax',
99
'numpydoc',
10+
'sphinx_gallery.gen_gallery'
1011
]
1112

1213
templates_path = ['_templates']
@@ -31,3 +32,6 @@
3132
html_static_path = ['_static']
3233
htmlhelp_basename = 'metric-learndoc'
3334

35+
# Option to only need single backticks to refer to symbols
36+
default_role = 'any'
37+

doc/getting_started.rst

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
###############
2+
Getting started
3+
###############
4+
5+
Installation and Setup
6+
======================
7+
8+
Run ``pip install metric-learn`` to download and install from PyPI.
9+
10+
Alternately, download the source repository and run:
11+
12+
- ``python setup.py install`` for default installation.
13+
- ``python setup.py test`` to run all tests.
14+
15+
**Dependencies**
16+
17+
- Python 2.7+, 3.4+
18+
- numpy, scipy, scikit-learn
19+
- (for running the examples only: matplotlib)
20+
21+
**Notes**
22+
23+
If a recent version of the Shogun Python modular (``modshogun``) library
24+
is available, the LMNN implementation will use the fast C++ version from
25+
there. The two implementations differ slightly, and the C++ version is
26+
more complete.
27+
28+
29+
Quick start
30+
===========
31+
32+
This example loads the iris dataset, and evaluates a k-nearest neighbors
33+
algorithm on an embedding space learned with `NCA`.
34+
35+
>>> from metric_learn import NCA
36+
>>> from sklearn.datasets import load_iris
37+
>>> from sklearn.model_selection import cross_val_score
38+
>>> from sklearn.pipeline import make_pipeline
39+
>>>
40+
>>> X, y = load_iris(return_X_y=True)
41+
>>> clf = make_pipeline(NCA(), KNeighborsClassifier())
42+
>>> cross_val_score(clf, X, y)

doc/index.rst

Lines changed: 12 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -2,103 +2,31 @@ metric-learn: Metric Learning in Python
22
=======================================
33
|License| |PyPI version|
44

5-
Distance metrics are widely used in the machine learning literature.
6-
Traditionally, practicioners would choose a standard distance metric
7-
(Euclidean, City-Block, Cosine, etc.) using a priori knowledge of
8-
the domain.
9-
Distance metric learning (or simply, metric learning) is the sub-field of
10-
machine learning dedicated to automatically constructing optimal distance
11-
metrics.
12-
13-
This package contains efficient Python implementations of several popular
14-
metric learning algorithms.
15-
16-
Supervised Algorithms
17-
---------------------
18-
Supervised metric learning algorithms take as inputs points `X` and target
19-
labels `y`, and learn a distance matrix that make points from the same class
20-
(for classification) or with close target value (for regression) close to
21-
each other, and points from different classes or with distant target values
22-
far away from each other.
5+
Welcome to metric-learn's documentation !
6+
-----------------------------------------
237

248
.. toctree::
25-
:maxdepth: 1
26-
27-
metric_learn.covariance
28-
metric_learn.lmnn
29-
metric_learn.nca
30-
metric_learn.lfda
31-
metric_learn.mlkr
9+
:maxdepth: 2
3210

33-
Weakly-Supervised Algorithms
34-
--------------------------
35-
Weakly supervised algorithms work on weaker information about the data points
36-
than supervised algorithms. Rather than labeled points, they take as input
37-
similarity judgments on tuples of data points, for instance pairs of similar
38-
and dissimilar points. Refer to the documentation of each algorithm for its
39-
particular form of input data.
11+
getting_started
4012

4113
.. toctree::
42-
:maxdepth: 1
43-
44-
metric_learn.itml
45-
metric_learn.lsml
46-
metric_learn.sdml
47-
metric_learn.rca
48-
metric_learn.mmc
49-
50-
Note that each weakly-supervised algorithm has a supervised version of the form
51-
`*_Supervised` where similarity constraints are generated from
52-
the labels information and passed to the underlying algorithm.
53-
54-
Each metric learning algorithm supports the following methods:
55-
56-
- ``fit(...)``, which learns the model.
57-
- ``transformer()``, which returns a transformation matrix
58-
:math:`L \in \mathbb{R}^{D \times d}`, which can be used to convert a
59-
data matrix :math:`X \in \mathbb{R}^{n \times d}` to the
60-
:math:`D`-dimensional learned metric space :math:`X L^{\top}`,
61-
in which standard Euclidean distances may be used.
62-
- ``transform(X)``, which applies the aforementioned transformation.
63-
- ``metric()``, which returns a Mahalanobis matrix
64-
:math:`M = L^{\top}L` such that distance between vectors ``x`` and
65-
``y`` can be computed as :math:`\left(x-y\right)M\left(x-y\right)`.
66-
67-
68-
Installation and Setup
69-
======================
70-
71-
Run ``pip install metric-learn`` to download and install from PyPI.
14+
:maxdepth: 2
7215

73-
Alternately, download the source repository and run:
16+
user_guide
7417

75-
- ``python setup.py install`` for default installation.
76-
- ``python setup.py test`` to run all tests.
77-
78-
**Dependencies**
79-
80-
- Python 2.7+, 3.4+
81-
- numpy, scipy, scikit-learn
82-
- (for running the examples only: matplotlib)
18+
.. toctree::
19+
:maxdepth: 2
8320

84-
**Notes**
21+
Package Overview <metric_learn>
8522

86-
If a recent version of the Shogun Python modular (``modshogun``) library
87-
is available, the LMNN implementation will use the fast C++ version from
88-
there. The two implementations differ slightly, and the C++ version is
89-
more complete.
23+
.. toctree::
24+
:maxdepth: 2
9025

91-
Navigation
92-
----------
26+
auto_examples/index
9327

9428
:ref:`genindex` | :ref:`modindex` | :ref:`search`
9529

96-
.. toctree::
97-
:maxdepth: 4
98-
:hidden:
99-
100-
Package Overview <metric_learn>
101-
10230
.. |PyPI version| image:: https://badge.fury.io/py/metric-learn.svg
10331
:target: http://badge.fury.io/py/metric-learn
10432
.. |License| image:: http://img.shields.io/:license-mit-blue.svg?style=flat

doc/introduction.rst

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
============
2+
Introduction
3+
============
4+
5+
Distance metrics are widely used in the machine learning literature.
6+
Traditionally, practitioners would choose a standard distance metric
7+
(Euclidean, City-Block, Cosine, etc.) using a priori knowledge of
8+
the domain.
9+
Distance metric learning (or simply, metric learning) is the sub-field of
10+
machine learning dedicated to automatically construct task-specific distance
11+
metrics from (weakly) supervised data.
12+
The learned distance metric often corresponds to a Euclidean distance in a new
13+
embedding space, hence distance metric learning can be seen as a form of
14+
representation learning.
15+
16+
This package contains a efficient Python implementations of several popular
17+
metric learning algorithms, compatible with scikit-learn. This allows to use
18+
all the scikit-learn routines for pipelining and model selection for
19+
metric learning algorithms.
20+
21+
22+
Currently, each metric learning algorithm supports the following methods:
23+
24+
- ``fit(...)``, which learns the model.
25+
- ``metric()``, which returns a Mahalanobis matrix
26+
:math:`M = L^{\top}L` such that distance between vectors ``x`` and
27+
``y`` can be computed as :math:`\sqrt{\left(x-y\right)M\left(x-y\right)}`.
28+
- ``transformer_from_metric(metric)``, which returns a transformation matrix
29+
:math:`L \in \mathbb{R}^{D \times d}`, which can be used to convert a
30+
data matrix :math:`X \in \mathbb{R}^{n \times d}` to the
31+
:math:`D`-dimensional learned metric space :math:`X L^{\top}`,
32+
in which standard Euclidean distances may be used.
33+
- ``transform(X)``, which applies the aforementioned transformation.
34+
- ``score_pairs(pairs)`` which returns the distance between pairs of
35+
points. ``pairs`` should be a 3D array-like of pairs of shape ``(n_pairs,
36+
2, n_features)``, or it can be a 2D array-like of pairs indicators of
37+
shape ``(n_pairs, 2)`` (see section :ref:`preprocessor_section` for more
38+
details).

doc/metric_learn.nca.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Example Code
2121
X = iris_data['data']
2222
Y = iris_data['target']
2323

24-
nca = NCA(max_iter=1000, learning_rate=0.01)
24+
nca = NCA(max_iter=1000)
2525
nca.fit(X, Y)
2626

2727
References

doc/metric_learn.rst

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
metric_learn package
22
====================
33

4-
Submodules
5-
----------
4+
Module Contents
5+
---------------
66

77
.. toctree::
88

@@ -16,11 +16,3 @@ Submodules
1616
metric_learn.nca
1717
metric_learn.rca
1818
metric_learn.sdml
19-
20-
Module contents
21-
---------------
22-
23-
.. automodule:: metric_learn
24-
:members:
25-
:undoc-members:
26-
:show-inheritance:

0 commit comments

Comments
 (0)