Skip to content

Commit 1276040

Browse files
authored
[MRG] Improve docstrings: add them for Constraints class and methods and fix minor problems (#280)
* add docstrings for constraints class, pairs and chunks methods * fix missing optional values and descriptions, uniformize * fix indentation problems in docstring and uniformize
1 parent e739239 commit 1276040

File tree

12 files changed

+804
-686
lines changed

12 files changed

+804
-686
lines changed

metric_learn/_util.py

Lines changed: 59 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -448,45 +448,45 @@ def _initialize_components(n_components, input, y=None, init='auto',
448448
The input labels (or not if there are no labels).
449449
450450
init : string or numpy array, optional (default='auto')
451-
Initialization of the linear transformation. Possible options are
452-
'auto', 'pca', 'lda', 'identity', 'random', and a numpy array of shape
453-
(n_features_a, n_features_b).
454-
455-
'auto'
456-
Depending on ``n_components``, the most reasonable initialization
457-
will be chosen. If ``n_components <= n_classes`` we use 'lda' (see
458-
the description of 'lda' init), as it uses labels information. If
459-
not, but ``n_components < min(n_features, n_samples)``, we use 'pca',
460-
as it projects data onto meaningful directions (those of higher
461-
variance). Otherwise, we just use 'identity'.
462-
463-
'pca'
464-
``n_components`` principal components of the inputs passed
465-
to :meth:`fit` will be used to initialize the transformation.
466-
(See `sklearn.decomposition.PCA`)
467-
468-
'lda'
469-
``min(n_components, n_classes)`` most discriminative
470-
components of the inputs passed to :meth:`fit` will be used to
471-
initialize the transformation. (If ``n_components > n_classes``,
472-
the rest of the components will be zero.) (See
473-
`sklearn.discriminant_analysis.LinearDiscriminantAnalysis`).
474-
This initialization is possible only if `has_classes == True`.
475-
476-
'identity'
477-
The identity matrix. If ``n_components`` is strictly smaller than the
478-
dimensionality of the inputs passed to :meth:`fit`, the identity
479-
matrix will be truncated to the first ``n_components`` rows.
480-
481-
'random'
482-
The initial transformation will be a random array of shape
483-
`(n_components, n_features)`. Each value is sampled from the
484-
standard normal distribution.
485-
486-
numpy array
487-
n_features_b must match the dimensionality of the inputs passed to
488-
:meth:`fit` and n_features_a must be less than or equal to that.
489-
If ``n_components`` is not None, n_features_a must match it.
451+
Initialization of the linear transformation. Possible options are
452+
'auto', 'pca', 'lda', 'identity', 'random', and a numpy array of shape
453+
(n_features_a, n_features_b).
454+
455+
'auto'
456+
Depending on ``n_components``, the most reasonable initialization
457+
will be chosen. If ``n_components <= n_classes`` we use 'lda' (see
458+
the description of 'lda' init), as it uses labels information. If
459+
not, but ``n_components < min(n_features, n_samples)``, we use 'pca',
460+
as it projects data onto meaningful directions (those of higher
461+
variance). Otherwise, we just use 'identity'.
462+
463+
'pca'
464+
``n_components`` principal components of the inputs passed
465+
to :meth:`fit` will be used to initialize the transformation.
466+
(See `sklearn.decomposition.PCA`)
467+
468+
'lda'
469+
``min(n_components, n_classes)`` most discriminative
470+
components of the inputs passed to :meth:`fit` will be used to
471+
initialize the transformation. (If ``n_components > n_classes``,
472+
the rest of the components will be zero.) (See
473+
`sklearn.discriminant_analysis.LinearDiscriminantAnalysis`).
474+
This initialization is possible only if `has_classes == True`.
475+
476+
'identity'
477+
The identity matrix. If ``n_components`` is strictly smaller than the
478+
dimensionality of the inputs passed to :meth:`fit`, the identity
479+
matrix will be truncated to the first ``n_components`` rows.
480+
481+
'random'
482+
The initial transformation will be a random array of shape
483+
`(n_components, n_features)`. Each value is sampled from the
484+
standard normal distribution.
485+
486+
numpy array
487+
n_features_b must match the dimensionality of the inputs passed to
488+
:meth:`fit` and n_features_a must be less than or equal to that.
489+
If ``n_components`` is not None, n_features_a must match it.
490490
491491
verbose : bool
492492
Whether to print the details of the initialization or not.
@@ -606,26 +606,26 @@ def _initialize_metric_mahalanobis(input, init='identity', random_state=None,
606606
The input samples (can be tuples or regular samples).
607607
608608
init : string or numpy array, optional (default='identity')
609-
Specification for the matrix to initialize. Possible options are
610-
'identity', 'covariance', 'random', and a numpy array of shape
611-
(n_features, n_features).
612-
613-
'identity'
614-
An identity matrix of shape (n_features, n_features).
615-
616-
'covariance'
617-
The (pseudo-)inverse covariance matrix (raises an error if the
618-
covariance matrix is not definite and `strict_pd == True`)
619-
620-
'random'
621-
A random positive definite (PD) matrix of shape
622-
`(n_features, n_features)`, generated using
623-
`sklearn.datasets.make_spd_matrix`.
624-
625-
numpy array
626-
A PSD matrix (or strictly PD if strict_pd==True) of
627-
shape (n_features, n_features), that will be used as such to
628-
initialize the metric, or set the prior.
609+
Specification for the matrix to initialize. Possible options are
610+
'identity', 'covariance', 'random', and a numpy array of shape
611+
(n_features, n_features).
612+
613+
'identity'
614+
An identity matrix of shape (n_features, n_features).
615+
616+
'covariance'
617+
The (pseudo-)inverse covariance matrix (raises an error if the
618+
covariance matrix is not definite and `strict_pd == True`)
619+
620+
'random'
621+
A random positive definite (PD) matrix of shape
622+
`(n_features, n_features)`, generated using
623+
`sklearn.datasets.make_spd_matrix`.
624+
625+
numpy array
626+
A PSD matrix (or strictly PD if strict_pd==True) of
627+
shape (n_features, n_features), that will be used as such to
628+
initialize the metric, or set the prior.
629629
630630
random_state : int or `numpy.RandomState` or None, optional (default=None)
631631
A pseudo random number generator object or a seed for it if int. If

metric_learn/base_metric.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -154,12 +154,12 @@ def transform(self, X):
154154
Parameters
155155
----------
156156
X : (n x d) matrix
157-
Data to transform.
157+
Data to transform.
158158
159159
Returns
160160
-------
161161
transformed : (n x d) matrix
162-
Input data transformed to the metric space by :math:`XL^{\\top}`
162+
Input data transformed to the metric space by :math:`XL^{\\top}`
163163
"""
164164

165165

@@ -180,7 +180,7 @@ class MahalanobisMixin(six.with_metaclass(ABCMeta, BaseMetricLearner,
180180
Attributes
181181
----------
182182
components_ : `numpy.ndarray`, shape=(n_components, n_features)
183-
The learned linear transformation ``L``.
183+
The learned linear transformation ``L``.
184184
"""
185185

186186
def score_pairs(self, pairs):
@@ -313,9 +313,9 @@ class _PairsClassifierMixin(BaseMetricLearner):
313313
Attributes
314314
----------
315315
threshold_ : `float`
316-
If the distance metric between two points is lower than this threshold,
317-
points will be classified as similar, otherwise they will be
318-
classified as dissimilar.
316+
If the distance metric between two points is lower than this threshold,
317+
points will be classified as similar, otherwise they will be
318+
classified as dissimilar.
319319
"""
320320

321321
_tuple_size = 2 # number of points in a tuple, 2 for pairs

metric_learn/constraints.py

Lines changed: 70 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,60 @@
1212

1313
class Constraints(object):
1414
"""
15-
Class to build constraints from labels.
15+
Class to build constraints from labeled data.
1616
17-
See more in the :ref:`User Guide <supervised_version>`
17+
See more in the :ref:`User Guide <supervised_version>`.
18+
19+
Parameters
20+
----------
21+
partial_labels : `numpy.ndarray` of ints, shape=(n_samples,)
22+
Array of labels, with -1 indicating unknown label.
23+
24+
Attributes
25+
----------
26+
partial_labels : `numpy.ndarray` of ints, shape=(n_samples,)
27+
Array of labels, with -1 indicating unknown label.
1828
"""
29+
1930
def __init__(self, partial_labels):
20-
'''partial_labels : int arraylike, -1 indicating unknown label'''
2131
partial_labels = np.asanyarray(partial_labels, dtype=int)
2232
self.partial_labels = partial_labels
2333

2434
def positive_negative_pairs(self, num_constraints, same_length=False,
2535
random_state=None):
36+
"""
37+
Generates positive pairs and negative pairs from labeled data.
38+
39+
Positive pairs are formed by randomly drawing ``num_constraints`` pairs of
40+
points with the same label. Negative pairs are formed by randomly drawing
41+
``num_constraints`` pairs of points with different label.
42+
43+
In the case where it is not possible to generate enough positive or
44+
negative pairs, a smaller number of pairs will be returned with a warning.
45+
46+
Parameters
47+
----------
48+
num_constraints : int
49+
Number of positive and negative constraints to generate.
50+
same_length : bool, optional (default=False)
51+
If True, forces the number of positive and negative pairs to be
52+
equal by ignoring some pairs from the larger set.
53+
random_state : int or numpy.RandomState or None, optional (default=None)
54+
A pseudo random number generator object or a seed for it if int.
55+
Returns
56+
-------
57+
a : array-like, shape=(n_constraints,)
58+
1D array of indicators for the left elements of positive pairs.
59+
60+
b : array-like, shape=(n_constraints,)
61+
1D array of indicators for the right elements of positive pairs.
62+
63+
c : array-like, shape=(n_constraints,)
64+
1D array of indicators for the left elements of negative pairs.
65+
66+
d : array-like, shape=(n_constraints,)
67+
1D array of indicators for the right elements of negative pairs.
68+
"""
2669
random_state = check_random_state(random_state)
2770
a, b = self._pairs(num_constraints, same_label=True,
2871
random_state=random_state)
@@ -60,7 +103,30 @@ def _pairs(self, num_constraints, same_label=True, max_iter=10,
60103

61104
def chunks(self, num_chunks=100, chunk_size=2, random_state=None):
62105
"""
63-
the random state object to be passed must be a numpy random seed
106+
Generates chunks from labeled data.
107+
108+
Each of ``num_chunks`` chunks is composed of ``chunk_size`` points from
109+
the same class drawn at random. Each point can belong to at most 1 chunk.
110+
111+
In the case where there is not enough points to generate ``num_chunks``
112+
chunks of size ``chunk_size``, a ValueError will be raised.
113+
114+
Parameters
115+
----------
116+
num_chunks : int, optional (default=100)
117+
Number of chunks to generate.
118+
119+
chunk_size : int, optional (default=2)
120+
Number of points in each chunk.
121+
122+
random_state : int or numpy.RandomState or None, optional (default=None)
123+
A pseudo random number generator object or a seed for it if int.
124+
125+
Returns
126+
-------
127+
chunks : array-like, shape=(n_samples,)
128+
1D array of chunk indicators, where -1 indicates that the point does not
129+
belong to any chunk.
64130
"""
65131
random_state = check_random_state(random_state)
66132
chunks = -np.ones_like(self.partial_labels, dtype=int)

0 commit comments

Comments
 (0)