Skip to content

[MRG] Add documentation for supervised classes and add __init__ docstrings to doc #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

wdevazelhes
Copy link
Member

Fixes #22

Here is a small proposal for adding some documentation for supervised classes.

@wdevazelhes
Copy link
Member Author

Docstrings in __init__.py of classes currently does not appear in the doc, I wonder if we should put them in the class docstring (below the class name), or rather add some directives to sphinx not to ignore them

doc/index.rst Outdated
Supervised Algorithms
---------------------
Supervised metric learning algorithms take as inputs points `X` and targets
`y`, and learn to make points from the same class close to each other, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target class labels

Copy link
Member Author

@wdevazelhes wdevazelhes Aug 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "target labels" only, to include MLKR ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized the rest of the sentence ("[...]from the same class[...]") didn't include MLKR so I'll update this too

@perimosocordiae
Copy link
Contributor

+1 for telling Sphinx to include the __init__ docstrings.

@wdevazelhes
Copy link
Member Author

wdevazelhes commented Aug 21, 2018

+1 for telling Sphinx to include the init docstrings.

According to this comment there are several ways to do that, I went for the one that includes the special-members because I found it easy to read and quite clean, but maybe option 1 (to include some code in conf.py), would be better, avoiding to add a line of code in every file ?

@wdevazelhes wdevazelhes changed the title [WIP] Add documentation for supervised classes [MRG] Add documentation for supervised classes and add __init__ docstrings to doc Aug 21, 2018
Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from a few nitpicks. Since the documentation will need to be adapted when we incorporate the new API, it is probably not worth spending too much time on improving the doc right now.

doc/index.rst Outdated
metric_learn.nca
metric_learn.lfda
metric_learn.mlkr

Semi-Supervised Algorithms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather use something like "Weakly supervised algorithms" to avoid confusion with the classic semi-supervised setting (i.e., a subset of points are labeled and other are not).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree
Done

doc/index.rst Outdated
Supervised Algorithms
---------------------
Supervised metric learning algorithms take as inputs points `X` and target
labels `y`, and learn to make points from the same class (for classification)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

learn a distance metric that makes points...

doc/index.rst Outdated

Semi-Supervised Algorithms
--------------------------
Semi-supervised algorithms do the same, but work on a weaker information about
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe avoid "do the same" to have a more self consistent section
quick suggestion: "Weakly supervised algorithms work on weaker information about the data points than supervised algorithms. Rather than labeled points, they take as input similarity judgments on tuples of data points, for instance pairs of similar and dissimilar points. Refer to the documentation of each algorithm for its particular form of input data."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree this is clearer
Done

num_labeled : int, optional
number of labels to preserve for training
num_labeled : int, optional (default=np.inf)
number of labels to preserve for training. If np.inf (default),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is meant by "preserve for training" needs to be clarified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about: "number of labeled points to keep for building pairs. Extra
labeled points will be considered unlabeled, and ignored as such.
Use np.inf (default) to use all labeled points." ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact IMO this parameter is useless. I will open an issue


`LSML_Supervised` creates quadruplets from labeled samples by taking two
samples from the same class, and two samples from different classes, and
concatenating them. This way it builds quadruplets where the two first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

", and concatenating them" seems useless

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed
Done

doc/index.rst Outdated

Each metric supports the following methods:
Note that each semi-supervised algorithm has a supervised version of the form
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weakly

@bellet bellet merged commit 22f60dd into scikit-learn-contrib:master Aug 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants