Skip to content

DOC: Fix EX02 in pandas.Series.factorize #32901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 1, 2020

Conversation

farhanreynaldo
Copy link
Contributor

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Related to #27977.

output of python scripts/validate_docstrings.py pandas.Series.factorize:

################################################################################
################################## Validation ##################################
################################################################################

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
Unfortunately, this is failing CI. See my inline comment

@@ -556,7 +556,7 @@ def factorize(

>>> codes, uniques = pd.factorize(['b', 'b', 'a', 'c', 'b'])
>>> codes
array([0, 0, 1, 2, 0])
array([0, 0, 1, 2, 0], dtype=int64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether this is displayed by numpy depends on your machine / OS, I think (eg on my linux laptop, it is not displayed). So we cannot simply add it. If we want to have the doctests pass locally in such cases, it probably needs some ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @jorisvandenbossche!
Could you elaborate more on ...?
So actually it is not really problem because the difference arises due to OS running?

Copy link
Member

@jorisvandenbossche jorisvandenbossche Mar 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's still a problem for people who want to run it locally and then see errors (like you did I suppose?).

About the ..., see https://docs.python.org/3/library/doctest.html#doctest.ELLIPSIS

I am not fully sure how we deal with this in other places (cc @datapythonista ?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising that numpy changes the __repr__ depending on the platform. I would report it to them, may be there is a good reason, or may be it's just some old code.

For fixing our doctrsts, I'd check if the same happens with __str__, and use print if it doesn't. Otherwise, I'd simply add an ignore. It's not a great solution, since it's shown to the user, but it's what we've been doing. Better than leave failing tests IMO.

Copy link
Member

@jorisvandenbossche jorisvandenbossche Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a well known issue with numpy and doctests (I think, not fully sure, but eg in scikit-learn we regularly ran into similar issues as well, now I think back of it). AFAIU, it specifically comes up here because it are ints (and if you are on a platform where int64 is not the default, it shows the dtype).

Using

array([0, 0, 1, 2, 0]...)

should work I think. @farhanreynaldo can you try if what this it passes locally for you?

Using print(..) is indeed another option (print never shows the dtype).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the changes using ellipsis but the test still throws an error. After doing a quick look, it seems like we need to enable optionflags=doctest.ELLIPSIS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add that to doctest_optionflags in setup.cfg, I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem still arises eventhough I have added ellipsis to doctest_optionflags. Am I missing something here?

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct, if you update this PR with the changes to setup.cfg and adding the ..., we can have a look and give it a try. But seems like it should be working.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this works locally, but I am also on a 64bit machine, so that might not be a good indication.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, and CI is green. Is this working for you locally @farhanreynaldo ?

@farhanreynaldo
Copy link
Contributor Author

lgtm, and CI is green. Is this working for you locally @farhanreynaldo ?

Yes, it's working now.

@jorisvandenbossche jorisvandenbossche merged commit f5b3d7c into pandas-dev:master Apr 1, 2020
@jorisvandenbossche
Copy link
Member

Thanks @farhanreynaldo !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants