DOC: Fix encoding of docstring validation for Windows #25466

kpapdac · 2019-02-27T21:04:20Z

In Windows, the validate_docstrings.py script fails because an encoding error. It has been fixed here.

PR done in the London python sprints meetup.

codecov · 2019-02-27T22:00:48Z

Codecov Report

Merging #25466 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #25466      +/-   ##
==========================================
+ Coverage   91.74%   91.74%   +<.01%     
==========================================
  Files         173      173              
  Lines       52923    52923              
==========================================
+ Hits        48554    48555       +1     
+ Misses       4369     4368       -1

Flag	Coverage Δ
#multiple	`90.31% <ø> (ø)`	⬆️
#single	`41.73% <ø> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/util/testing.py	`87.66% <0%> (+0.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe1654f...639302e. Read the comment docs.

codecov · 2019-02-27T22:00:55Z

Codecov Report

Merging #25466 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #25466      +/-   ##
==========================================
- Coverage   91.98%   91.98%   -0.01%     
==========================================
  Files         175      175              
  Lines       52374    52374              
==========================================
- Hits        48178    48174       -4     
- Misses       4196     4200       +4

Flag	Coverage Δ
#multiple	`90.53% <ø> (ø)`	⬆️
#single	`40.72% <ø> (-0.15%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`78.94% <0%> (-10.53%)`	⬇️
pandas/core/frame.py	`96.9% <0%> (-0.12%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94c8c94...5edb1e8. Read the comment docs.

WillAyd

Can you add a test case for the behavior?

kpapdac · 2019-02-28T00:24:52Z

running

.\scripts\validate_docstrings.py --errors=GL08 --format=json

returns

UnicodeEncodeError: 'charmap' codec can't encode character '\u2155' in position 254: character maps to <undefined>

WillAyd · 2019-02-28T00:26:39Z

Sure but we need automated tests. You can place one in scripts/tests/test_validate_docstrings.py

WillAyd · 2019-03-06T18:29:22Z

@kpapdac can you merge master to resolve build failure and add a test for this?

kpapdac · 2019-03-06T20:49:59Z

yes, i'm on it!

kpapdac · 2019-03-07T06:24:52Z

ok, i gave it a go, i've never done this before but my best guest is to add to the scripts/tests/test_validate_docstrings.py in TestDocstringClass class the following

    @pytest.mark.parametrize('name', 'pandas.Series.str.isdecimal')
    def test_exit_status_for_write_validation_results_to_json(self, name):
        msg = 'UnicodeEncodeError in "{}"'.format(name)
        with pytest.raises(UnicodeEncodeError, match=msg):
            validate_docstrings.Docstring.validate_pep8(name)

For this specific function the test should fail as there is an encoding issue. I doubt that it works though as it passes the test without making any change to scripts\validate_docstrings.py. I have no idea how to move on with this, any help appreciated!

RjLi13 · 2019-03-10T01:57:02Z

Hey @kpapdac sorry if I'm misunderstanding, but shouldn't your written test pass if you make no change to the scripts\validate_docstrings.py since it raises an UnicodeEncodeError on Windows? I think your test should be checking if UnicodeEncodeError does not get raised when calling the validate_pep8 function on Windows.

TomAugspurger · 2019-03-11T19:49:21Z

scripts/validate_docstrings.py

@@ -543,8 +543,8 @@ def validate_pep8(self):
        application = flake8.main.application.Application()
        application.initialize(["--quiet"])

-        with tempfile.NamedTemporaryFile(mode='w') as file:
-            file.write(content)
+        with tempfile.NamedTemporaryFile(mode='wb') as file:


I'd prefer opening the NamedTemporaryFile with encoding='utf-8'.

Can you address comment from @TomAugspurger here?

I changed it as @TomAugspurger suggested.

kpapdac · 2019-03-18T05:54:29Z

Hi @RjLi13 thank you for having a look at this and apologies for the delay. I think you're right. I changed the test to

    def missing_encoding_write_to_file(self):
        """
        Examples
        --------
        >>> try:
        ...   docstr = validate_docstrings.Docstring('pandas.Series.str.isdecimal')
        ...   result = docstr.validate_pep8()
        ...   next(result)
        ...   print(1)
        ... except:
        ...   0
        1
        """
        pass

so as to fail when the exception is raised. Now, when I change scripts\validate_docstrings.py as per the pull request, the test passes the file.write(content.encode('utf-8')) part but somehow fails on application.run_checks([file.name]). It gives me an Valueerror write to closed file. Do you have any idea what's going wrong here? I appreciate your help!

RjLi13 · 2019-03-19T04:12:28Z

@kpapdac Push what you have so we can take a look at the code.

jreback · 2019-03-19T23:30:38Z

can you merge master

kpapdac · 2019-03-20T06:05:58Z

@jreback I did merge master 3 days ago, there should be no conflict now.
@RjLi13 I also pushed my changes in test_validate_docstrings.py

kpapdac · 2019-03-20T08:43:14Z

@jreback just to clarify I merged and pushed encode_error branch to my master and got a successful built from Travis. Is there anything else i should do to resolve this? Thanks

WillAyd · 2019-04-10T05:15:54Z

@kpapdac your comment above suggests you did but I don't see a test as part of this. Can you double check you included that?

…andas-dev#25466)

kpapdac · 2019-04-14T17:29:13Z

hi @WillAyd, I added a test for the encoding issue in test_validate_docstrings.py in my master. The name of the test is test_encode_content_write_to_file.
Also apologies, I created a new pull request for this by mistake, so please ignore.
I'm a new contributor so please let me know if something doesn't make sense.

WillAyd · 2019-04-14T19:11:08Z

@kpapdac you'll need to add that test to this PR. Make it locally on your encode_error branch then do the following:

git fetch upstream
git merge upstream/master
git push origin encode_error

It should update this PR with your test

pep8speaks · 2019-04-14T20:52:07Z

Hello @kpapdac! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-05-06 19:50:57 UTC

kpapdac · 2019-04-16T05:04:16Z

thanks @WillAyd , does it now look ok?

WillAyd · 2019-04-17T20:22:08Z

scripts/tests/test_validate_docstrings.py

@@ -6,7 +6,7 @@
 import numpy as np
 import pandas as pd

-import validate_docstrings
+from scripts import validate_docstrings


I don't see how this is necessary - can you revert this (or let me know what I am missing)

no, my bad i wans't working for me when debugging, i reverted this.

WillAyd · 2019-04-17T20:22:40Z

scripts/tests/test_validate_docstrings.py

@@ -1052,6 +1052,11 @@ def test_raises_for_invalid_attribute_name(self, invalid_name):
        with pytest.raises(AttributeError, match=msg):
            validate_docstrings.Docstring(invalid_name)

+    @pytest.mark.parametrize('name', ['pandas.Series.str.isdecimal'])


If only using one value no need to parametrize - just use this locally in function

In the end I added a second test case to check.

scripts/tests/test_validate_docstrings.py

WillAyd · 2019-04-17T20:23:31Z

scripts/validate_docstrings.py

@@ -543,8 +543,8 @@ def validate_pep8(self):
        application = flake8.main.application.Application()
        application.initialize(["--quiet"])

-        with tempfile.NamedTemporaryFile(mode='w') as file:
-            file.write(content)
+        with tempfile.NamedTemporaryFile(mode='wb') as file:


Can you address comment from @TomAugspurger here?

… test

WillAyd · 2019-04-19T16:12:13Z

scripts/tests/test_validate_docstrings.py

@@ -5,7 +5,6 @@
 import pytest
 import numpy as np
 import pandas as pd
-


Can you also revert this? Probably fails isort or flake8 without it

WillAyd

Minor stuff. So both of the parametrized values were failing on windows before correct? Can't reproduce myself so just want to confirm the choice

WillAyd · 2019-04-19T20:09:21Z

scripts/tests/test_validate_docstrings.py

@@ -1052,6 +1052,11 @@ def test_raises_for_invalid_attribute_name(self, invalid_name):
        with pytest.raises(AttributeError, match=msg):
            validate_docstrings.Docstring(invalid_name)

+    @pytest.mark.parametrize('name', ['pandas.Series.str.isdecimal', 'pandas.Series.str.islower'])


Hmm I don't think our CI is hitting this directory but this would fail linting for being too long. Can you break lines here after the opening left bracket?

Yes, they both fail on windows having characters like '³', '⅕'. I'll break the lines.

WillAyd · 2019-04-19T20:09:34Z

scripts/tests/test_validate_docstrings.py

@@ -1052,6 +1052,11 @@ def test_raises_for_invalid_attribute_name(self, invalid_name):
        with pytest.raises(AttributeError, match=msg):
            validate_docstrings.Docstring(invalid_name)

+    @pytest.mark.parametrize('name', ['pandas.Series.str.isdecimal', 'pandas.Series.str.islower'])
+    def test_encode_content_write_to_file(self, name):
+        docstr = validate_docstrings.Docstring(name).validate_pep8() # GH25466


Move comment one line up (OK to be standalone)

… encode_error

WillAyd

Looks good to me - @datapythonista any chance you can take a look at this one?

WillAyd · 2019-04-22T15:33:36Z

scripts/tests/test_validate_docstrings.py

@@ -1052,6 +1052,12 @@ def test_raises_for_invalid_attribute_name(self, invalid_name):
        with pytest.raises(AttributeError, match=msg):
            validate_docstrings.Docstring(invalid_name)

+    @pytest.mark.parametrize('name', [
+        'pandas.Series.str.isdecimal', 'pandas.Series.str.islower'])
+    def test_encode_content_write_to_file(self, name):  # GH25466


This comment should be on the next line. @kpapdac if you want to fix up otherwise can clean up before merging

yeah, I'm sorry, you said that above but I didn't get it..I'll fix it.

datapythonista

lgtm, added couple of comments, but happy to get this merged as it is too

Thanks @kpapdac

scripts/tests/test_validate_docstrings.py

kpapdac · 2019-04-27T05:56:54Z

thank you @datapythonista, @WillAyd for all the help

WillAyd · 2019-05-03T05:41:43Z

@kpapdac can you address comments from @datapythonista ? Should be able to get this in thereafter

WillAyd · 2019-05-07T01:26:58Z

Thanks @kpapdac !

kpapdac · 2019-05-07T04:19:07Z

Thank you!

DOC: Fix encoding of docstring validation for Windows

639302e

WillAyd requested changes Feb 27, 2019

View reviewed changes

WillAyd added Docs CI Continuous Integration Windows Windows OS labels Feb 27, 2019

jreback added this to the 0.25.0 milestone Mar 3, 2019

TomAugspurger reviewed Mar 11, 2019

View reviewed changes

kpapdac added a commit to kpapdac/pandas that referenced this pull request Mar 20, 2019

Attempt to add a test for PR pandas-dev#25466

cb67f01

kpapdac mentioned this pull request Apr 14, 2019

DOC: Add test for encoding of docstring validation for Windows #26084

Closed

kpapdac added a commit to kpapdac/pandas that referenced this pull request Apr 14, 2019

DOC: Add a test for encoding of docstring validation for Windows (PR p…

e9aebb2

…andas-dev#25466)

kpapdac added 2 commits April 14, 2019 21:31

DOC: Add a test for encoding of docstring validation for Windows

52ad76f

merge to master

350fa92

DOC: Add a test for encoding of docstring validation for Windows

e05aa8b

WillAyd requested changes Apr 17, 2019

View reviewed changes

GH#25466 DOC:Fix encoding of docstring validation for Windows and add…

60d9df0

… test

WillAyd requested changes Apr 19, 2019

View reviewed changes

kpapdac added 2 commits April 19, 2019 17:54

GH25466 revert empty line

1a4fe4f

Merge branch 'master' into encode_error

ff4076e

WillAyd requested changes Apr 19, 2019

View reviewed changes

kpapdac added 4 commits April 20, 2019 03:22

GH25466 add breakline and fix comment

0fd3d99

Merge branch 'encode_error' of https://github.com/kpapdac/pandas into…

9fe7fbc

… encode_error

GH25466 add space before comment

70ee72a

Merge remote-tracking branch 'upstream/master' into encode_error

5f1d78a

WillAyd reviewed Apr 22, 2019

View reviewed changes

kpapdac added 2 commits April 26, 2019 08:54

Fix GH comment position

175a86f

Merge remote-tracking branch 'upstream/master' into encode_error

645ab84

datapythonista approved these changes Apr 26, 2019

View reviewed changes

scripts/tests/test_validate_docstrings.py Outdated Show resolved Hide resolved

scripts/tests/test_validate_docstrings.py Outdated Show resolved Hide resolved

kpapdac added 2 commits May 6, 2019 20:47

Formatting and change assert to check if list is empty

a9b09e3

Merge remote-tracking branch 'upstream/master' into encode_error

5edb1e8

WillAyd approved these changes May 7, 2019

View reviewed changes

WillAyd merged commit ca1a36a into pandas-dev:master May 7, 2019

Uh oh!

DOC: Fix encoding of docstring validation for Windows #25466

DOC: Fix encoding of docstring validation for Windows #25466

Uh oh!

Conversation

kpapdac commented Feb 27, 2019

Uh oh!

codecov bot commented Feb 27, 2019

Codecov Report

Uh oh!

codecov bot commented Feb 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

kpapdac commented Feb 28, 2019

Uh oh!

WillAyd commented Feb 28, 2019

Uh oh!

WillAyd commented Mar 6, 2019

Uh oh!

kpapdac commented Mar 6, 2019

Uh oh!

kpapdac commented Mar 7, 2019

Uh oh!

RjLi13 commented Mar 10, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kpapdac commented Mar 18, 2019

Uh oh!

RjLi13 commented Mar 19, 2019

Uh oh!

jreback commented Mar 19, 2019

Uh oh!

kpapdac commented Mar 20, 2019

Uh oh!

kpapdac commented Mar 20, 2019

Uh oh!

WillAyd commented Apr 10, 2019

Uh oh!

kpapdac commented Apr 14, 2019

Uh oh!

WillAyd commented Apr 14, 2019

Uh oh!

pep8speaks commented Apr 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2019-05-06 19:50:57 UTC

Uh oh!

kpapdac commented Apr 16, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WillAyd left a comment

codecov bot commented Feb 27, 2019 •

edited

Loading

pep8speaks commented Apr 14, 2019 •

edited

Loading