Skip to content

BUG: read_csv with dtype=bool[pyarrow] #53391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 31, 2023

Conversation

mroeschke
Copy link
Member

@mroeschke mroeschke added IO CSV read_csv, to_csv Arrow pyarrow functionality labels May 25, 2023
@@ -31,6 +31,7 @@ Bug fixes
- Bug in :func:`api.interchange.from_dataframe` was unnecessarily raising on bitmasks (:issue:`49888`)
- Bug in :func:`merge` when merging on datetime columns on different resolutions (:issue:`53200`)
- Bug in :func:`read_csv` raising ``OverflowError`` for ``engine="pyarrow"`` and ``parse_dates`` set (:issue:`53295`)
- Bug in :func:`read_csv` when defining ``dtype`` with ``bool[pyarrow]`` (:issue:`53390`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for "c"/"python" parsers, right?

Might want to clarify that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah correct, updated.

False
True
"""
result = parser.read_csv(StringIO(data), dtype={"col": "bool[pyarrow]"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for dtype_backend=pyarrow as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I modified test_EA_types to test this instead

@lithomas1 lithomas1 modified the milestones: 2.0.2, 2.0.3 May 26, 2023
@lithomas1
Copy link
Member

Ah sorry, I forgot the 2.0.2 release was today.

Can you move the whatsnew note to 2.0.3?

@mroeschke mroeschke merged commit 1a57d15 into pandas-dev:main May 31, 2023
@mroeschke mroeschke deleted the bug/csv/arrow/bool branch May 31, 2023 17:19
@lumberbot-app
Copy link

lumberbot-app bot commented May 31, 2023

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.0.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 1a57d153732b888f98b42d87f590952b262641c3
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #53391: BUG: read_csv with dtype=bool[pyarrow]'
  1. Push to a named branch:
git push YOURFORK 2.0.x:auto-backport-of-pr-53391-on-2.0.x
  1. Create a PR against branch 2.0.x, I would have named this PR:

"Backport PR #53391 on branch 2.0.x (BUG: read_csv with dtype=bool[pyarrow])"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

mroeschke added a commit to mroeschke/pandas that referenced this pull request May 31, 2023
mroeschke added a commit that referenced this pull request May 31, 2023
* Backport PR #53391: BUG: read_csv with dtype=bool[pyarrow]

* Add xfail
topper-123 pushed a commit to topper-123/pandas that referenced this pull request Jun 5, 2023
* BUG: read_csv with dtype=bool[pyarrow]

* Use existing test instead

* Clarify whatsnew

* Move to 2.0.3
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* BUG: read_csv with dtype=bool[pyarrow]

* Use existing test instead

* Clarify whatsnew

* Move to 2.0.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Reading fails when dtype is defined with bool[pyarrow]
2 participants