Skip to content

DOC update DataFrame.to_csv write modes (#51839) #51881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 17, 2023

Conversation

HamidrezaSK
Copy link
Contributor

@HamidrezaSK HamidrezaSK commented Mar 10, 2023

@mroeschke mroeschke added Docs IO CSV read_csv, to_csv labels Mar 10, 2023
HamidrezaSK and others added 2 commits March 10, 2023 20:35
- 'a', open for writing, appending to the end of file if it exists.

Including 'b' or 't' in the mode parameter will inform Pandas whether
'path_or_buf' requires string or binary data. However, in most cases,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think path_or_buf needs backticks. I'm not sure whether the quoted strings should also be wrapped in backticks.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job. Just one minor comment.

- 'x', open for exclusive creation, failing if the file already exists.
- 'a', open for writing, appending to the end of file if it exists.

Including 'b' or 't' in the mode parameter will inform Pandas whether
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use pandas lowercase.

Comment on lines 3647 to 3648
`path_or_buf` requires string or binary data. However, in most cases,
this should not be necessary.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, but I don't think it makes sense to have binary csv files. I think this was a comment in the issue, about using b, which in general makes sense, but I don't see how for to_csv it could make sense to save data in a binary file. Unless I'm missing something (correct me if I'm wong), better to simply delete this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your input.

I also think the same, but we had a discussion with @twoertwein earlier here #51881 (comment).

Could you please share your thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that part is more confusing than helpful, feel free to remove it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. I agree with these comments in general, bit I don't think in this particular case it's possible to binary data. Csv relies on the delimiter, the quote character and the line break to understand the structure. Binary data could have any of those and break a csv. Unless I'm missing something, it's not that in most cases "b" shouldn't be used, it can never be used. To me it makes more sense to remove that comment, since I think it's giving the impression that users can actually use binary csv's in some cases, and some may start researching about it. So, probably a bit confusing and distracting.

Feel free to disagree, but that's how I feel about that comment.

Remove the 'b' and 't' modes from the description.
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @HamidrezaSK

I'm not sure if we could remove the open for ... parts, and be more concise, something like 'w', truncate the file before writing, fail if the file already exists, append to the end of the file if it exists. Not sure if the beginning adds more noise than value. But either way, great addition, thanks!

Modify 'w', 'a', and 'x' write mode's description.
@mroeschke mroeschke added this to the 2.1 milestone Mar 17, 2023
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I think this explanation is a lot clearer.

@mroeschke mroeschke merged commit fb282b6 into pandas-dev:main Mar 17, 2023
@mroeschke
Copy link
Member

Thanks @HamidrezaSK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Improve to_csv mode documentation
4 participants