Skip to content

DOC: read_csv default encoding is not documented #49881

Closed
@wjandrea

Description

@wjandrea

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

pandas.read_csv

Documentation problem

The default encoding is not mentioned for read_csv.

I tried finding it in the source code, and I'm not sure I followed it correctly, but it looks like it's ultimately specified in pandas.io.common.get_handle():

pandas/pandas/io/common.py

Lines 697 to 698 in b7708f0

# Windows does not default to utf-8. Set to utf-8 for a consistent behavior
encoding = encoding or "utf-8"

Defaulting to UTF-8 differs from open(), which could be surprising.

The default encoding is platform dependent (whatever locale.getencoding() returns)

(But that's not to say it's a bad thing. UTF-8 is a fine default.)

Suggested fix for documentation

Mention the default encoding, i.e. UTF-8. Or, if it's documented elsewhere, add a link.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions