Skip to content

DOC: pandas.read_csv #48487

Closed
Closed
@ryanbaekr

Description

@ryanbaekr

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

Documentation problem

There are certain values that read_csv interprets as booleans even if the true_values and false_values parameters are set to none or an empty list. In my opinion, there are two oversights with the documentation here.

  1. There is nothing in the documentation that lists out what string values are interpreted as booleans by default (I found true, True, TRUE, false, False, and FALSE but there could be more, it would be nice to know that)
  2. There is no direct way to force read_csv to treat a column of data with values that could all be interpreted as booleans as strings.

Suggested fix for documentation

  1. The values that are interpreted as true and false no matter what should be documented.
  2. I managed to get around the issue by interpreting all columns as strings and then applying the to_numeric function to each column. If that is the only way to prevent those values from being interpreted as booleans, I think it should be noted on the page. I have an example of what I'm talking about here

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions