Closed
Description
Pandas version checks
- I have checked that the issue still exists on the latest versions of the docs on
main
here
Location of the documentation
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
Documentation problem
There are certain values that read_csv interprets as booleans even if the true_values and false_values parameters are set to none or an empty list. In my opinion, there are two oversights with the documentation here.
- There is nothing in the documentation that lists out what string values are interpreted as booleans by default (I found true, True, TRUE, false, False, and FALSE but there could be more, it would be nice to know that)
- There is no direct way to force read_csv to treat a column of data with values that could all be interpreted as booleans as strings.
Suggested fix for documentation
- The values that are interpreted as true and false no matter what should be documented.
- I managed to get around the issue by interpreting all columns as strings and then applying the to_numeric function to each column. If that is the only way to prevent those values from being interpreted as booleans, I think it should be noted on the page. I have an example of what I'm talking about here