Skip to content

DOC: Weird documentation (or behavior) of pd.read_csv escapechar= arg #23717

Closed
@keanpantraw

Description

@keanpantraw

Problem description

Documentation of pandas.read_csv states:

escapechar : str (length 1), default None

    One-character string used to escape delimiter when quoting is QUOTE_NONE.

But in reality you can correctly parse following CSV:

id, name
1, "Hello, my name is \"John\""

using this code:

import pandas as pd

df=pd.read_csv("john.csv", quotechar='"', escapechar="\\")
print(df.name) # prints Hello, my name is "John"
print(df.id) # prints 1

Also same doc states that default quoting is QUOTE_MINIMAL, not QUOTE_NONE in example above.

Trying same with quoting=csv.QUOTE_NONE will give you incorrect result:

import pandas as pd
import csv

df=pd.read_csv("john.csv", quotechar='"', escapechar="\\", quoting=csv.QUOTE_NONE)
print(df.name) # prints my name is John""
print(df.id) # prints 1, "Hello

Trying same without escapechar will also give you incorrect result:

import pandas as pd

df=pd.read_csv("john.csv", quotechar='"')
print(df.name) # prints Hello, my name is \John\""
print(df.id) # prints 1

Looks like doc is misleading here, incorrectly stating that escapechar is used only for QUOTE_NONE when in fact it can be used with default mode (maybe as result of some bugfixes?)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions