Skip to content

BUG/API: can't pass parameters to csv module via df.to_csv #4528

Closed
@brechea

Description

@brechea

Trying to print a data frame as plain, strict tsv (i.e., no quoting and no escaping, because I know none the fields will contain tabs), I wanted to use the "quoting" option, which is documented in pandas and is passed through to csv, as well as the "quotechar" option, not documented in pandas but also a csv option. But it doesn't work:

In [1]: import sys, csv

In [2]: from pandas import DataFrame

In [3]: data = {'col1': ['contents of col1 row1', 'contents " of col1 row2'], 'col2': ['contents of col2 row1', 'contents " of col2 row2'] }

In [4]: df = DataFrame(data)

In [5]: df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)
        col1    col2
0       contents of col1 row1   contents of col2 row1
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
<ipython-input-5-a30d32266fb4> in <module>()
----> 1 df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, nanRep, encoding, quoting, line_terminator, chunksize, tupleize_cols, **kwds)
   1409                                      tupleize_cols=tupleize_cols,
   1410                                      )
-> 1411         formatter.save()
   1412
   1413     def to_excel(self, excel_writer, sheet_name='sheet1', na_rep='',

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in save(self)
    974
    975             else:
--> 976                 self._save()
    977
    978

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save(self)
   1080                 break
   1081
-> 1082             self._save_chunk(start_i, end_i)
   1083
   1084     def _save_chunk(self, start_i, end_i):

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save_chunk(self, start_i, end_i)
   1098         ix = data_index.to_native_types(slicer=slicer, na_rep=self.na_rep, float_format=self.float_format)
   1099
-> 1100         lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)
   1101
   1102 # from collections import namedtuple

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/lib.so in pandas.lib.write_csv_rows (pandas/lib.c:13871)()

Error: need to escape, but no escapechar set

Adding the parameter

quotechar=kwds.get("quotechar")

to the

formatter = fmt.CSVFormatter(...

call in to_csv(), and doing corresponding changes to format.CSVFormatter()'s init() and save(), produces the expected output:

In [1]: import sys, csv

In [2]: from pandas import DataFrame

In [3]: data = {'col1': ['contents of col1 row1', 'contents " of col1 row2'], 'col2': ['contents of col2 row1', 'contents " of col2 row2'] }

In [4]: df = DataFrame(data)

In [5]: df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)
        col1    col2
0       contents of col1 row1   contents of col2 row1
1       contents " of col1 row2 contents " of col2 row2

i.e., unescaped, unquoted tsv.

More generally, there could be many reasons to want more control of the underlying csv writer, so a generic mechanism (as opposed to adding each param one by one) might be called for (e.g., allowign for a csv dialect object or at least a dictionary holding dialect attributes).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions