Skip to content

BUG: to_csv with chunksize mismatched formatting of dt64 #55481

Open
@jbrockmendel

Description

@jbrockmendel
dti = pd.date_range("2016-01-01", periods=3, freq="D")
df = pd.DataFrame({"A": dti})
df.iloc[-1, -1] += pd.Timedelta(minutes=1)

res = df.to_csv(chunksize=1)

>>> print(res)
,A
0,2016-01-01
1,2016-01-02
2,2016-01-03 00:01:00

In io.formats.csvs we call _save_body which iterates over chunks calling _save_chunk, which determines the format to use chunk-by-chunk. But for dt64 values (also td64) we determine the format to use (in most rendering contexts) by looking at the entire array.

Note this doesn't actually require chunksize to be specified to get the bad behavior, just to get a concise example.

Expected Behavior

>>> print(res)
,A
0,2016-01-01 00:00:00
1,2016-01-02 00:00:00
2,2016-01-03 00:01:00

Metadata

Metadata

Assignees

Labels

BugDatetimeDatetime data dtypeIO CSVread_csv, to_csv

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions