Skip to content

BUG: Fix MultiIndex DataFrame to_csv() segfault #26355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ Indexing
- Improved exception message when calling :meth:`DataFrame.iloc` with a list of non-numeric objects (:issue:`25753`).
- Bug in :meth:`DataFrame.loc` and :meth:`Series.loc` where ``KeyError`` was not raised for a ``MultiIndex`` when the key was less than or equal to the number of levels in the :class:`MultiIndex` (:issue:`14885`).
- Bug in which :meth:`DataFrame.append` produced an erroneous warning indicating that a ``KeyError`` will be thrown in the future when the data to be appended contains new columns (:issue:`22252`).
-
- Bug in which :meth:`DataFrame.to_csv` caused a segfault for a reindexed data frame, when the indices were single-level :class:`MultiIndex` (:issue:`26303`).


Missing
Expand Down
4 changes: 3 additions & 1 deletion pandas/core/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -946,7 +946,9 @@ def _format_native_types(self, na_rep='nan', **kwargs):
new_codes.append(level_codes)

if len(new_levels) == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment here that this is a single-level MI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do new_levels and new_codes look like here (from your test)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of new_levels is [array(['1', '2', '3'], dtype='<U21')].
The value of new_codes is [FrozenNDArray([0, 2], dtype='int8')]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment in d47089e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok use levels[0].take(codes[0])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 159a828

return Index(new_levels[0])._format_native_types()
# a single-level multi-index
return Index(new_levels[0].take(
new_codes[0]))._format_native_types()
else:
# reconstruct the multi-index
mi = MultiIndex(levels=new_levels, codes=new_codes,
Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/frame/test_to_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -1220,6 +1220,15 @@ def test_multi_index_header(self):
expected = tm.convert_rows_list_to_csv_str(expected_rows)
assert result == expected

def test_to_csv_single_level_multi_index(self):
# see gh-26303
index = pd.Index([(1,), (2,), (3,)])
df = pd.DataFrame([[1, 2, 3]], columns=index)
df = df.reindex(columns=[(1,), (3,)])
expected = ",1,3\n0,1,3\n"
result = df.to_csv(line_terminator='\n')
assert_almost_equal(result, expected)

def test_gz_lineend(self):
# GH 25311
df = pd.DataFrame({'a': [1, 2]})
Expand Down