Skip to content

Commit 5802686

Browse files
authored
BUG: to_string using wrong na_rep for ea dtype in multiindex (#47986)
1 parent 43c0508 commit 5802686

File tree

3 files changed

+22
-1
lines changed

3 files changed

+22
-1
lines changed

doc/source/whatsnew/v1.5.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1002,6 +1002,7 @@ I/O
10021002
- Bug in :func:`read_sas` with certain types of compressed SAS7BDAT files (:issue:`35545`)
10031003
- Bug in :func:`read_excel` not forward filling :class:`MultiIndex` when no names were given (:issue:`47487`)
10041004
- Bug in :func:`read_sas` returned ``None`` rather than an empty DataFrame for SAS7BDAT files with zero rows (:issue:`18198`)
1005+
- Bug in :meth:`DataFrame.to_string` using wrong missing value with extension arrays in :class:`MultiIndex` (:issue:`47986`)
10051006
- Bug in :class:`StataWriter` where value labels were always written with default encoding (:issue:`46750`)
10061007
- Bug in :class:`StataWriterUTF8` where some valid characters were removed from variable names (:issue:`47276`)
10071008
- Bug in :meth:`DataFrame.to_excel` when writing an empty dataframe with :class:`MultiIndex` (:issue:`19543`)

pandas/core/indexes/multi.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
ensure_int64,
5555
ensure_platform_int,
5656
is_categorical_dtype,
57+
is_extension_array_dtype,
5758
is_hashable,
5859
is_integer,
5960
is_iterator,
@@ -1370,7 +1371,7 @@ def format(
13701371

13711372
stringified_levels = []
13721373
for lev, level_codes in zip(self.levels, self.codes):
1373-
na = na_rep if na_rep is not None else _get_na_rep(lev.dtype.type)
1374+
na = na_rep if na_rep is not None else _get_na_rep(lev.dtype)
13741375

13751376
if len(lev) > 0:
13761377

@@ -3889,6 +3890,11 @@ def sparsify_labels(label_list, start: int = 0, sentinel=""):
38893890

38903891

38913892
def _get_na_rep(dtype) -> str:
3893+
if is_extension_array_dtype(dtype):
3894+
return f"{dtype.na_value}"
3895+
else:
3896+
dtype = dtype.type
3897+
38923898
return {np.datetime64: "NaT", np.timedelta64: "NaT"}.get(dtype, "NaN")
38933899

38943900

pandas/tests/frame/test_repr_info.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import pytest
1010

1111
from pandas import (
12+
NA,
1213
Categorical,
1314
DataFrame,
1415
MultiIndex,
@@ -342,6 +343,19 @@ def test_frame_to_string_with_periodindex(self):
342343
# it works!
343344
frame.to_string()
344345

346+
def test_to_string_ea_na_in_multiindex(self):
347+
# GH#47986
348+
df = DataFrame(
349+
{"a": [1, 2]},
350+
index=MultiIndex.from_arrays([Series([NA, 1], dtype="Int64")]),
351+
)
352+
353+
result = df.to_string()
354+
expected = """ a
355+
<NA> 1
356+
1 2"""
357+
assert result == expected
358+
345359
def test_datetime64tz_slice_non_truncate(self):
346360
# GH 30263
347361
df = DataFrame({"x": date_range("2019", periods=10, tz="UTC")})

0 commit comments

Comments
 (0)