Skip to content

MultiIndex to_string edge case Error after 0.23.0 upgrade #21180

Closed
@atlasstrategic

Description

@atlasstrategic

Code example

import pandas as pd
import numpy as np

index = pd.date_range('1970', '2018', freq='A')
data = np.random.randn(len(index))
columns1 = [
    ['This is a long title with > 37 chars.'],
    ['cat'],
]
columns2 = [
    ['This is a loooooonger title with > 43 chars.'],
    ['dog'],
]
df1 = pd.DataFrame(data=data, index=index, columns=columns1)
df2 = pd.DataFrame(data=data, index=index, columns=columns2)
df = pd.concat([df1, df2], axis=1)
df.head()

Output (using pandas 0.23.0)

>>> df.head()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/core/base.py", line 82, in __repr__
    return str(self)
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/core/base.py", line 61, in __str__
    return self.__unicode__()
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/core/frame.py", line 663, in __unicode__
    line_width=width, show_dimensions=show_dimensions)
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/core/frame.py", line 1968, in to_string
    formatter.to_string()
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/io/formats/format.py", line 648, in to_string
    strcols = self._to_str_columns()
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/io/formats/format.py", line 539, in _to_str_columns
    str_columns = self._get_formatted_column_labels(frame)
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/io/formats/format.py", line 782, in _get_formatted_column_labels
    str_columns = _sparsify(str_columns)
  File "/home/david/.virtualenvs/thegrid-py3-venv/lib/python3.5/site-packages/pandas/core/indexes/multi.py", line 2962, in _sparsify
    prev = pivoted[start]
IndexError: list index out of range

Problem description

After upgrading Pandas 0.22.0 to 0.23.0 I have experienced the above error. I have noticed that it is the length of the column values, This is a long title with > 37 chars. and This is a loooooonger title with > 43 chars., that makes the difference. If I tweak the combined length of these to be <= 80 characters, there is no error, and output is as expected.

Expected Output (using pandas 0.22.0)

>>> df.head()
           This is a long title with > 37 chars.  \
                                             cat   
1970-12-31                             -1.448415   
1971-12-31                              0.081324   
1972-12-31                             -0.018105   
1973-12-31                              0.902790   
1974-12-31                              0.668474   

           This is a loooooonger title with > 43 chars.  
                                                    dog  
1970-12-31                                    -1.448415  
1971-12-31                                     0.081324  
1972-12-31                                    -0.018105  
1973-12-31                                     0.902790  
1974-12-31                                     0.668474

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-124-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_ZA.UTF-8
LOCALE: en_ZA.UTF-8

pandas: 0.23.0
pytest: None
pip: 10.0.1
setuptools: 32.3.1
Cython: None
numpy: 1.14.0
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.5.3
xlrd: None
xlwt: None
xlsxwriter: 1.0.4
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Output-Formatting__repr__ of pandas objects, to_stringRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions