Skip to content

Grouping, then resampling empty DataFrame leads to it losing column names and index #26411

Closed
@andrej

Description

@andrej

Code Sample

import pandas as pd

empty_df = pd.DataFrame([], columns=["a", "b"], index=pd.TimedeltaIndex([]))

resampled_df = empty_df.groupby("a").resample(rule=pd.to_timedelta("00:00:01")).mean()

print resampled_df
print resampled_df["b"]

Problem description

After grouping and subsequently resampling as shown above, the returned data frame seems to lose all its meta information, such as column names and what type of index it uses. This leads to a key error when trying to access columns of the data frame that were existent, but empty, before doing the grouping and resampling.

Perhaps somewhere in the code a "generic" empty data frame is returned without the attached information about column names and indices?

Expected Output

Empty DataFrame
Columns: [a, b]
Index: []
Series([], Name: b, dtype: object) 

or similar, and empty_df["b"] == resampled_df["b"] would make sense to me.

Actual Output

Empty DataFrame
Columns: []
Index: []
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexes/base.py", line 2659, in get_lo
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'b'  

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.9.93-linuxkit-aufs machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: None.None pandas: 0.24.2 pytest: None pip: 8.1.1 setuptools: 20.7.0 Cython: None numpy: 1.16.2 scipy: 0.17.0 pyarrow: None xarray: None IPython: None sphinx: 1.3.6 patsy: None dateutil: 2.8.0 pytz: 2018.9 blosc: None bottleneck: None tables: 3.5.1 numexpr: 2.6.9 feather: None matplotlib: 1.5.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml.etree: 3.5.0 bs4: 4.4.1 html5lib: 0.999 sqlalchemy: 1.3.1 pymysql: 0.7.2.None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions