Skip to content

numpy ravel does not work on a list of DataFrames with specified column names #26247

Closed
@cod3licious

Description

@cod3licious

Code Sample

import numpy as np
import pandas as pd

x = np.random.randn(10,3)

# this works - list of DataFrames without column names
np.ravel([pd.DataFrame(batch.reshape(1,3)) for batch in x])

# this also works - single DataFrame with column names
np.ravel(pd.DataFrame(x[0].reshape(1,3), columns=["x1", "x2", "x3"]))

# this doesn't work - list of DataFrames with column names
np.ravel([pd.DataFrame(batch.reshape(1,3), columns=["x1", "x2", "x3"]) for batch in x])

Problem description

When calling numpy.ravel on a list of DataFrames with column names it gives an error - regular DataFrames aren't an issue though.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda/envs/python36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-67-b9b3fcdd02a0> in <module>()
----> 1 np.ravel([pd.DataFrame(batch.reshape(1,3), columns=["x1", "x2", "x3"]) for batch in x])

~/anaconda/envs/python36/lib/python3.6/site-packages/numpy/core/fromnumeric.py in ravel(a, order)
   1572         return asarray(a).ravel(order=order)
   1573     else:
-> 1574         return asanyarray(a).ravel(order=order)
   1575 
   1576 

~/anaconda/envs/python36/lib/python3.6/site-packages/numpy/core/numeric.py in asanyarray(a, dtype, order)
    551 
    552     """
--> 553     return array(a, dtype, copy=False, order=order, subok=True)
    554 
    555 

~/anaconda/envs/python36/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

~/anaconda/envs/python36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

Expected Output

no error

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Darwin
OS-release: 18.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 3.3.2
pip: 19.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.4
feather: None
matplotlib: 3.0.1
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml.etree: 4.1.1
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: 0.1.6
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsDataFrameDataFrame data structureNeeds TestsUnit test(s) needed to prevent regressionsgood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions