Description
Code Sample
import numpy as np
import pandas as pd
x = np.random.randn(10,3)
# this works - list of DataFrames without column names
np.ravel([pd.DataFrame(batch.reshape(1,3)) for batch in x])
# this also works - single DataFrame with column names
np.ravel(pd.DataFrame(x[0].reshape(1,3), columns=["x1", "x2", "x3"]))
# this doesn't work - list of DataFrames with column names
np.ravel([pd.DataFrame(batch.reshape(1,3), columns=["x1", "x2", "x3"]) for batch in x])
Problem description
When calling numpy.ravel on a list of DataFrames with column names it gives an error - regular DataFrames aren't an issue though.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/anaconda/envs/python36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-67-b9b3fcdd02a0> in <module>()
----> 1 np.ravel([pd.DataFrame(batch.reshape(1,3), columns=["x1", "x2", "x3"]) for batch in x])
~/anaconda/envs/python36/lib/python3.6/site-packages/numpy/core/fromnumeric.py in ravel(a, order)
1572 return asarray(a).ravel(order=order)
1573 else:
-> 1574 return asanyarray(a).ravel(order=order)
1575
1576
~/anaconda/envs/python36/lib/python3.6/site-packages/numpy/core/numeric.py in asanyarray(a, dtype, order)
551
552 """
--> 553 return array(a, dtype, copy=False, order=order, subok=True)
554
555
~/anaconda/envs/python36/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
2925 if self.columns.nlevels > 1:
2926 return self._getitem_multilevel(key)
-> 2927 indexer = self.columns.get_loc(key)
2928 if is_integer(indexer):
2929 indexer = [indexer]
~/anaconda/envs/python36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2657 return self._engine.get_loc(key)
2658 except KeyError:
-> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key))
2660 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2661 if indexer.ndim > 1 or indexer.size > 1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
Expected Output
no error
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Darwin
OS-release: 18.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 3.3.2
pip: 19.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.4
feather: None
matplotlib: 3.0.1
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml.etree: 4.1.1
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: 0.1.6
pandas_gbq: None
pandas_datareader: None
gcsfs: None