Closed
Description
This is from a first try to select all rows where an ID is in another dataset. The code does not work as intended (see last line), but I think that the describe()
call shouldn't fail.
[column names changed in output; output from last call shortened]
In [47]: all
Out[47]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 974757 entries, 0 to 974756
Data columns:
eid 974757 non-null values
number 974757 non-null values
a 972510 non-null values
b 974757 non-null values
c 929268 non-null values
d 922700 non-null values
e 974757 non-null values
dtypes: int64(1), object(6)
In [48]: subset = all[all["eid"].isin(other["eid"])]
In [49]: subset
Out[49]:
Int64Index([], dtype=int64)
Empty DataFrame
In [50]: subset.describe()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-50-ff735ef04a17> in <module>()
----> 1 business_authors.describe()
C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in describe(self, percentile_width)
4539 series = self[column]
4540 destat.append([series.count(), series.mean(), series.std(),
-> 4541 series.min(), series.quantile(lb), series.median(),
4542 series.quantile(ub), series.max()])
4543
C:\portabel\Python27\lib\site-packages\pandas\core\series.pyc in min(self, axis, out, skipna, level)
1320 if level is not None:
1321 return self._agg_by_level('min', level=level, skipna=skipna)
-> 1322 return nanops.nanmin(self.values, skipna=skipna)
1323
1324 @Substitution(name='maximum', shortname='max',
C:\portabel\Python27\lib\site-packages\pandas\core\nanops.pyc in f(values, axis, skipna, **kwds)
46 result = alt(values, axis=axis, skipna=skipna, **kwds)
47 except Exception:
---> 48 result = alt(values, axis=axis, skipna=skipna, **kwds)
49
50 return result
C:\portabel\Python27\lib\site-packages\pandas\core\nanops.pyc in _nanmin(values, axis, skipna)
179 or values.size == 0):
180 result = values.sum(axis)
--> 181 result.fill(np.nan)
182 else:
183 result = values.min(axis)
ValueError: cannot convert float NaN to integer
In[51]: all["eid"].isin(other["eid"])
Out[51]:
0 False
1 False
2 False
3 False
4 False
...
974752 False
974753 False
974754 False
974755 False
974756 False
Name: eid, Length: 974757