Closed
Description
Pandas raises a ValueError when assigning multiple values to a Series (or DataFrame) using range(x) where x > 1. This error is raised only when its length is one million or larger.
import pandas as pd
for x in [5, 999999, 1000000]:
s = pd.Series(index=range(x))
print('series length =', len(s))
# assigning value with range(1), always works
s.loc[range(1)] = 42
# reading values with range(x>1), always works
_ = s.loc[range(2)]
# assigning values with range(x>1), fails only when len >= 1 million
s.loc[range(2)] = 42
Output:
series length = 5
series length = 999999
series length = 1000000
Traceback (most recent call last):
File "<stdin>", line 9, in <module>
File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/indexing.py", line 114, in __setitem__
indexer = self._get_setitem_indexer(key)
File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/indexing.py", line 109, in _get_setitem_indexer
return self._convert_to_indexer(key, is_setter=True)
File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/indexing.py", line 1042, in _convert_to_indexer
return labels.get_loc(obj)
File "/home/nekobon/.env_exp/lib/python3.4/site-packages/pandas/core/index.py", line 1692, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)
File "pandas/index.pyx", line 145, in pandas.index.IndexEngine.get_loc (pandas/index.c:3680)
File "pandas/index.pyx", line 464, in pandas.index._bin_search (pandas/index.c:9124)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Tested on pandas 0.17.0 and python 3.4.