Skip to content

BUG: pd.Index(dtype=np.int64) cannot be used in ops with pd.Index(dtype=Int64Dtype()) #49576

Closed
@MikaelUmaN

Description

@MikaelUmaN

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

# This gives the warning message: "pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead."
test_index_warning_thrown = pd.Int64Index([1,2,3])

# Creates a dataframe with a column that gets the default type, that is Int64Index
a=pd.DataFrame([20, 30, 15, 2], index=[0, 1, 2, 3], columns=[1])

# Creates a dataframe that gets the explicit Index type.
b=pd.DataFrame([20, 30, 15, 2], index=[0, 1, 2, 3], columns=pd.Index([1], dtype=pd.Int64Dtype()))

# This throws an error that does not seem reasonable. Something with slicing? See below.
a-b

Issue Description

Relates to: #49560 .

  • When we create explicit pd.Int64Index we get deprecation warnings. When we just set the columns from a list we still get Int64Index, however.
  • When we explicity create pd.Index with dtype set to int64, to avoid the deprecation warning, we instead get the situation that we can't perform operations with other dataframes were pd.Int64Index is used.

Actual result is an error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /opt/conda/lib/python3.9/site-packages/pandas/core/indexing.py:873, in _LocationIndexer._validate_tuple_indexer(self, key)
    872 try:
--> 873     self._validate_key(k, i)
    874 except ValueError as err:

File /opt/conda/lib/python3.9/site-packages/pandas/core/indexing.py:1483, in _iLocIndexer._validate_key(self, key, axis)
   1482 else:
-> 1483     raise ValueError(f"Can only index by location with a [{self._valid_types}]")

ValueError: Can only index by location with a [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array]

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In [9], line 1
----> 1 a-b

Expected Behavior

Add, subtract, div etc. should work as normal, the types seem compatible. The docs tell you to create pd.Index instances, and these are all ints.

Installed Versions

INSTALLED VERSIONS

commit : 91111fd
python : 3.9.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.72-microsoft-standard-WSL2
Version : #1 SMP Wed Oct 28 23:40:43 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8

pandas : 1.5.1
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
setuptools : 52.0.0.post20210125
pip : 21.1.3
Cython : None
pytest : 7.2.0
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.2
IPython : 8.6.0
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.5
brotli :
fastparquet : 0.8.3
fsspec : 2022.10.0
gcsfs : None
matplotlib : 3.5.3
numba : 0.56.3
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : 9.0.0
pyreadstat : None
pyxlsb : None
s3fs : 0.4.2
scipy : 1.9.3
snappy :
sqlalchemy : 1.4.43
tables : 3.7.0
tabulate : None
xarray : 2022.11.0
xlrd : None
xlwt : None
zstandard : None
tzdata : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexRelated to the Index class or subclassesNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions