Skip to content

Makes NumericIndex constructor dtype aware #29529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Nov 17, 2019
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -4027,7 +4027,7 @@ def _string_data_error(cls, data):
)

@classmethod
def _coerce_to_ndarray(cls, data):
def _coerce_to_ndarray(cls, data, dtype=None):
"""
Coerces data to ndarray.

Expand All @@ -4047,7 +4047,8 @@ def _coerce_to_ndarray(cls, data):
# other iterable of some kind
if not isinstance(data, (ABCSeries, list, tuple)):
data = list(data)
data = np.asarray(data)

data = np.asarray(data, dtype=dtype)
return data

def _coerce_scalar_to_index(self, item):
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/indexes/numeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def __new__(cls, data=None, dtype=None, copy=False, name=None, fastpath=None):
return cls._simple_new(data, name=name)

# is_scalar, generators handled in coerce_to_ndarray
data = cls._coerce_to_ndarray(data)
data = cls._coerce_to_ndarray(data, dtype=dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should rely on dtype being explicitly passed in (it's None by default) so maybe make this dtype=self._dtype for now since the dtype parameter isn't being validated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This initially made sense to me.

However, numpy seems to complicate this suggestion since it'll cast input if dtype is explicitly passed:

In [1]: import numpy as np 
   ...: np.__version__                                                                                                                                                                                                                                                                                                         
Out[1]: '1.17.3'

In [2]: np.array([1, "2"]) 
   ...:                                                                                                                                                                                                                                                                                                                        
Out[2]: array(['1', '2'], dtype='<U21')

In [3]: np.array(["1", "2"])  
   ...:                                                                                                                                                                                                                                                                                                                        
Out[3]: array(['1', '2'], dtype='<U1')

In [4]: np.array(["1", "2"], dtype='int')                                                                                                                                                                                                                                                                                      
Out[4]: array([1, 2])

This test suggests that pandas keeps the same behavior. With 458c25e, I tried to check the type of elements. The side effect of that is consuming iterators, leading to the current errors.

Copy link
Contributor Author

@oguzhanogreden oguzhanogreden Nov 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numpy behavior I report above is brought up here but not acted on.


if issubclass(data.dtype.type, str):
cls._string_data_error(data)
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/indexes/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2821,3 +2821,11 @@ def test_shape_of_invalid_index():

idx = pd.Index([0, 1, 2, 3])
assert idx[:, None].shape == (4, 1)


def test_index_construction_respects_dtype():
index_list = [7606741985629028552, 17876870360202815256]
expected = np.asarray(index_list, dtype="uint64")
result = np.asarray(UInt64Index(index_list, dtype="uint64"), dtype="uint64")

tm.assert_equal(expected, result)