Skip to content

Commit 9c1c269

Browse files
committed
merge master
2 parents c34a863 + e878fdc commit 9c1c269

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+334
-199
lines changed

doc/source/getting_started/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -398,7 +398,7 @@ data set, a sliding window of the data or grouped by categories. The latter is a
398398
<div class="card-body">
399399

400400
Change the structure of your data table in multiple ways. You can :func:`~pandas.melt` your data table from wide to long/tidy form or :func:`~pandas.pivot`
401-
from long to wide format. With aggregations built-in, a pivot table is created with a sinlge command.
401+
from long to wide format. With aggregations built-in, a pivot table is created with a single command.
402402

403403
.. image:: ../_static/schemas/07_melt.svg
404404
:align: center

doc/source/getting_started/intro_tutorials/10_text_data.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@ names in the ``Name`` column. By using pandas string methods, the
199199
200200
Next, we need to get the corresponding location, preferably the index
201201
label, in the table for which the name length is the largest. The
202-
:meth:`~Series.idxmax`` method does exactly that. It is not a string method and is
202+
:meth:`~Series.idxmax` method does exactly that. It is not a string method and is
203203
applied to integers, so no ``str`` is used.
204204

205205
.. ipython:: python

doc/source/whatsnew/v1.1.0.rst

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -174,15 +174,17 @@ Other API changes
174174
- Added :meth:`DataFrame.value_counts` (:issue:`5377`)
175175
- :meth:`Groupby.groups` now returns an abbreviated representation when called on large dataframes (:issue:`1135`)
176176
- ``loc`` lookups with an object-dtype :class:`Index` and an integer key will now raise ``KeyError`` instead of ``TypeError`` when key is missing (:issue:`31905`)
177-
- Using a :func:`pandas.api.indexers.BaseIndexer` with ``count``, ``skew``, ``cov``, ``corr`` will now raise a ``NotImplementedError`` (:issue:`32865`)
178-
- Using a :func:`pandas.api.indexers.BaseIndexer` with ``min``, ``max`` will now return correct results for any monotonic :func:`pandas.api.indexers.BaseIndexer` descendant (:issue:`32865`)
177+
- Using a :func:`pandas.api.indexers.BaseIndexer` with ``skew``, ``cov``, ``corr`` will now raise a ``NotImplementedError`` (:issue:`32865`)
178+
- Using a :func:`pandas.api.indexers.BaseIndexer` with ``count``, ``min``, ``max`` will now return correct results for any monotonic :func:`pandas.api.indexers.BaseIndexer` descendant (:issue:`32865`)
179179
- Added a :func:`pandas.api.indexers.FixedForwardWindowIndexer` class to support forward-looking windows during ``rolling`` operations.
180180
-
181181

182182
Backwards incompatible API changes
183183
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
184184
- :meth:`DataFrame.swaplevels` now raises a ``TypeError`` if the axis is not a :class:`MultiIndex`.
185-
Previously a ``AttributeError`` was raised (:issue:`31126`)
185+
Previously an ``AttributeError`` was raised (:issue:`31126`)
186+
- :meth:`DataFrame.xs` now raises a ``TypeError`` if a ``level`` keyword is supplied and the axis is not a :class:`MultiIndex`.
187+
Previously an ``AttributeError`` was raised (:issue:`33610`)
186188
- :meth:`DataFrameGroupby.mean` and :meth:`SeriesGroupby.mean` (and similarly for :meth:`~DataFrameGroupby.median`, :meth:`~DataFrameGroupby.std` and :meth:`~DataFrameGroupby.var`)
187189
now raise a ``TypeError`` if a not-accepted keyword argument is passed into it.
188190
Previously a ``UnsupportedFunctionCall`` was raised (``AssertionError`` if ``min_count`` passed into :meth:`~DataFrameGroupby.median`) (:issue:`31485`)
@@ -458,6 +460,7 @@ Datetimelike
458460
- Bug in :class:`Timestamp` arithmetic when adding or subtracting a ``np.ndarray`` with ``timedelta64`` dtype (:issue:`33296`)
459461
- Bug in :meth:`DatetimeIndex.to_period` not infering the frequency when called with no arguments (:issue:`33358`)
460462
- Bug in :meth:`DatetimeIndex.tz_localize` incorrectly retaining ``freq`` in some cases where the original freq is no longer valid (:issue:`30511`)
463+
- Bug in :meth:`DatetimeIndex.intersection` losing ``freq`` and timezone in some cases (:issue:`33604`)
461464

462465
Timedelta
463466
^^^^^^^^^
@@ -525,6 +528,7 @@ Indexing
525528
- Bug in `Series.__getitem__` with an integer key and a :class:`MultiIndex` with leading integer level failing to raise ``KeyError`` if the key is not present in the first level (:issue:`33355`)
526529
- Bug in :meth:`DataFrame.iloc` when slicing a single column-:class:`DataFrame`` with ``ExtensionDtype`` (e.g. ``df.iloc[:, :1]``) returning an invalid result (:issue:`32957`)
527530
- Bug in :meth:`DatetimeIndex.insert` and :meth:`TimedeltaIndex.insert` causing index ``freq`` to be lost when setting an element into an empty :class:`Series` (:issue:33573`)
531+
- Bug in :meth:`Series.__setitem__` with an :class:`IntervalIndex` and a list-like key of integers (:issue:`33473`)
528532

529533
Missing
530534
^^^^^^^
@@ -576,6 +580,8 @@ I/O
576580
- Bug in :meth:`read_excel` did not correctly handle multiple embedded spaces in OpenDocument text cells. (:issue:`32207`)
577581
- Bug in :meth:`read_json` was raising ``TypeError`` when reading a list of booleans into a Series. (:issue:`31464`)
578582
- Bug in :func:`pandas.io.json.json_normalize` where location specified by `record_path` doesn't point to an array. (:issue:`26284`)
583+
- :func:`pandas.read_hdf` has a more explicit error message when loading an
584+
unsupported HDF file (:issue:`9539`)
579585

580586
Plotting
581587
^^^^^^^^

pandas/_libs/parsers.pyx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,9 +55,7 @@ from pandas.core.dtypes.common import (
5555
is_bool_dtype, is_object_dtype,
5656
is_datetime64_dtype,
5757
pandas_dtype, is_extension_array_dtype)
58-
from pandas.core.arrays import Categorical
5958
from pandas.core.dtypes.concat import union_categoricals
60-
import pandas.io.common as icom
6159

6260
from pandas.compat import _import_lzma, _get_lzma_file
6361
from pandas.errors import (ParserError, DtypeWarning,
@@ -1149,7 +1147,8 @@ cdef class TextReader:
11491147

11501148
# Method accepts list of strings, not encoded ones.
11511149
true_values = [x.decode() for x in self.true_values]
1152-
cat = Categorical._from_inferred_categories(
1150+
array_type = dtype.construct_array_type()
1151+
cat = array_type._from_inferred_categories(
11531152
cats, codes, dtype, true_values=true_values)
11541153
return cat, na_count
11551154

pandas/core/algorithms.py

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@
2929
is_bool_dtype,
3030
is_categorical_dtype,
3131
is_complex_dtype,
32-
is_datetime64_any_dtype,
3332
is_datetime64_dtype,
3433
is_datetime64_ns_dtype,
3534
is_extension_array_dtype,
@@ -122,12 +121,7 @@ def _ensure_data(values, dtype=None):
122121
return ensure_object(values), "object"
123122

124123
# datetimelike
125-
if (
126-
needs_i8_conversion(values)
127-
or is_period_dtype(dtype)
128-
or is_datetime64_any_dtype(dtype)
129-
or is_timedelta64_dtype(dtype)
130-
):
124+
if needs_i8_conversion(values) or needs_i8_conversion(dtype):
131125
if is_period_dtype(values) or is_period_dtype(dtype):
132126
from pandas import PeriodIndex
133127

@@ -616,7 +610,7 @@ def factorize(
616610
values = _ensure_arraylike(values)
617611
original = values
618612

619-
if is_extension_array_dtype(values):
613+
if is_extension_array_dtype(values.dtype):
620614
values = extract_array(values)
621615
codes, uniques = values.factorize(na_sentinel=na_sentinel)
622616
dtype = original.dtype

pandas/core/arrays/categorical.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -370,7 +370,7 @@ def __init__(
370370
# we're inferring from values
371371
dtype = CategoricalDtype(categories, dtype.ordered)
372372

373-
elif is_categorical_dtype(values):
373+
elif is_categorical_dtype(values.dtype):
374374
old_codes = (
375375
values._values.codes if isinstance(values, ABCSeries) else values.codes
376376
)

pandas/core/arrays/datetimelike.py

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -638,8 +638,6 @@ def astype(self, dtype, copy=True):
638638
# 1. PeriodArray.astype handles period -> period
639639
# 2. DatetimeArray.astype handles conversion between tz.
640640
# 3. DatetimeArray.astype handles datetime -> period
641-
from pandas import Categorical
642-
643641
dtype = pandas_dtype(dtype)
644642

645643
if is_object_dtype(dtype):
@@ -667,7 +665,8 @@ def astype(self, dtype, copy=True):
667665
msg = f"Cannot cast {type(self).__name__} to dtype {dtype}"
668666
raise TypeError(msg)
669667
elif is_categorical_dtype(dtype):
670-
return Categorical(self, dtype=dtype)
668+
arr_cls = dtype.construct_array_type()
669+
return arr_cls(self, dtype=dtype)
671670
else:
672671
return np.asarray(self, dtype=dtype)
673672

@@ -1177,10 +1176,7 @@ def _add_timedeltalike_scalar(self, other):
11771176
# adding a scalar preserves freq
11781177
new_freq = self.freq
11791178

1180-
if new_freq is not None:
1181-
# fastpath that doesnt require inference
1182-
return type(self)(new_values, dtype=self.dtype, freq=new_freq)
1183-
return type(self)(new_values, dtype=self.dtype)._with_freq("infer")
1179+
return type(self)(new_values, dtype=self.dtype, freq=new_freq)
11841180

11851181
def _add_timedelta_arraylike(self, other):
11861182
"""
@@ -1210,7 +1206,7 @@ def _add_timedelta_arraylike(self, other):
12101206
mask = (self._isnan) | (other._isnan)
12111207
new_values[mask] = iNaT
12121208

1213-
return type(self)(new_values, dtype=self.dtype)._with_freq("infer")
1209+
return type(self)(new_values, dtype=self.dtype)
12141210

12151211
def _add_nat(self):
12161212
"""

pandas/core/arrays/datetimes.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@
2222
from pandas.errors import PerformanceWarning
2323

2424
from pandas.core.dtypes.common import (
25-
_INT64_DTYPE,
2625
DT64NS_DTYPE,
26+
INT64_DTYPE,
2727
is_bool_dtype,
2828
is_categorical_dtype,
2929
is_datetime64_any_dtype,
@@ -404,7 +404,7 @@ def _generate_range(
404404
start = start.tz_localize(None)
405405
if end is not None:
406406
end = end.tz_localize(None)
407-
# TODO: consider re-implementing _cached_range; GH#17914
407+
408408
values, _tz = generate_regular_range(start, end, periods, freq)
409409
index = cls._simple_new(values, freq=freq, dtype=tz_to_dtype(_tz))
410410

@@ -698,7 +698,7 @@ def _add_offset(self, offset):
698698
# GH#30336 _from_sequence won't be able to infer self.tz
699699
return type(self)._from_sequence(result).tz_localize(self.tz)
700700

701-
return type(self)._from_sequence(result)._with_freq("infer")
701+
return type(self)._from_sequence(result)
702702

703703
def _sub_datetimelike_scalar(self, other):
704704
# subtract a datetime from myself, yielding a ndarray[timedelta64[ns]]
@@ -1963,7 +1963,7 @@ def sequence_to_dt64ns(
19631963
if tz:
19641964
tz = timezones.maybe_get_tz(tz)
19651965

1966-
if data.dtype != _INT64_DTYPE:
1966+
if data.dtype != INT64_DTYPE:
19671967
data = data.astype(np.int64, copy=False)
19681968
result = data.view(DT64NS_DTYPE)
19691969

pandas/core/base.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1143,8 +1143,7 @@ def _map_values(self, mapper, na_action=None):
11431143
raise NotImplementedError
11441144
map_f = lambda values, f: values.map(f)
11451145
else:
1146-
values = self.astype(object)
1147-
values = getattr(values, "values", values)
1146+
values = self.astype(object)._values
11481147
if na_action == "ignore":
11491148

11501149
def map_f(values, f):

pandas/core/dtypes/cast.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@
2121
from pandas.util._validators import validate_bool_kwarg
2222

2323
from pandas.core.dtypes.common import (
24-
_INT64_DTYPE,
2524
_POSSIBLY_CAST_DTYPES,
2625
DT64NS_DTYPE,
26+
INT64_DTYPE,
2727
TD64NS_DTYPE,
2828
ensure_int8,
2929
ensure_int16,
@@ -954,7 +954,7 @@ def astype_nansafe(arr, dtype, copy: bool = True, skipna: bool = False):
954954
raise ValueError("Cannot convert NaT values to integer")
955955
return arr.view(dtype)
956956

957-
if dtype not in [_INT64_DTYPE, TD64NS_DTYPE]:
957+
if dtype not in [INT64_DTYPE, TD64NS_DTYPE]:
958958

959959
# allow frequency conversions
960960
# we return a float here!

pandas/core/dtypes/common.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,17 +60,14 @@
6060

6161
DT64NS_DTYPE = conversion.DT64NS_DTYPE
6262
TD64NS_DTYPE = conversion.TD64NS_DTYPE
63-
_INT64_DTYPE = np.dtype(np.int64)
63+
INT64_DTYPE = np.dtype(np.int64)
6464

6565
# oh the troubles to reduce import time
6666
_is_scipy_sparse = None
6767

6868
ensure_float64 = algos.ensure_float64
6969
ensure_float32 = algos.ensure_float32
7070

71-
_ensure_datetime64ns = conversion.ensure_datetime64ns
72-
_ensure_timedelta64ns = conversion.ensure_timedelta64ns
73-
7471

7572
def ensure_float(arr):
7673
"""

pandas/core/generic.py

Lines changed: 28 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,7 @@ def _construct_axes_from_arguments(
353353
return axes, kwargs
354354

355355
@classmethod
356-
def _get_axis_number(cls, axis):
356+
def _get_axis_number(cls, axis) -> int:
357357
axis = cls._AXIS_ALIASES.get(axis, axis)
358358
if is_integer(axis):
359359
if axis in cls._AXIS_NAMES:
@@ -366,7 +366,7 @@ def _get_axis_number(cls, axis):
366366
raise ValueError(f"No axis named {axis} for object type {cls.__name__}")
367367

368368
@classmethod
369-
def _get_axis_name(cls, axis):
369+
def _get_axis_name(cls, axis) -> str:
370370
axis = cls._AXIS_ALIASES.get(axis, axis)
371371
if isinstance(axis, str):
372372
if axis in cls._AXIS_NUMBERS:
@@ -378,12 +378,12 @@ def _get_axis_name(cls, axis):
378378
pass
379379
raise ValueError(f"No axis named {axis} for object type {cls.__name__}")
380380

381-
def _get_axis(self, axis):
381+
def _get_axis(self, axis) -> Index:
382382
name = self._get_axis_name(axis)
383383
return getattr(self, name)
384384

385385
@classmethod
386-
def _get_block_manager_axis(cls, axis):
386+
def _get_block_manager_axis(cls, axis) -> int:
387387
"""Map the axis to the block_manager axis."""
388388
axis = cls._get_axis_number(axis)
389389
if cls._AXIS_REVERSED:
@@ -590,7 +590,9 @@ def swapaxes(self: FrameOrSeries, axis1, axis2, copy=True) -> FrameOrSeries:
590590
if copy:
591591
new_values = new_values.copy()
592592

593-
return self._constructor(new_values, *new_axes).__finalize__(
593+
# ignore needed because of NDFrame constructor is different than
594+
# DataFrame/Series constructors.
595+
return self._constructor(new_values, *new_axes).__finalize__( # type: ignore
594596
self, method="swapaxes"
595597
)
596598

@@ -3217,7 +3219,8 @@ def _maybe_cache_changed(self, item, value) -> None:
32173219
"""
32183220
The object has called back to us saying maybe it has changed.
32193221
"""
3220-
self._mgr.set(item, value)
3222+
loc = self._info_axis.get_loc(item)
3223+
self._mgr.iset(loc, value)
32213224

32223225
@property
32233226
def _is_cached(self) -> bool_t:
@@ -3490,6 +3493,8 @@ class animal locomotion
34903493
axis = self._get_axis_number(axis)
34913494
labels = self._get_axis(axis)
34923495
if level is not None:
3496+
if not isinstance(labels, MultiIndex):
3497+
raise TypeError("Index must be a MultiIndex")
34933498
loc, new_ax = labels.get_loc_level(key, level=level, drop_level=drop_level)
34943499

34953500
# create the tuple of the indexer
@@ -3548,8 +3553,6 @@ class animal locomotion
35483553
result._set_is_copy(self, copy=not result._is_view)
35493554
return result
35503555

3551-
_xs: Callable = xs
3552-
35533556
def __getitem__(self, item):
35543557
raise AbstractMethodError(self)
35553558

@@ -3594,8 +3597,14 @@ def _iset_item(self, loc: int, value) -> None:
35943597
self._clear_item_cache()
35953598

35963599
def _set_item(self, key, value) -> None:
3597-
self._mgr.set(key, value)
3598-
self._clear_item_cache()
3600+
try:
3601+
loc = self._info_axis.get_loc(key)
3602+
except KeyError:
3603+
# This item wasn't present, just insert at end
3604+
self._mgr.insert(len(self._info_axis), key, value)
3605+
return
3606+
3607+
NDFrame._iset_item(self, loc, value)
35993608

36003609
def _set_is_copy(self, ref, copy: bool_t = True) -> None:
36013610
if not copy:
@@ -7623,11 +7632,11 @@ def at_time(
76237632
axis = self._get_axis_number(axis)
76247633

76257634
index = self._get_axis(axis)
7626-
try:
7627-
indexer = index.indexer_at_time(time, asof=asof)
7628-
except AttributeError as err:
7629-
raise TypeError("Index must be DatetimeIndex") from err
76307635

7636+
if not isinstance(index, DatetimeIndex):
7637+
raise TypeError("Index must be DatetimeIndex")
7638+
7639+
indexer = index.indexer_at_time(time, asof=asof)
76317640
return self._take_with_is_copy(indexer, axis=axis)
76327641

76337642
def between_time(
@@ -7706,16 +7715,12 @@ def between_time(
77067715
axis = self._get_axis_number(axis)
77077716

77087717
index = self._get_axis(axis)
7709-
try:
7710-
indexer = index.indexer_between_time(
7711-
start_time,
7712-
end_time,
7713-
include_start=include_start,
7714-
include_end=include_end,
7715-
)
7716-
except AttributeError as err:
7717-
raise TypeError("Index must be DatetimeIndex") from err
7718+
if not isinstance(index, DatetimeIndex):
7719+
raise TypeError("Index must be DatetimeIndex")
77187720

7721+
indexer = index.indexer_between_time(
7722+
start_time, end_time, include_start=include_start, include_end=include_end,
7723+
)
77197724
return self._take_with_is_copy(indexer, axis=axis)
77207725

77217726
def resample(

0 commit comments

Comments
 (0)