Skip to content

Commit 2afde21

Browse files
author
luke
committed
Merge branch 'fix-agg-with-one-element-fuc-list-given-arg-kwagrs' of https://github.com/luke396/pandas into fix-agg-with-one-element-fuc-list-given-arg-kwagrs
2 parents be9135d + 98fbd39 commit 2afde21

File tree

141 files changed

+873
-1639
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

141 files changed

+873
-1639
lines changed

.github/ISSUE_TEMPLATE/bug_report.yaml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,7 @@ body:
1717
[latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
1818
required: true
1919
- label: >
20-
I have confirmed this bug exists on the [main branch]
21-
(https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas)
22-
of pandas.
20+
I have confirmed this bug exists on the main branch of pandas.
2321
- type: textarea
2422
id: example
2523
attributes:

.github/workflows/python-dev.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ jobs:
7373
run: |
7474
python --version
7575
python -m pip install --upgrade pip setuptools wheel
76-
python -m pip install --pre --extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple numpy
76+
python -m pip install --extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple numpy
7777
python -m pip install git+https://github.com/nedbat/coveragepy.git
7878
python -m pip install versioneer[toml]
7979
python -m pip install python-dateutil pytz cython hypothesis>=6.34.2 pytest>=7.0.0 pytest-xdist>=2.2.0 pytest-cov pytest-asyncio>=0.17

doc/source/getting_started/install.rst

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -201,22 +201,6 @@ Installing from source
201201

202202
See the :ref:`contributing guide <contributing>` for complete instructions on building from the git source tree. Further, see :ref:`creating a development environment <contributing_environment>` if you wish to create a pandas development environment.
203203

204-
Installing the development version of pandas
205-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
206-
207-
Installing a nightly build is the quickest way to:
208-
209-
* Try a new feature that will be shipped in the next release (that is, a feature from a pull-request that was recently merged to the main branch).
210-
* Check whether a bug you encountered has been fixed since the last release.
211-
212-
You can install the nightly build of pandas using the scipy-wheels-nightly index from the PyPI registry of anaconda.org with the following command::
213-
214-
pip install --pre --extra-index https://pypi.anaconda.org/scipy-wheels-nightly/simple pandas
215-
216-
Note that first uninstalling pandas might be required to be able to install nightly builds::
217-
218-
pip uninstall pandas -y
219-
220204
Running the test suite
221205
----------------------
222206

doc/source/whatsnew/v2.0.0.rst

Lines changed: 28 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -83,44 +83,34 @@ be set to ``"pyarrow"`` to return pyarrow-backed, nullable :class:`ArrowDtype` (
8383
df_pyarrow = pd.read_csv(data, use_nullable_dtypes=True, engine="pyarrow")
8484
df_pyarrow.dtypes
8585
86-
Copy-on-Write improvements
86+
Copy on write improvements
8787
^^^^^^^^^^^^^^^^^^^^^^^^^^
8888

89-
- A new lazy copy mechanism that defers the copy until the object in question is modified
90-
was added to the following methods:
91-
92-
- :meth:`DataFrame.reset_index` / :meth:`Series.reset_index`
93-
- :meth:`DataFrame.set_index` / :meth:`Series.set_index`
94-
- :meth:`DataFrame.set_axis` / :meth:`Series.set_axis`
95-
- :meth:`DataFrame.rename_axis` / :meth:`Series.rename_axis`
96-
- :meth:`DataFrame.rename_columns`
97-
- :meth:`DataFrame.reindex` / :meth:`Series.reindex`
98-
- :meth:`DataFrame.reindex_like` / :meth:`Series.reindex_like`
99-
- :meth:`DataFrame.assign`
100-
- :meth:`DataFrame.drop`
101-
- :meth:`DataFrame.dropna` / :meth:`Series.dropna`
102-
- :meth:`DataFrame.select_dtypes`
103-
- :meth:`DataFrame.align` / :meth:`Series.align`
104-
- :meth:`Series.to_frame`
105-
- :meth:`DataFrame.rename` / :meth:`Series.rename`
106-
- :meth:`DataFrame.add_prefix` / :meth:`Series.add_prefix`
107-
- :meth:`DataFrame.add_suffix` / :meth:`Series.add_suffix`
108-
- :meth:`DataFrame.drop_duplicates` / :meth:`Series.drop_duplicates`
109-
- :meth:`DataFrame.reorder_levels` / :meth:`Series.reorder_levels`
110-
111-
These methods return views when Copy-on-Write is enabled, which provides a significant
112-
performance improvement compared to the regular execution (:issue:`49473`).
113-
114-
- Accessing a single column of a DataFrame as a Series (e.g. ``df["col"]``) now always
115-
returns a new object every time it is constructed when Copy-on-Write is enabled (not
116-
returning multiple times an identical, cached Series object). This ensures that those
117-
Series objects correctly follow the Copy-on-Write rules (:issue:`49450`)
118-
119-
- The :class:`Series` constructor will now create a lazy copy (deferring the copy until
120-
a modification to the data happens) when constructing a Series from an existing
121-
Series with the default of ``copy=False`` (:issue:`50471`)
122-
123-
Copy-on-Write can be enabled through
89+
A new lazy copy mechanism that defers the copy until the object in question is modified
90+
was added to the following methods:
91+
92+
- :meth:`DataFrame.reset_index` / :meth:`Series.reset_index`
93+
- :meth:`DataFrame.set_index` / :meth:`Series.set_index`
94+
- :meth:`DataFrame.set_axis` / :meth:`Series.set_axis`
95+
- :meth:`DataFrame.rename_axis` / :meth:`Series.rename_axis`
96+
- :meth:`DataFrame.rename_columns`
97+
- :meth:`DataFrame.reindex` / :meth:`Series.reindex`
98+
- :meth:`DataFrame.reindex_like` / :meth:`Series.reindex_like`
99+
- :meth:`DataFrame.assign`
100+
- :meth:`DataFrame.drop`
101+
- :meth:`DataFrame.dropna` / :meth:`Series.dropna`
102+
- :meth:`DataFrame.select_dtypes`
103+
- :meth:`DataFrame.align` / :meth:`Series.align`
104+
- :meth:`Series.to_frame`
105+
- :meth:`DataFrame.rename` / :meth:`Series.rename`
106+
- :meth:`DataFrame.add_prefix` / :meth:`Series.add_prefix`
107+
- :meth:`DataFrame.add_suffix` / :meth:`Series.add_suffix`
108+
- :meth:`DataFrame.drop_duplicates` / :meth:`Series.drop_duplicates`
109+
- :meth:`DataFrame.reorder_levels` / :meth:`Series.reorder_levels`
110+
111+
These methods return views when copy on write is enabled, which provides a significant
112+
performance improvement compared to the regular execution (:issue:`49473`). Copy on write
113+
can be enabled through
124114

125115
.. code-block:: python
126116
@@ -573,7 +563,8 @@ Deprecations
573563
~~~~~~~~~~~~
574564
- Deprecated argument ``infer_datetime_format`` in :func:`to_datetime` and :func:`read_csv`, as a strict version of it is now the default (:issue:`48621`)
575565
- Deprecated :func:`pandas.io.sql.execute`(:issue:`50185`)
576-
- :meth:`Index.is_integer` has been deprecated. Use :func:`pandas.api.types.is_integer_dtype` instead (:issue:`50042`)
566+
-
567+
577568
- :meth:`Index.is_floating` has been deprecated. Use :func:`pandas.api.types.is_float_dtype` instead (:issue:`50042`)
578569

579570
.. ---------------------------------------------------------------------------
@@ -918,8 +909,6 @@ Timezones
918909
- Bug in :meth:`Series.astype` and :meth:`DataFrame.astype` with object-dtype containing multiple timezone-aware ``datetime`` objects with heterogeneous timezones to a :class:`DatetimeTZDtype` incorrectly raising (:issue:`32581`)
919910
- Bug in :func:`to_datetime` was failing to parse date strings with timezone name when ``format`` was specified with ``%Z`` (:issue:`49748`)
920911
- Better error message when passing invalid values to ``ambiguous`` parameter in :meth:`Timestamp.tz_localize` (:issue:`49565`)
921-
- Bug in string parsing incorrectly allowing a :class:`Timestamp` to be constructed with an invalid timezone, which would raise when trying to print (:issue:`50668`)
922-
-
923912

924913
Numeric
925914
^^^^^^^
@@ -958,7 +947,6 @@ Indexing
958947
- Bug in :meth:`DataFrame.__setitem__` raising when indexer is a :class:`DataFrame` with ``boolean`` dtype (:issue:`47125`)
959948
- Bug in :meth:`DataFrame.reindex` filling with wrong values when indexing columns and index for ``uint`` dtypes (:issue:`48184`)
960949
- Bug in :meth:`DataFrame.loc` when setting :class:`DataFrame` with different dtypes coercing values to single dtype (:issue:`50467`)
961-
- Bug in :meth:`DataFrame.sort_values` where ``None`` was not returned when ``by`` is empty list and ``inplace=True`` (:issue:`50643`)
962950
- Bug in :meth:`DataFrame.loc` coercing dtypes when setting values with a list indexer (:issue:`49159`)
963951
- Bug in :meth:`Series.loc` raising error for out of bounds end of slice indexer (:issue:`50161`)
964952
- Bug in :meth:`DataFrame.loc` raising ``ValueError`` with ``bool`` indexer and :class:`MultiIndex` (:issue:`47687`)

pandas/_libs/tslib.pyx

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -770,15 +770,6 @@ cdef _array_to_datetime_object(
770770
oresult[i] = "NaT"
771771
cnp.PyArray_MultiIter_NEXT(mi)
772772
continue
773-
elif val == "now":
774-
oresult[i] = datetime.now()
775-
cnp.PyArray_MultiIter_NEXT(mi)
776-
continue
777-
elif val == "today":
778-
oresult[i] = datetime.today()
779-
cnp.PyArray_MultiIter_NEXT(mi)
780-
continue
781-
782773
try:
783774
oresult[i] = parse_datetime_string(val, dayfirst=dayfirst,
784775
yearfirst=yearfirst)

pandas/_libs/tslibs/parsing.pyi

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ from datetime import datetime
22

33
import numpy as np
44

5+
from pandas._libs.tslibs.offsets import BaseOffset
56
from pandas._typing import npt
67

78
class DateParseError(ValueError): ...
@@ -11,9 +12,9 @@ def parse_datetime_string(
1112
dayfirst: bool = ...,
1213
yearfirst: bool = ...,
1314
) -> datetime: ...
14-
def parse_datetime_string_with_reso(
15-
date_string: str,
16-
freq: str | None = ...,
15+
def parse_time_string(
16+
arg: str,
17+
freq: BaseOffset | str | None = ...,
1718
dayfirst: bool | None = ...,
1819
yearfirst: bool | None = ...,
1920
) -> tuple[datetime, str]: ...

pandas/_libs/tslibs/parsing.pyx

Lines changed: 57 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ from pandas._libs.tslibs.np_datetime cimport (
5959
npy_datetimestruct,
6060
string_to_dts,
6161
)
62+
from pandas._libs.tslibs.offsets cimport is_offset_object
6263
from pandas._libs.tslibs.strptime import array_strptime
6364
from pandas._libs.tslibs.util cimport (
6465
get_c_string_buf_and_size,
@@ -256,10 +257,6 @@ def parse_datetime_string(
256257
Returns
257258
-------
258259
datetime
259-
260-
Notes
261-
-----
262-
Does not handle "today" or "now", which caller is responsible for handling.
263260
"""
264261

265262
cdef:
@@ -278,6 +275,14 @@ def parse_datetime_string(
278275
if dt is not None:
279276
return dt
280277

278+
# Handling special case strings today & now
279+
if date_string == "now":
280+
dt = datetime.now()
281+
return dt
282+
elif date_string == "today":
283+
dt = datetime.today()
284+
return dt
285+
281286
try:
282287
dt, _ = _parse_dateabbr_string(date_string, _DEFAULT_DATETIME, freq=None)
283288
return dt
@@ -299,39 +304,20 @@ def parse_datetime_string(
299304
raise OutOfBoundsDatetime(
300305
f'Parsing "{date_string}" to datetime overflows'
301306
) from err
302-
if dt.tzinfo is not None:
303-
# dateutil can return a datetime with a tzoffset outside of (-24H, 24H)
304-
# bounds, which is invalid (can be constructed, but raises if we call
305-
# str(dt)). Check that and raise here if necessary.
306-
try:
307-
dt.utcoffset()
308-
except ValueError as err:
309-
# offset must be a timedelta strictly between -timedelta(hours=24)
310-
# and timedelta(hours=24)
311-
raise ValueError(
312-
f'Parsed string "{date_string}" gives an invalid tzoffset, '
313-
"which must be between -timedelta(hours=24) and timedelta(hours=24)"
314-
)
315307

316308
return dt
317309

318310

319-
def parse_datetime_string_with_reso(
320-
str date_string, str freq=None, dayfirst=None, yearfirst=None
321-
):
322-
# NB: This will break with np.str_ (GH#45580) even though
323-
# isinstance(npstrobj, str) evaluates to True, so caller must ensure
324-
# the argument is *exactly* 'str'
311+
def parse_time_string(arg, freq=None, dayfirst=None, yearfirst=None):
325312
"""
326313
Try hard to parse datetime string, leveraging dateutil plus some extra
327314
goodies like quarter recognition.
328315
329316
Parameters
330317
----------
331-
date_string : str
332-
freq : str or None, default None
318+
arg : str
319+
freq : str or DateOffset, default None
333320
Helps with interpreting time string if supplied
334-
Corresponds to `offset.rule_code`
335321
dayfirst : bool, default None
336322
If None uses default from print_config
337323
yearfirst : bool, default None
@@ -342,21 +328,50 @@ def parse_datetime_string_with_reso(
342328
datetime
343329
str
344330
Describing resolution of parsed string.
345-
346-
Raises
347-
------
348-
ValueError : preliminary check suggests string is not datetime
349-
DateParseError : error within dateutil
350331
"""
332+
if type(arg) is not str:
333+
# GH#45580 np.str_ satisfies isinstance(obj, str) but if we annotate
334+
# arg as "str" this raises here
335+
if not isinstance(arg, np.str_):
336+
raise TypeError(
337+
"Argument 'arg' has incorrect type "
338+
f"(expected str, got {type(arg).__name__})"
339+
)
340+
arg = str(arg)
341+
342+
if is_offset_object(freq):
343+
freq = freq.rule_code
351344

352345
if dayfirst is None:
353346
dayfirst = get_option("display.date_dayfirst")
354347
if yearfirst is None:
355348
yearfirst = get_option("display.date_yearfirst")
356349

350+
res = parse_datetime_string_with_reso(arg, freq=freq,
351+
dayfirst=dayfirst,
352+
yearfirst=yearfirst)
353+
return res
354+
355+
356+
cdef parse_datetime_string_with_reso(
357+
str date_string, str freq=None, bint dayfirst=False, bint yearfirst=False,
358+
):
359+
"""
360+
Parse datetime string and try to identify its resolution.
361+
362+
Returns
363+
-------
364+
datetime
365+
str
366+
Inferred resolution of the parsed string.
367+
368+
Raises
369+
------
370+
ValueError : preliminary check suggests string is not datetime
371+
DateParseError : error within dateutil
372+
"""
357373
cdef:
358-
datetime parsed
359-
str reso
374+
object parsed, reso
360375
bint string_to_dts_failed
361376
npy_datetimestruct dts
362377
NPY_DATETIMEUNIT out_bestunit
@@ -468,7 +483,7 @@ cpdef bint _does_string_look_like_datetime(str py_string):
468483
cdef object _parse_dateabbr_string(object date_string, datetime default,
469484
str freq=None):
470485
cdef:
471-
datetime ret
486+
object ret
472487
# year initialized to prevent compiler warnings
473488
int year = -1, quarter = -1, month
474489
Py_ssize_t date_len
@@ -490,8 +505,8 @@ cdef object _parse_dateabbr_string(object date_string, datetime default,
490505
except ValueError:
491506
pass
492507

493-
if 4 <= date_len <= 7:
494-
try:
508+
try:
509+
if 4 <= date_len <= 7:
495510
i = date_string.index("Q", 1, 6)
496511
if i == 1:
497512
quarter = int(date_string[0])
@@ -538,21 +553,19 @@ cdef object _parse_dateabbr_string(object date_string, datetime default,
538553
ret = default.replace(year=year, month=month)
539554
return ret, "quarter"
540555

541-
except DateParseError:
542-
raise
543-
except ValueError:
544-
# e.g. if "Q" is not in date_string and .index raised
545-
pass
556+
except DateParseError:
557+
raise
558+
except ValueError:
559+
pass
546560

547561
if date_len == 6 and freq == "M":
548562
year = int(date_string[:4])
549563
month = int(date_string[4:6])
550564
try:
551565
ret = default.replace(year=year, month=month)
552566
return ret, "month"
553-
except ValueError as err:
554-
# We can infer that none of the patterns below will match
555-
raise ValueError(f"Unable to parse {date_string}") from err
567+
except ValueError:
568+
pass
556569

557570
for pat in ["%Y-%m", "%b %Y", "%b-%Y"]:
558571
try:

pandas/_libs/tslibs/period.pyx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ from pandas._libs.tslibs.dtypes cimport (
8888
)
8989
from pandas._libs.tslibs.parsing cimport quarter_to_myear
9090

91-
from pandas._libs.tslibs.parsing import parse_datetime_string_with_reso
91+
from pandas._libs.tslibs.parsing import parse_time_string
9292

9393
from pandas._libs.tslibs.nattype cimport (
9494
NPY_NAT,
@@ -2589,9 +2589,7 @@ class Period(_Period):
25892589

25902590
value = str(value)
25912591
value = value.upper()
2592-
2593-
freqstr = freq.rule_code if freq is not None else None
2594-
dt, reso = parse_datetime_string_with_reso(value, freqstr)
2592+
dt, reso = parse_time_string(value, freq)
25952593
try:
25962594
ts = Timestamp(value)
25972595
except ValueError:

pandas/_testing/asserters.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
is_bool,
1616
is_categorical_dtype,
1717
is_extension_array_dtype,
18-
is_integer_dtype,
1918
is_interval_dtype,
2019
is_number,
2120
is_numeric_dtype,
@@ -1336,7 +1335,7 @@ def assert_indexing_slices_equivalent(ser: Series, l_slc: slice, i_slc: slice) -
13361335

13371336
assert_series_equal(ser.loc[l_slc], expected)
13381337

1339-
if not is_integer_dtype(ser.index):
1338+
if not ser.index.is_integer():
13401339
# For integer indices, .loc and plain getitem are position-based.
13411340
assert_series_equal(ser[l_slc], expected)
13421341

0 commit comments

Comments
 (0)