Skip to content

Commit 4ad16c9

Browse files
committed
Merge remote-tracking branch 'upstream/master' into depr-cdt-ordered-none
2 parents adc0bca + f5cc078 commit 4ad16c9

File tree

152 files changed

+967
-875
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

152 files changed

+967
-875
lines changed

.travis.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,14 @@ install:
8686
- ci/submit_cython_cache.sh
8787
- echo "install done"
8888

89+
before_script:
90+
# display server (for clipboard functionality) needs to be started here,
91+
# does not work if done in install:setup_env.sh (GH-26103)
92+
- export DISPLAY=":99.0"
93+
- echo "sh -e /etc/init.d/xvfb start"
94+
- sh -e /etc/init.d/xvfb start
95+
- sleep 3
96+
8997
script:
9098
- echo "script start"
9199
- source activate pandas-dev

ci/azure/windows.yml

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,16 +17,15 @@ jobs:
1717
CONDA_PY: "37"
1818

1919
steps:
20-
- task: CondaEnvironment@1
21-
inputs:
22-
updateConda: no
23-
packageSpecs: ''
24-
25-
- script: |
26-
ci\\incremental\\setup_conda_environment.cmd
27-
displayName: 'Before Install'
20+
- powershell: Write-Host "##vso[task.prependpath]$env:CONDA\Scripts"
21+
displayName: Add conda to PATH
22+
- script: conda update -q -n base conda
23+
displayName: Update conda
24+
- script: conda env create -q --file ci\\deps\\azure-windows-$(CONDA_PY).yaml
25+
displayName: Create anaconda environment
2826
- script: |
2927
call activate pandas-dev
28+
call conda list
3029
ci\\incremental\\build.cmd
3130
displayName: 'Build'
3231
- script: |

ci/code_checks.sh

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -169,15 +169,6 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
169169
invgrep -r -E --include '*.py' '(unittest(\.| import )mock|mock\.Mock\(\)|mock\.patch)' pandas/tests/
170170
RET=$(($RET + $?)) ; echo $MSG "DONE"
171171

172-
# Check that we use pytest.raises only as a context manager
173-
#
174-
# For any flake8-compliant code, the only way this regex gets
175-
# matched is if there is no "with" statement preceding "pytest.raises"
176-
MSG='Check for pytest.raises as context manager (a line starting with `pytest.raises` is invalid, needs a `with` to precede it)' ; echo $MSG
177-
MSG='TODO: This check is currently skipped because so many files fail this. Please enable when all are corrected (xref gh-24332)' ; echo $MSG
178-
# invgrep -R --include '*.py' -E '[[:space:]] pytest.raises' pandas/tests
179-
# RET=$(($RET + $?)) ; echo $MSG "DONE"
180-
181172
MSG='Check for wrong space after code-block directive and before colon (".. code-block ::" instead of ".. code-block::")' ; echo $MSG
182173
invgrep -R --include="*.rst" ".. code-block ::" doc/source
183174
RET=$(($RET + $?)) ; echo $MSG "DONE"
@@ -239,6 +230,10 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
239230
pytest -q --doctest-modules pandas/core/groupby/groupby.py -k"-cumcount -describe -pipe"
240231
RET=$(($RET + $?)) ; echo $MSG "DONE"
241232

233+
MSG='Doctests datetimes.py' ; echo $MSG
234+
pytest -q --doctest-modules pandas/core/tools/datetimes.py
235+
RET=$(($RET + $?)) ; echo $MSG "DONE"
236+
242237
MSG='Doctests top-level reshaping functions' ; echo $MSG
243238
pytest -q --doctest-modules \
244239
pandas/core/reshape/concat.py \

ci/deps/azure-windows-37.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
name: pandas-dev
22
channels:
33
- defaults
4+
- conda-forge
45
dependencies:
56
- beautifulsoup4
67
- bottleneck
8+
- gcsfs
79
- html5lib
810
- jinja2
911
- lxml

ci/incremental/setup_conda_environment.cmd

Lines changed: 0 additions & 23 deletions
This file was deleted.

ci/setup_env.sh

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -118,16 +118,10 @@ echo "conda list"
118118
conda list
119119

120120
# Install DB for Linux
121-
export DISPLAY=":99."
122121
if [ ${TRAVIS_OS_NAME} == "linux" ]; then
123122
echo "installing dbs"
124123
mysql -e 'create database pandas_nosetest;'
125124
psql -c 'create database pandas_nosetest;' -U postgres
126-
127-
echo
128-
echo "sh -e /etc/init.d/xvfb start"
129-
sh -e /etc/init.d/xvfb start
130-
sleep 3
131125
else
132126
echo "not using dbs on non-linux"
133127
fi

doc/source/ecosystem.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,11 @@ provides a familiar ``DataFrame`` interface for out-of-core, parallel and distri
285285

286286
Dask-ML enables parallel and distributed machine learning using Dask alongside existing machine learning libraries like Scikit-Learn, XGBoost, and TensorFlow.
287287

288+
`Koalas <https://koalas.readthedocs.io/en/latest/>`__
289+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
290+
291+
Koalas provides a familiar pandas DataFrame interface on top of Apache Spark. It enables users to leverage multi-cores on one machine or a cluster of machines to speed up or scale their DataFrame code.
292+
288293
`Odo <http://odo.pydata.org>`__
289294
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
290295

doc/source/user_guide/computation.rst

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -865,7 +865,7 @@ which is equivalent to using weights
865865
866866
The difference between the above two variants arises because we are
867867
dealing with series which have finite history. Consider a series of infinite
868-
history:
868+
history, with ``adjust=True``:
869869

870870
.. math::
871871
@@ -884,10 +884,11 @@ and a ratio of :math:`1 - \alpha` we have
884884
&= \alpha x_t + (1 - \alpha)[x_{t-1} + (1 - \alpha) x_{t-2} + ...]\alpha\\
885885
&= \alpha x_t + (1 - \alpha) y_{t-1}
886886
887-
which shows the equivalence of the above two variants for infinite series.
888-
When ``adjust=True`` we have :math:`y_0 = x_0` and from the last
889-
representation above we have :math:`y_t = \alpha x_t + (1 - \alpha) y_{t-1}`,
890-
therefore there is an assumption that :math:`x_0` is not an ordinary value
887+
which is the same expression as ``adjust=False`` above and therefore
888+
shows the equivalence of the two variants for infinite series.
889+
When ``adjust=False``, we have :math:`y_0 = x_0` and
890+
:math:`y_t = \alpha x_t + (1 - \alpha) y_{t-1}`.
891+
Therefore, there is an assumption that :math:`x_0` is not an ordinary value
891892
but rather an exponentially weighted moment of the infinite series up to that
892893
point.
893894

doc/source/user_guide/indexing.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -868,7 +868,7 @@ You can also set using these same indexers.
868868

869869
.. ipython:: python
870870
871-
df.at[dates[-1] + 1, 0] = 7
871+
df.at[dates[-1] + pd.Timedelta('1 day'), 0] = 7
872872
df
873873
874874
Boolean indexing

doc/source/user_guide/io.rst

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2341,10 +2341,10 @@ round-trippable manner.
23412341
.. ipython:: python
23422342
23432343
df = pd.DataFrame({'foo': [1, 2, 3, 4],
2344-
'bar': ['a', 'b', 'c', 'd'],
2345-
'baz': pd.date_range('2018-01-01', freq='d', periods=4),
2346-
'qux': pd.Categorical(['a', 'b', 'c', 'c'])
2347-
}, index=pd.Index(range(4), name='idx'))
2344+
'bar': ['a', 'b', 'c', 'd'],
2345+
'baz': pd.date_range('2018-01-01', freq='d', periods=4),
2346+
'qux': pd.Categorical(['a', 'b', 'c', 'c'])
2347+
}, index=pd.Index(range(4), name='idx'))
23482348
df
23492349
df.dtypes
23502350
@@ -2864,6 +2864,19 @@ of sheet names can simply be passed to ``read_excel`` with no loss in performanc
28642864
data = pd.read_excel('path_to_file.xls', ['Sheet1', 'Sheet2'],
28652865
index_col=None, na_values=['NA'])
28662866
2867+
``ExcelFile`` can also be called with a ``xlrd.book.Book`` object
2868+
as a parameter. This allows the user to control how the excel file is read.
2869+
For example, sheets can be loaded on demand by calling ``xlrd.open_workbook()``
2870+
with ``on_demand=True``.
2871+
2872+
.. code-block:: python
2873+
2874+
import xlrd
2875+
xlrd_book = xlrd.open_workbook('path_to_file.xls', on_demand=True)
2876+
with pd.ExcelFile(xlrd_book) as xls:
2877+
df1 = pd.read_excel(xls, 'Sheet1')
2878+
df2 = pd.read_excel(xls, 'Sheet2')
2879+
28672880
.. _io.excel.specifying_sheets:
28682881

28692882
Specifying Sheets

doc/source/whatsnew/v0.11.0.rst

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -238,14 +238,9 @@ Enhancements
238238

239239
- support ``read_hdf/to_hdf`` API similar to ``read_csv/to_csv``
240240

241-
.. ipython:: python
242-
:suppress:
243-
244-
from pandas.compat import lrange
245-
246241
.. ipython:: python
247242
248-
df = pd.DataFrame({'A': lrange(5), 'B': lrange(5)})
243+
df = pd.DataFrame({'A': range(5), 'B': range(5)})
249244
df.to_hdf('store.h5', 'table', append=True)
250245
pd.read_hdf('store.h5', 'table', where=['index > 2'])
251246

doc/source/whatsnew/v0.12.0.rst

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -83,13 +83,8 @@ API changes
8383
``iloc`` API to be *purely* positional based.
8484

8585
.. ipython:: python
86-
:suppress:
8786
88-
from pandas.compat import lrange
89-
90-
.. ipython:: python
91-
92-
df = pd.DataFrame(lrange(5), list('ABCDE'), columns=['a'])
87+
df = pd.DataFrame(range(5), index=list('ABCDE'), columns=['a'])
9388
mask = (df.a % 2 == 0)
9489
mask
9590

doc/source/whatsnew/v0.25.0.rst

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ Other Enhancements
4141
- :meth:`DataFrame.query` and :meth:`DataFrame.eval` now supports quoting column names with backticks to refer to names with spaces (:issue:`6508`)
4242
- :func:`merge_asof` now gives a more clear error message when merge keys are categoricals that are not equal (:issue:`26136`)
4343
- :meth:`pandas.core.window.Rolling` supports exponential (or Poisson) window type (:issue:`21303`)
44+
-
4445

4546
.. _whatsnew_0250.api_breaking:
4647

@@ -249,15 +250,18 @@ Other API Changes
249250
- Comparing :class:`Timestamp` with unsupported objects now returns :py:obj:`NotImplemented` instead of raising ``TypeError``. This implies that unsupported rich comparisons are delegated to the other object, and are now consistent with Python 3 behavior for ``datetime`` objects (:issue:`24011`)
250251
- Bug in :meth:`DatetimeIndex.snap` which didn't preserving the ``name`` of the input :class:`Index` (:issue:`25575`)
251252
- The ``arg`` argument in :meth:`pandas.core.groupby.DataFrameGroupBy.agg` has been renamed to ``func`` (:issue:`26089`)
253+
- The ``arg`` argument in :meth:`pandas.core.window._Window.aggregate` has been renamed to ``func`` (:issue:`26372`)
254+
- Most Pandas classes had a ``__bytes__`` method, which was used for getting a python2-style bytestring representation of the object. This method has been removed as a part of dropping Python2 (:issue:`26447`)
252255

253256
.. _whatsnew_0250.deprecations:
254257

255258
Deprecations
256259
~~~~~~~~~~~~
257260

261+
- The deprecated ``.ix[]`` indexer now raises a more visible FutureWarning instead of DeprecationWarning (:issue:`26438`).
258262
- Deprecated the ``units=M`` (months) and ``units=Y`` (year) parameters for ``units`` of :func:`pandas.to_timedelta`, :func:`pandas.Timedelta` and :func:`pandas.TimedeltaIndex` (:issue:`16344`)
259263
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64` or :meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
260-
- The default value ``ordered=None`` in :class:`~pandas.api.types.CategoricalDtype` has been deprecated in favor of ``ordered=False``. When converting between categorical types ``ordered=True`` must be explicitly passed in order to be preserved. (:issue:`26336`)
264+
261265

262266
.. _whatsnew_0250.prior_deprecations:
263267

@@ -294,11 +298,10 @@ Bug Fixes
294298
~~~~~~~~~
295299

296300

297-
298301
Categorical
299302
^^^^^^^^^^^
300303

301-
-
304+
- Bug in :func:`DataFrame.at` and :func:`Series.at` that would raise exception if the index was a :class:`CategoricalIndex` (:issue:`20629`)
302305
-
303306
-
304307

@@ -318,7 +321,7 @@ Timedelta
318321

319322
- Bug in :func:`TimedeltaIndex.intersection` where for non-monotonic indices in some cases an empty ``Index`` was returned when in fact an intersection existed (:issue:`25913`)
320323
- Bug with comparisons between :class:`Timedelta` and ``NaT`` raising ``TypeError`` (:issue:`26039`)
321-
-
324+
- Bug when adding or subtracting a :class:`BusinessHour` to a :class:`Timestamp` with the resulting time landing in a following or prior day respectively (:issue:`26381`)
322325

323326
Timezones
324327
^^^^^^^^^
@@ -337,6 +340,7 @@ Numeric
337340

338341
- Bug in :meth:`to_numeric` in which large negative numbers were being improperly handled (:issue:`24910`)
339342
- Bug in :meth:`to_numeric` in which numbers were being coerced to float, even though ``errors`` was not ``coerce`` (:issue:`24910`)
343+
- Bug in :meth:`to_numeric` in which invalid values for ``errors`` were being allowed (:issue:`26466`)
340344
- Bug in :class:`format` in which floating point complex numbers were not being formatted to proper display precision and trimming (:issue:`25514`)
341345
- Bug in error messages in :meth:`DataFrame.corr` and :meth:`Series.corr`. Added the possibility of using a callable. (:issue:`25729`)
342346
- Bug in :meth:`Series.divmod` and :meth:`Series.rdivmod` which would raise an (incorrect) ``ValueError`` rather than return a pair of :class:`Series` objects as result (:issue:`25557`)
@@ -374,7 +378,9 @@ Indexing
374378
- Improved exception message when calling :meth:`DataFrame.iloc` with a list of non-numeric objects (:issue:`25753`).
375379
- Bug in :meth:`DataFrame.loc` and :meth:`Series.loc` where ``KeyError`` was not raised for a ``MultiIndex`` when the key was less than or equal to the number of levels in the :class:`MultiIndex` (:issue:`14885`).
376380
- Bug in which :meth:`DataFrame.append` produced an erroneous warning indicating that a ``KeyError`` will be thrown in the future when the data to be appended contains new columns (:issue:`22252`).
377-
-
381+
- Bug in which :meth:`DataFrame.to_csv` caused a segfault for a reindexed data frame, when the indices were single-level :class:`MultiIndex` (:issue:`26303`).
382+
- Fixed bug where assigning a :class:`arrays.PandasArray` to a :class:`pandas.core.frame.DataFrame` would raise error (:issue:`26390`)
383+
- Allow keyword arguments for callable local reference used in the :method:`DataFrame.query` string (:issue:`26426`)
378384

379385

380386
Missing
@@ -441,7 +447,9 @@ Groupby/Resample/Rolling
441447
- Bug in :meth:`pandas.core.groupby.GroupBy.idxmax` and :meth:`pandas.core.groupby.GroupBy.idxmin` with datetime column would return incorrect dtype (:issue:`25444`, :issue:`15306`)
442448
- Bug in :meth:`pandas.core.groupby.GroupBy.cumsum`, :meth:`pandas.core.groupby.GroupBy.cumprod`, :meth:`pandas.core.groupby.GroupBy.cummin` and :meth:`pandas.core.groupby.GroupBy.cummax` with categorical column having absent categories, would return incorrect result or segfault (:issue:`16771`)
443449
- Bug in :meth:`pandas.core.groupby.GroupBy.nth` where NA values in the grouping would return incorrect results (:issue:`26011`)
444-
450+
- Bug in :meth:`pandas.core.groupby.SeriesGroupBy.transform` where transforming an empty group would raise error (:issue:`26208`)
451+
- Bug in :meth:`pandas.core.frame.DataFrame.groupby` where passing a :class:`pandas.core.groupby.grouper.Grouper` would return incorrect groups when using the ``.groups`` accessor (:issue:`26326`)
452+
- Bug in :meth:`pandas.core.groupby.GroupBy.agg` where incorrect results are returned for uint64 columns. (:issue:`26310`)
445453

446454
Reshaping
447455
^^^^^^^^^
@@ -453,8 +461,10 @@ Reshaping
453461
- Bug in :func:`pivot_table` where columns with ``NaN`` values are dropped even if ``dropna`` argument is ``False``, when the ``aggfunc`` argument contains a ``list`` (:issue:`22159`)
454462
- Bug in :func:`concat` where the resulting ``freq`` of two :class:`DatetimeIndex` with the same ``freq`` would be dropped (:issue:`3232`).
455463
- Bug in :func:`merge` where merging with equivalent Categorical dtypes was raising an error (:issue:`22501`)
464+
- bug in :class:`DataFrame` instantiating with a dict of iterators or generators (e.g. ``pd.DataFrame({'A': reversed(range(3))})``) raised an error (:issue:`26349`).
456465
- bug in :class:`DataFrame` instantiating with a ``range`` (e.g. ``pd.DataFrame(range(3))``) raised an error (:issue:`26342`).
457466
- Bug in :class:`DataFrame` constructor when passing non-empty tuples would cause a segmentation fault (:issue:`25691`)
467+
- Bug in :func:`Series.apply` failed when the series is a timezone aware :class:`DatetimeIndex` (:issue:`25959`)
458468
- Bug in :func:`pandas.cut` where large bins could incorrectly raise an error due to an integer overflow (:issue:`26045`)
459469
- Bug in :func:`DataFrame.sort_index` where an error is thrown when a multi-indexed DataFrame is sorted on all levels with the initial level sorted last (:issue:`26053`)
460470
- Bug in :meth:`Series.nlargest` treats ``True`` as smaller than ``False`` (:issue:`26154`)
@@ -474,7 +484,6 @@ Other
474484
- Bug in :func:`factorize` when passing an ``ExtensionArray`` with a custom ``na_sentinel`` (:issue:`25696`).
475485
- Allow :class:`Index` and :class:`RangeIndex` to be passed to numpy ``min`` and ``max`` functions.
476486

477-
478487
.. _whatsnew_0.250.contributors:
479488

480489
Contributors

mypy.ini

Lines changed: 1 addition & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -15,43 +15,4 @@ ignore_errors=True
1515
ignore_errors=True
1616

1717
[mypy-pandas.core.indexes.timedeltas]
18-
ignore_errors=True
19-
20-
[mypy-pandas.core.indexing]
21-
ignore_errors=True
22-
23-
[mypy-pandas.core.internals.blocks]
24-
ignore_errors=True
25-
26-
[mypy-pandas.core.ops]
27-
ignore_errors=True
28-
29-
[mypy-pandas.core.panel]
30-
ignore_errors=True
31-
32-
[mypy-pandas.core.resample]
33-
ignore_errors=True
34-
35-
[mypy-pandas.core.reshape.merge]
36-
ignore_errors=True
37-
38-
[mypy-pandas.core.reshape.reshape]
39-
ignore_errors=True
40-
41-
[mypy-pandas.core.series]
42-
ignore_errors=True
43-
44-
[mypy-pandas.core.util.hashing]
45-
ignore_errors=True
46-
47-
[mypy-pandas.core.window]
48-
ignore_errors=True
49-
50-
[mypy-pandas.io.pytables]
51-
ignore_errors=True
52-
53-
[mypy-pandas.util._doctools]
54-
ignore_errors=True
55-
56-
[mypy-pandas.util.testing]
57-
ignore_errors=True
18+
ignore_errors=True

pandas/_libs/parsers.pyx

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -149,9 +149,6 @@ cdef extern from "parser/tokenizer.h":
149149
int skipinitialspace # ignore spaces following delimiter? */
150150
int quoting # style of quoting to write */
151151

152-
# hmm =/
153-
# int numeric_field
154-
155152
char commentchar
156153
int allow_embedded_newline
157154
int strict # raise exception on bad CSV */

0 commit comments

Comments
 (0)