Skip to content

Commit 5b459d2

Browse files
Resolved merge conflicts in pandas/io/pickle.py and doc/source/whatsnew/v1.0.0.rst.
2 parents 44087ec + 3577b5a commit 5b459d2

File tree

139 files changed

+1726
-1176
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

139 files changed

+1726
-1176
lines changed

.travis.yml

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -30,31 +30,34 @@ matrix:
3030
- python: 3.5
3131

3232
include:
33-
- dist: trusty
34-
env:
33+
- env:
3534
- JOB="3.8" ENV_FILE="ci/deps/travis-38.yaml" PATTERN="(not slow and not network)"
3635

37-
- dist: trusty
38-
env:
36+
- env:
3937
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network)"
4038

41-
- dist: trusty
42-
env:
43-
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8"
39+
- env:
40+
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"
41+
services:
42+
- mysql
43+
- postgresql
4444

45-
- dist: trusty
46-
env:
47-
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36-cov.yaml" PATTERN="((not slow and not network) or (single and db))" PANDAS_TESTING_MODE="deprecate" COVERAGE=true
45+
- env:
46+
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36-cov.yaml" PATTERN="((not slow and not network) or (single and db))" PANDAS_TESTING_MODE="deprecate" COVERAGE=true SQL="1"
47+
services:
48+
- mysql
49+
- postgresql
4850

4951
# In allow_failures
50-
- dist: trusty
51-
env:
52-
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow"
52+
- env:
53+
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
54+
services:
55+
- mysql
56+
- postgresql
5357

5458
allow_failures:
55-
- dist: trusty
56-
env:
57-
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow"
59+
- env:
60+
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
5861

5962
before_install:
6063
- echo "before_install"

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ Here are just a few of the things that pandas does well:
124124
and saving/loading data from the ultrafast [**HDF5 format**][hdfstore]
125125
- [**Time series**][timeseries]-specific functionality: date range
126126
generation and frequency conversion, moving window statistics,
127-
moving window linear regressions, date shifting and lagging, etc.
127+
date shifting and lagging.
128128

129129

130130
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data

ci/azure/windows.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ jobs:
3131
- bash: |
3232
source activate pandas-dev
3333
conda list
34-
ci\\incremental\\build.cmd
34+
python setup.py build_ext -q -i
35+
python -m pip install --no-build-isolation -e .
3536
displayName: 'Build'
3637
- bash: |
3738
source activate pandas-dev

ci/code_checks.sh

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then
5252
black --version
5353

5454
MSG='Checking black formatting' ; echo $MSG
55-
black . --check
55+
black . --check
5656
RET=$(($RET + $?)) ; echo $MSG "DONE"
5757

5858
# `setup.cfg` contains the list of error codes that are being ignored in flake8
@@ -104,7 +104,7 @@ if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then
104104
isort --version-number
105105

106106
# Imports - Check formatting using isort see setup.cfg for settings
107-
MSG='Check import format using isort ' ; echo $MSG
107+
MSG='Check import format using isort' ; echo $MSG
108108
ISORT_CMD="isort --recursive --check-only pandas asv_bench"
109109
if [[ "$GITHUB_ACTIONS" == "true" ]]; then
110110
eval $ISORT_CMD | awk '{print "##[error]" $0}'; RET=$(($RET + ${PIPESTATUS[0]}))
@@ -203,6 +203,10 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
203203
invgrep -R --include=*.{py,pyx} '\.__class__' pandas
204204
RET=$(($RET + $?)) ; echo $MSG "DONE"
205205

206+
MSG='Check for use of xrange instead of range' ; echo $MSG
207+
invgrep -R --include=*.{py,pyx} 'xrange' pandas
208+
RET=$(($RET + $?)) ; echo $MSG "DONE"
209+
206210
MSG='Check that no file in the repo contains trailing whitespaces' ; echo $MSG
207211
INVGREP_APPEND=" <- trailing whitespaces found"
208212
invgrep -RI --exclude=\*.{svg,c,cpp,html,js} --exclude-dir=env "\s$" *

ci/incremental/build.cmd

Lines changed: 0 additions & 9 deletions
This file was deleted.

ci/run_tests.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,6 @@ sh -c "$PYTEST_CMD"
3838

3939
if [[ "$COVERAGE" && $? == 0 && "$TRAVIS_BRANCH" == "master" ]]; then
4040
echo "uploading coverage"
41-
echo "bash <(curl -s https://codecov.io/bash) -Z -c -F $TYPE -f $COVERAGE_FNAME"
42-
bash <(curl -s https://codecov.io/bash) -Z -c -F $TYPE -f $COVERAGE_FNAME
41+
echo "bash <(curl -s https://codecov.io/bash) -Z -c -f $COVERAGE_FNAME"
42+
bash <(curl -s https://codecov.io/bash) -Z -c -f $COVERAGE_FNAME
4343
fi

ci/setup_env.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,8 @@ echo "conda list"
140140
conda list
141141

142142
# Install DB for Linux
143-
if [ "${TRAVIS_OS_NAME}" == "linux" ]; then
143+
144+
if [[ -n ${SQL:0} ]]; then
144145
echo "installing dbs"
145146
mysql -e 'create database pandas_nosetest;'
146147
psql -c 'create database pandas_nosetest;' -U postgres

doc/redirects.csv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -777,7 +777,7 @@ generated/pandas.io.formats.style.Styler.to_excel,../reference/api/pandas.io.for
777777
generated/pandas.io.formats.style.Styler.use,../reference/api/pandas.io.formats.style.Styler.use
778778
generated/pandas.io.formats.style.Styler.where,../reference/api/pandas.io.formats.style.Styler.where
779779
generated/pandas.io.json.build_table_schema,../reference/api/pandas.io.json.build_table_schema
780-
generated/pandas.io.json.json_normalize,../reference/api/pandas.io.json.json_normalize
780+
generated/pandas.io.json.json_normalize,../reference/api/pandas.json_normalize
781781
generated/pandas.io.stata.StataReader.data_label,../reference/api/pandas.io.stata.StataReader.data_label
782782
generated/pandas.io.stata.StataReader.value_labels,../reference/api/pandas.io.stata.StataReader.value_labels
783783
generated/pandas.io.stata.StataReader.variable_labels,../reference/api/pandas.io.stata.StataReader.variable_labels

doc/source/_static/favicon.ico

-3.81 KB
Binary file not shown.

doc/source/conf.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,11 @@
204204
# Theme options are theme-specific and customize the look and feel of a theme
205205
# further. For a list of options available for each theme, see the
206206
# documentation.
207-
# html_theme_options = {}
207+
html_theme_options = {
208+
"external_links": [],
209+
"github_url": "https://github.com/pandas-dev/pandas",
210+
"twitter_url": "https://twitter.com/pandas_dev",
211+
}
208212

209213
# Add any paths that contain custom themes here, relative to this directory.
210214
# html_theme_path = ["themes"]
@@ -228,7 +232,7 @@
228232
# The name of an image file (within the static path) to use as favicon of the
229233
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
230234
# pixels large.
231-
html_favicon = os.path.join(html_static_path[0], "favicon.ico")
235+
html_favicon = "../../web/pandas/static/img/favicon.ico"
232236

233237
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
234238
# using the given strftime format.

doc/source/getting_started/overview.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,7 @@ Here are just a few of the things that pandas does well:
5757
Excel files, databases, and saving / loading data from the ultrafast **HDF5
5858
format**
5959
- **Time series**-specific functionality: date range generation and frequency
60-
conversion, moving window statistics, moving window linear regressions,
61-
date shifting and lagging, etc.
60+
conversion, moving window statistics, date shifting and lagging.
6261

6362
Many of these principles are here to address the shortcomings frequently
6463
experienced using other languages / scientific research environments. For data

doc/source/reference/io.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,13 +50,13 @@ JSON
5050
:toctree: api/
5151

5252
read_json
53+
json_normalize
5354

5455
.. currentmodule:: pandas.io.json
5556

5657
.. autosummary::
5758
:toctree: api/
5859

59-
json_normalize
6060
build_table_schema
6161

6262
.. currentmodule:: pandas

doc/source/user_guide/io.rst

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2136,27 +2136,26 @@ into a flat table.
21362136

21372137
.. ipython:: python
21382138
2139-
from pandas.io.json import json_normalize
21402139
data = [{'id': 1, 'name': {'first': 'Coleen', 'last': 'Volk'}},
21412140
{'name': {'given': 'Mose', 'family': 'Regner'}},
21422141
{'id': 2, 'name': 'Faye Raker'}]
2143-
json_normalize(data)
2142+
pd.json_normalize(data)
21442143
21452144
.. ipython:: python
21462145
21472146
data = [{'state': 'Florida',
21482147
'shortname': 'FL',
21492148
'info': {'governor': 'Rick Scott'},
2150-
'counties': [{'name': 'Dade', 'population': 12345},
2151-
{'name': 'Broward', 'population': 40000},
2152-
{'name': 'Palm Beach', 'population': 60000}]},
2149+
'county': [{'name': 'Dade', 'population': 12345},
2150+
{'name': 'Broward', 'population': 40000},
2151+
{'name': 'Palm Beach', 'population': 60000}]},
21532152
{'state': 'Ohio',
21542153
'shortname': 'OH',
21552154
'info': {'governor': 'John Kasich'},
2156-
'counties': [{'name': 'Summit', 'population': 1234},
2157-
{'name': 'Cuyahoga', 'population': 1337}]}]
2155+
'county': [{'name': 'Summit', 'population': 1234},
2156+
{'name': 'Cuyahoga', 'population': 1337}]}]
21582157
2159-
json_normalize(data, 'counties', ['state', 'shortname', ['info', 'governor']])
2158+
pd.json_normalize(data, 'county', ['state', 'shortname', ['info', 'governor']])
21602159
21612160
The max_level parameter provides more control over which level to end normalization.
21622161
With max_level=1 the following snippet normalizes until 1st nesting level of the provided dict.
@@ -2169,7 +2168,7 @@ With max_level=1 the following snippet normalizes until 1st nesting level of the
21692168
'Name': 'Name001'}},
21702169
'Image': {'a': 'b'}
21712170
}]
2172-
json_normalize(data, max_level=1)
2171+
pd.json_normalize(data, max_level=1)
21732172
21742173
.. _io.jsonl:
21752174

@@ -4764,10 +4763,10 @@ Parquet supports partitioning of data based on the values of one or more columns
47644763
.. ipython:: python
47654764
47664765
df = pd.DataFrame({'a': [0, 0, 1, 1], 'b': [0, 1, 0, 1]})
4767-
df.to_parquet(fname='test', engine='pyarrow',
4766+
df.to_parquet(path='test', engine='pyarrow',
47684767
partition_cols=['a'], compression=None)
47694768
4770-
The `fname` specifies the parent directory to which data will be saved.
4769+
The `path` specifies the parent directory to which data will be saved.
47714770
The `partition_cols` are the column names by which the dataset will be partitioned.
47724771
Columns are partitioned in the order they are given. The partition splits are
47734772
determined by the unique values in the partition columns.
@@ -4829,7 +4828,6 @@ See also some :ref:`cookbook examples <cookbook.sql>` for some advanced strategi
48294828
The key functions are:
48304829

48314830
.. autosummary::
4832-
:toctree: ../reference/api/
48334831

48344832
read_sql_table
48354833
read_sql_query

doc/source/user_guide/text.rst

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ These are places where the behavior of ``StringDtype`` objects differ from
7474
l. For ``StringDtype``, :ref:`string accessor methods<api.series.str>`
7575
that return **numeric** output will always return a nullable integer dtype,
7676
rather than either int or float dtype, depending on the presence of NA values.
77+
Methods returning **boolean** output will return a nullable boolean dtype.
7778

7879
.. ipython:: python
7980
@@ -89,12 +90,22 @@ l. For ``StringDtype``, :ref:`string accessor methods<api.series.str>`
8990
s.astype(object).str.count("a")
9091
s.astype(object).dropna().str.count("a")
9192
92-
When NA values are present, the output dtype is float64.
93+
When NA values are present, the output dtype is float64. Similarly for
94+
methods returning boolean values.
95+
96+
.. ipython:: python
97+
98+
s.str.isdigit()
99+
s.str.match("a")
93100
94101
2. Some string methods, like :meth:`Series.str.decode` are not available
95102
on ``StringArray`` because ``StringArray`` only holds strings, not
96103
bytes.
97-
104+
3. In comparision operations, :class:`arrays.StringArray` and ``Series`` backed
105+
by a ``StringArray`` will return an object with :class:`BooleanDtype`,
106+
rather than a ``bool`` dtype object. Missing values in a ``StringArray``
107+
will propagate in comparision operations, rather than always comparing
108+
unequal like :attr:`numpy.nan`.
98109

99110
Everything else that follows in the rest of this document applies equally to
100111
``string`` and ``object`` dtype.

doc/source/whatsnew/v0.25.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ which level to end normalization (:issue:`23843`):
170170

171171
The repr now looks like this:
172172

173-
.. ipython:: python
173+
.. code-block:: ipython
174174
175175
from pandas.io.json import json_normalize
176176
data = [{

doc/source/whatsnew/v1.0.0.rst

100755100644
Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,9 @@ Other enhancements
205205
(:meth:`~DataFrame.to_parquet` / :func:`read_parquet`) using the `'pyarrow'` engine
206206
now preserve those data types with pyarrow >= 1.0.0 (:issue:`20612`).
207207
- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`)
208+
- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue: `30270`)
209+
- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`)
210+
208211

209212
Build Changes
210213
^^^^^^^^^^^^^
@@ -252,10 +255,10 @@ To update, use ``MultiIndex.set_names``, which returns a new ``MultiIndex``.
252255
mi2 = mi.set_names("new name", level=0)
253256
mi2.names
254257
255-
New repr for :class:`pandas.core.arrays.IntervalArray`
256-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
258+
New repr for :class:`~pandas.arrays.IntervalArray`
259+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
257260

258-
- :class:`pandas.core.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`)
261+
- :class:`pandas.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`)
259262

260263
*pandas 0.25.x*
261264

@@ -486,6 +489,7 @@ Documentation Improvements
486489
Deprecations
487490
~~~~~~~~~~~~
488491

492+
- :meth:`Series.item` and :meth:`Index.item` have been _undeprecated_ (:issue:`29250`)
489493
- ``Index.set_value`` has been deprecated. For a given index ``idx``, array ``arr``,
490494
value in ``idx`` of ``idx_val`` and a new value of ``val``, ``idx.set_value(arr, idx_val, val)``
491495
is equivalent to ``arr[idx.get_loc(idx_val)] = val``, which should be used instead (:issue:`28621`).
@@ -495,7 +499,11 @@ Deprecations
495499
- The parameter ``numeric_only`` of :meth:`Categorical.min` and :meth:`Categorical.max` is deprecated and replaced with ``skipna`` (:issue:`25303`)
496500
- The parameter ``label`` in :func:`lreshape` has been deprecated and will be removed in a future version (:issue:`29742`)
497501
- ``pandas.core.index`` has been deprecated and will be removed in a future version, the public classes are available in the top-level namespace (:issue:`19711`)
498-
-
502+
- :func:`pandas.json_normalize` is now exposed in the top-level namespace.
503+
Usage of ``json_normalize`` as ``pandas.io.json.json_normalize`` is now deprecated and
504+
it is recommended to use ``json_normalize`` as :func:`pandas.json_normalize` instead (:issue:`27586`).
505+
- :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_feather`, and :meth:`DataFrame.to_parquet` argument "fname" is deprecated, use "path" instead (:issue:`23574`)
506+
499507

500508
.. _whatsnew_1000.prior_deprecations:
501509

@@ -571,7 +579,7 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
571579
- :meth:`Series.where` with ``Categorical`` dtype (or :meth:`DataFrame.where` with ``Categorical`` column) no longer allows setting new categories (:issue:`24114`)
572580
- :class:`DatetimeIndex`, :class:`TimedeltaIndex`, and :class:`PeriodIndex` constructors no longer allow ``start``, ``end``, and ``periods`` keywords, use :func:`date_range`, :func:`timedelta_range`, and :func:`period_range` instead (:issue:`23919`)
573581
- :class:`DatetimeIndex` and :class:`TimedeltaIndex` constructors no longer have a ``verify_integrity`` keyword argument (:issue:`23919`)
574-
- :func:`core.internals.blocks.make_block` no longer accepts the "fastpath" keyword(:issue:`19265`)
582+
- ``pandas.core.internals.blocks.make_block`` no longer accepts the "fastpath" keyword(:issue:`19265`)
575583
- :meth:`Block.make_block_same_class` no longer accepts the "dtype" keyword(:issue:`19434`)
576584
- Removed the previously deprecated :meth:`ExtensionArray._formatting_values`. Use :attr:`ExtensionArray._formatter` instead. (:issue:`23601`)
577585
- Removed the previously deprecated :meth:`MultiIndex.to_hierarchical` (:issue:`21613`)
@@ -648,7 +656,7 @@ Performance improvements
648656
~~~~~~~~~~~~~~~~~~~~~~~~
649657

650658
- Performance improvement in indexing with a non-unique :class:`IntervalIndex` (:issue:`27489`)
651-
- Performance improvement in `MultiIndex.is_monotonic` (:issue:`27495`)
659+
- Performance improvement in :attr:`MultiIndex.is_monotonic` (:issue:`27495`)
652660
- Performance improvement in :func:`cut` when ``bins`` is an :class:`IntervalIndex` (:issue:`27668`)
653661
- Performance improvement when initializing a :class:`DataFrame` using a ``range`` (:issue:`30171`)
654662
- Performance improvement in :meth:`DataFrame.corr` when ``method`` is ``"spearman"`` (:issue:`28139`)
@@ -703,6 +711,8 @@ Datetimelike
703711
- Bug in :attr:`Timestamp.resolution` being a property instead of a class attribute (:issue:`29910`)
704712
- Bug in :func:`pandas.to_datetime` when called with ``None`` raising ``TypeError`` instead of returning ``NaT`` (:issue:`30011`)
705713
- Bug in :func:`pandas.to_datetime` failing for `deques` when using ``cache=True`` (the default) (:issue:`29403`)
714+
- Bug in :meth:`Series.item` with ``datetime64`` or ``timedelta64`` dtype, :meth:`DatetimeIndex.item`, and :meth:`TimedeltaIndex.item` returning an integer instead of a :class:`Timestamp` or :class:`Timedelta` (:issue:`30175`)
715+
- Bug in :class:`DatetimeIndex` addition when adding a non-optimized :class:`DateOffset` incorrectly dropping timezone information (:issue:`30336`)
706716

707717
Timedelta
708718
^^^^^^^^^
@@ -749,7 +759,7 @@ Interval
749759
^^^^^^^^
750760

751761
- Bug in :meth:`IntervalIndex.get_indexer` where a :class:`Categorical` or :class:`CategoricalIndex` ``target`` would incorrectly raise a ``TypeError`` (:issue:`30063`)
752-
-
762+
- Bug in ``pandas.core.dtypes.cast.infer_dtype_from_scalar`` where passing ``pandas_dtype=True`` did not infer :class:`IntervalDtype` (:issue:`30337`)
753763

754764
Indexing
755765
^^^^^^^^
@@ -813,7 +823,7 @@ Plotting
813823
- Bug in the ``xticks`` argument being ignored for :meth:`DataFrame.plot.bar` (:issue:`14119`)
814824
- :func:`set_option` now validates that the plot backend provided to ``'plotting.backend'`` implements the backend when the option is set, rather than when a plot is created (:issue:`28163`)
815825
- :meth:`DataFrame.plot` now allow a ``backend`` keyword arugment to allow changing between backends in one session (:issue:`28619`).
816-
- Bug in color validation incorrectly raising for non-color styles (:issue:`29122`).
826+
- Bug in color validation incorrectly raising for non-color styles (:issue:`30163`).
817827

818828
Groupby/resample/rolling
819829
^^^^^^^^^^^^^^^^^^^^^^^^
@@ -833,6 +843,7 @@ Groupby/resample/rolling
833843
- Bug in :meth:`DataFrame.groupby` where ``any``, ``all``, ``nunique`` and transform functions would incorrectly handle duplicate column labels (:issue:`21668`)
834844
- Bug in :meth:`DataFrameGroupBy.agg` with timezone-aware datetime64 column incorrectly casting results to the original dtype (:issue:`29641`)
835845
- Bug in :meth:`DataFrame.groupby` when using axis=1 and having a single level columns index (:issue:`30208`)
846+
- Bug in :meth:`DataFrame.groupby` when using nunique on axis=1 (:issue:`30253`)
836847

837848
Reshaping
838849
^^^^^^^^^

0 commit comments

Comments
 (0)