Skip to content

Commit 9b92536

Browse files
committed
Merge remote-tracking branch 'upstream/master' into mcmali-s3-pub-test
2 parents e308bf8 + d5139bb commit 9b92536

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+926
-451
lines changed

.devcontainer.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@
1717
"python.linting.pylintEnabled": false,
1818
"python.linting.mypyEnabled": true,
1919
"python.testing.pytestEnabled": true,
20-
"python.testing.cwd": "pandas/tests"
20+
"python.testing.pytestArgs": [
21+
"pandas"
22+
]
2123
},
2224

2325
// Add the IDs of extensions you want installed when the container is created in the array below.

ci/deps/azure-36-locale.yaml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@ dependencies:
1515

1616
# pandas dependencies
1717
- beautifulsoup4
18-
- gcsfs
1918
- html5lib
2019
- ipython
2120
- jinja2
@@ -31,7 +30,6 @@ dependencies:
3130
- pytables
3231
- python-dateutil
3332
- pytz
34-
- s3fs
3533
- scipy
3634
- xarray
3735
- xlrd

ci/deps/azure-37-locale.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ dependencies:
2727
- pytables
2828
- python-dateutil
2929
- pytz
30-
- s3fs
3130
- scipy
3231
- xarray
3332
- xlrd

ci/deps/azure-windows-37.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ dependencies:
1515
# pandas dependencies
1616
- beautifulsoup4
1717
- bottleneck
18-
- gcsfs
18+
- fsspec>=0.7.4
19+
- gcsfs>=0.6.0
1920
- html5lib
2021
- jinja2
2122
- lxml
@@ -28,7 +29,7 @@ dependencies:
2829
- pytables
2930
- python-dateutil
3031
- pytz
31-
- s3fs
32+
- s3fs>=0.4.0
3233
- scipy
3334
- sqlalchemy
3435
- xlrd

ci/deps/travis-36-cov.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ dependencies:
1818
- cython>=0.29.16
1919
- dask
2020
- fastparquet>=0.3.2
21-
- gcsfs
21+
- fsspec>=0.7.4
22+
- gcsfs>=0.6.0
2223
- geopandas
2324
- html5lib
2425
- matplotlib
@@ -35,7 +36,7 @@ dependencies:
3536
- pytables
3637
- python-snappy
3738
- pytz
38-
- s3fs
39+
- s3fs>=0.4.0
3940
- scikit-learn
4041
- scipy
4142
- sqlalchemy

ci/deps/travis-36-locale.yaml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ dependencies:
1616
- blosc=1.14.3
1717
- python-blosc
1818
- fastparquet=0.3.2
19-
- gcsfs=0.2.2
2019
- html5lib
2120
- ipython
2221
- jinja2
@@ -33,7 +32,6 @@ dependencies:
3332
- pytables
3433
- python-dateutil
3534
- pytz
36-
- s3fs=0.3.0
3735
- scipy
3836
- sqlalchemy=1.1.4
3937
- xarray=0.10

ci/deps/travis-36-slow.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ dependencies:
1313

1414
# pandas dependencies
1515
- beautifulsoup4
16+
- fsspec>=0.7.4
1617
- html5lib
1718
- lxml
1819
- matplotlib
@@ -25,7 +26,7 @@ dependencies:
2526
- pytables
2627
- python-dateutil
2728
- pytz
28-
- s3fs
29+
- s3fs>=0.4.0
2930
- scipy
3031
- sqlalchemy
3132
- xlrd

ci/deps/travis-37.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,13 @@ dependencies:
1313

1414
# pandas dependencies
1515
- botocore>=1.11
16+
- fsspec>=0.7.4
1617
- numpy
1718
- python-dateutil
1819
- nomkl
1920
- pyarrow
2021
- pytz
21-
- s3fs
22+
- s3fs>=0.4.0
2223
- tabulate
2324
- pyreadstat
2425
- pip

doc/redirects.csv

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -269,11 +269,11 @@ generated/pandas.core.resample.Resampler.std,../reference/api/pandas.core.resamp
269269
generated/pandas.core.resample.Resampler.sum,../reference/api/pandas.core.resample.Resampler.sum
270270
generated/pandas.core.resample.Resampler.transform,../reference/api/pandas.core.resample.Resampler.transform
271271
generated/pandas.core.resample.Resampler.var,../reference/api/pandas.core.resample.Resampler.var
272-
generated/pandas.core.window.EWM.corr,../reference/api/pandas.core.window.EWM.corr
273-
generated/pandas.core.window.EWM.cov,../reference/api/pandas.core.window.EWM.cov
274-
generated/pandas.core.window.EWM.mean,../reference/api/pandas.core.window.EWM.mean
275-
generated/pandas.core.window.EWM.std,../reference/api/pandas.core.window.EWM.std
276-
generated/pandas.core.window.EWM.var,../reference/api/pandas.core.window.EWM.var
272+
generated/pandas.core.window.ExponentialMovingWindow.corr,../reference/api/pandas.core.window.ExponentialMovingWindow.corr
273+
generated/pandas.core.window.ExponentialMovingWindow.cov,../reference/api/pandas.core.window.ExponentialMovingWindow.cov
274+
generated/pandas.core.window.ExponentialMovingWindow.mean,../reference/api/pandas.core.window.ExponentialMovingWindow.mean
275+
generated/pandas.core.window.ExponentialMovingWindow.std,../reference/api/pandas.core.window.ExponentialMovingWindow.std
276+
generated/pandas.core.window.ExponentialMovingWindow.var,../reference/api/pandas.core.window.ExponentialMovingWindow.var
277277
generated/pandas.core.window.Expanding.aggregate,../reference/api/pandas.core.window.Expanding.aggregate
278278
generated/pandas.core.window.Expanding.apply,../reference/api/pandas.core.window.Expanding.apply
279279
generated/pandas.core.window.Expanding.corr,../reference/api/pandas.core.window.Expanding.corr

doc/source/getting_started/install.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -267,8 +267,9 @@ SQLAlchemy 1.1.4 SQL support for databases other tha
267267
SciPy 0.19.0 Miscellaneous statistical functions
268268
XLsxWriter 0.9.8 Excel writing
269269
blosc Compression for HDF5
270+
fsspec 0.7.4 Handling files aside from local and HTTP
270271
fastparquet 0.3.2 Parquet reading / writing
271-
gcsfs 0.2.2 Google Cloud Storage access
272+
gcsfs 0.6.0 Google Cloud Storage access
272273
html5lib HTML parser for read_html (see :ref:`note <optional_html>`)
273274
lxml 3.8.0 HTML parser for read_html (see :ref:`note <optional_html>`)
274275
matplotlib 2.2.2 Visualization
@@ -282,7 +283,7 @@ pyreadstat SPSS files (.sav) reading
282283
pytables 3.4.3 HDF5 reading / writing
283284
pyxlsb 1.0.6 Reading for xlsb files
284285
qtpy Clipboard I/O
285-
s3fs 0.3.0 Amazon S3 access
286+
s3fs 0.4.0 Amazon S3 access
286287
tabulate 0.8.3 Printing in Markdown-friendly format (see `tabulate`_)
287288
xarray 0.8.2 pandas-like API for N-dimensional data
288289
xclip Clipboard I/O on linux

doc/source/reference/window.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Window
88

99
Rolling objects are returned by ``.rolling`` calls: :func:`pandas.DataFrame.rolling`, :func:`pandas.Series.rolling`, etc.
1010
Expanding objects are returned by ``.expanding`` calls: :func:`pandas.DataFrame.expanding`, :func:`pandas.Series.expanding`, etc.
11-
EWM objects are returned by ``.ewm`` calls: :func:`pandas.DataFrame.ewm`, :func:`pandas.Series.ewm`, etc.
11+
ExponentialMovingWindow objects are returned by ``.ewm`` calls: :func:`pandas.DataFrame.ewm`, :func:`pandas.Series.ewm`, etc.
1212

1313
Standard moving window functions
1414
--------------------------------
@@ -69,11 +69,11 @@ Exponentially-weighted moving window functions
6969
.. autosummary::
7070
:toctree: api/
7171

72-
EWM.mean
73-
EWM.std
74-
EWM.var
75-
EWM.corr
76-
EWM.cov
72+
ExponentialMovingWindow.mean
73+
ExponentialMovingWindow.std
74+
ExponentialMovingWindow.var
75+
ExponentialMovingWindow.corr
76+
ExponentialMovingWindow.cov
7777

7878
Window indexer
7979
--------------

doc/source/user_guide/advanced.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -260,7 +260,9 @@ You don't have to specify all levels of the ``MultiIndex`` by passing only the
260260
first elements of the tuple. For example, you can use "partial" indexing to
261261
get all elements with ``bar`` in the first level as follows:
262262

263-
df.loc['bar']
263+
.. ipython:: python
264+
265+
df.loc['bar']
264266
265267
This is a shortcut for the slightly more verbose notation ``df.loc[('bar',),]`` (equivalent
266268
to ``df.loc['bar',]`` in this example).

doc/source/user_guide/computation.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ see the :ref:`groupby docs <groupby.transform.window_resample>`.
230230
The API for window statistics is quite similar to the way one works with ``GroupBy`` objects, see the documentation :ref:`here <groupby>`.
231231

232232
We work with ``rolling``, ``expanding`` and ``exponentially weighted`` data through the corresponding
233-
objects, :class:`~pandas.core.window.Rolling`, :class:`~pandas.core.window.Expanding` and :class:`~pandas.core.window.EWM`.
233+
objects, :class:`~pandas.core.window.Rolling`, :class:`~pandas.core.window.Expanding` and :class:`~pandas.core.window.ExponentialMovingWindow`.
234234

235235
.. ipython:: python
236236
@@ -777,7 +777,7 @@ columns by reshaping and indexing:
777777
Aggregation
778778
-----------
779779

780-
Once the ``Rolling``, ``Expanding`` or ``EWM`` objects have been created, several methods are available to
780+
Once the ``Rolling``, ``Expanding`` or ``ExponentialMovingWindow`` objects have been created, several methods are available to
781781
perform multiple computations on the data. These operations are similar to the :ref:`aggregating API <basics.aggregate>`,
782782
:ref:`groupby API <groupby.aggregate>`, and :ref:`resample API <timeseries.aggregate>`.
783783

@@ -971,7 +971,7 @@ Exponentially weighted windows
971971

972972
A related set of functions are exponentially weighted versions of several of
973973
the above statistics. A similar interface to ``.rolling`` and ``.expanding`` is accessed
974-
through the ``.ewm`` method to receive an :class:`~EWM` object.
974+
through the ``.ewm`` method to receive an :class:`~ExponentialMovingWindow` object.
975975
A number of expanding EW (exponentially weighted)
976976
methods are provided:
977977

@@ -980,11 +980,11 @@ methods are provided:
980980
:header: "Function", "Description"
981981
:widths: 20, 80
982982

983-
:meth:`~EWM.mean`, EW moving average
984-
:meth:`~EWM.var`, EW moving variance
985-
:meth:`~EWM.std`, EW moving standard deviation
986-
:meth:`~EWM.corr`, EW moving correlation
987-
:meth:`~EWM.cov`, EW moving covariance
983+
:meth:`~ExponentialMovingWindow.mean`, EW moving average
984+
:meth:`~ExponentialMovingWindow.var`, EW moving variance
985+
:meth:`~ExponentialMovingWindow.std`, EW moving standard deviation
986+
:meth:`~ExponentialMovingWindow.corr`, EW moving correlation
987+
:meth:`~ExponentialMovingWindow.cov`, EW moving covariance
988988

989989
In general, a weighted moving average is calculated as
990990

@@ -1090,12 +1090,12 @@ Here is an example for a univariate time series:
10901090
@savefig ewma_ex.png
10911091
s.ewm(span=20).mean().plot(style='k')
10921092
1093-
EWM has a ``min_periods`` argument, which has the same
1093+
ExponentialMovingWindow has a ``min_periods`` argument, which has the same
10941094
meaning it does for all the ``.expanding`` and ``.rolling`` methods:
10951095
no output values will be set until at least ``min_periods`` non-null values
10961096
are encountered in the (expanding) window.
10971097

1098-
EWM also has an ``ignore_na`` argument, which determines how
1098+
ExponentialMovingWindow also has an ``ignore_na`` argument, which determines how
10991099
intermediate null values affect the calculation of the weights.
11001100
When ``ignore_na=False`` (the default), weights are calculated based on absolute
11011101
positions, so that intermediate null values affect the result.

doc/source/user_guide/cookbook.rst

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1166,6 +1166,25 @@ Storing Attributes to a group node
11661166
store.close()
11671167
os.remove('test.h5')
11681168
1169+
You can create or load a HDFStore in-memory by passing the ``driver``
1170+
parameter to PyTables. Changes are only written to disk when the HDFStore
1171+
is closed.
1172+
1173+
.. ipython:: python
1174+
1175+
store = pd.HDFStore('test.h5', 'w', diver='H5FD_CORE')
1176+
1177+
df = pd.DataFrame(np.random.randn(8, 3))
1178+
store['test'] = df
1179+
1180+
# only after closing the store, data is written to disk:
1181+
store.close()
1182+
1183+
.. ipython:: python
1184+
:suppress:
1185+
1186+
os.remove('test.h5')
1187+
11691188
.. _cookbook.binary:
11701189

11711190
Binary files

doc/source/user_guide/timeseries.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,8 @@ inferred frequency upon creation:
235235
236236
pd.DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], freq='infer')
237237
238+
.. _timeseries.converting.format:
239+
238240
Providing a format argument
239241
~~~~~~~~~~~~~~~~~~~~~~~~~~~
240242

@@ -319,6 +321,12 @@ which can be specified. These are computed from the starting point specified by
319321
pd.to_datetime([1349720105100, 1349720105200, 1349720105300,
320322
1349720105400, 1349720105500], unit='ms')
321323
324+
.. note::
325+
326+
The ``unit`` parameter does not use the same strings as the ``format`` parameter
327+
that was discussed :ref:`above<timeseries.converting.format>`). The
328+
available units are listed on the documentation for :func:`pandas.to_datetime`.
329+
322330
Constructing a :class:`Timestamp` or :class:`DatetimeIndex` with an epoch timestamp
323331
with the ``tz`` argument specified will currently localize the epoch timestamps to UTC
324332
first then convert the result to the specified time zone. However, this behavior

doc/source/whatsnew/v0.25.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1206,7 +1206,7 @@ Groupby/resample/rolling
12061206
- Bug in :meth:`pandas.core.groupby.GroupBy.agg` where incorrect results are returned for uint64 columns. (:issue:`26310`)
12071207
- Bug in :meth:`pandas.core.window.Rolling.median` and :meth:`pandas.core.window.Rolling.quantile` where MemoryError is raised with empty window (:issue:`26005`)
12081208
- Bug in :meth:`pandas.core.window.Rolling.median` and :meth:`pandas.core.window.Rolling.quantile` where incorrect results are returned with ``closed='left'`` and ``closed='neither'`` (:issue:`26005`)
1209-
- Improved :class:`pandas.core.window.Rolling`, :class:`pandas.core.window.Window` and :class:`pandas.core.window.EWM` functions to exclude nuisance columns from results instead of raising errors and raise a ``DataError`` only if all columns are nuisance (:issue:`12537`)
1209+
- Improved :class:`pandas.core.window.Rolling`, :class:`pandas.core.window.Window` and :class:`pandas.core.window.ExponentialMovingWindow` functions to exclude nuisance columns from results instead of raising errors and raise a ``DataError`` only if all columns are nuisance (:issue:`12537`)
12101210
- Bug in :meth:`pandas.core.window.Rolling.max` and :meth:`pandas.core.window.Rolling.min` where incorrect results are returned with an empty variable window (:issue:`26005`)
12111211
- Raise a helpful exception when an unsupported weighted window function is used as an argument of :meth:`pandas.core.window.Window.aggregate` (:issue:`26597`)
12121212

doc/source/whatsnew/v1.1.0.rst

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,22 @@ If needed you can adjust the bins with the argument ``offset`` (a Timedelta) tha
245245

246246
For a full example, see: :ref:`timeseries.adjust-the-start-of-the-bins`.
247247

248+
fsspec now used for filesystem handling
249+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
250+
251+
For reading and writing to filesystems other than local and reading from HTTP(S),
252+
the optional dependency ``fsspec`` will be used to dispatch operations (:issue:`33452`).
253+
This will give unchanged
254+
functionality for S3 and GCS storage, which were already supported, but also add
255+
support for several other storage implementations such as `Azure Datalake and Blob`_,
256+
SSH, FTP, dropbox and github. For docs and capabilities, see the `fsspec docs`_.
257+
258+
The existing capability to interface with S3 and GCS will be unaffected by this
259+
change, as ``fsspec`` will still bring in the same packages as before.
260+
261+
.. _Azure Datalake and Blob: https://github.com/dask/adlfs
262+
263+
.. _fsspec docs: https://filesystem-spec.readthedocs.io/en/latest/
248264

249265
.. _whatsnew_110.enhancements.other:
250266

@@ -292,11 +308,13 @@ Other enhancements
292308
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` now accept an ``errors`` argument (:issue:`22610`)
293309
- :meth:`groupby.transform` now allows ``func`` to be ``pad``, ``backfill`` and ``cumcount`` (:issue:`31269`).
294310
- :meth:`~pandas.io.json.read_json` now accepts `nrows` parameter. (:issue:`33916`).
311+
- :meth:`DataFrame.hist`, :meth:`Series.hist`, :meth:`core.groupby.DataFrameGroupBy.hist`, and :meth:`core.groupby.SeriesGroupBy.hist` have gained the ``legend`` argument. Set to True to show a legend in the histogram. (:issue:`6279`)
295312
- :func:`concat` and :meth:`~DataFrame.append` now preserve extension dtypes, for example
296313
combining a nullable integer column with a numpy integer column will no longer
297314
result in object dtype but preserve the integer dtype (:issue:`33607`, :issue:`34339`).
298315
- :meth:`~pandas.io.gbq.read_gbq` now allows to disable progress bar (:issue:`33360`).
299316
- :meth:`~pandas.io.gbq.read_gbq` now supports the ``max_results`` kwarg from ``pandas-gbq`` (:issue:`34639`).
317+
- :meth:`Dataframe.cov` and :meth:`Series.cov` now support a new parameter ddof to support delta degrees of freedom as in the corresponding numpy methods (:issue:`34611`).
300318
- :meth:`DataFrame.to_html` and :meth:`DataFrame.to_string`'s ``col_space`` parameter now accepts a list of dict to change only some specific columns' width (:issue:`28917`).
301319

302320
.. ---------------------------------------------------------------------------
@@ -700,7 +718,9 @@ Optional libraries below the lowest tested version may still work, but are not c
700718
+-----------------+-----------------+---------+
701719
| fastparquet | 0.3.2 | |
702720
+-----------------+-----------------+---------+
703-
| gcsfs | 0.2.2 | |
721+
| fsspec | 0.7.4 | |
722+
+-----------------+-----------------+---------+
723+
| gcsfs | 0.6.0 | X |
704724
+-----------------+-----------------+---------+
705725
| lxml | 3.8.0 | |
706726
+-----------------+-----------------+---------+
@@ -716,7 +736,7 @@ Optional libraries below the lowest tested version may still work, but are not c
716736
+-----------------+-----------------+---------+
717737
| pytables | 3.4.3 | X |
718738
+-----------------+-----------------+---------+
719-
| s3fs | 0.3.0 | |
739+
| s3fs | 0.4.0 | X |
720740
+-----------------+-----------------+---------+
721741
| scipy | 1.2.0 | X |
722742
+-----------------+-----------------+---------+
@@ -956,6 +976,7 @@ MultiIndex
956976
df.loc[(['b', 'a'], [2, 1]), :]
957977
958978
- Bug in :meth:`MultiIndex.intersection` was not guaranteed to preserve order when ``sort=False``. (:issue:`31325`)
979+
- Bug in :meth:`DataFrame.truncate` was dropping :class:`MultiIndex` names. (:issue:`34564`)
959980

960981
.. ipython:: python
961982

environment.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,9 @@ dependencies:
9898

9999
- pyqt>=5.9.2 # pandas.read_clipboard
100100
- pytables>=3.4.3 # pandas.read_hdf, DataFrame.to_hdf
101-
- s3fs # pandas.read_csv... when using 's3://...' path
101+
- s3fs>=0.4.0 # file IO when using 's3://...' path
102+
- fsspec>=0.7.4 # for generic remote file operations
103+
- gcsfs>=0.6.0 # file IO when using 'gcs://...' path
102104
- sqlalchemy # pandas.read_sql, DataFrame.to_sql
103105
- xarray # DataFrame.to_xarray
104106
- cftime # Needed for downstream xarray.CFTimeIndex test

0 commit comments

Comments
 (0)