Skip to content

Commit e273ead

Browse files
authored
Merge pull request #8 from pandas-dev/master
Updation of fork
2 parents da1ccc1 + b528be6 commit e273ead

File tree

159 files changed

+2262
-1415
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

159 files changed

+2262
-1415
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ its way towards this goal.
3232
Here are just a few of the things that pandas does well:
3333

3434
- Easy handling of [**missing data**][missing-data] (represented as
35-
`NaN`) in floating point as well as non-floating point data
35+
`NaN`, `NA`, or `NaT`) in floating point as well as non-floating point data
3636
- Size mutability: columns can be [**inserted and
3737
deleted**][insertion-deletion] from DataFrame and higher dimensional
3838
objects

asv_bench/asv.conf.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
// The Pythons you'd like to test against. If not provided, defaults
2727
// to the current version of Python used to run `asv`.
2828
// "pythons": ["2.7", "3.4"],
29-
"pythons": ["3.6"],
29+
"pythons": ["3.8"],
3030

3131
// The matrix of dependencies to test. Each key is the name of a
3232
// package (in PyPI) and the values are version numbers. An empty

ci/deps/azure-37-locale.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ dependencies:
2121
- lxml
2222
- matplotlib>=3.3.0
2323
- moto
24+
- flask
2425
- nomkl
2526
- numexpr
2627
- numpy=1.16.*

ci/deps/azure-37-slow.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,11 @@ dependencies:
2727
- python-dateutil
2828
- pytz
2929
- s3fs>=0.4.0
30+
- moto>=1.3.14
3031
- scipy
3132
- sqlalchemy
3233
- xlrd
3334
- xlsxwriter
3435
- xlwt
3536
- moto
37+
- flask

ci/deps/azure-38-locale.yaml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,15 @@ dependencies:
66

77
# tools
88
- cython>=0.29.16
9-
- pytest>=5.0.1,<6.0.0 # https://github.com/pandas-dev/pandas/issues/35620
9+
- pytest>=5.0.1
1010
- pytest-xdist>=1.21
11-
- pytest-asyncio
11+
- pytest-asyncio>=0.12.0
1212
- hypothesis>=3.58.0
1313
- pytest-azurepipelines
1414

1515
# pandas dependencies
1616
- beautifulsoup4
17+
- flask
1718
- html5lib
1819
- ipython
1920
- jinja2
@@ -32,6 +33,7 @@ dependencies:
3233
- xlrd
3334
- xlsxwriter
3435
- xlwt
36+
- moto
3537
- pyarrow>=0.15
3638
- pip
3739
- pip:

ci/deps/azure-windows-37.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,28 +8,29 @@ dependencies:
88
# tools
99
- cython>=0.29.16
1010
- pytest>=5.0.1
11-
- pytest-xdist>=1.21,<2.0.0 # GH 35737
11+
- pytest-xdist>=1.21
1212
- hypothesis>=3.58.0
1313
- pytest-azurepipelines
1414

1515
# pandas dependencies
1616
- beautifulsoup4
1717
- bottleneck
18-
- fsspec>=0.7.4
18+
- fsspec>=0.8.0
1919
- gcsfs>=0.6.0
2020
- html5lib
2121
- jinja2
2222
- lxml
2323
- matplotlib=2.2.*
24-
- moto
24+
- moto>=1.3.14
25+
- flask
2526
- numexpr
2627
- numpy=1.16.*
2728
- openpyxl
2829
- pyarrow=0.15
2930
- pytables
3031
- python-dateutil
3132
- pytz
32-
- s3fs>=0.4.0
33+
- s3fs>=0.4.2
3334
- scipy
3435
- sqlalchemy
3536
- xlrd

ci/deps/azure-windows-38.yaml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,18 @@ dependencies:
88
# tools
99
- cython>=0.29.16
1010
- pytest>=5.0.1
11-
- pytest-xdist>=1.21,<2.0.0 # GH 35737
11+
- pytest-xdist>=1.21
1212
- hypothesis>=3.58.0
1313
- pytest-azurepipelines
1414

1515
# pandas dependencies
1616
- blosc
1717
- bottleneck
1818
- fastparquet>=0.3.2
19+
- flask
20+
- fsspec>=0.8.0
1921
- matplotlib=3.1.3
22+
- moto>=1.3.14
2023
- numba
2124
- numexpr
2225
- numpy=1.18.*
@@ -26,6 +29,7 @@ dependencies:
2629
- pytables
2730
- python-dateutil
2831
- pytz
32+
- s3fs>=0.4.0
2933
- scipy
3034
- xlrd
3135
- xlsxwriter

ci/deps/travis-37-arm64.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,5 +17,6 @@ dependencies:
1717
- python-dateutil
1818
- pytz
1919
- pip
20+
- flask
2021
- pip:
2122
- moto

ci/deps/travis-37-cov.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ dependencies:
2323
- geopandas
2424
- html5lib
2525
- matplotlib
26-
- moto
26+
- moto>=1.3.14
27+
- flask
2728
- nomkl
2829
- numexpr
2930
- numpy=1.16.*

ci/deps/travis-37-locale.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,14 @@ dependencies:
2121
- jinja2
2222
- lxml=4.3.0
2323
- matplotlib=3.0.*
24-
- moto
2524
- nomkl
2625
- numexpr
2726
- numpy
2827
- openpyxl
2928
- pandas-gbq=0.12.0
29+
- pyarrow>=0.17
3030
- psycopg2=2.7
31+
- pyarrow>=0.15.0 # GH #35813
3132
- pymysql=0.7.11
3233
- pytables
3334
- python-dateutil

ci/deps/travis-37.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,8 @@ dependencies:
2020
- pyarrow
2121
- pytz
2222
- s3fs>=0.4.0
23+
- moto>=1.3.14
24+
- flask
2325
- tabulate
2426
- pyreadstat
2527
- pip
26-
- pip:
27-
- moto

doc/source/development/contributing.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -204,6 +204,7 @@ You will need `Build Tools for Visual Studio 2017
204204
You DO NOT need to install Visual Studio 2019.
205205
You only need "Build Tools for Visual Studio 2019" found by
206206
scrolling down to "All downloads" -> "Tools for Visual Studio 2019".
207+
In the installer, select the "C++ build tools" workload.
207208

208209
**Mac OS**
209210

doc/source/development/roadmap.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,32 @@ need to implement certain operations expected by pandas users (for example
5353
the algorithm used in, ``Series.str.upper``). That work may be done outside of
5454
pandas.
5555

56+
Consistent missing value handling
57+
---------------------------------
58+
59+
Currently, pandas handles missing data differently for different data types. We
60+
use different types to indicate that a value is missing (``np.nan`` for
61+
floating-point data, ``np.nan`` or ``None`` for object-dtype data -- typically
62+
strings or booleans -- with missing values, and ``pd.NaT`` for datetimelike
63+
data). Integer data cannot store missing data or are cast to float. In addition,
64+
pandas 1.0 introduced a new missing value sentinel, ``pd.NA``, which is being
65+
used for the experimental nullable integer, boolean, and string data types.
66+
67+
These different missing values have different behaviors in user-facing
68+
operations. Specifically, we introduced different semantics for the nullable
69+
data types for certain operations (e.g. propagating in comparison operations
70+
instead of comparing as False).
71+
72+
Long term, we want to introduce consistent missing data handling for all data
73+
types. This includes consistent behavior in all operations (indexing, arithmetic
74+
operations, comparisons, etc.). We want to eventually make the new semantics the
75+
default.
76+
77+
This has been discussed at
78+
`github #28095 <https://github.com/pandas-dev/pandas/issues/28095>`__ (and
79+
linked issues), and described in more detail in this
80+
`design doc <https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB>`__.
81+
5682
Apache Arrow interoperability
5783
-----------------------------
5884

doc/source/getting_started/overview.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ Package overview
99
**pandas** is a `Python <https://www.python.org>`__ package providing fast,
1010
flexible, and expressive data structures designed to make working with
1111
"relational" or "labeled" data both easy and intuitive. It aims to be the
12-
fundamental high-level building block for doing practical, **real world** data
12+
fundamental high-level building block for doing practical, **real-world** data
1313
analysis in Python. Additionally, it has the broader goal of becoming **the
14-
most powerful and flexible open source data analysis / manipulation tool
14+
most powerful and flexible open source data analysis/manipulation tool
1515
available in any language**. It is already well on its way toward this goal.
1616

1717
pandas is well suited for many different kinds of data:
@@ -21,7 +21,7 @@ pandas is well suited for many different kinds of data:
2121
- Ordered and unordered (not necessarily fixed-frequency) time series data.
2222
- Arbitrary matrix data (homogeneously typed or heterogeneous) with row and
2323
column labels
24-
- Any other form of observational / statistical data sets. The data actually
24+
- Any other form of observational / statistical data sets. The data
2525
need not be labeled at all to be placed into a pandas data structure
2626

2727
The two primary data structures of pandas, :class:`Series` (1-dimensional)
@@ -57,7 +57,7 @@ Here are just a few of the things that pandas does well:
5757
Excel files, databases, and saving / loading data from the ultrafast **HDF5
5858
format**
5959
- **Time series**-specific functionality: date range generation and frequency
60-
conversion, moving window statistics, date shifting and lagging.
60+
conversion, moving window statistics, date shifting, and lagging.
6161

6262
Many of these principles are here to address the shortcomings frequently
6363
experienced using other languages / scientific research environments. For data
@@ -101,12 +101,12 @@ fashion.
101101

102102
Also, we would like sensible default behaviors for the common API functions
103103
which take into account the typical orientation of time series and
104-
cross-sectional data sets. When using ndarrays to store 2- and 3-dimensional
104+
cross-sectional data sets. When using the N-dimensional array (ndarrays) to store 2- and 3-dimensional
105105
data, a burden is placed on the user to consider the orientation of the data
106106
set when writing functions; axes are considered more or less equivalent (except
107107
when C- or Fortran-contiguousness matters for performance). In pandas, the axes
108108
are intended to lend more semantic meaning to the data; i.e., for a particular
109-
data set there is likely to be a "right" way to orient the data. The goal,
109+
data set, there is likely to be a "right" way to orient the data. The goal,
110110
then, is to reduce the amount of mental effort required to code up data
111111
transformations in downstream functions.
112112

@@ -148,8 +148,8 @@ pandas possible. Thanks to `all of our contributors <https://github.com/pandas-d
148148
If you're interested in contributing, please visit the :ref:`contributing guide <contributing>`.
149149

150150
pandas is a `NumFOCUS <https://www.numfocus.org/open-source-projects/>`__ sponsored project.
151-
This will help ensure the success of development of pandas as a world-class open-source
152-
project, and makes it possible to `donate <https://pandas.pydata.org/donate.html>`__ to the project.
151+
This will help ensure the success of the development of pandas as a world-class open-source
152+
project and makes it possible to `donate <https://pandas.pydata.org/donate.html>`__ to the project.
153153

154154
Project governance
155155
------------------

doc/source/getting_started/tutorials.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,4 +94,4 @@ Various tutorials
9494
* `Intro to pandas data structures, by Greg Reda <http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/>`_
9595
* `Pandas and Python: Top 10, by Manish Amde <https://manishamde.github.io/blog/2013/03/07/pandas-and-python-top-10/>`_
9696
* `Pandas DataFrames Tutorial, by Karlijn Willems <https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python>`_
97-
* `A concise tutorial with real life examples <https://tutswiki.com/pandas-cookbook/chapter1>`_
97+
* `A concise tutorial with real life examples <https://tutswiki.com/pandas-cookbook/chapter1/>`_

doc/source/reference/frame.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,7 @@ Sparse-dtype specific methods and attributes are provided under the
343343

344344
.. autosummary::
345345
:toctree: api/
346+
:template: autosummary/accessor_method.rst
346347

347348
DataFrame.sparse.from_spmatrix
348349
DataFrame.sparse.to_coo

doc/source/reference/series.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -522,6 +522,7 @@ Sparse-dtype specific methods and attributes are provided under the
522522

523523
.. autosummary::
524524
:toctree: api/
525+
:template: autosummary/accessor_method.rst
525526

526527
Series.sparse.from_coo
527528
Series.sparse.to_coo

doc/source/user_guide/sparse.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -87,14 +87,15 @@ The :attr:`SparseArray.dtype` property stores two pieces of information
8787
sparr.dtype
8888
8989
90-
A :class:`SparseDtype` may be constructed by passing each of these
90+
A :class:`SparseDtype` may be constructed by passing only a dtype
9191

9292
.. ipython:: python
9393
9494
pd.SparseDtype(np.dtype('datetime64[ns]'))
9595
96-
The default fill value for a given NumPy dtype is the "missing" value for that dtype,
97-
though it may be overridden.
96+
in which case a default fill value will be used (for NumPy dtypes this is often the
97+
"missing" value for that dtype). To override this default an explicit fill value may be
98+
passed instead
9899

99100
.. ipython:: python
100101

doc/source/user_guide/timeseries.rst

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2319,13 +2319,18 @@ you can use the ``tz_convert`` method.
23192319
Instead, the datetime needs to be localized using the ``localize`` method
23202320
on the ``pytz`` time zone object.
23212321

2322+
.. warning::
2323+
2324+
Be aware that for times in the future, correct conversion between time zones
2325+
(and UTC) cannot be guaranteed by any time zone library because a timezone's
2326+
offset from UTC may be changed by the respective government.
2327+
23222328
.. warning::
23232329

23242330
If you are using dates beyond 2038-01-18, due to current deficiencies
23252331
in the underlying libraries caused by the year 2038 problem, daylight saving time (DST) adjustments
23262332
to timezone aware dates will not be applied. If and when the underlying libraries are fixed,
2327-
the DST transitions will be applied. It should be noted though, that time zone data for far future time zones
2328-
are likely to be inaccurate, as they are simple extrapolations of the current set of (regularly revised) rules.
2333+
the DST transitions will be applied.
23292334

23302335
For example, for two dates that are in British Summer Time (and so would normally be GMT+1), both the following asserts evaluate as true:
23312336

doc/source/user_guide/visualization.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -668,6 +668,7 @@ A ``ValueError`` will be raised if there are any negative values in your data.
668668
plt.figure()
669669
670670
.. ipython:: python
671+
:okwarning:
671672
672673
series = pd.Series(3 * np.random.rand(4),
673674
index=['a', 'b', 'c', 'd'], name='series')
@@ -742,6 +743,7 @@ If you pass values whose sum total is less than 1.0, matplotlib draws a semicirc
742743
plt.figure()
743744
744745
.. ipython:: python
746+
:okwarning:
745747
746748
series = pd.Series([0.1] * 4, index=['a', 'b', 'c', 'd'], name='series2')
747749

doc/source/whatsnew/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Version 1.1
2424
.. toctree::
2525
:maxdepth: 2
2626

27+
v1.1.2
2728
v1.1.1
2829
v1.1.0
2930

doc/source/whatsnew/v0.25.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -540,7 +540,7 @@ with :attr:`numpy.nan` in the case of an empty :class:`DataFrame` (:issue:`26397
540540

541541
.. ipython:: python
542542
543-
df.describe()
543+
df.describe()
544544
545545
``__str__`` methods now call ``__repr__`` rather than vice versa
546546
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0 commit comments

Comments
 (0)