Skip to content

Commit 5e84250

Browse files
committed
Merge remote-tracking branch 'upstream/master' into docfix-multiindex-set_levels
2 parents 5e4b57b + 95be077 commit 5e84250

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+2171
-1709
lines changed

.github/workflows/ci.yml

Lines changed: 72 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,53 +23,53 @@ jobs:
2323

2424
- name: Looking for unwanted patterns
2525
run: ci/code_checks.sh patterns
26-
if: true
26+
if: always()
2727

2828
- name: Setup environment and build pandas
2929
run: ci/setup_env.sh
30-
if: true
30+
if: always()
3131

3232
- name: Linting
3333
run: |
3434
source activate pandas-dev
3535
ci/code_checks.sh lint
36-
if: true
36+
if: always()
3737

3838
- name: Dependencies consistency
3939
run: |
4040
source activate pandas-dev
4141
ci/code_checks.sh dependencies
42-
if: true
42+
if: always()
4343

4444
- name: Checks on imported code
4545
run: |
4646
source activate pandas-dev
4747
ci/code_checks.sh code
48-
if: true
48+
if: always()
4949

5050
- name: Running doctests
5151
run: |
5252
source activate pandas-dev
5353
ci/code_checks.sh doctests
54-
if: true
54+
if: always()
5555

5656
- name: Docstring validation
5757
run: |
5858
source activate pandas-dev
5959
ci/code_checks.sh docstrings
60-
if: true
60+
if: always()
6161

6262
- name: Typing validation
6363
run: |
6464
source activate pandas-dev
6565
ci/code_checks.sh typing
66-
if: true
66+
if: always()
6767

6868
- name: Testing docstring validation script
6969
run: |
7070
source activate pandas-dev
7171
pytest --capture=no --strict scripts
72-
if: true
72+
if: always()
7373

7474
- name: Running benchmarks
7575
run: |
@@ -87,11 +87,73 @@ jobs:
8787
else
8888
echo "Benchmarks did not run, no changes detected"
8989
fi
90-
if: true
90+
if: always()
9191

9292
- name: Publish benchmarks artifact
9393
uses: actions/upload-artifact@master
9494
with:
9595
name: Benchmarks log
9696
path: asv_bench/benchmarks.log
9797
if: failure()
98+
99+
web_and_docs:
100+
name: Web and docs
101+
runs-on: ubuntu-latest
102+
steps:
103+
104+
- name: Setting conda path
105+
run: echo "::set-env name=PATH::${HOME}/miniconda3/bin:${PATH}"
106+
107+
- name: Checkout
108+
uses: actions/checkout@v1
109+
110+
- name: Setup environment and build pandas
111+
run: ci/setup_env.sh
112+
113+
- name: Build website
114+
run: |
115+
source activate pandas-dev
116+
python web/pandas_web.py web/pandas --target-path=web/build
117+
118+
- name: Build documentation
119+
run: |
120+
source activate pandas-dev
121+
doc/make.py --warnings-are-errors | tee sphinx.log ; exit ${PIPESTATUS[0]}
122+
123+
# This can be removed when the ipython directive fails when there are errors,
124+
# including the `tee sphinx.log` in te previous step (https://github.com/ipython/ipython/issues/11547)
125+
- name: Check ipython directive errors
126+
run: "! grep -B1 \"^<<<-------------------------------------------------------------------------$\" sphinx.log"
127+
128+
- name: Merge website and docs
129+
run: |
130+
mkdir -p pandas_web/docs
131+
cp -r web/build/* pandas_web/
132+
cp -r doc/build/html/* pandas_web/docs/
133+
if: github.event_name == 'push'
134+
135+
- name: Install Rclone
136+
run: sudo apt install rclone -y
137+
if: github.event_name == 'push'
138+
139+
- name: Set up Rclone
140+
run: |
141+
RCLONE_CONFIG_PATH=$HOME/.config/rclone/rclone.conf
142+
mkdir -p `dirname $RCLONE_CONFIG_PATH`
143+
echo "[ovh_cloud_pandas_web]" > $RCLONE_CONFIG_PATH
144+
echo "type = swift" >> $RCLONE_CONFIG_PATH
145+
echo "env_auth = false" >> $RCLONE_CONFIG_PATH
146+
echo "auth_version = 3" >> $RCLONE_CONFIG_PATH
147+
echo "auth = https://auth.cloud.ovh.net/v3/" >> $RCLONE_CONFIG_PATH
148+
echo "endpoint_type = public" >> $RCLONE_CONFIG_PATH
149+
echo "tenant_domain = default" >> $RCLONE_CONFIG_PATH
150+
echo "tenant = 2977553886518025" >> $RCLONE_CONFIG_PATH
151+
echo "domain = default" >> $RCLONE_CONFIG_PATH
152+
echo "user = w4KGs3pmDxpd" >> $RCLONE_CONFIG_PATH
153+
echo "key = ${{ secrets.ovh_object_store_key }}" >> $RCLONE_CONFIG_PATH
154+
echo "region = BHS" >> $RCLONE_CONFIG_PATH
155+
if: github.event_name == 'push'
156+
157+
- name: Sync web
158+
run: rclone sync pandas_web ovh_cloud_pandas_web:dev
159+
if: github.event_name == 'push'

.travis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
language: python
2-
python: 3.5
2+
python: 3.7
33

44
# To turn off cached cython files and compiler cache
55
# set NOCACHE-true

doc/source/user_guide/integer_na.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,10 @@ Nullable integer data type
1515
IntegerArray is currently experimental. Its API or implementation may
1616
change without warning.
1717

18+
.. versionchanged:: 1.0.0
19+
20+
Now uses :attr:`pandas.NA` as the missing value rather
21+
than :attr:`numpy.nan`.
1822

1923
In :ref:`missing_data`, we saw that pandas primarily uses ``NaN`` to represent
2024
missing data. Because ``NaN`` is a float, this forces an array of integers with
@@ -23,6 +27,9 @@ much. But if your integer column is, say, an identifier, casting to float can
2327
be problematic. Some integers cannot even be represented as floating point
2428
numbers.
2529

30+
Construction
31+
------------
32+
2633
Pandas can represent integer data with possibly missing values using
2734
:class:`arrays.IntegerArray`. This is an :ref:`extension types <extending.extension-types>`
2835
implemented within pandas.
@@ -39,6 +46,12 @@ NumPy's ``'int64'`` dtype:
3946
4047
pd.array([1, 2, np.nan], dtype="Int64")
4148
49+
All NA-like values are replaced with :attr:`pandas.NA`.
50+
51+
.. ipython:: python
52+
53+
pd.array([1, 2, np.nan, None, pd.NA], dtype="Int64")
54+
4255
This array can be stored in a :class:`DataFrame` or :class:`Series` like any
4356
NumPy array.
4457

@@ -78,6 +91,9 @@ with the dtype.
7891
In the future, we may provide an option for :class:`Series` to infer a
7992
nullable-integer dtype.
8093

94+
Operations
95+
----------
96+
8197
Operations involving an integer array will behave similar to NumPy arrays.
8298
Missing values will be propagated, and the data will be coerced to another
8399
dtype if needed.
@@ -123,3 +139,15 @@ Reduction and groupby operations such as 'sum' work as well.
123139
124140
df.sum()
125141
df.groupby('B').A.sum()
142+
143+
Scalar NA Value
144+
---------------
145+
146+
:class:`arrays.IntegerArray` uses :attr:`pandas.NA` as its scalar
147+
missing value. Slicing a single element that's missing will return
148+
:attr:`pandas.NA`
149+
150+
.. ipython:: python
151+
152+
a = pd.array([1, None], dtype="Int64")
153+
a[1]

doc/source/whatsnew/v1.0.0.rst

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -357,6 +357,64 @@ The following methods now also correctly output values for unobserved categories
357357
358358
As a reminder, you can specify the ``dtype`` to disable all inference.
359359

360+
:class:`arrays.IntegerArray` now uses :attr:`pandas.NA`
361+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
362+
363+
:class:`arrays.IntegerArray` now uses :attr:`pandas.NA` rather than
364+
:attr:`numpy.nan` as its missing value marker (:issue:`29964`).
365+
366+
*pandas 0.25.x*
367+
368+
.. code-block:: python
369+
370+
>>> a = pd.array([1, 2, None], dtype="Int64")
371+
>>> a
372+
<IntegerArray>
373+
[1, 2, NaN]
374+
Length: 3, dtype: Int64
375+
376+
>>> a[2]
377+
nan
378+
379+
*pandas 1.0.0*
380+
381+
.. ipython:: python
382+
383+
a = pd.array([1, 2, None], dtype="Int64")
384+
a[2]
385+
386+
See :ref:`missing_data.NA` for more on the differences between :attr:`pandas.NA`
387+
and :attr:`numpy.nan`.
388+
389+
:class:`arrays.IntegerArray` comparisons return :class:`arrays.BooleanArray`
390+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
391+
392+
Comparison operations on a :class:`arrays.IntegerArray` now returns a
393+
:class:`arrays.BooleanArray` rather than a NumPy array (:issue:`29964`).
394+
395+
*pandas 0.25.x*
396+
397+
.. code-block:: python
398+
399+
>>> a = pd.array([1, 2, None], dtype="Int64")
400+
>>> a
401+
<IntegerArray>
402+
[1, 2, NaN]
403+
Length: 3, dtype: Int64
404+
405+
>>> a > 1
406+
array([False, True, False])
407+
408+
*pandas 1.0.0*
409+
410+
.. ipython:: python
411+
412+
a = pd.array([1, 2, None], dtype="Int64")
413+
a > 1
414+
415+
Note that missing values now propagate, rather than always comparing unequal
416+
like :attr:`numpy.nan`. See :ref:`missing_data.NA` for more.
417+
360418
By default :meth:`Categorical.min` now returns the minimum instead of np.nan
361419
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
362420

@@ -728,6 +786,7 @@ Datetimelike
728786
- Bug in :class:`DatetimeIndex` addition when adding a non-optimized :class:`DateOffset` incorrectly dropping timezone information (:issue:`30336`)
729787
- Bug in :meth:`DataFrame.drop` where attempting to drop non-existent values from a DatetimeIndex would yield a confusing error message (:issue:`30399`)
730788
- Bug in :meth:`DataFrame.append` would remove the timezone-awareness of new data (:issue:`30238`)
789+
- Bug in :meth:`Series.cummin` and :meth:`Series.cummax` with timezone-aware dtype incorrectly dropping its timezone (:issue:`15553`)
731790
- Bug in :class:`DatetimeArray`, :class:`TimedeltaArray`, and :class:`PeriodArray` where inplace addition and subtraction did not actually operate inplace (:issue:`24115`)
732791

733792
Timedelta
@@ -757,6 +816,7 @@ Numeric
757816
- Bug in :class:`NumericIndex` construction that caused :class:`UInt64Index` to be casted to :class:`Float64Index` when integers in the ``np.uint64`` range were used to index a :class:`DataFrame` (:issue:`28279`)
758817
- Bug in :meth:`Series.interpolate` when using method=`index` with an unsorted index, would previously return incorrect results. (:issue:`21037`)
759818
- Bug in :meth:`DataFrame.round` where a :class:`DataFrame` with a :class:`CategoricalIndex` of :class:`IntervalIndex` columns would incorrectly raise a ``TypeError`` (:issue:`30063`)
819+
- Bug in :class:`DataFrame` cumulative operations (e.g. cumsum, cummax) incorrect casting to object-dtype (:issue:`19296`)
760820

761821
Conversion
762822
^^^^^^^^^^

pandas/_config/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ def __setattr__(self, key, val):
197197
else:
198198
raise OptionError("You can only set the value of existing options")
199199

200-
def __getattr__(self, key):
200+
def __getattr__(self, key: str):
201201
prefix = object.__getattribute__(self, "prefix")
202202
if prefix:
203203
prefix += "."

pandas/compat/numpy/function.py

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -169,13 +169,6 @@ def validate_clip_with_axis(axis, args, kwargs):
169169
return axis
170170

171171

172-
COMPRESS_DEFAULTS: "OrderedDict[str, Any]" = OrderedDict()
173-
COMPRESS_DEFAULTS["axis"] = None
174-
COMPRESS_DEFAULTS["out"] = None
175-
validate_compress = CompatValidator(
176-
COMPRESS_DEFAULTS, fname="compress", method="both", max_fname_arg_count=1
177-
)
178-
179172
CUM_FUNC_DEFAULTS: "OrderedDict[str, Any]" = OrderedDict()
180173
CUM_FUNC_DEFAULTS["dtype"] = None
181174
CUM_FUNC_DEFAULTS["out"] = None

pandas/core/arrays/boolean.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -524,7 +524,7 @@ def astype(self, dtype, copy=True):
524524
na_value = np.nan
525525
# coerce
526526
data = self._coerce_to_ndarray(na_value=na_value)
527-
return astype_nansafe(data, dtype, copy=None)
527+
return astype_nansafe(data, dtype, copy=False)
528528

529529
def value_counts(self, dropna=True):
530530
"""
@@ -730,7 +730,6 @@ def all(self, skipna: bool = True, **kwargs):
730730
@classmethod
731731
def _create_logical_method(cls, op):
732732
def logical_method(self, other):
733-
734733
if isinstance(other, (ABCDataFrame, ABCSeries, ABCIndexClass)):
735734
# Rely on pandas to unbox and dispatch to us.
736735
return NotImplemented
@@ -777,8 +776,11 @@ def logical_method(self, other):
777776
@classmethod
778777
def _create_comparison_method(cls, op):
779778
def cmp_method(self, other):
779+
from pandas.arrays import IntegerArray
780780

781-
if isinstance(other, (ABCDataFrame, ABCSeries, ABCIndexClass)):
781+
if isinstance(
782+
other, (ABCDataFrame, ABCSeries, ABCIndexClass, IntegerArray)
783+
):
782784
# Rely on pandas to unbox and dispatch to us.
783785
return NotImplemented
784786

0 commit comments

Comments
 (0)