Skip to content

Commit 8d20cb8

Browse files
authored
Merge branch 'master' into ndarray_tolerance
2 parents d6ec1d6 + 062f6f1 commit 8d20cb8

36 files changed

+520
-307
lines changed

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ include LICENSE
33
include RELEASE.md
44
include README.rst
55
include setup.py
6+
include pyproject.toml
67

78
graft doc
89
prune doc/build

doc/source/overview.rst

Lines changed: 38 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,11 @@
66
Package overview
77
****************
88

9-
:mod:`pandas` consists of the following things
9+
:mod:`pandas` is an open source, BSD-licensed library providing high-performance,
10+
easy-to-use data structures and data analysis tools for the `Python <https://www.python.org/>`__
11+
programming language.
12+
13+
:mod:`pandas` consists of the following elements
1014

1115
* A set of labeled array data structures, the primary of which are
1216
Series and DataFrame
@@ -21,27 +25,23 @@ Package overview
2125
* Memory-efficient "sparse" versions of the standard data structures for storing
2226
data that is mostly missing or mostly constant (some fixed value)
2327
* Moving window statistics (rolling mean, rolling standard deviation, etc.)
24-
* Static and moving window linear and `panel regression
25-
<http://en.wikipedia.org/wiki/Panel_data>`__
2628

27-
Data structures at a glance
28-
---------------------------
29+
Data Structures
30+
---------------
2931

3032
.. csv-table::
3133
:header: "Dimensions", "Name", "Description"
3234
:widths: 15, 20, 50
3335

34-
1, Series, "1D labeled homogeneously-typed array"
35-
2, DataFrame, "General 2D labeled, size-mutable tabular structure with
36-
potentially heterogeneously-typed columns"
37-
3, Panel, "General 3D labeled, also size-mutable array"
36+
1, "Series", "1D labeled homogeneously-typed array"
37+
2, "DataFrame", "General 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed column"
3838

39-
Why more than 1 data structure?
40-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39+
Why more than one data structure?
40+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4141

4242
The best way to think about the pandas data structures is as flexible
4343
containers for lower dimensional data. For example, DataFrame is a container
44-
for Series, and Panel is a container for DataFrame objects. We would like to be
44+
for Series, and Series is a container for scalars. We would like to be
4545
able to insert and remove objects from these containers in a dictionary-like
4646
fashion.
4747

@@ -85,36 +85,41 @@ The first stop for pandas issues and ideas is the `Github Issue Tracker
8585
pandas community experts can answer through `Stack Overflow
8686
<http://stackoverflow.com/questions/tagged/pandas>`__.
8787

88-
Longer discussions occur on the `developer mailing list
89-
<http://groups.google.com/group/pystatsmodels>`__, and commercial support
90-
inquiries for Lambda Foundry should be sent to: support@lambdafoundry.com
88+
Community
89+
---------
9190

92-
Credits
93-
-------
91+
pandas is actively supported today by a community of like-minded individuals around
92+
the world who contribute their valuable time and energy to help make open source
93+
pandas possible. Thanks to `all of our contributors <https://github.com/pandas-dev/pandas/graphs/contributors>`__.
94+
95+
If you're interested in contributing, please
96+
visit `Contributing to pandas webpage <http://pandas.pydata.org/pandas-docs/stable/contributing.html>`__.
9497

95-
pandas development began at `AQR Capital Management <http://www.aqr.com>`__ in
96-
April 2008. It was open-sourced at the end of 2009. AQR continued to provide
97-
resources for development through the end of 2011, and continues to contribute
98-
bug reports today.
98+
pandas is a `NUMFocus <https://www.numfocus.org/open-source-projects/>`__ sponsored project.
99+
This will help ensure the success of development of pandas as a world-class open-source
100+
project, and makes it possible to `donate <https://pandas.pydata.org/donate.html>`__ to the project.
99101

100-
Since January 2012, `Lambda Foundry <http://www.lambdafoundry.com>`__, has
101-
been providing development resources, as well as commercial support,
102-
training, and consulting for pandas.
102+
Project Governance
103+
------------------
103104

104-
pandas is only made possible by a group of people around the world like you
105-
who have contributed new code, bug reports, fixes, comments and ideas. A
106-
complete list can be found `on Github <http://www.github.com/pandas-dev/pandas/contributors>`__.
105+
The governance process that pandas project has used informally since its inception in 2008 is formalized in `Project Governance documents <https://github.com/pandas-dev/pandas-governance>`__ .
106+
The documents clarify how decisions are made and how the various elements of our community interact, including the relationship between open source collaborative development and work that may be funded by for-profit or non-profit entities.
107+
108+
Wes McKinney is the Benevolent Dictator for Life (BDFL).
107109

108110
Development Team
109-
----------------
111+
-----------------
112+
113+
The list of the Core Team members and more detailed information can be found on the `people’s page <https://github.com/pandas-dev/pandas-governance/blob/master/people.md>`__ of the governance repo.
114+
110115

111-
pandas is a part of the PyData project. The PyData Development Team is a
112-
collection of developers focused on the improvement of Python's data
113-
libraries. The core team that coordinates development can be found on `Github
114-
<http://github.com/pydata>`__. If you're interested in contributing, please
115-
visit the `project website <http://pandas.pydata.org>`__.
116+
Institutional Partners
117+
----------------------
118+
119+
The information about current institutional partners can be found on `pandas website page <https://pandas.pydata.org/about.html>`__
116120

117121
License
118122
-------
119123

120124
.. literalinclude:: ../../LICENSE
125+

doc/source/whatsnew/v0.21.0.txt

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@ Other Enhancements
112112
^^^^^^^^^^^^^^^^^^
113113

114114
- The ``validate`` argument for :func:`merge` function now checks whether a merge is one-to-one, one-to-many, many-to-one, or many-to-many. If a merge is found to not be an example of specified merge type, an exception of type ``MergeError`` will be raised. For more, see :ref:`here <merging.validation>` (:issue:`16270`)
115+
- Added support for `PEP 518 <https://www.python.org/dev/peps/pep-0518/>`_ to the build system (:issue:`16745`)
115116
- :func:`Series.to_dict` and :func:`DataFrame.to_dict` now support an ``into`` keyword which allows you to specify the ``collections.Mapping`` subclass that you would like returned. The default is ``dict``, which is backwards compatible. (:issue:`16122`)
116117
- :func:`RangeIndex.append` now returns a ``RangeIndex`` object when possible (:issue:`16212`)
117118
- :func:`Series.rename_axis` and :func:`DataFrame.rename_axis` with ``inplace=True`` now return ``None`` while renaming the axis inplace. (:issue:`15704`)
@@ -272,6 +273,30 @@ named ``.isna()`` and ``.notna()``, these are included for classes ``Categorical
272273

273274
The configuration option ``pd.options.mode.use_inf_as_null`` is deprecated, and ``pd.options.mode.use_inf_as_na`` is added as a replacement.
274275

276+
.. _whatsnew_210.api.multiindex_single:
277+
278+
MultiIndex Constructor with a Single Level
279+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
280+
281+
The ``MultiIndex`` constructors no longer squeeze a MultiIndex with all
282+
length-one levels down to a regular ``Index``. This affects all the
283+
``MultiIndex`` constructors. (:issue:`17178`)
284+
285+
Previous behavior:
286+
287+
.. code-block:: ipython
288+
289+
In [2]: pd.MultiIndex.from_tuples([('a',), ('b',)])
290+
Out[2]: Index(['a', 'b'], dtype='object')
291+
292+
Length 1 levels are no longer special-cased. They behave exactly as if you had
293+
length 2+ levels, so a :class:`MultiIndex` is always returned from all of the
294+
``MultiIndex`` constructors:
295+
296+
.. ipython:: python
297+
298+
pd.MultiIndex.from_tuples([('a',), ('b',)])
299+
275300
.. _whatsnew_0210.api:
276301

277302
Other API Changes
@@ -357,6 +382,7 @@ Indexing
357382
- Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`)
358383
- Bug in ``.iloc`` when used with inplace addition or assignment and an int indexer on a ``MultiIndex`` causing the wrong indexes to be read from and written to (:issue:`17148`)
359384
- Bug in ``.isin()`` in which checking membership in empty ``Series`` objects raised an error (:issue:`16991`)
385+
- Bug in ``CategoricalIndex`` reindexing in which specified indices containing duplicates were not being respected (:issue:`17323`)
360386

361387
I/O
362388
^^^
@@ -404,6 +430,7 @@ Reshaping
404430
- Bug in :func:`crosstab` where passing two ``Series`` with the same name raised a ``KeyError`` (:issue:`13279`)
405431
- :func:`Series.argmin`, :func:`Series.argmax`, and their counterparts on ``DataFrame`` and groupby objects work correctly with floating point data that contains infinite values (:issue:`13595`).
406432
- Bug in :func:`unique` where checking a tuple of strings raised a ``TypeError`` (:issue:`17108`)
433+
- Bug in :func:`concat` where order of result index was unpredictable if it contained non-comparable elements (:issue:`17344`)
407434

408435
Numeric
409436
^^^^^^^
@@ -422,3 +449,4 @@ Categorical
422449
Other
423450
^^^^^
424451
- Bug in :func:`eval` where the ``inplace`` parameter was being incorrectly handled (:issue:`16732`)
452+
- Several ``NaT`` method docstrings (e.g. :func:`NaT.ctime`) were incorrect (:issue:`17327`)

pandas/_libs/src/ujson/python/objToJSON.c

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -329,7 +329,7 @@ static Py_ssize_t get_attr_length(PyObject *obj, char *attr) {
329329
return ret;
330330
}
331331

332-
npy_int64 get_long_attr(PyObject *o, const char *attr) {
332+
static npy_int64 get_long_attr(PyObject *o, const char *attr) {
333333
npy_int64 long_val;
334334
PyObject *value = PyObject_GetAttrString(o, attr);
335335
long_val = (PyLong_Check(value) ?
@@ -338,15 +338,12 @@ npy_int64 get_long_attr(PyObject *o, const char *attr) {
338338
return long_val;
339339
}
340340

341-
npy_float64 total_seconds(PyObject *td) {
342-
// Python 2.6 compat
343-
// TODO(anyone): remove this legacy workaround with a more
344-
// direct td.total_seconds()
345-
npy_int64 microseconds = get_long_attr(td, "microseconds");
346-
npy_int64 seconds = get_long_attr(td, "seconds");
347-
npy_int64 days = get_long_attr(td, "days");
348-
npy_int64 days_in_seconds = days * 24LL * 3600LL;
349-
return (microseconds + (seconds + days_in_seconds) * 1000000.0) / 1000000.0;
341+
static npy_float64 total_seconds(PyObject *td) {
342+
npy_float64 double_val;
343+
PyObject *value = PyObject_CallMethod(td, "total_seconds", NULL);
344+
double_val = PyFloat_AS_DOUBLE(value);
345+
Py_DECREF(value);
346+
return double_val;
350347
}
351348

352349
static PyObject *get_item(PyObject *obj, Py_ssize_t i) {

pandas/_libs/tslib.pyx

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# -*- coding: utf-8 -*-
12
# cython: profile=False
23

34
import warnings
@@ -3922,7 +3923,7 @@ for _method_name in _nat_methods:
39223923
def f(*args, **kwargs):
39233924
return NaT
39243925
f.__name__ = func_name
3925-
f.__doc__ = _get_docstring(_method_name)
3926+
f.__doc__ = _get_docstring(func_name)
39263927
return f
39273928

39283929
setattr(NaTType, _method_name, _make_nat_func(_method_name))
@@ -3934,7 +3935,7 @@ for _method_name in _nan_methods:
39343935
def f(*args, **kwargs):
39353936
return np.nan
39363937
f.__name__ = func_name
3937-
f.__doc__ = _get_docstring(_method_name)
3938+
f.__doc__ = _get_docstring(func_name)
39383939
return f
39393940

39403941
setattr(NaTType, _method_name, _make_nan_func(_method_name))
@@ -3952,7 +3953,7 @@ for _maybe_method_name in dir(NaTType):
39523953
def f(*args, **kwargs):
39533954
raise ValueError("NaTType does not support " + func_name)
39543955
f.__name__ = func_name
3955-
f.__doc__ = _get_docstring(_method_name)
3956+
f.__doc__ = _get_docstring(func_name)
39563957
return f
39573958

39583959
setattr(NaTType, _maybe_method_name,

pandas/core/accessor.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# -*- coding: utf-8 -*-
2+
"""
3+
4+
accessor.py contains base classes for implementing accessor properties
5+
that can be mixed into or pinned onto other pandas classes.
6+
7+
"""
8+
9+
10+
class DirNamesMixin(object):
11+
_accessors = frozenset([])
12+
13+
def _dir_deletions(self):
14+
""" delete unwanted __dir__ for this object """
15+
return self._accessors
16+
17+
def _dir_additions(self):
18+
""" add addtional __dir__ for this object """
19+
rv = set()
20+
for accessor in self._accessors:
21+
try:
22+
getattr(self, accessor)
23+
rv.add(accessor)
24+
except AttributeError:
25+
pass
26+
return rv
27+
28+
def __dir__(self):
29+
"""
30+
Provide method name lookup and completion
31+
Only provide 'public' methods
32+
"""
33+
rv = set(dir(type(self)))
34+
rv = (rv - self._dir_deletions()) | self._dir_additions()
35+
return sorted(rv)

pandas/core/base.py

Lines changed: 3 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
from pandas.util._decorators import (Appender, cache_readonly,
2020
deprecate_kwarg, Substitution)
2121
from pandas.core.common import AbstractMethodError
22+
from pandas.core.accessor import DirNamesMixin
2223

2324
_shared_docs = dict()
2425
_indexops_doc_kwargs = dict(klass='IndexOpsMixin', inplace='',
@@ -73,7 +74,7 @@ def __repr__(self):
7374
return str(self)
7475

7576

76-
class PandasObject(StringMixin):
77+
class PandasObject(StringMixin, DirNamesMixin):
7778

7879
"""baseclass for various pandas objects"""
7980

@@ -92,23 +93,6 @@ def __unicode__(self):
9293
# Should be overwritten by base classes
9394
return object.__repr__(self)
9495

95-
def _dir_additions(self):
96-
""" add addtional __dir__ for this object """
97-
return set()
98-
99-
def _dir_deletions(self):
100-
""" delete unwanted __dir__ for this object """
101-
return set()
102-
103-
def __dir__(self):
104-
"""
105-
Provide method name lookup and completion
106-
Only provide 'public' methods
107-
"""
108-
rv = set(dir(type(self)))
109-
rv = (rv - self._dir_deletions()) | self._dir_additions()
110-
return sorted(rv)
111-
11296
def _reset_cache(self, key=None):
11397
"""
11498
Reset cached properties. If ``key`` is passed, only clears that key.
@@ -141,7 +125,7 @@ class NoNewAttributesMixin(object):
141125
142126
Prevents additional attributes via xxx.attribute = "something" after a
143127
call to `self.__freeze()`. Mainly used to prevent the user from using
144-
wrong attrirbutes on a accessor (`Series.cat/.str/.dt`).
128+
wrong attributes on a accessor (`Series.cat/.str/.dt`).
145129
146130
If you really want to add a new attribute at a later time, you need to use
147131
`object.__setattr__(self, key, value)`.

pandas/core/common.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -629,3 +629,17 @@ def _random_state(state=None):
629629
else:
630630
raise ValueError("random_state must be an integer, a numpy "
631631
"RandomState, or None")
632+
633+
634+
def _get_distinct_objs(objs):
635+
"""
636+
Return a list with distinct elements of "objs" (different ids).
637+
Preserves order.
638+
"""
639+
ids = set()
640+
res = []
641+
for obj in objs:
642+
if not id(obj) in ids:
643+
ids.add(id(obj))
644+
res.append(obj)
645+
return res

pandas/core/frame.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,8 @@
6767
_dict_compat,
6868
standardize_mapping)
6969
from pandas.core.generic import NDFrame, _shared_docs
70-
from pandas.core.index import Index, MultiIndex, _ensure_index
70+
from pandas.core.index import (Index, MultiIndex, _ensure_index,
71+
_ensure_index_from_sequences)
7172
from pandas.core.indexing import (maybe_droplevels, convert_to_index_sliceable,
7273
check_bool_indexer)
7374
from pandas.core.internals import (BlockManager,
@@ -1155,9 +1156,9 @@ def from_records(cls, data, index=None, exclude=None, columns=None,
11551156
else:
11561157
try:
11571158
to_remove = [arr_columns.get_loc(field) for field in index]
1158-
1159-
result_index = MultiIndex.from_arrays(
1160-
[arrays[i] for i in to_remove], names=index)
1159+
index_data = [arrays[i] for i in to_remove]
1160+
result_index = _ensure_index_from_sequences(index_data,
1161+
names=index)
11611162

11621163
exclude.update(index)
11631164
except Exception:
@@ -3000,7 +3001,7 @@ def set_index(self, keys, drop=True, append=False, inplace=False,
30003001
to_remove.append(col)
30013002
arrays.append(level)
30023003

3003-
index = MultiIndex.from_arrays(arrays, names=names)
3004+
index = _ensure_index_from_sequences(arrays, names)
30043005

30053006
if verify_integrity and not index.is_unique:
30063007
duplicates = index.get_duplicates()

pandas/core/generic.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -192,8 +192,9 @@ def __unicode__(self):
192192

193193
def _dir_additions(self):
194194
""" add the string-like attributes from the info_axis """
195-
return set([c for c in self._info_axis
196-
if isinstance(c, string_types) and isidentifier(c)])
195+
additions = set([c for c in self._info_axis
196+
if isinstance(c, string_types) and isidentifier(c)])
197+
return super(NDFrame, self)._dir_additions().union(additions)
197198

198199
@property
199200
def _constructor_sliced(self):

0 commit comments

Comments
 (0)