Skip to content

Commit 650ab15

Browse files
author
Matt Roeschke
committed
Merge remote-tracking branch 'upstream/master' into calendarday_offset
2 parents 04b35af + 601d71f commit 650ab15

File tree

12 files changed

+287
-204
lines changed

12 files changed

+287
-204
lines changed

doc/source/whatsnew/v0.24.0.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -530,6 +530,7 @@ Deprecations
530530
- :meth:`MultiIndex.to_hierarchical` is deprecated and will be removed in a future version (:issue:`21613`)
531531
- :meth:`Series.ptp` is deprecated. Use ``numpy.ptp`` instead (:issue:`21614`)
532532
- :meth:`Series.compress` is deprecated. Use ``Series[condition]`` instead (:issue:`18262`)
533+
- The signature of :meth:`Series.to_csv` has been uniformed to that of doc:meth:`DataFrame.to_csv`: the name of the first argument is now 'path_or_buf', the order of subsequent arguments has changed, the 'header' argument now defaults to True. (:issue:`19715`)
533534
- :meth:`Categorical.from_codes` has deprecated providing float values for the ``codes`` argument. (:issue:`21767`)
534535
- :func:`pandas.read_table` is deprecated. Instead, use :func:`pandas.read_csv` passing ``sep='\t'`` if necessary (:issue:`21948`)
535536

@@ -678,7 +679,7 @@ MultiIndex
678679

679680
- Removed compatibility for :class:`MultiIndex` pickles prior to version 0.8.0; compatibility with :class:`MultiIndex` pickles from version 0.13 forward is maintained (:issue:`21654`)
680681
- :meth:`MultiIndex.get_loc_level` (and as a consequence, ``.loc`` on a :class:``MultiIndex``ed object) will now raise a ``KeyError``, rather than returning an empty ``slice``, if asked a label which is present in the ``levels`` but is unused (:issue:`22221`)
681-
-
682+
- Fix ``TypeError`` in Python 3 when creating :class:`MultiIndex` in which some levels have mixed types, e.g. when some labels are tuples (:issue:`15457`)
682683

683684
I/O
684685
^^^

pandas/core/arrays/categorical.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2538,7 +2538,10 @@ def _factorize_from_iterable(values):
25382538
ordered=values.ordered)
25392539
codes = values.codes
25402540
else:
2541-
cat = Categorical(values, ordered=True)
2541+
# The value of ordered is irrelevant since we don't use cat as such,
2542+
# but only the resulting categories, the order of which is independent
2543+
# from ordered. Set ordered to False as default. See GH #15457
2544+
cat = Categorical(values, ordered=False)
25422545
categories = cat.categories
25432546
codes = cat.codes
25442547
return codes, categories

pandas/core/frame.py

Lines changed: 0 additions & 101 deletions
Original file line numberDiff line numberDiff line change
@@ -1714,107 +1714,6 @@ def to_panel(self):
17141714

17151715
return self._constructor_expanddim(new_mgr)
17161716

1717-
def to_csv(self, path_or_buf=None, sep=",", na_rep='', float_format=None,
1718-
columns=None, header=True, index=True, index_label=None,
1719-
mode='w', encoding=None, compression='infer', quoting=None,
1720-
quotechar='"', line_terminator='\n', chunksize=None,
1721-
tupleize_cols=None, date_format=None, doublequote=True,
1722-
escapechar=None, decimal='.'):
1723-
r"""Write DataFrame to a comma-separated values (csv) file
1724-
1725-
Parameters
1726-
----------
1727-
path_or_buf : string or file handle, default None
1728-
File path or object, if None is provided the result is returned as
1729-
a string.
1730-
sep : character, default ','
1731-
Field delimiter for the output file.
1732-
na_rep : string, default ''
1733-
Missing data representation
1734-
float_format : string, default None
1735-
Format string for floating point numbers
1736-
columns : sequence, optional
1737-
Columns to write
1738-
header : boolean or list of string, default True
1739-
Write out the column names. If a list of strings is given it is
1740-
assumed to be aliases for the column names
1741-
index : boolean, default True
1742-
Write row names (index)
1743-
index_label : string or sequence, or False, default None
1744-
Column label for index column(s) if desired. If None is given, and
1745-
`header` and `index` are True, then the index names are used. A
1746-
sequence should be given if the DataFrame uses MultiIndex. If
1747-
False do not print fields for index names. Use index_label=False
1748-
for easier importing in R
1749-
mode : str
1750-
Python write mode, default 'w'
1751-
encoding : string, optional
1752-
A string representing the encoding to use in the output file,
1753-
defaults to 'ascii' on Python 2 and 'utf-8' on Python 3.
1754-
compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None},
1755-
default 'infer'
1756-
If 'infer' and `path_or_buf` is path-like, then detect compression
1757-
from the following extensions: '.gz', '.bz2', '.zip' or '.xz'
1758-
(otherwise no compression).
1759-
1760-
.. versionchanged:: 0.24.0
1761-
'infer' option added and set to default
1762-
line_terminator : string, default ``'\n'``
1763-
The newline character or character sequence to use in the output
1764-
file
1765-
quoting : optional constant from csv module
1766-
defaults to csv.QUOTE_MINIMAL. If you have set a `float_format`
1767-
then floats are converted to strings and thus csv.QUOTE_NONNUMERIC
1768-
will treat them as non-numeric
1769-
quotechar : string (length 1), default '\"'
1770-
character used to quote fields
1771-
doublequote : boolean, default True
1772-
Control quoting of `quotechar` inside a field
1773-
escapechar : string (length 1), default None
1774-
character used to escape `sep` and `quotechar` when appropriate
1775-
chunksize : int or None
1776-
rows to write at a time
1777-
tupleize_cols : boolean, default False
1778-
.. deprecated:: 0.21.0
1779-
This argument will be removed and will always write each row
1780-
of the multi-index as a separate row in the CSV file.
1781-
1782-
Write MultiIndex columns as a list of tuples (if True) or in
1783-
the new, expanded format, where each MultiIndex column is a row
1784-
in the CSV (if False).
1785-
date_format : string, default None
1786-
Format string for datetime objects
1787-
decimal: string, default '.'
1788-
Character recognized as decimal separator. E.g. use ',' for
1789-
European data
1790-
1791-
"""
1792-
1793-
if tupleize_cols is not None:
1794-
warnings.warn("The 'tupleize_cols' parameter is deprecated and "
1795-
"will be removed in a future version",
1796-
FutureWarning, stacklevel=2)
1797-
else:
1798-
tupleize_cols = False
1799-
1800-
from pandas.io.formats.csvs import CSVFormatter
1801-
formatter = CSVFormatter(self, path_or_buf,
1802-
line_terminator=line_terminator, sep=sep,
1803-
encoding=encoding,
1804-
compression=compression, quoting=quoting,
1805-
na_rep=na_rep, float_format=float_format,
1806-
cols=columns, header=header, index=index,
1807-
index_label=index_label, mode=mode,
1808-
chunksize=chunksize, quotechar=quotechar,
1809-
tupleize_cols=tupleize_cols,
1810-
date_format=date_format,
1811-
doublequote=doublequote,
1812-
escapechar=escapechar, decimal=decimal)
1813-
formatter.save()
1814-
1815-
if path_or_buf is None:
1816-
return formatter.path_or_buf.getvalue()
1817-
18181717
@Appender(_shared_docs['to_excel'] % _shared_doc_kwargs)
18191718
def to_excel(self, excel_writer, sheet_name='Sheet1', na_rep='',
18201719
float_format=None, columns=None, header=True, index=True,

pandas/core/generic.py

Lines changed: 111 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1952,13 +1952,13 @@ def to_json(self, path_or_buf=None, orient=None, date_format=None,
19521952
* Series
19531953
19541954
- default is 'index'
1955-
- allowed values are: {'split','records','index'}
1955+
- allowed values are: {'split','records','index','table'}
19561956
19571957
* DataFrame
19581958
19591959
- default is 'columns'
19601960
- allowed values are:
1961-
{'split','records','index','columns','values'}
1961+
{'split','records','index','columns','values','table'}
19621962
19631963
* The format of the JSON string
19641964
@@ -9271,6 +9271,115 @@ def first_valid_index(self):
92719271
def last_valid_index(self):
92729272
return self._find_valid_index('last')
92739273

9274+
def to_csv(self, path_or_buf=None, sep=",", na_rep='', float_format=None,
9275+
columns=None, header=True, index=True, index_label=None,
9276+
mode='w', encoding=None, compression='infer', quoting=None,
9277+
quotechar='"', line_terminator='\n', chunksize=None,
9278+
tupleize_cols=None, date_format=None, doublequote=True,
9279+
escapechar=None, decimal='.'):
9280+
r"""Write object to a comma-separated values (csv) file
9281+
9282+
Parameters
9283+
----------
9284+
path_or_buf : string or file handle, default None
9285+
File path or object, if None is provided the result is returned as
9286+
a string.
9287+
.. versionchanged:: 0.24.0
9288+
Was previously named "path" for Series.
9289+
sep : character, default ','
9290+
Field delimiter for the output file.
9291+
na_rep : string, default ''
9292+
Missing data representation
9293+
float_format : string, default None
9294+
Format string for floating point numbers
9295+
columns : sequence, optional
9296+
Columns to write
9297+
header : boolean or list of string, default True
9298+
Write out the column names. If a list of strings is given it is
9299+
assumed to be aliases for the column names
9300+
.. versionchanged:: 0.24.0
9301+
Previously defaulted to False for Series.
9302+
index : boolean, default True
9303+
Write row names (index)
9304+
index_label : string or sequence, or False, default None
9305+
Column label for index column(s) if desired. If None is given, and
9306+
`header` and `index` are True, then the index names are used. A
9307+
sequence should be given if the object uses MultiIndex. If
9308+
False do not print fields for index names. Use index_label=False
9309+
for easier importing in R
9310+
mode : str
9311+
Python write mode, default 'w'
9312+
encoding : string, optional
9313+
A string representing the encoding to use in the output file,
9314+
defaults to 'ascii' on Python 2 and 'utf-8' on Python 3.
9315+
compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None},
9316+
default 'infer'
9317+
If 'infer' and `path_or_buf` is path-like, then detect compression
9318+
from the following extensions: '.gz', '.bz2', '.zip' or '.xz'
9319+
(otherwise no compression).
9320+
9321+
.. versionchanged:: 0.24.0
9322+
'infer' option added and set to default
9323+
line_terminator : string, default ``'\n'``
9324+
The newline character or character sequence to use in the output
9325+
file
9326+
quoting : optional constant from csv module
9327+
defaults to csv.QUOTE_MINIMAL. If you have set a `float_format`
9328+
then floats are converted to strings and thus csv.QUOTE_NONNUMERIC
9329+
will treat them as non-numeric
9330+
quotechar : string (length 1), default '\"'
9331+
character used to quote fields
9332+
doublequote : boolean, default True
9333+
Control quoting of `quotechar` inside a field
9334+
escapechar : string (length 1), default None
9335+
character used to escape `sep` and `quotechar` when appropriate
9336+
chunksize : int or None
9337+
rows to write at a time
9338+
tupleize_cols : boolean, default False
9339+
.. deprecated:: 0.21.0
9340+
This argument will be removed and will always write each row
9341+
of the multi-index as a separate row in the CSV file.
9342+
9343+
Write MultiIndex columns as a list of tuples (if True) or in
9344+
the new, expanded format, where each MultiIndex column is a row
9345+
in the CSV (if False).
9346+
date_format : string, default None
9347+
Format string for datetime objects
9348+
decimal: string, default '.'
9349+
Character recognized as decimal separator. E.g. use ',' for
9350+
European data
9351+
9352+
.. versionchanged:: 0.24.0
9353+
The order of arguments for Series was changed.
9354+
"""
9355+
9356+
df = self if isinstance(self, ABCDataFrame) else self.to_frame()
9357+
9358+
if tupleize_cols is not None:
9359+
warnings.warn("The 'tupleize_cols' parameter is deprecated and "
9360+
"will be removed in a future version",
9361+
FutureWarning, stacklevel=2)
9362+
else:
9363+
tupleize_cols = False
9364+
9365+
from pandas.io.formats.csvs import CSVFormatter
9366+
formatter = CSVFormatter(df, path_or_buf,
9367+
line_terminator=line_terminator, sep=sep,
9368+
encoding=encoding,
9369+
compression=compression, quoting=quoting,
9370+
na_rep=na_rep, float_format=float_format,
9371+
cols=columns, header=header, index=index,
9372+
index_label=index_label, mode=mode,
9373+
chunksize=chunksize, quotechar=quotechar,
9374+
tupleize_cols=tupleize_cols,
9375+
date_format=date_format,
9376+
doublequote=doublequote,
9377+
escapechar=escapechar, decimal=decimal)
9378+
formatter.save()
9379+
9380+
if path_or_buf is None:
9381+
return formatter.path_or_buf.getvalue()
9382+
92749383

92759384
def _doc_parms(cls):
92769385
"""Return a tuple of the doc parms."""

0 commit comments

Comments
 (0)