Description
Code Sample, a copy-pastable example if possible
df.to_csv('uncompressed.csv')
df.to_csv('compressed-wrong-line-terminator.csv.gz')
df.to_csv('compressed-good-line-terminator.csv.gz', line_terminator='\n')
Problem description
Current line_terminator defaults when using compression and when not using compression are different (Windows OS, pandas 0.24.1).
When uncompressing the gzip file created using the default line_terminator, we can clearly see that the files are different (compressed-wrong-line-terminator.csv vs uncompressed.csv); only when using the explicit line_termintor='\n' the uncompressed file is identical to the not compressed file (compressed-good-line-terminator.csv.gz vs. uncompressed.csv).
It is emphasized that if we use the explicit line_terminator='\n' for non-compressed files, the output file is different than the ones created without explicit assignment of the line_terminator - forcing the user the need to explicitly specify the line_terminator only for compressed files.
This behavior is problematic, especially using the latest pandas version, where compression is inferred from the file extension, and one would expect that also the line_separator will undergo the same inference.
Expected Output
As stated above, it is expected that the command in python line 2 (after uncompressing it) will produce the same file as produced by the command in python line 1.
However, we see that only the command in python line 3 (after uncompressing it) produces the same file as produced by the command in python line 1.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.1
pytest: None
pip: 19.0.1
setuptools: 40.4.3
Cython: None
numpy: 1.15.2
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.0
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None