Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
df = pd.DataFrame({'numbers': list(range(1, 10))})
df.to_json('gcs://test-bucket/test.json.gz')
Problem description
Error writing compressed stream using gcs. Removing compression works fine.
import pandas as pd...
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/workspaces/charybdis/test_pg.py in
254
255 df = pd.DataFrame({'numbers': list(range(1, 10))})
----> 256 df.to_json('gcs://river-categorizer-data-us-central1/test.jsonl.gz')
/workspaces/charybdis/.venv/lib/python3.8/site-packages/pandas/core/generic.py in to_json(self, path_or_buf, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index, indent, storage_options)
2463 indent = indent or 0
2464
-> 2465 return json.to_json(
2466 path_or_buf=path_or_buf,
2467 obj=self,
/workspaces/charybdis/.venv/lib/python3.8/site-packages/pandas/io/json/_json.py in to_json(path_or_buf, obj, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index, indent, storage_options)
100 if path_or_buf is not None:
101 # apply compression and byte/text conversion
--> 102 with get_handle(
103 path_or_buf, "wt", compression=compression, storage_options=storage_options
104 ) as handles:
/workspaces/charybdis/.venv/lib/python3.8/site-packages/pandas/io/common.py in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
590 )
591 else:
--> 592 handle = gzip.GzipFile(
593 fileobj=handle, # type: ignore[arg-type]
594 mode=ioargs.mode,
/usr/local/lib/python3.8/gzip.py in __init__(self, filename, mode, compresslevel, fileobj, mtime)
202
203 if self.mode == WRITE:
--> 204 self._write_gzip_header(compresslevel)
205
206 @property
/usr/local/lib/python3.8/gzip.py in _write_gzip_header(self, compresslevel)
230
231 def _write_gzip_header(self, compresslevel):
--> 232 self.fileobj.write(b'\037\213') # magic header
233 self.fileobj.write(b'\010') # compression method
Expected Output
No error
Output of pd.show_versions()
pandas : 1.2.2
numpy : 1.20.1
pytz : 2021.1
dateutil : 2.8.1
pip : 20.2.2
setuptools : 49.6.0
Cython : None
pytest : 6.2.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.2
html5lib : None
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.20.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : 0.8.5
fastparquet : None
gcsfs : 0.7.2
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 3.0.0
pyxlsb : None
s3fs : None
scipy : 1.6.0
sqlalchemy : 1.3.23
tables : None
tabulate : 0.8.7
xarray : None
xlrd : None
xlwt : None
numba : None