Skip to content

unable to write JSON to S3 use to_json #28375

Closed
@juls858

Description

@juls858

Code Sample, a copy-pastable example if possible

S3 paths work for reading and writing CSV. However, I am not able to write json files using the to_json method. Reading json from an S3 path seems to work just fine.

df.to_json(s3uri, orient='values')

Problem description

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
 in 
      1 final_users['user_id'].to_json(s3_path + 'users.json',
----> 2                                orient='values')

~/anaconda3/envs/qa-tool/lib/python3.7/site-packages/pandas/core/generic.py in to_json(self, path_or_buf, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index)
   2271                             default_handler=default_handler,
   2272                             lines=lines, compression=compression,
-> 2273                             index=index)
   2274 
   2275     def to_hdf(self, path_or_buf, key, **kwargs):

~/anaconda3/envs/qa-tool/lib/python3.7/site-packages/pandas/io/json/json.py in to_json(path_or_buf, obj, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index)
     66 
     67     if isinstance(path_or_buf, compat.string_types):
---> 68         fh, handles = _get_handle(path_or_buf, 'w', compression=compression)
     69         try:
     70             fh.write(s)

~/anaconda3/envs/qa-tool/lib/python3.7/site-packages/pandas/io/common.py in _get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text)
    425         elif is_text:
    426             # Python 3 and no explicit encoding
--> 427             f = open(path_or_buf, mode, errors='replace', newline="")
    428         else:
    429             # Python 3 and binary mode

FileNotFoundError: [Errno 2] No such file or directory: 's3://bucket/folder/foo.json'

Expected Output

None. Expected output is no error and the file is written to the s3 bucket.

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None

pandas: 0.24.2
pytest: None
pip: 19.2.2
setuptools: 41.0.1
Cython: None
numpy: 1.16.4
scipy: 1.3.1
pyarrow: None
xarray: None
IPython: 7.7.0
sphinx: None
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.1.0
openpyxl: 2.6.2
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: 0.2.1
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO JSONread_json, to_json, json_normalizeIO NetworkLocal or Cloud (AWS, GCS, etc.) IO Issues

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions