-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Jsonlines append mode #35832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jsonlines append mode #35832
Changes from all commits
61818d2
e5c326f
9007656
e13c84b
6d292f7
4357a69
bb76a38
dc5c1f5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2063,6 +2063,7 @@ def to_json( | |
index: bool_t = True, | ||
indent: Optional[int] = None, | ||
storage_options: StorageOptions = None, | ||
mode: str = "w", | ||
) -> Optional[str]: | ||
""" | ||
Convert the object to a JSON string. | ||
|
@@ -2156,6 +2157,13 @@ def to_json( | |
|
||
.. versionadded:: 1.2.0 | ||
|
||
mode : str, default 'w' | ||
If 'orient' is 'records' and 'lines' is 'True' enable option to append | ||
mode ('mode' is 'a') to a json file instead of overwriting. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tests if anything else? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch, will do It |
||
Will throw ValueError if incorrect 'orient' and 'lines'. | ||
|
||
.. versionadded:: 1.2.0 | ||
|
||
Returns | ||
------- | ||
None or str | ||
|
@@ -2335,6 +2343,7 @@ def to_json( | |
index=index, | ||
indent=indent, | ||
storage_options=storage_options, | ||
mode=mode, | ||
) | ||
|
||
def to_hdf( | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -50,6 +50,7 @@ def to_json( | |
index: bool = True, | ||
indent: int = 0, | ||
storage_options: StorageOptions = None, | ||
mode: str = "w", | ||
): | ||
|
||
if not index and orient not in ["split", "table"]: | ||
|
@@ -68,6 +69,11 @@ def to_json( | |
if lines and orient != "records": | ||
raise ValueError("'lines' keyword only valid when 'orient' is records") | ||
|
||
if mode == "a" and (not lines or orient != "records"): | ||
raise ValueError( | ||
"'append mode' only valid when 'line' is True and 'orient' is records" | ||
) | ||
|
||
if orient == "table" and isinstance(obj, Series): | ||
obj = obj.to_frame(name=obj.name or "values") | ||
|
||
|
@@ -95,9 +101,19 @@ def to_json( | |
|
||
if lines: | ||
s = convert_to_line_delimits(s) | ||
try: | ||
add_new_line = ( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. umm, what is all this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ensure that the file exists and is not empty before adding a new line at the end, so if the file is empty the code skips It. The try part is because errors can raise during the comparison (I copied It from pandas repo Itself, later I try to find the line) |
||
mode == "a" | ||
and os.path.exists(path_or_buf) | ||
and os.path.isfile(path_or_buf) | ||
and os.path.getsize(path_or_buf) | ||
) | ||
s = "\n" + s if add_new_line else s | ||
except (TypeError, ValueError): | ||
pass | ||
|
||
if isinstance(path_or_buf, str): | ||
fh, handles = get_handle(path_or_buf, "w", compression=compression) | ||
fh, handles = get_handle(path_or_buf, mode, compression=compression) | ||
try: | ||
fh.write(s) | ||
finally: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not sure about
mode
as a keyword - maybeappend_lines
is better?I think mode is just confusing since it only works on a very small subset of user cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can make this change. My choice of using mode was to keep same naming as previously established (e.g. CSVFormatter(...,mode=mode...) ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mode is appropriate and consistent with other IO methods., but possibly restrict to w|a only, would put before storage_options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I'm just wary of what users expect from this since "mode" is not going to be applicable to the majority of orients. For example, it wouldn't make sense to have
mode="a"
for the table orient because it would produce an invalid document structure that wayIn any case can start with this and add tests for the behavior we want to catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll keep mode for now, If in future bigger consensus tell me to change I still can change. From my side, I tend to favor consistency, because of this
mode="a"
sounds better to me.