Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
x = pd.Series({'a': None, 'b': '012345', 'c': 1})
print(x)
print(
pd.read_json(
pd.Series(x).to_json(),
typ="series",
orient="records",
keep_default_dates=True,
)
)
a None
b 012345
c 1
dtype: object
a NaT
b 1970-01-01 03:25:45
c 1970-01-01 00:00:01
dtype: datetime64[ns]
# Without na column
x = pd.Series({'b': '012345', 'c': 1})
print(x)
print(
pd.read_json(
pd.Series(x).to_json(),
typ="series",
orient="records",
keep_default_dates=True,
)
)
b 012345
c 1
dtype: object
b 12345
c 1
dtype: int64
Issue Description
When a series has a column that could be parsed as a date, and when there is another column with an na value, read_json
will convert all columns to datetimes.
Expected Behavior
Ideally none of the columns would be parsed as dates, unless I set keep_default_dates=False
or I do not supply it.
Installed Versions
pandas : 1.5.1
numpy : 1.22.2
pytz : 2022.6
dateutil : 2.8.2
setuptools : 65.5.0
pip : 22.1.2
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.6
jinja2 : 3.1.2
IPython : 7.34.0
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : 2022.10.0
gcsfs : None
matplotlib : None
numba : 0.56.3
numexpr : None
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.9.3
snappy : None
sqlalchemy : 1.4.42
tables : None
tabulate : 0.9.0
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None
tzdata : None