Skip to content

to_datetime function not working with %Y.%m.%d %H format #21422

Closed
@johan12345

Description

@johan12345

Code Sample, a copy-pastable example if possible

import pandas as pd
import datetime as dt
print(dt.datetime.strptime('2012.01.01 1', '%Y.%m.%d %H')) 
# -> 2012-01-01 01:00:00
print(pd.to_datetime('2012.01.01 1', format='%Y.%m.%d %H')) 
# -> ValueError: time data '2012.01.01 1' doesn't match format specified

Problem description

When using to_datetime to parse a date that only includes an hour component, but not minutes and seconds, with a format that is otherwise similar to ISO8601 (such as the format '%Y.%m.%d %H'), a ValueError is raised (see above). This behavior is unexpected as strptime can parse the same date without any problem, using the same format string (see above).

I suspect that the problem is in the _format_is_iso function of pandas._libs.tslibs.parsing, where it is just checked if the ISO format starts with the format given - so this format is recognized as being ISO-like. In this case, the format passed to to_datetime is ignored and tslib.array_to_datetime function is used to parse the date instead, which doesn't seem to be able to handle this kind of format.

My current workaround is to modify the dates to also have a minutes component (append ':00' to every string) so that they can be parsed.

Expected Output

2012-01-01 01:00:00 (same as when using strptime)

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: de_DE.UTF-8

pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 38.5.2
Cython: None
numpy: 1.14.2
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.0
openpyxl: 2.5.0
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions