Skip to content

API: which "anchor point" for datetime properties of Periods ? #20324

Open
@jorisvandenbossche

Description

@jorisvandenbossche

In several docstring PRs for Period datetime properties, we ran into the confusion about how the date/time of those attributes are determined (start or end ?). Eg see discussion in #20277 (comment)

Small example to illustrate:

In [30]: p1 = pd.Period('2017-01-01', freq='D')

In [31]: p2 = pd.Period('2017-01-01', freq='M')

In [32]: p1
Out[32]: Period('2017-01-01', 'D')

In [33]: p2
Out[33]: Period('2017-01', 'M')

In [34]: p1.start_time
Out[34]: Timestamp('2017-01-01 00:00:00')

In [35]: p2.start_time
Out[35]: Timestamp('2017-01-01 00:00:00')

In [36]: p1.day
Out[36]: 1

In [37]: p2.day
Out[37]: 31

The discussion raised from how to describe the summary of such an attribute: "The day of the month" -> but which day of the period span? -> should this be "The day of the month of the start of the Period" ? -> ah, no, because it is not always the start, it depends on the frequency.

In the above example, M is actually the freq string for "MonthEnd", and the datetime-properties apparently then use the end as date to calculate those properties.

Questions:

  • How best to document this? Can we use a certain phrase in all docstrings?
  • Is there a way to know, given a certain freq, what the "anchor point" is? (using anchor point here, don't know if we have existing terminology for that) A way to know if the freq is a "End" ?
  • It's rather confusing behaviour, is this actually the behaviour we want?

cc @jreback @jbrockmendel @sinhrks

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions