Closed
Description
When using SAS or Stata data, dates are represented as the number of days since 1/1/1960, and other statistical software uses different origin dates. With that in mind, it would be nice to have an origin date that can be specified. See also, #3969.
It's a relatively simple thing, and not hard to work around, of course. However, I end up dealing with date formatting on just about every data set I import, and I imagine that lots of others do, too.
Currently, I do something like this:
import pandas as pd
import datetime
EPOCH1960 = datetime.date(1970, 1, 1) - datetime.date(1960, 1, 1)
data = pd.read_stata('./data.dta')
data['date'] = pd.to_datetime(data['date'],unit='D') - EPOCH1960
In R, the as.Date()
function takes an origin parameter for numeric types (see, manual). So, in R, the date part would simply be:
data$date <- as.Date(data$date, origin = '1960-01-01')