Skip to content

Choose correct datatype when creating new dataframe column from old ones #8527

Closed
@djrobust

Description

@djrobust

I am converting a Stata dataset to a dataframe, then multiply two columns to create a third one. The chosen datatype for the two columns is int8 and for some reason the new column blindly follows that.

For instance, this code

df = pd.read_stata(file)
df['w_age_educ'] = df['w_age'] * df['weduc']
print(df[['w_age', 'weduc', 'w_age_educ']].dtypes)
print(df[['w_age', 'weduc', 'w_age_educ']][:3])

would give me

w_age         int8
weduc         int8
w_age_educ    int8
dtype: object
   w_age  weduc  w_age_educ
0     44     14         104
1     34     13         -70
2     33     18          82

Is this a bug or intended behavior? If the latter, how can I get my desired product column?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions