Skip to content

iterrows: when upcasting to object, values are converted to python types #13468

Open
@jorisvandenbossche

Description

@jorisvandenbossche

I know iterrows is not the most recommended function, but I noticed a strange behaviour (triggered by a problem of a geopandas user: geopandas/geopandas#348). When using iterrows on a df with mixed dtypes (so the resulting series is of object dtype), the numeric values are converted to python types, while with loc/iloc the numpy types are preserved:

In [254]: df = pd.DataFrame({'int':[0,1], 'float':[0.1,0.2], 'str':['a','b']})

In [255]: df
Out[255]:
   float  int str
0    0.1    0   a
1    0.2    1   b

In [256]: row1 = df.iloc[0]

In [257]: i, row2 = next(df.iterrows())

In [258]: row3= next(df.itertuples())

In [260]: type(row1['float'])
Out[260]: numpy.float64

In [261]: type(row2['float'])
Out[261]: float

In [269]: type(row3.float)
Out[269]: numpy.float64

Is this intentional? (it's a consequence of using self.values in the implementation, and numpy does this conversion to python types in an object array) And if so, is this worth documenting?

(note it was actually the numpy types in an object dtyped series that caused an issue for the geopandas user, because fiona couldn't handle those numpy scalars in an object dtyped column, but that's not an issue to blame pandas)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions