current implementation of DataFrame.apply where passed function has side effects is a real "gotcha"

From the documentation for DataFrame.apply:

`
In the current implementation apply calls func twice on the first column/row to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if func has side-effects, as they will take effect twice for the first column/row.
`

I am sure there are well-thought out reasons why it is currently implemented this way, but this behavior can be a real "gotcha".  I grant that having side effects inside an apply function is not good standard practice, but I would argue that there are times when it is a good solution to a problem. ( I can elaborate on this if needed).  Thankfully this behavior is documented, but I think it is reasonable to expect that most users will not always be mindful of secondary notes that exist throughout the documentation, and the source of problems caused by this behavior is not at all obvious.

While I don't understand the engineering issues regarding the fast/slow path, I would suggest that some better solution to the problem be introduced.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

current implementation of DataFrame.apply where passed function has side effects is a real "gotcha" #10222

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

current implementation of DataFrame.apply where passed function has side effects is a real "gotcha" #10222

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions