Description
According to the docs (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html)
"In the current implementation apply calls func twice on the first column/row to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if func has side-effects..."
Well it definitely is there in the docs, but took me several hours to trace down the bug to this "feature".
So I think it would be cleaner either to fully support side effects in apply (e.g. by calling func on a copy of the first column/row in the testing phase ) or ban it completely if technically possible.
I know there are plans to ban modification when using groupby.apply
( #12653 )
I don't see any issues with mutation inside a (non groupby) apply per se, but I may be wrong.
I also have to note, that the above note from the docs is not entirely correct. If result_type
is specified the first row/column is not necessarily processed twice.