Closed
Description
Numpy 1.17 introduced a new random module with faster PRNGs, and which drops the strict reproducibility of random streams guarantee, which allows some algorithmic improvements. In particular, the choice method is now a lot faster in the replace=False case. Would it make sense for random_state
to return a np.random.Generator instead of a np.random.RandomState when numpy version >= 1.17 here: https://github.com/pandas-dev/pandas/blob/master/pandas/core/common.py#L408. This would automatically speed up the DataFrame.sample method for instance.
I can write a PR, but I'm not sure how to handle the different numpy versions. Should I just do the tests inside random_state
or is something that needs to go inside numpy.compat?