Closed
Description
I understand the benefit of Series.str methods which automatically handle NA, but the implementation seems really slow.
>>> s = pd.Series(['abcdefg', np.nan]*500000)
>>> timeit s.str[:5]
1 loops, best of 3: 2.55 s per loop
>>> timeit s.map(lambda row: row[:5], na_action='ignore')
1 loops, best of 3: 558 ms per loop
Looking in the code the difference seems to be that Series.map with na_action='ignore' uses some vectorized code to filter out the NA values while Series.str uses the _na_map function with a try/except for each item in the Series (non-vectorized).
Can I make a request to eliminate the _na_map in favor of something more like Series.map(na_action='ignore')?