Open
Description
Currently with df.interpolate(limit, limit_direction)
, I must choose 1 or both sides to fill when I am limiting the interpolation. What I find more useful is a all-or-none strategy rather than only fill up to the limit count, so I can fill up some short-term missing data and keep long-term missing data to be filtered after. Demonstrated as here:
>>> df = pd.DataFrame([[0,1,2,3],[1,np.nan,np.nan,np.nan],[np.nan,np.nan,np.nan,5],[3,4,5,6]],columns=list('abcd'))
>>> df
a b c d
0 0.0 1.0 2.0 3.0
1 1.0 NaN NaN NaN
2 NaN NaN NaN 5.0
3 3.0 4.0 5.0 6.0
>>> # Current options
>>> df.interpolate(axis=0,limit=1,limit_direction='forward')
a b c d
0 0.0 1.0 2.0 3.0
1 1.0 2.0 3.0 4.0
2 2.0 NaN NaN 5.0
3 3.0 4.0 5.0 6.0
>>> df.interpolate(axis=0,limit=1,limit_direction='backward')
a b c d
0 0.0 1.0 2.0 3.0
1 1.0 NaN NaN 4.0
2 2.0 3.0 4.0 5.0
3 3.0 4.0 5.0 6.0
>>> df.interpolate(axis=0,limit=1,limit_direction='both')
a b c d
0 0.0 1.0 2.0 3.0
1 1.0 2.0 3.0 4.0
2 2.0 3.0 4.0 5.0
3 3.0 4.0 5.0 6.0
>>> # What is desired
>>> interpolated_df = pd.DataFrame([[0,1,2,3],[1,np.nan,np.nan,4],[2,np.nan,np.nan,5],[3,4,5,6]],columns=list('abcd'))
>>> interpolated_df # NaNs at column b and c not filtered for exceeding limit 1
a b c d
0 0 1.0 2.0 3
1 1 NaN NaN 4
2 2 NaN NaN 5
3 3 4.0 5.0 6