TRACKER: milestones

In order in which we should tackle them:

- [x] 1. Support a custom time index in ewm operations, using the time unit specified via `halflife` (No Pandas issue yet).
- [x] 2. DatatimeIndexer https://github.com/twosigma/pandas/issues/45
- [x] 3. Allow turning on numba by default (https://github.com/pandas-dev/pandas/issues/33966).
- [ ] 4. Enable full parallelism (`parallel=True`, use `numba.prange`) when doing `apply`
    - [ ] 4a. `groupby.apply` (Note: udf must be a reduction op until https://github.com/numba/numba/issues/4579 is solved)
    - [x] 4b. `rolling.apply`
    - [x] 4c. `groupby.rolling.apply`
    - [x] 4d. `groupby.agg`
    - [x] 4e. `groupby.transform`
- [x] 5. Ensure readonly is set on all numpy arrays passed to numba (we make a copy of the data before passing to numba func, still needed?)
- [x] 6. Investigate how much time effort it would take to implement the full initial spec (with indexers/aggregators/kernels), without including it in any of the existing APIs. E.g. how hard would it be to implement something like `df._window(indexer, aggregator, kernel)` and `df.groupby._window(indexer, aggregator, kernel)`.
- [ ] 7. Implement consecutive nan-handling in ewma. (e.g. only keep a nan if it is preceded by X amount of other nans)
- [ ] 8. Implement step size in rolling operations (https://github.com/pandas-dev/pandas/issues/15354)
- [x] 9. Implement indexers in EWM, along with `.ewm(...).apply` (No pandas issue yet)
- [x] 10. Think about weighting calculations: https://github.com/pandas-dev/pandas/issues/34556 (https://github.com/pandas-dev/pandas/pull/37204)
- [ ] 11. merge_asof, where tolerance is an indexer; also investigate .reindex which accepts currently a list-like for tolerance.
- [x] 12. Forward looking indexer: pandas-dev#34226
- [x] 13. dask backend in ibis: https://github.com/ibis-project/ibis/issues/2245
- [x] 14. https://github.com/pandas-dev/pandas/issues/35690, see if we can put answers to these and see if these are bugs / not
- [x] 15. Tablewise rolling: https://github.com/pandas-dev/pandas/issues/15095 (https://github.com/pandas-dev/pandas/pull/38417)
- [x] 16. benchmark doing a fixed window but with variable indexers (would help for step size too): https://github.com/pandas-dev/pandas/pull/36567
- [x] 17. investigate can we support ``.groupby(..).expanding()`` via indexers: https://github.com/pandas-dev/pandas/pull/37064
- [x] 18. investigate can we support ``.groupby(..).ewma()`` via indexers (https://github.com/pandas-dev/pandas/pull/37878)
  - [x] 18a: xref https://github.com/pandas-dev/pandas/issues/16037
- [x] 19. if 17) and 18) then can remove internal uses of .apply (old way we did for groupby / rolling) (https://github.com/pandas-dev/pandas/pull/39219)
- [x] 20. ~explore for say `.rolling(...).mean/sum/max/min` if doing this in numba is faster / can / should we call the cython function. (prob not worth it)~
- [ ] 21. investigate seeing if we can use common functions for aggregation. e.g. windowing: https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/window/aggregations.pyx#L140, and groupby: https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/groupby.pyx#L473
- [ ] 22. MultiIndex engine improvement for partial key matching (https://github.com/pandas-dev/pandas/issues/38650)
- [x] 23. online ewma (#46 )
- [ ] 24. merge_asof with multi-key (#47)
- [x] 25. Tablewise ewma
- [x] 26. Implement ewm().sum() (https://github.com/pandas-dev/pandas/issues/13297)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TRACKER: milestones #44

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TRACKER: milestones #44

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions