Closed
Description
Try to think about how to resolve our differences in #194 (comment), and I think such disagreements are bound to keep coming up.
I'd like to suggest a way around this: that there be 2 levels of the standard:
- level 1: core functionality (what we have already), can be used to do heavy lifting
- level 2: in addition to level1, there are also methods which don't necessarily guarantee high performance (such as
GroupBy.__iter__
,to_json
/to_pylist
, or maybe even.to_array_object
)
Implementations could then choose to provide level1-compliance, or level1 and level2 compliance. For example, we may get to:
- cudf: level1
- modin: level1
- pandas: level1, level 2
- polars: level1, level2
Then, dataframe libraries could declare some level of compliance. For example:
- scikit-learn: works with any level1-compliant dataframe
- feature-engine: works with any level1-compliant dataframe
- plotly: works with any level2-compliant dataframe
- altair: works with any level2-compliant dataframe
Thoughts?
Plotting was meant to be one of the concrete use cases, and I'd be disappointed if we had to give it up just because it doesn't do heavy lifting
Metadata
Metadata
Assignees
Labels
No labels