Skip to content

PERF: groupby nth w/o dropna can use the cython routines #7569

Closed
@jreback

Description

@jreback

see #7568

  • nth is a fair bit slower than first/last which are calling cython routines. In a case where you don't dropna you can simply call the cython aggregation routines
  • side issue is to move the group_last_object/group_nth_object from algos.pyx to generate_code.py (simply move group_last/group_nth from the groupby template to the same as group_count template, which generates the object dtypes)
  • trap and reraise an error like ValueError buffer type mismatch in the cython trials. This is generated when a built in routine tries to use the cython routines, but the function is not defined (but it SHOULD be defined for all dtypes), so this is a trapped bug (in which case it goes to the python path).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Error ReportingIncorrect or improved errors from pandasGroupbyPerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions