Skip to content

reporting format of benchmarks #12

Closed
@amueller

Description

@amueller

As discussed here: scikit-learn/scikit-learn#14247 (comment)

I think the current report is very hard to read.
It might be helpful to specify very clearly what the baseline is, that is the meaning of 1 in all the plots - it's your own C++ implementation.

For a comparison with scikit-learn I think doing sklearn speed / your c++ speed would be easier to read as it shows your speedup factor, not our slow-down factor.

Finally, I don't see the number of cores in your benchmark, which is pretty crucial since most of our implementations are single-threaded. Yes, that's a big issue, but saying "we're 100x faster" without saying "on 100 CPUs instead of 1" is quite misleading.
It might be helpful to have a chart of speedup vs number of CPUs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions