How to deal with multiple valid test results

Depending on the version of matplotlib, the way it was installed, the way other dependencies were installed and a bunch of other random factors, the results of our image comparison tests can vary slightly.
Enough to fall way out of the tolerance window, but still being perfectly valid results.

So we have a set of multiple valid baseline images for most of our tests, but don't really have a good way to actually test if any of them matches.

Right now, we are using our own fork of pytest-mpl, which does support testing against multiple baselines: https://github.com/OGGM/pytest-mpl
But maintaining that has been a consistent pain, and it's getting harder the more upstream pytest-mpl evolves.

This must be a general issue not only we have. What is the proper way to deal with it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to deal with multiple valid test results #149

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to deal with multiple valid test results #149

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions