-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Remove nightly builds of external extensions #16100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
While the idea to notice inadvertent API breaks early by building some external extensions is nice in theory, it doesn't make much sense in practice: * we would need to check many more extensions (e.g. parallel and uopz come to mind) * we also should check external SAPIs for a better coverage * we also should run that on platforms other than Linux for better coverage * if we introduce a deliberate API break, we need to disable breaking extensions, and to re-enable the builds once the issue if fixed downstream (e.g. the redis build has been disabled in January, but apparently it has been fixed without us having noticed) * all the extensions we currently test have their own GH actions set up; these jobs are likely better suited to notice API break, since they can run respective tests * if an extension build breaks due to downstream issues, we're probably puzzled what's going on, while downstream maintainers likely know what broke, and how to fix it * we would need nightly runs which are at least occasionally green; otherwise it is not that likely that someone even looks at individual issues there (the last green nightly was more than a month ago) All in all, the costs of actually doing these builds seem to outweigh the benefits. As such we drop the night PECL builds.
I don't have a big preference either way.
Sure, but "there could be more tests" is not a great argument for dropping the existing tests.
That's true, but do they test against master? They might not even test against RC.
It was until we branched 8.4. I already notified Derick on the failure when it happened (he's on holiday until next week), and asked if we can prevent it from occurring every year. |
Fair enough.
Ah, right, very good point regarding testing dev (RC should be done by extensions, though).
Thank you! However, with this I was referring to the overall nightly workflow (not the PECL job in particular). The more jobs are failing, the less likely anyone looks into that, I'm afraid. After reconsideration, I don't think this single job is a real problem, so I'm closing this PR. |
I completely agree. I've essentially been struggling with keeping nightly green since it was moved to GitHub. Apart from the obvious issues, we have the macOS build fails sometimes with this error:
I haven't found a solution to this problem, and unfortunately it's something GitHub support couldn't help me with. I already freed one CPU core, so that shouldn't be the cause. It might be a particular test, or set thereof, hogging memory or network. Unfortunately, the runner crashes with the output log not being available. Hence, it's not easy to identify which test is to blame. Anyway, long story short, I look at nightly daily because of these intermittent errors. If we can find a solution for them, I would be very happy ofc. It's also the reason I'm currently not logging this to some public channel on Slack. If the build was more reliable, that would be an option, so more people look at it and I don't have to ping them manually. 🙂 |
Looks like this is a common problem with self hosted runners, but maybe even for hosted runners, it is a GH issue. On the other hand, does this also happen for our push workflow? If not, maybe we're running too many nightly jobs, or it's just a bad time. Anyhow, I'm not too much concerned about an occassionally failing job (of course, that's annoying, though). But when I looked at yesterday's nightlies, I was a bit shocked to see so many failing jobs. PR #16107 and PR #16097 should be a good step. Now I was looking at https://github.com/php/php-src/actions/runs/11088570760, and surprised that only few jobs had failed. On closer investigation I've noticed that only branches with new commits are tested. In my opinion, that's not the best idea, since jobs may fail for other reasons (see e.g. PR #16107), and when a new release of a security branch is made, all over sudden CI may be completely red. In my opinion, we need to keep CI green even for security branches. And I'm completely aware that the two paragraphs are contradictory ("too many jobs" vs. "always run everything"). Maybe a solution is to have alternating nightlies (run one half of the jobs on one day, and the other the other day); might still be good enough to catch regressions. |
I've seen it on occasion, but it's more noticeable in nightly because it runs many more jobs.
Nightly is already a ton of work. Keeping security branches green all the time (not just when we do a security release) essentially means 5 years of support per branch (1 for development, 2 active support, 2 security support). Since CI branches often diverge it's not just a matter of merging one branch lower. This is especially problematic when there are actual code changes required to keep the build green, as we're not supported to do that in security branches. Even Ubuntu LTS will run out for every other PHP release. Also note that with security support now being extended, we would essentially be going from 3 branches to 5 tested in nightly, increasing the number of jobs by 66%. |
I've noticed that none of the extension run CI for PHP 8.4 on Windows (some even don't for other OSs); submitted PR about that.
I know. Thanks for working on that! However, 16:0 votes on https://wiki.php.net/rfc/travis_ci, and 34:0 on https://wiki.php.net/rfc/release_cycle_update, and apparently almost nobody of the voters cares about that. Sad. |
I wasn't aware of the Travis CI RFC/policy. It does say:
Which doesn't seem to include security branches. Don't get me wrong, I can see your point. Of course, it would be optimal to have all branches that still get releases passing. I'm just afraid this is going to be a significant pain point. I would say that I'd object to running all branches nightly. I don't think that's necessary and wasteful. If we want to keep security branches green, it should be enough to run them weekly. That would reduce build times by a lot. After some time, I can check in my work logs how much time I invested in it. We might also try to get to the bottom of #16100 (comment) by logging to some external service, so that the logs persist. It's annoying but I don't see any other way to get to the bottom of this, especially since it occurs intermittently. I would appreciate help on these though. |
After reconsideration, I think it would even be okay to have I'll try to help with getting nightly green; last night's run doesn't look too bad. And I doubt that we can resolve all these intermittent test failures anyway, especially when nightly runs so many jobs on hosted runners. |
@iluuu1994, the nightly MSAN issue should now be fixed via 4a16940. |
Thank you! |
While the idea to notice inadvertent API breaks early by building some external extensions is nice in theory, it doesn't make much sense in practice:
All in all, the costs of actually doing these builds seem to outweigh the benefits. As such we drop the night PECL builds.
Note that I'm not sure which branch that needs to target. But maybe we want to keep these builds anyway, so that wouldn't matter.