Skip to content

Revert "Merge pull request #5837 from rust-lang/share-background-worker-connection" #5909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 11, 2023

Conversation

Turbo87
Copy link
Member

@Turbo87 Turbo87 commented Jan 10, 2023

This reverts commit 8283354, reversing changes made to ba69940.

After deploying #5837 to production the update_downloads background job was triggering the "stuck background job" warning after a while, with the logs showing:

Job 1212273 failed to run: deadlock detected

With the on-call team we narrowed the issue down to #5837 and deployed the same revision with that PR reverted (aka. this branch). This apparently resolved the issue on production.

@jtgeibel any clue what might have caused this? any objections on reverting #5837 for now to keep master deployable? :)

…ound-worker-connection"

This reverts commit 8283354, reversing
changes made to ba69940.
@Turbo87 Turbo87 added C-bug 🐞 Category: unintended, undesired behavior A-backend ⚙️ labels Jan 10, 2023
@jtgeibel
Copy link
Member

Yeah, lets revert until we can track down the issue.

@jtgeibel jtgeibel merged commit 838ac1d into rust-lang:master Jan 11, 2023
@jtgeibel
Copy link
Member

Okay, I think I understand what is going on. The update_downloads job is long running and touches tons of rows of the crates table. With this change we are basically running all of this within a transaction which means that nothing is actually committed until the background job finishes. This leaves a huge window for conflicting updates and probably leads to these deadlocks.

For this endpoint we can ignore the existing connection and obtain our own. That will guarantee that we aren't in a transaction, and I don't think we have existing test coverage of this so hopefully no tests that start failing again 🤞. (I'm not sure what a test for this would look like, but I think we could reuse the unhealthy_database test infrastructure to test things like this if needed.)

I'll take a look at the other background jobs to see if they could have similar issues. I think I'm looking for any non-SELECT that could be pending for more than a few seconds before the job completes.

@Turbo87 Turbo87 deleted the revert-bg-worker-changes branch January 11, 2023 09:36
jtgeibel added a commit to jtgeibel/crates.io that referenced this pull request Jan 12, 2023
…er-changes"

This reverts commit 838ac1d, reversing
changes made to a732c1f.

This reintroduces changes that were originally landed in rust-lang#5837 and
reverted in rust-lang#5909.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-bug 🐞 Category: unintended, undesired behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants