Possible mitigations for occasionaly slow database requests

I think we've seen some encouraging results of moving some queries over to the read-only replica in #2073, but I think there are other improvements we could make to improve the performance of our primary database as well.

## Problem description

Occasionally we see bursts of slow queries.  Moving just a few of our expensive queries to the read-only replica has helped, but we've only moved a few hundred queries per hour to the replica and I don't believe we have actually addressed the root cause here.

In particular, I'm surprised to see that we have a lot of auto-vacuums occurring on the `crates` table.  We're seeing 2 to 3 auto-vacuums **per hour**.  We're also seeing frequent vacuuming of `pg_toast_25513` which I'm guessing is the [TOAST](https://www.postgresql.org/docs/11/storage-toast.html) table associated with the crates table.

What I expect is happening, is that after running the `update_downloads` background job we end up with a lot of old tuples in the `crates` table.  This table includes several columns that can become wide:

* `character varying`
  * description
  * homepage
  * documentation
  * readme - can become very large
  * repository
* `tsvector`
  * textsearchable_index_col

## Proposed mitigations

### Improve the `update_download` job

We currently grab all `version_downloads` that need to be processed, and then loop over those results, committing 1 transaction for each.  This means that if a popular crate has download requests for 10 different versions, then we will commit to the `crates` table 10 times for that crate, quickly creating garbage in this table.

If we instead grouped the initial query by `crate_id`, we could batch the work for a crate into a single transaction.

### More aggressively `TOAST`ing on the `crates` table

The other part of the problem seems to be that rows in the `crate` table can be large.  By default PG will compress these columns if the row >2kB, but the row can grow to 8kB before columns are evicted to the TOAST table.  So we're creating a lot of garbage tuples that may be large.

I don't see any obvious way to force these columns into TOAST.  None of the 4 available strategies seem to force a column into external storage.  We could `ALTER TABLE crates SET (toast_tuple_target = 256)` or some other small number, but I'm also not entirely certain this will actually push more data out to TOAST.  It should make PG more aggressive about compressing the data, but I'm a bit unclear on if this will force anything into TOAST before hitting the 8kB maximum row length.

Importantly, the documentation does say "During an UPDATE operation, values of unchanged fields are normally preserved as-is; so an UPDATE of a row with out-of-line values incurs no TOAST costs if none of the out-of-line values change."  This means that if we can force more data into TOAST then our updates of the download counts should not result in churn to that table.  (Publishing new versions of a crate will update these values and will still cause some churn, but this is obviously much lower in volume than downloads.)

### Leave more empty space in the `crates` table

I can't recall the specific knob that configures this, but I recall reading that there is a way to encourage PG to leave some extra space within each 8kB page.  As I understand it, this increases the chance that PG can fit an updated row into the same page as the old row.  This helps reduce the overhead of updating indexes.

### Manually extract columns from the `crates` table

This would be a more invasive change, as we would have to update any queries that use these columns.  It's also not clear to me exactly how we go about that with minimal impact to the production database, but if the above solutions don't provide sufficient improvements then this is something we can explore.

cc @sgrif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible mitigations for occasionaly slow database requests #2153

Problem description

Proposed mitigations

Improve the `update_download` job

More aggressively `TOAST`ing on the `crates` table

Leave more empty space in the `crates` table

Manually extract columns from the `crates` table

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible mitigations for occasionaly slow database requests #2153

Description

Problem description

Proposed mitigations

Improve the update_download job

More aggressively TOASTing on the crates table

Leave more empty space in the crates table

Manually extract columns from the crates table

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Improve the `update_download` job

More aggressively `TOAST`ing on the `crates` table

Leave more empty space in the `crates` table

Manually extract columns from the `crates` table