Skip to content

Update pgo test to check connectivity on instances & endpoints ([ch5712]) #1076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 19, 2019

Conversation

jkatz
Copy link
Contributor

@jkatz jkatz commented Nov 2, 2019

Checklist:

  • Have you added an explanation of what your changes do and why you'd like them to be included?
  • Have you updated or added documentation for the change, as applicable?
  • Have you tested your changes on all related environments with successful results, as applicable?

Type of Changes:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

What is the current behavior? (link to any open issues here)

Previously, pgo test would check if each individual user available to a PostgreSQL cluster could authenticate with its credentials. While this certainly determines that one can make a connection to a PostgreSQL database managed by the Operator, it also presents several issues:

  • If one has many users available to the Operator, the number of connections made can rapidly multiply. Imagine 10 users across 5 replicas
  • This also means that the user's credentials must be stored in the Operator. Currently this is a plaintext password. If the connections are not being made over SSL, this creates the risk of password leakage.

What is the new behavior (if this is a feature change)?

Update the pgo test to do the following:

  • Provide an instance (pod) level readiness check. The pod readiness check already calls pg_isready, which checks to see if a PostgreSQL instance is available to accept connections
  • Check the service endpoints to see if they are ready to accept connections.

These two checks provide the equivalent behavior of what is currently happening today with pgo test but in a more efficient manner with less potential for information leakage.

The API endpoint now also serves content that is in a structured format.

Additionally, the event that was fired when "test event" is called is removed. This does not affect any lifecycle activity of a PostgreSQL cluster and as it is fired ad-hoc by a user, does not need to be propagated downstream.

Other information:

Here is a test case:

Client #1:

watch pgo test -n jkatz jkatz

Client #2:

pgo create namespace -n jkatz
# observe Client #1
pgo create cluster -n jkatz jkatz
# observe Client #1
pgo scale cluster -n jkatz jkatz
# observe Client #1

@jkatz jkatz requested a review from andrewlecuyer November 2, 2019 16:14
@jkatz jkatz self-assigned this Nov 2, 2019
Copy link
Collaborator

@andrewlecuyer andrewlecuyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not run into any issues when testing this PR, so functionally the change looks good. There is one suggestion that I have involving changing the header displayed in the CLI output from Endpoints to Services, but otherwise the other notes are simply some minor notes for cleaning up comments.

Jonathan S. Katz added 2 commits November 18, 2019 20:44
Previously, the PostgreSQL cluster test (from the command line, known as
`pgo test`) would check if each individual user available to a PostgreSQL
cluster could authenticate with its credentials. While this certainly
determines that one can make a connection to a PostgreSQL database managed by
the Operator, it also presents several issues:

- If one has many users available to the Operator, the number of connections
made can rapidly multiply. Imagine 10 users across 5 replicas
- This also means that the user's credentials must be stored in the Operator.
Currently this is a plaintext password. If the connections are _not_ being made
over SSL, this creates the risk of password leakage.

The test now performs two types of checks: instance (or pod) readiness checks,
and endpoint availability. Both of these are designed to check if a PostgreSQL
instance an be connected to and reached from the outside world. Specifically:

- The pod readiness check calls `pg_isready` which can quickly determine if a
PostgreSQL is accepting connections
- Check if the service endpoints responsible for serving the PostgreSQL are
reachable.

If both of the checks pass, then a PostgreSQL cluster is reachable.

This commit also does the following:

- BREAKING: Modify the API endpoint that is consumed by `pgo test` to deliver
the content in a structure manner
- BREAKING: Remove the "TestCluster" event. This is not a lifecycle event.
- Introduce the "Endpoints" Kubernetes construct to the `kubeapi` package
- Some refactoring, cleanup, and improved documentation around the connectivity
tests

Issue: [ch5712]
@jkatz jkatz force-pushed the pgo-test-optimization branch from 3d3d56c to c860a17 Compare November 19, 2019 04:52
@jkatz jkatz requested a review from andrewlecuyer November 19, 2019 04:52
@jkatz
Copy link
Contributor Author

jkatz commented Nov 19, 2019

@andrewlecuyer Thanks for the review. I've made modifications per your suggestions.

@jkatz jkatz merged commit b5305c4 into CrunchyData:master Nov 19, 2019
@jkatz jkatz deleted the pgo-test-optimization branch December 23, 2019 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants