-
Notifications
You must be signed in to change notification settings - Fork 118
Add longevity test plan and results #1113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
404ff96
Add longevity test plan and results
pleshakov bb50c6e
Apply suggestions from code review
pleshakov d917a4e
Update tests/longevity/results/1.0.0.md
pleshakov 63cd53e
Update tests/longevity/longevity.md
pleshakov dcf712d
Refactor dir structure
pleshakov 8b57e4b
Mention where to look for logs and metrics
pleshakov 38f804d
Make it explicit that tester VMs are on Google Cloud
pleshakov 8e0ae72
Fix linting
pleshakov 6e9886d
Merge branch 'main' into test/longevity
pleshakov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
# Longevity Test | ||
|
||
This document describes how we test NGF for longevity. | ||
|
||
<!-- TOC --> | ||
|
||
- [Longevity Test](#longevity-test) | ||
- [Goals](#goals) | ||
- [Test Environment](#test-environment) | ||
- [Steps](#steps) | ||
- [Start](#start) | ||
- [Check the Test is Running Correctly](#check-the-test-is-running-correctly) | ||
- [End](#end) | ||
- [Analyze](#analyze) | ||
- [Results](#results) | ||
|
||
<!-- TOC --> | ||
|
||
## Goals | ||
|
||
- Ensure that NGF successfully processes both control plane and data plane transactions over a period of time much | ||
greater than in our other tests. | ||
- Catch bugs that could only appear over a period of time (like resource leaks). | ||
|
||
## Test Environment | ||
|
||
- A Kubernetes cluster with 3 nodes on GKE | ||
- Node: e2-medium (2 vCPU, 4GB memory) | ||
- Enabled GKE logging. | ||
- Enabled GKE Cloud monitoring with managed Prometheus service, with enabled: | ||
- system. | ||
- kube state - pods, deployments. | ||
- Tester VMs on Google Cloud: | ||
- Configuration: | ||
- Debian | ||
- Install packages: tmux, wrk | ||
- Location - same zone as the Kubernetes cluster. | ||
- First VM - for HTTP traffic | ||
- Second VM - for sending HTTPs traffic | ||
- NGF | ||
- Deployment with 1 replica | ||
- Exposed via a Service with type LoadBalancer, private IP | ||
- Gateway, two listeners - HTTP and HTTPs | ||
- Two apps: | ||
- Coffee - 3 replicas | ||
- Tea - 3 replicas | ||
- Two HTTPRoutes | ||
- Coffee (HTTP) | ||
- Tea (HTTPS) | ||
|
||
## Steps | ||
|
||
### Start | ||
|
||
Test duration - 4 days. | ||
|
||
1. Create a Kubernetes cluster on GKE. | ||
2. Deploy NGF. | ||
3. Expose NGF via a LoadBalancer Service with `"networking.gke.io/load-balancer-type":"Internal"` annotation to | ||
allocate an internal load balancer. | ||
4. Apply the manifests which will: | ||
1. Deploy the coffee and tea backends. | ||
2. Configure HTTP and HTTPS listeners on the Gateway. | ||
3. Expose coffee via HTTP listener and tea via HTTPS listener. | ||
4. Create two CronJobs to re-rollout backends: | ||
1. Coffee - every minute for an hour every 6 hours | ||
2. Tea - every minute for an hour every 6 hours, 3 hours apart from coffee. | ||
5. Configure Prometheus on GKE to pick up NGF metrics. | ||
|
||
```shell | ||
kubectl apply -f files | ||
``` | ||
|
||
5. In Tester VMs, update `/etc/hosts` to have an entry with the External IP of the NGF Service (`10.128.0.10` in this | ||
case): | ||
|
||
```text | ||
10.128.0.10 cafe.example.com | ||
``` | ||
|
||
6. In Tester VMs, start a tmux session (this is needed so that even if you disconnect from the VM, any launched command | ||
will keep running): | ||
|
||
```shell | ||
tmux | ||
``` | ||
|
||
7. In First VM, start wrk for 4 days for coffee via HTTP: | ||
|
||
```shell | ||
wrk -t2 -c100 -d96h http://cafe.example.com/coffee | ||
``` | ||
|
||
8. In Second VM, start wrk for 4 days for tea via HTTPS: | ||
|
||
```shell | ||
wrk -t2 -c100 -d96h https://cafe.example.com/tea | ||
``` | ||
|
||
Notes: | ||
|
||
- The updated coffee and tea backends in cafe.yaml include extra configuration for zero time upgrades, so that | ||
wrk in Tester VMs doesn't get 502 from NGF. Based on https://learnk8s.io/graceful-shutdown | ||
|
||
### Check the Test is Running Correctly | ||
|
||
Check that you don't see any errors: | ||
|
||
1. Check that GKE exports NGF pod logs to Google Cloud Operations Logging and Prometheus metrics to Google Cloud | ||
Monitoring. | ||
2. Check that traffic is flowing - look at the access logs of NGINX in Google Cloud Operations Logging. | ||
3. Check that CronJob can run. | ||
|
||
```shell | ||
kubectl create job --from=cronjob/coffee-rollout-mgr coffee-test | ||
kubectl create job --from=cronjob/tea-rollout-mgr tea-test | ||
``` | ||
|
||
In case of errors, double check if you prepared the environment and launched the test correctly. | ||
|
||
### End | ||
|
||
- Remove CronJobs. | ||
|
||
## Analyze | ||
|
||
- Traffic | ||
- Tester VMs (clients) | ||
- As wrk stop, they will print output upon termination. To connect to the tmux session with wrk, | ||
run `tmux attach -t 0` | ||
- Check for errors, latency, RPS | ||
- Logs | ||
- Check the logs for errors in Google Cloud Operations Logging. | ||
- NGF | ||
- NGINX | ||
- Check metrics in Google Cloud Monitoring. | ||
- NGF | ||
- CPU usage | ||
- NGINX | ||
- NGF | ||
- Memory usage | ||
- NGINX | ||
- NGF | ||
- NGINX metrics | ||
- Reloads | ||
|
||
## Results | ||
|
||
- [1.0.0](results/1.0.0/1.0.0.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
apiVersion: gateway.networking.k8s.io/v1beta1 | ||
kind: HTTPRoute | ||
metadata: | ||
name: coffee | ||
spec: | ||
parentRefs: | ||
- name: gateway | ||
sectionName: http | ||
hostnames: | ||
- "cafe.example.com" | ||
rules: | ||
- matches: | ||
- path: | ||
type: PathPrefix | ||
value: /coffee | ||
backendRefs: | ||
- name: coffee | ||
port: 80 | ||
--- | ||
apiVersion: gateway.networking.k8s.io/v1beta1 | ||
kind: HTTPRoute | ||
metadata: | ||
name: tea | ||
spec: | ||
parentRefs: | ||
- name: gateway | ||
sectionName: https | ||
hostnames: | ||
- "cafe.example.com" | ||
rules: | ||
- matches: | ||
- path: | ||
type: PathPrefix | ||
value: /tea | ||
backendRefs: | ||
- name: tea | ||
port: 80 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: cafe-secret | ||
type: kubernetes.io/tls | ||
data: | ||
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNzakNDQVpvQ0NRQzdCdVdXdWRtRkNEQU5CZ2txaGtpRzl3MEJBUXNGQURBYk1Sa3dGd1lEVlFRRERCQmoKWVdabExtVjRZVzF3YkdVdVkyOXRNQjRYRFRJeU1EY3hOREl4TlRJek9Wb1hEVEl6TURjeE5ESXhOVEl6T1ZvdwpHekVaTUJjR0ExVUVBd3dRWTJGbVpTNWxlR0Z0Y0d4bExtTnZiVENDQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFECmdnRVBBRENDQVFvQ2dnRUJBTHFZMnRHNFc5aStFYzJhdnV4Q2prb2tnUUx1ek10U1Rnc1RNaEhuK3ZRUmxIam8KVzFLRnMvQVdlS25UUStyTWVKVWNseis4M3QwRGtyRThwUisxR2NKSE50WlNMb0NEYUlRN0Nhck5nY1daS0o4Qgo1WDNnVS9YeVJHZjI2c1REd2xzU3NkSEQ1U2U3K2Vab3NPcTdHTVF3K25HR2NVZ0VtL1Q1UEMvY05PWE0zZWxGClRPL051MStoMzROVG9BbDNQdTF2QlpMcDNQVERtQ0thaEROV0NWbUJQUWpNNFI4VERsbFhhMHQ5Z1o1MTRSRzUKWHlZWTNtdzZpUzIrR1dYVXllMjFuWVV4UEhZbDV4RHY0c0FXaGRXbElweHlZQlNCRURjczN6QlI2bFF1OWkxZAp0R1k4dGJ3blVmcUVUR3NZdWxzc05qcU95V1VEcFdJelhibHhJZVVDQXdFQUFUQU5CZ2txaGtpRzl3MEJBUXNGCkFBT0NBUUVBcjkrZWJ0U1dzSnhLTGtLZlRkek1ISFhOd2Y5ZXFVbHNtTXZmMGdBdWVKTUpUR215dG1iWjlpbXQKL2RnWlpYVE9hTElHUG9oZ3BpS0l5eVVRZVdGQ2F0NHRxWkNPVWRhbUloOGk0Q1h6QVJYVHNvcUNOenNNLzZMRQphM25XbFZyS2lmZHYrWkxyRi8vblc0VVNvOEoxaCtQeDljY0tpRDZZU0RVUERDRGh1RUtFWXcvbHpoUDJVOXNmCnl6cEJKVGQ4enFyM3paTjNGWWlITmgzYlRhQS82di9jU2lyamNTK1EwQXg4RWpzQzYxRjRVMTc4QzdWNWRCKzQKcmtPTy9QNlA0UFlWNTRZZHMvRjE2WkZJTHFBNENCYnExRExuYWRxamxyN3NPbzl2ZzNnWFNMYXBVVkdtZ2todAp6VlZPWG1mU0Z4OS90MDBHUi95bUdPbERJbWlXMGc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== | ||
tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRQzZtTnJSdUZ2WXZoSE4KbXI3c1FvNUtKSUVDN3N6TFVrNExFeklSNS9yMEVaUjQ2RnRTaGJQd0ZuaXAwMFBxekhpVkhKYy92TjdkQTVLeApQS1VmdFJuQ1J6YldVaTZBZzJpRU93bXF6WUhGbVNpZkFlVjk0RlAxOGtSbjl1ckV3OEpiRXJIUncrVW51L25tCmFMRHF1eGpFTVBweGhuRklCSnYwK1R3djNEVGx6TjNwUlV6dnpidGZvZCtEVTZBSmR6N3Rid1dTNmR6MHc1Z2kKbW9RelZnbFpnVDBJek9FZkV3NVpWMnRMZllHZWRlRVJ1VjhtR041c09va3R2aGxsMU1udHRaMkZNVHgySmVjUQo3K0xBRm9YVnBTS2NjbUFVZ1JBM0xOOHdVZXBVTHZZdFhiUm1QTFc4SjFINmhFeHJHTHBiTERZNmpzbGxBNlZpCk0xMjVjU0hsQWdNQkFBRUNnZ0VBQnpaRE50bmVTdWxGdk9HZlFYaHRFWGFKdWZoSzJBenRVVVpEcUNlRUxvekQKWlV6dHdxbkNRNlJLczUyandWNTN4cU9kUU94bTNMbjNvSHdNa2NZcEliWW82MjJ2dUczYnkwaVEzaFlsVHVMVgpqQmZCcS9UUXFlL2NMdngvSkczQWhFNmJxdFRjZFlXeGFmTmY2eUtpR1dzZk11WVVXTWs4MGVJVUxuRmZaZ1pOCklYNTlSOHlqdE9CVm9Sa3hjYTVoMW1ZTDFsSlJNM3ZqVHNHTHFybmpOTjNBdWZ3ZGRpK1VDbGZVL2l0K1EvZkUKV216aFFoTlRpNVFkRWJLVStOTnYvNnYvb2JvandNb25HVVBCdEFTUE05cmxFemIralQ1WHdWQjgvLzRGY3VoSwoyVzNpcjhtNHVlQ1JHSVlrbGxlLzhuQmZ0eVhiVkNocVRyZFBlaGlPM1FLQmdRRGlrR3JTOTc3cjg3Y1JPOCtQClpoeXltNXo4NVIzTHVVbFNTazJiOTI1QlhvakpZL2RRZDVTdFVsSWE4OUZKZnNWc1JRcEhHaTFCYzBMaTY1YjIKazR0cE5xcVFoUmZ1UVh0UG9GYXRuQzlPRnJVTXJXbDVJN0ZFejZnNkNQMVBXMEg5d2hPemFKZUdpZVpNYjlYTQoybDdSSFZOcC9jTDlYbmhNMnN0Q1lua2Iwd0tCZ1FEUzF4K0crakEyUVNtRVFWNXA1RnRONGcyamsyZEFjMEhNClRIQ2tTazFDRjhkR0Z2UWtsWm5ZbUt0dXFYeXNtekJGcnZKdmt2eUhqbUNYYTducXlpajBEdDZtODViN3BGcVAKQWxtajdtbXI3Z1pUeG1ZMXBhRWFLMXY4SDNINGtRNVl3MWdrTWRybVJHcVAvaTBGaDVpaGtSZS9DOUtGTFVkSQpDcnJjTzhkUVp3S0JnSHA1MzRXVWNCMVZibzFlYStIMUxXWlFRUmxsTWlwRFM2TzBqeWZWSmtFb1BZSEJESnp2ClIrdzZLREJ4eFoyWmJsZ05LblV0YlhHSVFZd3lGelhNcFB5SGxNVHpiZkJhYmJLcDFyR2JVT2RCMXpXM09PRkgKcmppb21TUm1YNmxhaDk0SjRHU0lFZ0drNGw1SHhxZ3JGRDZ2UDd4NGRjUktJWFpLZ0w2dVJSSUpBb0dCQU1CVApaL2p5WStRNTBLdEtEZHUrYU9ORW4zaGxUN3hrNXRKN3NBek5rbWdGMU10RXlQUk9Xd1pQVGFJbWpRbk9qbHdpCldCZ2JGcXg0M2ZlQ1Z4ZXJ6V3ZEM0txaWJVbWpCTkNMTGtYeGh3ZEVteFQwVit2NzZGYzgwaTNNYVdSNnZZR08KditwVVovL0F6UXdJcWZ6dlVmV2ZxdStrMHlhVXhQOGNlcFBIRyt0bEFvR0FmQUtVVWhqeFU0Ym5vVzVwVUhKegpwWWZXZXZ5TW54NWZyT2VsSmRmNzlvNGMvMHhVSjh1eFBFWDFkRmNrZW96dHNpaVFTNkN6MENRY09XVWxtSkRwCnVrdERvVzM3VmNSQU1BVjY3NlgxQVZlM0UwNm5aL2g2Tkd4Z28rT042Q3pwL0lkMkJPUm9IMFAxa2RjY1NLT3kKMUtFZlNnb1B0c1N1eEpBZXdUZmxDMXc9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: coffee | ||
spec: | ||
replicas: 3 | ||
selector: | ||
matchLabels: | ||
app: coffee | ||
template: | ||
metadata: | ||
labels: | ||
app: coffee | ||
spec: | ||
containers: | ||
- name: coffee | ||
image: nginxdemos/nginx-hello:plain-text | ||
ports: | ||
- containerPort: 8080 | ||
readinessProbe: | ||
httpGet: | ||
path: / | ||
port: 8080 | ||
lifecycle: | ||
preStop: | ||
exec: | ||
command: ["/bin/sleep", "15"] | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: coffee | ||
spec: | ||
ports: | ||
- port: 80 | ||
targetPort: 8080 | ||
protocol: TCP | ||
name: http | ||
selector: | ||
app: coffee | ||
--- | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: tea | ||
spec: | ||
replicas: 3 | ||
selector: | ||
matchLabels: | ||
app: tea | ||
template: | ||
metadata: | ||
labels: | ||
app: tea | ||
spec: | ||
containers: | ||
- name: tea | ||
image: nginxdemos/nginx-hello:plain-text | ||
ports: | ||
- containerPort: 8080 | ||
readinessProbe: | ||
httpGet: | ||
path: / | ||
port: 8080 | ||
lifecycle: | ||
preStop: | ||
exec: | ||
command: ["/bin/sleep", "15"] | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: tea | ||
spec: | ||
ports: | ||
- port: 80 | ||
targetPort: 8080 | ||
protocol: TCP | ||
name: http | ||
selector: | ||
app: tea |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
apiVersion: v1 | ||
kind: ServiceAccount | ||
metadata: | ||
name: rollout-mgr | ||
namespace: default | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: Role | ||
metadata: | ||
name: rollout-mgr | ||
namespace: default | ||
rules: | ||
- apiGroups: | ||
- "apps" | ||
resources: | ||
- deployments | ||
verbs: | ||
- patch | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: RoleBinding | ||
metadata: | ||
name: rollout-mgr | ||
namespace: default | ||
roleRef: | ||
apiGroup: rbac.authorization.k8s.io | ||
kind: Role | ||
name: rollout-mgr | ||
subjects: | ||
- kind: ServiceAccount | ||
name: rollout-mgr | ||
namespace: default | ||
--- | ||
apiVersion: batch/v1 | ||
kind: CronJob | ||
metadata: | ||
name: coffee-rollout-mgr | ||
namespace: default | ||
spec: | ||
schedule: "* */6 * * *" # every minute every 6 hours | ||
jobTemplate: | ||
spec: | ||
template: | ||
spec: | ||
serviceAccountName: rollout-mgr | ||
containers: | ||
- name: coffee-rollout-mgr | ||
image: curlimages/curl:8.3.0 | ||
imagePullPolicy: IfNotPresent | ||
command: | ||
- /bin/sh | ||
- -c | ||
args: | ||
- | | ||
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token) | ||
RESTARTED_AT=$(date -u +"%Y-%m-%dT%H:%M:%SZ") | ||
curl -X PATCH -s -k -v \ | ||
-H "Authorization: Bearer $TOKEN" \ | ||
-H "Content-type: application/merge-patch+json" \ | ||
--data-raw "{\"spec\": {\"template\": {\"metadata\": {\"annotations\": {\"kubectl.kubernetes.io/restartedAt\": \"$RESTARTED_AT\"}}}}}" \ | ||
"https://kubernetes/apis/apps/v1/namespaces/default/deployments/coffee?fieldManager=kubectl-rollout" 2>&1 | ||
restartPolicy: OnFailure | ||
--- | ||
apiVersion: batch/v1 | ||
kind: CronJob | ||
metadata: | ||
name: tea-rollout-mgr | ||
namespace: default | ||
spec: | ||
schedule: "* 3,9,15,21 * * *" # every minute every 6 hours, 3 hours apart from coffee | ||
jobTemplate: | ||
spec: | ||
template: | ||
spec: | ||
serviceAccountName: rollout-mgr | ||
containers: | ||
- name: coffee-rollout-mgr | ||
image: curlimages/curl:8.3.0 | ||
imagePullPolicy: IfNotPresent | ||
command: | ||
- /bin/sh | ||
- -c | ||
args: | ||
- | | ||
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token) | ||
RESTARTED_AT=$(date -u +"%Y-%m-%dT%H:%M:%SZ") | ||
curl -X PATCH -s -k -v \ | ||
-H "Authorization: Bearer $TOKEN" \ | ||
-H "Content-type: application/merge-patch+json" \ | ||
--data-raw "{\"spec\": {\"template\": {\"metadata\": {\"annotations\": {\"kubectl.kubernetes.io/restartedAt\": \"$RESTARTED_AT\"}}}}}" \ | ||
"https://kubernetes/apis/apps/v1/namespaces/default/deployments/tea?fieldManager=kubectl-rollout" 2>&1 | ||
restartPolicy: OnFailure |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.