Skip to content

Commit 4e293d7

Browse files
authored
Add architecture and design principles doc (#715)
Problem: There's no document that describes how NKG works or what our design principles are. Solution: Add an architecture and design principles doc.
1 parent 0151b8d commit 4e293d7

File tree

8 files changed

+194
-3
lines changed

8 files changed

+194
-3
lines changed

CONTRIBUTING.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,8 +89,10 @@ Before beginning development, familiarize yourself with the following documents:
8989
procedures. This document explains how to write and run unit tests, and how to manually verify changes.
9090
- [Pull Request Guidelines](/docs/developer/pull-request.md): A guide for both pull request submitters and reviewers,
9191
outlining guidelines and best practices to ensure smooth and efficient pull request processes.
92-
- [Go Style Guide](/docs/developer/go-style-guide.md) A coding style guide for Go. Contains best practices and
93-
conventions to follow when writing Go code for the project.
92+
- [Go Style Guide](/docs/developer/go-style-guide.md): A coding style guide for Go. Contains best practices and
93+
conventions to follow when writing Go code for the project.
94+
- [Architecture](/docs/architecture.md): A high-level overview of the project's architecture.
95+
- [Design Principles](/docs/developer/design-principles.md): An overview of the project's design principles.
9496

9597
## Contributor License Agreement
9698

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ For a list of supported Gateway API resources and features, see the [Gateway API
99
> Warning: This project is actively in development (beta feature state) and should not be deployed in a production environment.
1010
> All APIs, SDKs, designs, and packages are subject to change.
1111
12+
Learn about our [design principles](/docs/developer/design-principles.md) and [architecture](/docs/architecture.md).
13+
1214
## Getting Started
1315

1416
1. [Quick Start on a kind cluster](docs/running-on-kind.md).

docs/architecture.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# Architecture
2+
3+
This document provides an overview of the architecture and design principles of the NGINX Kubernetes Gateway. The target
4+
audience includes the following groups:
5+
6+
* *Cluster Operators* who would like to know how the software works and also better understand how it can fail.
7+
* *Developers* who would like to [contribute][contribute] to the project.
8+
9+
We assume that the reader is familiar with core Kubernetes concepts, such as Pods, Deployments, Services, and Endpoints.
10+
Additionally, we recommend reading [this blog post][blog] for an overview of the NGINX architecture.
11+
12+
[contribute]: https://github.com/nginxinc/nginx-kubernetes-gateway/blob/main/CONTRIBUTING.md
13+
14+
[blog]: https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/
15+
16+
## What is NGINX Kubernetes Gateway?
17+
18+
The NGINX Kubernetes Gateway is a component in a Kubernetes cluster that configures an HTTP load balancer according to
19+
Gateway API resources created by Cluster Operators and Application Developers.
20+
21+
> If you’d like to read more about the Gateway API, refer to [Gateway API documentation][sig-gateway].
22+
23+
This document focuses specifically on the NGINX Kubernetes Gateway, also known as NKG, which uses NGINX as its data
24+
plane.
25+
26+
[sig-gateway]: https://gateway-api.sigs.k8s.io/
27+
28+
## NGINX Kubernetes Gateway at a High Level
29+
30+
To start, let's take a high-level look at the NGINX Kubernetes Gateway (NKG). The accompanying diagram illustrates an
31+
example scenario where NKG exposes two web applications hosted within a Kubernetes cluster to external clients on the
32+
internet:
33+
34+
![NKG High Level](/docs/images/nkg-high-level.png)
35+
36+
The figure shows:
37+
38+
* A *Kubernetes cluster*.
39+
* Users *Cluster Operator*, *Application Developer A* and *Application Developer B*. These users interact with the
40+
cluster through the Kubernetes API by creating Kubernetes objects.
41+
* *Clients A* and *Clients B* connect to *Applications A* and *B*, respectively. This applications have been deployed by
42+
the corresponding users.
43+
* The *NKG Pod*, [deployed by *Cluster Operator*](/docs/installation.md) in the Namespace *nginx-gateway*. This Pod
44+
consists of two containers: `NGINX` and `NKG`. The *NKG* container interacts with the Kubernetes API to retrieve the
45+
most up-to-date Gateway API resources created within the cluster. It then dynamically configures the *NGINX*
46+
container based on these resources, ensuring proper alignment between the cluster state and the NGINX configuration.
47+
* *Gateway AB*, created by *Cluster Operator*, requests a point where traffic can be translated to Services within the
48+
cluster. This Gateway includes a listener with a hostname `*.example.com`. Application Developers have the ability to
49+
attach their application's routes to this Gateway if their application's hostname matches `*.example.com`.
50+
* *Application A* with two Pods deployed in the *applications* Namespace by *Application Developer A*. To expose the
51+
application to its clients (*Clients A*) via the host `a.example.com`, *Application Developer A* creates *HTTPRoute A*
52+
and attaches it to `Gateway AB`.
53+
* *Application B* with one Pod deployed in the *applications* Namespace by *Application Developer B*. To expose the
54+
application to its clients (*Clients B*) via the host `b.example.com`, *Application Developer B* creates *HTTPRoute B*
55+
and attaches it to `Gateway AB`.
56+
* *Public Endpoint*, which fronts the *NKG* Pod. This is typically a TCP load balancer (cloud, software, or hardware)
57+
or a combination of such load balancer with a NodePort Service. *Clients A* and *B* connect to their applications via
58+
the *Public Endpoint*.
59+
60+
The connections related to client traffic are depicted by the yellow and purple arrows, while the black arrows represent
61+
access to the Kubernetes API. The resources within the cluster are color-coded based on the user responsible for their
62+
creation. For example, the Cluster Operator is denoted by the color green, indicating that they have created and manage
63+
all the green resources.
64+
65+
> Note: For simplicity, many necessary Kubernetes resources like Deployment and Services aren't shown,
66+
> which the Cluster Operator and the Application Developers also need to create.
67+
68+
Next, let's explore the NKG Pod.
69+
70+
## The NGINX Kubernetes Gateway Pod
71+
72+
The NGINX Kubernetes Gateway consists of three containers:
73+
74+
1. `nginx`: the data plane. Consists of an NGINX master process and NGINX worker processes. The master process controls
75+
the worker processes. The worker processes handle the client traffic and load balance the traffic to the backend
76+
applications.
77+
2. `nginx-gateway`: the control plane. Watches Kubernetes objects and configures NGINX.
78+
3. `busybox`: initializes the NGINX config environment.
79+
80+
These containers are deployed in a single Pod as a Kubernetes Deployment. The init container, `busybox`, runs before the
81+
`nginx` and `nginx-gateway` containers and creates directories and sets permissions for the NGINX process.
82+
83+
The `nginx-gateway`, or the control plane, is a [Kubernetes controller][controller], written with
84+
the [controller-runtime][runtime] library. It watches Kubernetes objects (Services, Endpoints, Secrets, and Gateway API
85+
CRDs), translates them to nginx configuration, and configures NGINX. This configuration happens in two stages. First,
86+
NGINX configuration files are written to the NGINX configuration volume shared by the `nginx-gateway` and `nginx`
87+
containers. Next, the control plane reloads the NGINX process. This is possible because the two
88+
containers [share a process namespace][share], which allows the NKG process to send signals to the NGINX master process.
89+
90+
The diagram below provides a visual representation of the interactions between processes within the nginx and
91+
nginx-gateway containers, as well as external processes/entities. It showcases the connections and relationships between
92+
these components. For the sake of simplicity, the `busybox` init container is not depicted in the diagram.
93+
94+
![NKG pod](/docs/images/nkg-pod.png)
95+
96+
The following list provides a description of each connection, along with its corresponding type indicated in
97+
parentheses. To enhance readability, the suffix "process" has been omitted from the process descriptions below.
98+
99+
1. (HTTPS) *NKG* reads the *Kubernetes API* to get the latest versions of the resources in the cluster and writes to the
100+
API to update the handled resources' statuses and emit events.
101+
2. (File I/O) *NKG* generates NGINX *configuration* based on the cluster resources and writes them as `.conf` files to
102+
the mounted `nginx` volume, located at `/etc/nginx`. It also writes *TLS certificates* and *keys*
103+
from [TLS Secrets][secrets] referenced in the accepted Gateway resource to the volume at the
104+
path `/etc/nginx/secrets`.
105+
3. (File I/O) *NKG* writes logs to its *stdout* and *stderr*, which are collected by the container runtime.
106+
4. (Signal) To reload NGINX, *NKG* sends the [reload signal][reload] to the **NGINX master**.
107+
5. (File I/O) The *NGINX master* reads *configuration files* and the *TLS cert and keys* referenced in the
108+
configuration when it starts or during a reload. These files, certificates, and keys are stored in the `nginx` volume
109+
that is mounted to both the `nginx-gateway` and `nginx` containers.
110+
6. (File I/O): The *NGINX master* writes to the auxiliary Unix sockets folder, which is mounted to the `nginx`
111+
container as the `var-lib-nginx` volume. The mounted path for this volume is `/var/lib/nginx`.
112+
7. (File I/O) The *NGINX master* sends logs to its *stdout* and *stderr*, which are collected by the container runtime.
113+
8. (File I/O): The *NGINX master* reads the NJS modules referenced in the configuration when it starts or during a
114+
reload. NJS modules are stored in the `njs-modules` volume that is mounted to the `nginx` container.
115+
9. (File I/O) An *NGINX worker* writes logs to its *stdout* and *stderr*, which are collected by the container runtime.
116+
10. (File I/O): The *NGINX master* reads the `nginx.conf` file from the mounted `nginx-conf` volume.
117+
This [file][conf-file] contains the global and http configuration settings for NGINX.
118+
11. (Signal) The *NGINX master* controls the [lifecycle of *NGINX workers*][lifecycle] it creates workers with the new
119+
configuration and shutdowns workers with the old configuration.
120+
12. (HTTP,HTTPS) A *client* sends traffic to and receives traffic from any of the *NGINX workers* on ports 80 and 443.
121+
13. (HTTP,HTTPS) An *NGINX worker* sends traffic to and receives traffic from the *backends*.
122+
123+
[controller]: https://kubernetes.io/docs/concepts/architecture/controller/
124+
125+
[runtime]: https://github.com/kubernetes-sigs/controller-runtime
126+
127+
[secrets]: https://kubernetes.io/docs/concepts/configuration/secret/#tls-secrets
128+
129+
[reload]: https://nginx.org/en/docs/control.html
130+
131+
[lifecycle]: https://nginx.org/en/docs/control.html#reconfiguration
132+
133+
[conf-file]: https://github.com/nginxinc/nginx-kubernetes-gateway/blob/main/deploy/manifests/nginx-conf.yaml
134+
135+
[share]: https://kubernetes.io/docs/tasks/configure-pod-container/share-process-namespace/

docs/developer/design-principles.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Design Principles
2+
3+
The aim of the NGINX Kubernetes Gateway is to become a fundamental infrastructure component within a Kubernetes cluster,
4+
serving as both an ingress and egress point for traffic directed towards the services (applications) running within or
5+
outside the cluster. Leveraging NGINX as a data plane technology, it harnesses the well-established reputation of NGINX
6+
as an open-source project widely recognized for its role as a web server, proxy, load balancer, and content cache. NGINX
7+
is renowned for its stability, high performance, security, and rich feature set, positioning it as a critical
8+
infrastructure tool. Notably, once properly configured and operational, NGINX requires minimal attention, making it
9+
reliable and steady software.
10+
11+
The NGINX Kubernetes Gateway aims to embody the same qualities as NGINX and become familiar, trustworthy and reliable
12+
software. The principles outlined below serve as a guide for engineering the NGINX Kubernetes Gateway with the intention
13+
of achieving this goal.
14+
15+
## Security
16+
17+
We are security first. We prioritize security from the outset, thoroughly evaluating each design and feature with a
18+
focus on security. We proactively identify and safeguard assets at the early stages of our processes, ensuring their
19+
protection throughout the development lifecycle. We adhere to best practices for secure design, including proper
20+
authentication, authorization, and encryption mechanisms.
21+
22+
## Availability
23+
24+
As a critical infrastructure component, we must be highly available. We design and review features with redundancy and
25+
fault tolerance in mind. We regularly test the NGINX Kubernetes Gateway's availability by simulating failure scenarios
26+
and conducting load testing. We work to identify potential weaknesses and bottlenecks, and address them to ensure high
27+
availability under various conditions.
28+
29+
## Performance
30+
31+
We must be highly performant and lightweight. We fine-tune the NGINX configuration to maximize performance without
32+
requiring custom configuration. We strive to minimize our memory and CPU footprint, enabling efficient resource
33+
allocation and reducing unnecessary processing overhead. We use profiling tools on our code to identify bottlenecks and
34+
improve performance.
35+
36+
## Resilience
37+
38+
We design with resilience in mind. This includes gracefully handling failures, such as pod restarts or network
39+
interruptions, as well as leveraging Kubernetes features like health checks, readiness probes, and container restart
40+
policies.
41+
42+
## Observability
43+
44+
We provide comprehensive logging, metrics, and tracing capabilities to gain insights into our behavior and performance.
45+
We prioritize Kubernetes-native observability tools like Prometheus, Grafana, and distributed tracing systems to help
46+
users monitor the health of NGINX Kubernetes Gateway and to assist in diagnosing issues.
47+
48+
## Ease of Use
49+
50+
NGINX Kubernetes Gateway must be easy and intuitive to use. This means that it should be easy to install, easy to
51+
configure, and easy to monitor. Its defaults should be sane and should lead to "out-of-box" success. The documentation
52+
should be clear and provide meaningful examples that customer's can use to inform their deployments and configurations.

docs/developer/pull-request.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
follow-up in a timely manner with acceptance of the solution and explanation.
3636
- Do your best to review in a timely manner. However, code reviews are time-consuming; maximize their benefit by
3737
focusing on what’s highest value. This code review pyramid outlines a reasonable shorthand:
38-
![Code Review Pyramid](code-review-pyramid.jpeg)
38+
![Code Review Pyramid](/docs/images/code-review-pyramid.jpeg)
3939
- Always review for: design, API semantics, [DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) and
4040
maintainability practices, bugs and quality, efficiency and correctness.
4141
- Code review checklist:

docs/images/nkg-high-level.png

136 KB
Loading

docs/images/nkg-pod.png

109 KB
Loading

0 commit comments

Comments
 (0)