Description
If a PodSpecTemplate already has a node affinity term that is using node selectors, the additional term being injected by the AppWrapper controller for Autopilot is being done on level "too high" resulting in an OR
of the terms instead of the desired AND
.
The Kubernetes documentation states:
If you specify multiple terms in nodeSelectorTerms associated with nodeAffinity types, then the Pod can be scheduled onto a node if one of the specified terms can be satisfied (terms are ORed).
If you specify multiple expressions in a single matchExpressions field associated with a term in nodeSelectorTerms, then the Pod can be scheduled onto a node only if all the expressions are satisfied (expressions are ANDed).
Our bug is that we generate this PodSpecTemplate:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: NotIn
values:
- 10.240.0.14
- 10.240.128.16
- 10.240.64.15
- matchExpressions:
- key: autopilot.ibm.com/gpuhealth
operator: NotIn
values:
- ERR
- TESTING
- EVICT
but the correct yaml is:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: NotIn
values:
- 10.240.0.14
- 10.240.128.16
- 10.240.64.15
- key: autopilot.ibm.com/gpuhealth
operator: NotIn
values:
- ERR
- TESTING
- EVICT