Closed
Description
Hello,
As requested by Tom Barnes, I open this issue following NPE we are facing.
- We created the auxiliary image "demo-gregoan_lite_1.0.0" (simple WL domain composed of only 1 WL cluster)
- We used this auxiliary image and the WL domain is properly created (Lite)
- We created the auxiliary image "demo-gregoan_full_1.0.0" (complex WL domain composed of 2 WL clusters and one independent WL server)
- We used this auxiliary image and the WL domain is properly updated (Lite ---> Full using offline update)
@[2021-12-29T09:13:26.276000Z][introspectDomain.py:266][FINE] Printing file /tmp/introspect/gregoan-app/topology.yaml
>>> /tmp/introspect/gregoan-app/topology.yaml
domainValid: true
domain:
name: "gregoan-app"
adminServerName: "admin"
configuredClusters:
- name: "cluster-1"
dynamicServersConfig:
name: "NO_NAME_0"
serverTemplateName: "cluster-1-template"
calculatedListenPorts: False
serverNamePrefix: "cls1"
dynamicClusterSize: 9
maxDynamicClusterSize: 9
minDynamicClusterSize: 0
- name: "cluster-2"
dynamicServersConfig:
name: "NO_NAME_0"
serverTemplateName: "cluster-2-template"
calculatedListenPorts: False
serverNamePrefix: "cluster-2"
dynamicClusterSize: 9
maxDynamicClusterSize: 9
minDynamicClusterSize: 0
serverTemplates:
- name: "cluster-1-template"
listenPort: 1041
listenAddress: "gregoan-app-cluster-1-template"
clusterName: "cluster-1"
- name: "cluster-2-template"
listenPort: 1041
listenAddress: "gregoan-app-cluster-2-template"
clusterName: "cluster-2"
servers:
- name: "admin"
listenPort: 1041
listenAddress: "gregoan-app-admin"
- name: "server-1"
listenPort: 1041
listenAddress: "gregoan-app-server-1"
>>> EOF
- We updated the WL domain to come back to "demo-gregoan_lite_1.0.0" (Full ---> Lite still using offline update)
>>> /tmp/introspect/gregoan-app/topology.yaml
domainValid: true
domain:
name: "gregoan-app"
adminServerName: "admin"
configuredClusters:
- name: "cluster-1"
dynamicServersConfig:
name: "NO_NAME_0"
serverTemplateName: "cluster-1-template"
calculatedListenPorts: False
serverNamePrefix: "cls1"
dynamicClusterSize: 9
maxDynamicClusterSize: 9
minDynamicClusterSize: 0
serverTemplates:
- name: "cluster-1-template"
listenPort: 1041
listenAddress: "gregoan-app-cluster-1-template"
clusterName: "cluster-1"
servers:
- name: "admin"
listenPort: 1041
listenAddress: "gregoan-app-admin"
>>> EOF
- The ADMIN server is well restarted as the WL cluster but the second cluster and independent WL server remain available from K8s point of view
- They are not part of model
- They are not visible in WL domain console.
=> kubectl get pods -n gregoan
NAME READY STATUS RESTARTS AGE
gregoan-app-admin 1/1 Terminating 0 7m33s
gregoan-app-cls11 1/1 Running 0 5m59s
gregoan-app-cluster-21 1/1 Running 0 5m54s
gregoan-app-cluster-22 1/1 Running 0 4m1s
gregoan-app-server-1 1/1 Running 0 5m53s
=> kubectl get pods -n gregoan
NAME READY STATUS RESTARTS AGE
**gregoan-app-admin 1/1 Running 0 10m
gregoan-app-cls11 1/1 Running 0 9m5s**
gregoan-app-cluster-21 1/1 Running 0 16m
gregoan-app-cluster-22 1/1 Running 0 15m
gregoan-app-server-1 1/1 Running 0 16m
- We killed by hand these extra resources
- They are not re-created
- So the model is fine from our point of view
=> kubectl delete pods gregoan-app-cluster-21 gregoan-app-cluster-22 gregoan-app-server-1 -n gregoan
pod "gregoan-app-cluster-21" deleted
pod "gregoan-app-cluster-22" deleted
pod "gregoan-app-server-1" deleted
=> kubectl get pods -n gregoan
NAME READY STATUS RESTARTS AGE
gregoan-app-admin 1/1 Running 0 12m
gregoan-app-cls11 1/1 Running 0 11m
When we checked the Operator logs, we saw the following :
{
"timestamp":"2021-12-29T10:23:55.847065183Z",
"thread":25,"fiber":"fiber-239986-child-4",
"namespace":"gregoan",
"domainUID":"gregoan-app",
"level":"SEVERE",
"class":"oracle.kubernetes.operator.DomainProcessorImpl",
"method":"logThrowable",
"timeInMillis":1640773435847,
"message":"Exception thrown",
**"exception":"\njava.lang.NullPointerException\n",**
"code":"",
"headers":{},
"body":""
}
As you can see, we have many time this error
=> kubectl logs weblogic-operator-856dbdb9c9-rgvqr -n weblogic-operator | grep NullPointerException | wc -l
1980
After the deletion of these extra resource, we don't see anymore the error in logs.
- We don't know if the extra resources failed to be deleted due to NPE or if there is something else
- We got this issue with version 3.3.6 and 3.3.7
Regards.