Skip to content

docs: adr regarding the cr deserialization problem #1506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 3, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions adr/002-Custom-Resource-Deserialization-Problem.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Multi Version Custom Resources Deserialization Problem

## Status

accepted

## Context

In case there are multiple versions of a custom resource in can happen that a controller/informer tracking
such a resource might run into deserialization problem as shown
in [this integration test](https://github.com/java-operator-sdk/java-operator-sdk/blob/07aab1a9914d865364d7236e496ef9ba5b50699e/operator-framework/src/test/java/io/javaoperatorsdk/operator/MultiVersionCRDIT.java#L55-L55)
. In the mentioned case two versions of a custom resource are not compatible with each other. The informer receives
both, but naturally not able to deserialize one of them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify somewhere in the doc that the informer/watcher receives an event for resource with a given Group Version Kind (GVK) but with the structure of a different (former or newer) version (for the same Group and Kind).

This is what in the end might lead to a deserialization Exception if the Serializer is configured in a strict mode and some of the fields might have unmatched target types because there was a change in between the two versions of the same resource definition.

Note that the deserializer can be set to be more lenient by configuring the Serialization Unmatched Field Type module:

Serialization.UNMATCHED_FIELD_TYPE_MODULE.setRestrictToTemplates(true);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the deserializer can be set to be more lenient by configuring the Serialization Unmatched Field Type module:

I think this is something we want to avoid. So processing resource that are not complete in some cases, that can lead to some more confusion. But might be good to put this into the docs as a note.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify somewhere in the doc that the informer/watcher receives an event for resource with a given Group Version Kind (GVK) but with the structure of a different (former or newer) version (for the same Group and Kind).

added some explanation, not sure if you meant it this way


How should the framework or the underlying informer behave?

Alternatives:

1. The informer should skip the resource and should continue to process the resources with the correct version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO I think this shouldn't even be considered as an alternative. It might encourage users to ignore errors that in the long-term will definitely have an impact in their application's lifecycle.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding since it was working this way if until now, and before was mentioned in other conversations. But completely agree with what you say.

2. Informer stops and makes a notification callback.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do believe that this is the expected behavior, a possible improvement would be to include the GenericKubernetesResource that generated the issue in the Exception / Notification.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that at that point is useless (or rather does not fit very well). So the informer is already stopped, event if the resource is fixed from the callback method, on next start there might a other resource with that problem. So it is a restart per problematic resource.

The solution here is to use conversion hooks.


## Decision

From the JOSDK perspective is fine if the informer stops, and the users decides if the whole operator should stop
(usually the preferred way). The reason, that this is an obvious issue on platform level (not on operator/controller
level). Thus, the controller should not receive such custom resources in the first place, so the problem should be
addressed on platform level. Possibly introducing conversion hooks, or labeling for the target resource.

## Consequences

If an Informer stops on such deserialization error, even explicitly restarting it won't solve the problem, since
would fail again on the same error.

## Notes

The informer implementation if fabric8 client changed in this regard, before it was not stopping on deserialization
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The informer implementation if fabric8 client changed in this regard, before it was not stopping on deserialization
The informer implementation in the Fabric8 client has changed in this regard starting with version 5.12.4. It was previously not stopping on deserialization

error, but as describe this change in behavior is completely acceptable.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error, but as describe this change in behavior is completely acceptable.
errors, but as described in this document, this change in behavior is completely acceptable.