Skip to content

tracking: CRD versioning macro #507

Closed
Closed
@fhennig

Description

@fhennig

This tracks the implementation of a first working version of the #[versioned] macro. Further improvements will be done as part of our normal development workflow as well as alongside #642.

General

Currently, all our CRDs are defined by Rust structs. This works really well for multiple reasons:

  • Structs can directly be used in the operator code
  • Can be serialized in YAML for for use in Helm or other deployments mechanisms
  • Deserialize directly from Kubernetes API calls

This strict type safety ensures we build code which will never crash during runtime due to invalid types. The future introduction of CRD versions warrants the same strict typing for different version of the CRD. Additionally, we want to be able to upgrade (and downgrade) between versions in a controlled and automated manner. Rust already provides traits for conversion of types: From<T>, TryFrom<T> and friends. A goal of this implementation should be support for this native mechanism.

Requirements

  • Common mechanisms for CRD versioning must live in a common repository
  • Different versions of the same struct without much overhead. This overhead includes:
    • The need to copy and paste the same struct for multiple versions
    • Manually keeping track of changes between versions and how to properly convert between them
    • Conversion should be a simple one method call, no need to manually assign fields from the old struct to the new one
  • Support for versioned field changes:
    • Add a new field
    • Rename a field
    • Deprecate a field
  • Field changes can be annotated with notes, hints, etc...
  • The derive macro must surface meaningful error messages
  • Handle recursive structs and enums

Vision

Initial Sketch

Adding the Derive Macro

The derive macro uses a simple and fitting name. A few examples are: Versioned, Version and Update. The macro additionally uses an attribute to customize the code generation. The name of that attribute will be the same as the derive macro name, just all lowercase letters.

// The derive macro is used just like any other derive macro like Debug,
// Deserialize or JsonSchema
#[derive(Debug, Versioned)]
pub struct MyCrd {
    foo: String,
    bar: usize,
}

// The macro can also be used on enums
#[derive(Debug, Versioned)]
pub enum MyEnum {
    Foo,
    Bar,
}

The macro provides a mechanism to provide traits which should be derived on the generated structs/enums. Attributes like #[serde] or #[schemars] will be passed through as well.

#[derive(Debug, Versioned)]
#[versioned(derives(Debug, PartialEq))]
#[serde(rename_all = "camelCase")]
pub struct MyCrd {
    foo: String,
}

Adding Versions

Each distinct version of a struct/enum needs to be declared via the accommodating attribute. The version field can be used as often as one would like. Each definition creates a new versioned version of the base struct/enum. Each version name must be unique, enforced by the macro and surfaced with easy to understand error message.

(Optional) Because we plan to publish this crate publicly for the Rust community, we don't mandate the use of Kubernetes versions as version names. Instead, the validation should be as generic as possible to allow usages outside of the context of Kubernetes. Initially we might want to only support SemVer and Kubernetes version formats. Each version must provide an implementation for PartialEq and PartialOrd.

(Optional) Users can customize the visibility of the generated structs.

#[derive(Debug, Versioned)]
// Default visibility is `pub`
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1", visibility = "pub(crate)")
)]
pub struct MyCrd {
    foo: String,
    bar: usize,
}

#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1", visibility = "pub(crate)")
)]
#[derive(Debug, Versioned)]
pub enum MyEnum {
    Foo,
    Bar,
}

Each added version will generate a new struct or enum specific for that version. A basic example without any fields/variants (and semantics around them) could look like this:

pub mod v1alpha1 {
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

pub(crate) mod v1beta1 {
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

// ################################################

pub mod v1alpha1 {
    pub enum MyEnum {
        // Variants will be discussed in sections below
    }
}

pub(crate) mod v1beta1 {
    pub enum MyEnum {
        // Variants will be discussed in sections below
    }
}

The naming scheme uses Rust modules. Different version of the struct would then be separated by modules, like v1alpha1::MyCrd and v1beta1::MyCrd. Additionally, the macro provides stable modules, which for example always point to the latest version of the struct/enum.

pub mod v1alpha1 {
    pub struct MyCrd;
}

pub mod v1beta1 {
    pub struct MyCrd;
}

// The operator can always use latest::MyCrd
pub mod latest {
    pub use v1beta1::*;
}

Deprecation of Versions

CRDs can mark particular versions as deprecated. This can be achieved by providing the deprecated flag. Versioned structs will additionally include Rust's #[deprecated] attribute.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1", deprecated),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    #[deprecated]
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

Field Attributes

Each field can be attached with additional (but optional) attributes. These attributes indicate when the fields were added, renamed, deprecated and removed. Any field which doesn't have a attribute attached is presumed to exist since the earliest version (version v1alpha1 in the example below). This feature is one reason why versions need to implement PartialEq and PartialOrd. Any attribute field cannot be combined with any other field if the same version is used.

Add Fields

This attribute marks fields as added in a specific version. This field will not appear in generated structs of earlier versions.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(added(since = "v1beta1"))]
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    pub struct MyCrd {
        bar: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrdV1Beta1 {
        foo: String,
        bar: usize,
    }
}

Adding required fields in CRDs is against the Kubernetes guidelines. Instead, CRD developers should use optional fields with a sane default value. There will be multiple pieces of Rust code to make this work:

  • Each field type needs to implement Default.
  • A custom default value can be provided using #[versioned(added(default = default_fn))].
    This value will be used when using the From<T> trait for conversion between versions.

The From<Old> for New implementation will look similar to this:

Without Custom Default Value
impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        Self {
            foo: Default::default(),
            bar: my_crd.bar,
        }
    }
}
With Custom Default Value
impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        Self {
            foo: default_fn(),
            bar: my_crd.bar,
        }
    }
}

Rename Fields

This attribute marks fields as renamed in a specific version. This field will use the original name up to the version it was renamed in. The From<T> trait implementation will move the value from the old field to the new field.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(renamed(since = "v1beta1", to = "baz"))]
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    pub struct MyCrd {
        foo: String,
        bar: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        baz: String,
        bar: usize,
    }
}

The From<Old> for New implementation will look similar to this:

impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        Self {
            baz: my_crd.foo,
            bar: my_crd.bar,
        }
    }
}

Deprecate Fields

This attribute marks fields as deprecated in a specific version. Each deprecated field will be included in later versions but renamed to deprecated_<FIELD_NAME>. They will additionally include Rust's #[deprecated] attribute to indicate the deprecation in the Rust code. Conversions from earlier versions will move the value of the field to the deprecated field automatically.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(deprecated(since = "v1beta1"))]
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    pub struct MyCrd {
        foo: String,
        bar: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        #[deprecated]
        deprecated_foo: String,
        bar: usize,
    }
}

The From<Old> for New implementation will look similar to this:

impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        #[allow(deprecated)]
        Self {
            deprecated_foo: my_crd.foo,
            bar: my_crd.bar,
        }
    }
}

Full Example

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(added(since = "v1beta1"))]
    foo: String,

    #[versioned(renamed(since = "v1beta1", to = "new_bar"))]
    bar: usize,

    #[versioned(deprecated(since = "v1beta1"))]
    baz: bool,
    gau: usize,
}

pub mod v1alpha1 {
    pub struct MyCrd {
        bar: usize,
        baz: bool,
        gau: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        foo: Option<String>,
        new_bar: usize,
        deprecated_baz: bool,
        gau: usize,
    }
}

impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        #[allow(deprecated)]
        Self {
            foo: Default::default(),
            new_bar: my_crd.bar,
            deprecated_baz: my_crd.baz,
            gau: my_crd.gau,
        }
    }
}

impl From<v1beta1::MyCrd> for v1alpha1::MyCrd {
    fn from(my_crd: v1beta1::MyCrd) -> Self {
        #[allow(deprecated)]
        Self {
            bar: my_crd.new_bar,
            baz: my_crd.deprecated_baz,
            gau: my_crd.gau,
        }
    }
}

Kubernetes API Version Crate

Initial Sketch

We want to parse raw Kubernetes (API) version strings into Rust structs. This enables us to more easily work with these versions. It enables sorting, equality checking and ordering. It additionally enables us to provide better error messages in case of invalid versioning.

// (<GROUP>/)<VERSION>
pub struct ApiVersion {
    pub group: Option<String>,
    pub version: Version,
}

// v<MAJOR>(<LEVEL>)
pub struct Version {
    pub major: u64,
    pub level: Option<Level>,
}

// beta<LEVEL_VERSION>/alpha<LEVEL_VERSION>
pub enum Level {
    Beta(u64),
    Alpha(u64),
}

A PoC implementation for this already exists.

This is implemented. See https://github.com/stackabletech/operator-rs/tree/main/crates/k8s-version

Open Questions / Ideas

  1. We may even want to limit the visibility of deprecated fields using pub(crate).
  2. We might want to make the conversion mechanism configurable for added fields via the builder pattern.
    This can be done by implementing the From<Old> for New manually instead of using the automatically derived one.
    Users can opt out of the auto-generation by providing the #[versioned(version(name = "v1beta1", skip("from")))]

References

Selection of Crates

Research

## Tasks
- [x] Our CRDs are already very big, how do we handle multiple versions (which basically double or triple the size) in terms of space requirements?

Implementation

## Tasks
- [ ] https://github.com/stackabletech/operator-rs/pull/764
- [ ] https://github.com/stackabletech/operator-rs/pull/784
- [ ] https://github.com/stackabletech/operator-rs/pull/793
- [ ] https://github.com/stackabletech/operator-rs/pull/790
- [ ] https://github.com/stackabletech/operator-rs/pull/804
- [ ] https://github.com/stackabletech/operator-rs/pull/813
- [ ] https://github.com/stackabletech/operator-rs/pull/820
- [ ] https://github.com/stackabletech/operator-rs/pull/844
- [x] Add support for nested versioned data structures
- [ ] https://github.com/stackabletech/operator-rs/pull/847
- [ ] https://github.com/stackabletech/operator-rs/pull/857
- [ ] https://github.com/stackabletech/operator-rs/pull/849
- [ ] https://github.com/stackabletech/operator-rs/pull/850
- [ ] https://github.com/stackabletech/operator-rs/pull/865
- [x] Add support for `serde` integration
- [ ] https://github.com/stackabletech/operator-rs/pull/876

Test Rounds

2024-09-06

#507 (comment)

### Fixes
- [ ] https://github.com/stackabletech/operator-rs/pull/859
- [ ] https://github.com/stackabletech/operator-rs/pull/860
- [ ] https://github.com/stackabletech/operator-rs/pull/864

2024-09-18

### Fixes
- [ ] https://github.com/stackabletech/operator-rs/pull/872
- [ ] https://github.com/stackabletech/operator-rs/pull/875
- [ ] https://github.com/stackabletech/operator-rs/pull/873
- [x] ~Handle URL replacements in CRDs (currently done)~ (Moved)
- [x] ~Utility methods for printing crd to stdout, etc... (see that trait impl)~ (Moved)

Acceptance criteria:

  • We know how to version our CRDs in Rust
  • We know how to write our conversion code between these versions
  • We have a PoC implementation/spike for a conversion webhook
  • We have a full-featured derive macro to handle CRD versioning
  • We have a general-purpose crate which can be used by others in the open-source Rust ecosystem

Metadata

Metadata

Assignees

Labels

Type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions