-
Notifications
You must be signed in to change notification settings - Fork 64
Adding GPUs to Kind cluster #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding GPUs to Kind cluster #494
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this change you are bypassing the whole e2e test. Why is this necessary ?
Yes, that was intentional. I'm testing something that involves patching the nodes created with kind and I wanted to make sure the Travis update shows what I want. I'm not trying to merge this yet, so that is why I haven't requested anyone's reviews. |
Thanks for the explanation, draft PR(s) are great for communicating WIP. |
I thought about a draft PR, but I didn't know if it would trigger the Travis build which is what I need. If it does, then I don't have any problems making this PR a draft while I do my work. |
I think it will work jus fine: #482 is still draft and CI build ran. |
Thanks @z103cb ! I made the PR a draft PR while I work on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a very good step in the right direction, but I can't approve this PR for these reasons:
- There are missing error checking steps when the node are being extended
- I think that the extend-resources function should not exist, some of its behaviours should be moved to other functions:
- The cluster enhancements should be moved to the setup-mcad-env function, generally that is where the cluster cusimizations are done
- Executing the new tests should be moved into the kuttl-test function (adding the yaml to the array, should be sufficient).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asside from the small nitpick around the use of =
instead of ==
, this is looking good. It can be merged once the correct operator is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be merged once the PR check passes.
@metalcycling Thanks for this PR, I would like to understand a bit more, who is mutating the node object when an AW with GPU resource requirement is dispatched? |
The node is already mutated by the time the AW comes in. The Kubernetes patching adds an extended resource to the list of |
This update patches the nodes created by
kind
by extending the resources to include GPUs. These GPUs are just there as a count and do now represent real GPUs. This version of the build system will allow tests to include GPUs even when the cluster doesn't have physical GPUs.