Skip to content

The cluster.wait_ready() command fails when the RayCluster doesn't have status #339

Closed
@roytman

Description

@roytman

Describe the Bug

The cluster.wait_ready() checks the status of the AppWrapper and, after that the RayCluster.
Which is done with sub-commands _map_to_app_wrapper and _map_to_ray_cluster.
Both of them check fields of Status, which cannot be defined.
The PR #254 fixed the AppWrapper check, but it did not fix the RayCluster check.

  File "/usr/local/lib/python3.10/site-packages/codeflare_sdk/cluster/cluster.py", line 271, in wait_ready
    status, ready = self.status(print_to_console=False)
  File "/usr/local/lib/python3.10/site-packages/codeflare_sdk/cluster/cluster.py", line 237, in status
    cluster = _ray_cluster_status(self.config.name, self.config.namespace)
  File "/usr/local/lib/python3.10/site-packages/codeflare_sdk/cluster/cluster.py", line 521, in _ray_cluster_status
    return _map_to_ray_cluster(rc)
  File "/usr/local/lib/python3.10/site-packages/codeflare_sdk/cluster/cluster.py", line 572, in _map_to_ray_cluster
    if "state" in rc["status"]:
KeyError: 'status'

Codeflare Stack Component Versions

Please specify the component versions in which you have encountered this bug.

Codeflare SDK: v0.7.0
MCAD:
Instascale:
Codeflare Operator:
Other:

Steps to Reproduce the Bug

  1. Create a cluster
  2. call cluster.up()
  3. and immediately call cluster.wait_ready()
  4. See error:

Expected Behavior

The code should safely wait until the RayCluster is ready.

Affected Releases

v0.6.1, v.07.0 and the main branch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions