Skip to content

feat(infr): add scaling #5057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions menu/navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -900,6 +900,10 @@
"label": "Monitor a deployment",
"slug": "monitor-deployment"
},
{
"label": "Configure autoscaling",
"slug": "configure-autoscaling"
},
{
"label": "Manage allowed IP addresses",
"slug": "manage-allowed-ips"
Expand Down
36 changes: 36 additions & 0 deletions pages/managed-inference/how-to/configure-autoscaling.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
meta:
title: How to scale Managed Inference deployments
description: This page explains how to scale Managed Inference deployments in size
content:
h1: How to manage scale Managed Inference deployments
paragraph: This page explains how to scale Managed Inference deployments in size
tags: managed-inference ai-data ip-address
dates:
validation: 2025-06-03
posted: 2025-06-03
categories:
- ai-data
---

You can scale your Managed Inference deployment up or down to match it to the incoming load of your deployment.

<Macro id="requirements" />

- A Scaleway account logged into the [console](https://console.scaleway.com)
- A [Managed Inference deployment](/managed-inference/quickstart/)
- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization

## How to scale a Managed Inference deployment in size

1. Click **Managed Inference** in the **AI** section of the [Scaleway console](https://console.scaleway.com) side menu. A list of your deployments displays.
2. Click a deployment name or <Icon name="more" /> > **More info** to access the deployment dashboard.
3. Click the **Settings** tab and navigate to the **Scaling** section.
4. Click **Update node count** and adjust the number of nodes in your deployment.
<Message type="note">
High availability is only guaranteed with two or more nodes.
</Message>
5. Click **Update node count** to update the number of nodes in your deployment.
<Message type="note">
Your deployment will be unavailable for 15-30 minutes while the node update is in progress.
</Message>
10 changes: 7 additions & 3 deletions pages/managed-inference/how-to/create-deployment.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,16 @@ dates:
</Message>
- Choose the geographical **region** for the deployment.
- Specify the GPU Instance type to be used with your deployment.
4. Enter a **name** for the deployment, and optional tags.
5. Configure the **network connectivity** settings for the deployment:
4. Choose the number of nodes for your deployment.
<Message type="note">
High availability is only guaranteed with two or more nodes.
</Message>
5. Enter a **name** for the deployment, and optional tags.
6. Configure the **network connectivity** settings for the deployment:
- Attach to a **Private Network** for secure communication and restricted availability. Choose an existing Private Network from the drop-down list, or create a new one.
- Set up **Public connectivity** to access resources via the public internet. Authentication by API key is enabled by default.
<Message type="important">
- Enabling both private and public connectivity will result in two distinct endpoints (public and private) for your deployment.
- Deployments must have at least one endpoint, either public or private.
</Message>
6. Click **Deploy model** to launch the deployment process. Once the model is ready, it will be listed among your deployments.
7. Click **Deploy model** to launch the deployment process. Once the model is ready, it will be listed among your deployments.
Loading