From d2dbd3cafc603fc3db45318edbb70c617ad3632e Mon Sep 17 00:00:00 2001
From: Benedikt Rollik <brollik@online.net>
Date: Wed, 4 Jun 2025 15:55:35 +0200
Subject: [PATCH 1/2] feat(infr): add quantization configuration for custom
 model deployment

---
 pages/managed-inference/how-to/create-deployment.mdx | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/pages/managed-inference/how-to/create-deployment.mdx b/pages/managed-inference/how-to/create-deployment.mdx
index ad5ed35260..9732ff0aca 100644
--- a/pages/managed-inference/how-to/create-deployment.mdx
+++ b/pages/managed-inference/how-to/create-deployment.mdx
@@ -27,6 +27,10 @@ dates:
           Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
         </Message>
     - Choose the geographical **region** for the deployment.
+    - For custom models: Choose the model quantization.
+      <Message type="tip">
+        Each model comes with a default quantization. Select lower bits quantization to improve performance and enable model to run on smaller GPU Nodes, while potentially reducing precision.
+      </Message>
     - Specify the GPU Instance type to be used with your deployment.
 4. Enter a **name** for the deployment, and optional tags.
 5. Configure the **network connectivity** settings for the deployment:

From aeb6e60f3f4f16e87ae1fac9f574f7a353c53671 Mon Sep 17 00:00:00 2001
From: Benedikt Rollik <brollik@scaleway.com>
Date: Thu, 5 Jun 2025 14:57:39 +0200
Subject: [PATCH 2/2] Apply suggestions from code review

Co-authored-by: Rowena Jones <36301604+RoRoJ@users.noreply.github.com>
---
 pages/managed-inference/how-to/create-deployment.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pages/managed-inference/how-to/create-deployment.mdx b/pages/managed-inference/how-to/create-deployment.mdx
index 9732ff0aca..ecd53493ad 100644
--- a/pages/managed-inference/how-to/create-deployment.mdx
+++ b/pages/managed-inference/how-to/create-deployment.mdx
@@ -29,7 +29,7 @@ dates:
     - Choose the geographical **region** for the deployment.
     - For custom models: Choose the model quantization.
       <Message type="tip">
-        Each model comes with a default quantization. Select lower bits quantization to improve performance and enable model to run on smaller GPU Nodes, while potentially reducing precision.
+        Each model comes with a default quantization. Select lower bits quantization to improve performance and enable the model to run on smaller GPU nodes, while potentially reducing precision.
       </Message>
     - Specify the GPU Instance type to be used with your deployment.
 4. Enter a **name** for the deployment, and optional tags.