Skip to content

Commit 0e97c15

Browse files
petermeansrockmmcclean-awsl2yaojetterdjfoxpro24
authored
feat(sagemaker): add EndpointConfig L2 construct (#22865)
This is the second of three PRs to complete the implementation of RFC 431: aws/aws-cdk-rfcs#431 related to #2809 ---- ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Unconventional Dependencies: * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies) ### New Features * [x] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [x] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license* ---- Co-authored-by: Matt McClean <mmcclean@amazon.com> Co-authored-by: Long Yao <yl1984108@gmail.com> Co-authored-by: Drew Jetter <60628154+jetterdj@users.noreply.github.com> Co-authored-by: Murali Ganesh <59461079+foxpro24@users.noreply.github.com> Co-authored-by: Abilash Rangoju <988529+rangoju@users.noreply.github.com>
1 parent 6be4cf6 commit 0e97c15

21 files changed

+3817
-0
lines changed

packages/@aws-cdk/aws-sagemaker/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,3 +156,42 @@ import * as sagemaker from '@aws-cdk/aws-sagemaker';
156156
const bucket = new s3.Bucket(this, 'MyBucket');
157157
const modelData = sagemaker.ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
158158
```
159+
160+
## Model Hosting
161+
162+
Amazon SageMaker provides model hosting services for model deployment. Amazon SageMaker provides an
163+
HTTPS endpoint where your machine learning model is available to provide inferences.
164+
165+
### Endpoint Configuration
166+
167+
By using the `EndpointConfig` construct, you can define a set of endpoint configuration which can be
168+
used to provision one or more endpoints. In this configuration, you identify one or more models to
169+
deploy and the resources that you want Amazon SageMaker to provision. You define one or more
170+
production variants, each of which identifies a model. Each production variant also describes the
171+
resources that you want Amazon SageMaker to provision. If you are hosting multiple models, you also
172+
assign a variant weight to specify how much traffic you want to allocate to each model. For example,
173+
suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1
174+
for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to
175+
model B:
176+
177+
```typescript
178+
import * as sagemaker from '@aws-cdk/aws-sagemaker';
179+
180+
declare const modelA: sagemaker.Model;
181+
declare const modelB: sagemaker.Model;
182+
183+
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
184+
instanceProductionVariants: [
185+
{
186+
model: modelA,
187+
variantName: 'modelA',
188+
initialVariantWeight: 2.0,
189+
},
190+
{
191+
model: modelB,
192+
variantName: 'variantB',
193+
initialVariantWeight: 1.0,
194+
},
195+
]
196+
});
197+
```
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
import * as cdk from '@aws-cdk/core';
2+
3+
/**
4+
* Supported Elastic Inference (EI) instance types for SageMaker instance-based production variants.
5+
* EI instances provide on-demand GPU computing for inference.
6+
*/
7+
export class AcceleratorType {
8+
/**
9+
* ml.eia1.large
10+
*/
11+
public static readonly EIA1_LARGE = AcceleratorType.of('ml.eia1.large');
12+
13+
/**
14+
* ml.eia1.medium
15+
*/
16+
public static readonly EIA1_MEDIUM = AcceleratorType.of('ml.eia1.medium');
17+
18+
/**
19+
* ml.eia1.xlarge
20+
*/
21+
public static readonly EIA1_XLARGE = AcceleratorType.of('ml.eia1.xlarge');
22+
23+
/**
24+
* ml.eia2.large
25+
*/
26+
public static readonly EIA2_LARGE = AcceleratorType.of('ml.eia2.large');
27+
28+
/**
29+
* ml.eia2.medium
30+
*/
31+
public static readonly EIA2_MEDIUM = AcceleratorType.of('ml.eia2.medium');
32+
33+
/**
34+
* ml.eia2.xlarge
35+
*/
36+
public static readonly EIA2_XLARGE = AcceleratorType.of('ml.eia2.xlarge');
37+
38+
/**
39+
* Builds an AcceleratorType from a given string or token (such as a CfnParameter).
40+
* @param acceleratorType An accelerator type as string
41+
* @returns A strongly typed AcceleratorType
42+
*/
43+
public static of(acceleratorType: string): AcceleratorType {
44+
return new AcceleratorType(acceleratorType);
45+
}
46+
47+
private readonly acceleratorTypeIdentifier: string;
48+
49+
constructor(acceleratorType: string) {
50+
if (cdk.Token.isUnresolved(acceleratorType) || acceleratorType.startsWith('ml.')) {
51+
this.acceleratorTypeIdentifier = acceleratorType;
52+
} else {
53+
throw new Error(`instance type must start with 'ml.'; (got ${acceleratorType})`);
54+
}
55+
}
56+
57+
/**
58+
* Return the accelerator type as a string
59+
* @returns The accelerator type as a string
60+
*/
61+
public toString(): string {
62+
return this.acceleratorTypeIdentifier;
63+
}
64+
}

0 commit comments

Comments
 (0)