Skip to content

Commit ba4946e

Browse files
robm26switch180
andauthored
LSQL: relational migration merge - guide (#121)
* relational migration instructions initial * LMIG -> LDMS, + LSQL naming * LSQL content additions * Changes for LSQL to add MySQL EC2 instance * pulling in Rob's changes to template * Fixed syntax error in UserData * Changes alongside my review * Rob's changes * instruction updates for clarity * image and instruction improvements * updated images * Final revision before merge --------- Co-authored-by: Sean Shriver <seanshi@amazon.com>
1 parent 03f2c12 commit ba4946e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+1005
-5
lines changed

content/authors.en.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,15 @@ weight: 100
1414
1. Daniel Yoder ([danielsyoder](https://github.com/danielsyoder)) - The brains behind amazon-dynamodb-labs.com and the co-creator of the design scenarios
1515

1616
### 2024 additions
17-
The Generative AI workshop LBED was released in 2024:
17+
The Generative AI workshop LBED was released in early 2024:
1818
1. John Terhune - ([@terhunej](https://github.com/terhunej)) - Primary author
1919
1. Zhang Xin - ([@SEZ9](https://github.com/SEZ9)) - Content contributor and original author of a lab that John used as the basis of LBED
2020
1. Sean Shriver - ([@switch180](https://github.com/switch180)) - Editor, tech reviewer, and merger
2121

22+
The LSQL relational migration lab was released in late 2024:
23+
1. Robert McCauley - ([robm26](https://github.com/robm26)) - Primary author
24+
1. Sean Shriver - ([@switch180](https://github.com/switch180)) - Editor, tech reviewer, and merger
25+
2226
### 2023 additions
2327
The serverless event driven architecture lab was added in 2023:
2428

content/change-data-capture/index.en.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: "LCDC: Change Data Capture for Amazon DynamoDB"
33
chapter: true
44
description: "200 level: Hands-on exercises with DynamoDB Streams and Kinesis Data Streams with Kinesis Analytics."
5-
weight: 40
5+
weight: 80
66
---
77
In this workshop, you will learn how to perform change data capture of item level changes on DynamoDB tables using [Amazon DynamoDB Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) and [Amazon Kinesis Data Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/kds.html). This technique allows you to develop event-driven solutions that are initiated by alterations made to item-level data stored in DynamoDB.
88

content/index.en.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@ Prior expertise with AWS and NoSQL databases is beneficial but not required to c
1616
If you're brand new to DynamoDB with no experience, you may want to begin with *Hands-on Labs for Amazon DynamoDB*. If you want to learn the design patterns for DynamoDB, check out *Advanced Design Patterns for DynamoDB* and the *Design Challenges* scenarios.
1717

1818
### Looking for a larger challenge?
19-
The DynamoDB Immersion Day has a series of workshops designed to cover advanced topics. If you want to dig deep into streaming aggregations with AWS Lambda and DynamoDB Streams, consider LEDA. Or if you want an easier introduction CDC you can consider LCDC.
19+
The DynamoDB Immersion Day has a series of workshops designed to cover advanced topics. If you want to dig deep into streaming aggregations with AWS Lambda and DynamoDB Streams, consider LEDA. Or if you want an easier introduction CDC you can consider LCDC. Do you have a relational database to migrate to DynamoDB? We offer LSQL and a AWS DMS lab LDMS: we highly recommend LSQL unless you have a need to use DMS.
20+
2021
Do you want to integrate Generative AI to create a context-aware reasoning application? If so consider LBED, a lab that takes a product catalog from DynamoDB and contiously indexes it into OpenSearch Service for natural language queries supported by Amazon Bedrock.
2122

2223
Dive into the content:

content/hands-on-labs/rdbms-migration/index.en.md renamed to content/rdbms-migration/index.en.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
2-
title: "5. LMIG: Relational Modeling & Migration"
2+
title: "LDMS: AWS DMS Migration"
33
date: 2021-04-25T07:33:04-05:00
44
weight: 50
55
---
66

7-
In this module, also classified as LMIG, you will learn how to design a target data model in DynamoDB for highly normalized relational data in a relational database.
7+
In this module, classified as LDMS, you will learn how to design a target data model in DynamoDB for highly normalized relational data in a relational database.
88
The exercise also guides a step by step migration of an IMDb dataset from a self-managed MySQL database instance on EC2 to a fully managed key-value pair database Amazon DynamoDB.
99
At the end of this lesson, you should feel confident in your ability to design and migrate an existing relational database to Amazon DynamoDB.
1010

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
title : "Application Refactoring"
3+
weight : 40
4+
---
5+
6+
## Updating the Client Application for DynamoDB
7+
After you have chosen your DynamoDB table schema, and migrated any historical data over,
8+
you can consider what code changes are required so a new version of your app can call the DynamoDB
9+
read and write APIs.
10+
11+
The web app we have been using includes forms and buttons to perform standard CRUD (Create, Read, Update, Delete) operations.
12+
13+
The web app makes HTTP calls to the published API using standard GET and POST methods against certain API paths.
14+
15+
1. In Cloud9, open the left nav and locate the file **app.py**
16+
2. Double click to open and review this file
17+
18+
In the bottom half of the file you will see several small handler functions that
19+
pass core read and write requests on to the **db** object's functions.
20+
21+
22+
Notice the file contains a conditional import for the **db** object.
23+
24+
```python
25+
if migration_stage == 'relational':
26+
from chalicelib import mysql_calls as db
27+
else:
28+
from chalicelib import dynamodb_calls as db
29+
```
30+
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
title : "DynamoDB-ready middle tier"
3+
weight : 41
4+
---
5+
6+
## Deploy a new DynamoDB-ready API
7+
8+
If you recall, we had run the command ```chalice deploy --stage relational``` previously
9+
to create the MySQL-ready middle tier.
10+
11+
We can repeat this to create a new API Gateway and Lambda stack, this time using the DynamoDB stage.
12+
13+
1. Within the Cloud9 terminal window, run:
14+
```bash
15+
chalice deploy --stage dynamodb
16+
```
17+
2. When this completes, find the new Rest API URL and copy it.
18+
3. You can paste this into a new browser tab to test it. You should see a status message indicating
19+
the DynamoDB version of the API is working.
20+
21+
We now need a separate browser to test out the full web app experience, since
22+
the original browser has a cookie set to the relational Rest API.
23+
24+
4. If you have multiple browsers on your laptop, such as Edge, Firefox, or Safari,
25+
open a different browser and navigate to the web app:
26+
27+
[https://amazon-dynamodb-labs.com/static/relational-migration/web/index.html](https://amazon-dynamodb-labs.com/static/relational-migration/web/index.html).
28+
29+
(You can also open the same browser in Incognito Mode for this step.)
30+
31+
5. Click the Target API button and paste in the new Rest API URL.
32+
6. Notice the title of the page has updated to **DynamoDB App** in a blue color. If it isn't blue, you can refresh the page and see the color change.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
---
2+
title : "Testing and reviewing DynamoDB code"
3+
weight : 42
4+
---
5+
6+
## Test drive your DynamoDB application
7+
8+
1. Click Tables to see a list of available tables in the account. You should see the
9+
Customers table, vCustOrders table, and a few other tables used by separate workshops.
10+
11+
3. Click on the Customers table, click the SCAN button to see the table's data.
12+
4. Test the CRUD operations such as get-item, and the update and delete buttons in the data grid,
13+
to make sure they work against the DynamoDB table.
14+
4. Click on the Querying tab to display the form with GSI indexes listed.
15+
5. On the idx_region GSI, enter 'North' and press GO.
16+
17+
![DynamoDB GSI Form](/static/images/relational-migration/ddb_gsi.png)
18+
19+
## Updating DynamoDB functions
20+
21+
Let's make a small code change to demonstrate the process to customize the DynamoDB functions.
22+
23+
6. In Cloud9, left nav, locate the chalicelib folder and open it.
24+
7. Locate and open the file dynamodb_calls.py
25+
8. Search for the text ```get_request['ConsistentRead'] = False```
26+
9. Update this from False to True and click File/Save to save your work.
27+
10. In the terminal prompt, redeploy:
28+
29+
```bash
30+
chalice deploy --stage dynamodb
31+
```
32+
33+
11. Return to the web app, click on the Customers table, and enter cust_id value "0001" and click the GET ITEM button.
34+
12. Verify a record was retrieved for you. This record was found using a strongly consistent read.
35+
13. Feel free to extend the DynamoDB code to add new functions or modify existing ones.
36+
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
title : "Data Migration"
3+
weight : 30
4+
---
5+
6+
## Transform, Extract, Convert, Stage Import
7+
8+
Recall our strategy for migrating table data into DynamoDB via S3 was
9+
summarized in the :link[Workshop Introduction]{href="../introduction/index5" target=_blank}.
10+
11+
For each table or view that we want to migrate, we need a routine that will ```SELECT *``` from it,
12+
and convert the result dataset into DynamoDB JSON before writing it to an S3 bucket.
13+
14+
![Migration Flow](/static/images/relational-migration/migrate_flow.png)
15+
16+
For migrations of very large tables we may choose to use purpose-built data tools like
17+
AWS Glue, Amazon EMR, or Amazon DMS. These tools can help you define and coordinate multiple
18+
parallel jobs that perform the work to extract, transform, and stage data into S3.
19+
20+
In this workshop we can use a Python script to demonstrate this ETL process.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
---
2+
title : "ETL Scripts"
3+
weight : 31
4+
---
5+
6+
7+
## mysql_s3.py
8+
9+
A script called mysql_s3.py is provided that performs all the work to convert and load a query result
10+
set into S3. We can run this script in preview mode by using the "stdout" parameter.
11+
12+
1. Run:
13+
```bash
14+
python3 mysql_s3.py Customers stdout
15+
```
16+
You should see results in DynamoDB JSON format:
17+
18+
![mysql_s3.py output](/static/images/relational-migration/mysql_s3_output.png)
19+
20+
2. Next, run it for our view:
21+
```bash
22+
python3 mysql_s3.py vCustOrders stdout
23+
```
24+
You should see similar output from the view results.
25+
26+
The script can write these to S3 for us. We just need to omit the "stdout" command line parameter.
27+
28+
3. Now, run the script without preview mode:
29+
```bash
30+
python3 mysql_s3.py Customers
31+
```
32+
You should see confirmation that objects have been written to S3:
33+
34+
![mysql_s3.py output](/static/images/relational-migration/mysql_s3_write_output.png)
35+
36+
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
title : "Full Migration"
3+
weight : 32
4+
---
5+
6+
## DynamoDB Import from S3
7+
8+
The Import from S3 feature is a convenient way to have data loaded into a new DynamoDB table.
9+
Learn more about this feature [here](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.HowItWorks.html).
10+
11+
Import creates a brand new table, and is not able to load data into an existing table.
12+
Therefore, it is most useful during the one-time initial load of data during a migration.
13+
14+
## migrate.sh
15+
16+
A script is provided that performs multiple steps to coordinate a migration:
17+
* Runs **mysql_desc_ddb.py** and stores the result in a table definition JSON file
18+
* Runs **mysql_s3.py** to extract, transform, and load data into an S3 bucket
19+
* Uses the **aws dynamodb import-table** CLI command to request a new table, by providing the bucket name and table definition JSON file
20+
21+
1. Run:
22+
```bash
23+
./migrate.sh Customers
24+
```
25+
The script should produce output as shown here:
26+
27+
![Migrate Output](/static/images/relational-migration/migrate_output.png)
28+
29+
Notice the ARN returned. This is the ARN of the Import job, not the new DynamoDB table.
30+
31+
The import will take a few minutes to complete.
32+
33+
2. Optional: You can check the status of an import job using this command, by setting the Import ARN on line two.
34+
35+
```bash
36+
aws dynamodb describe-import \
37+
--import-arn '<paste ARN here>' \
38+
--output json --query '{"Status ":ImportTableDescription.ImportStatus, "FailureCode ":ImportTableDescription.FailureCode, "FailureMessage ":ImportTableDescription.FailureMessage }'
39+
```
40+
41+
We can also check the import status within the AWS Console.
42+
43+
3. Click into the separate browser tab titled "AWS Cloud9" to open the AWS Console.
44+
4. In the search box, type DynamoDB to visit the DyanmoDB console.
45+
5. From the left nav, click Imports from S3.
46+
6. Notice your import is listed along with the current status.
47+
![Import from S3](/static/images/relational-migration/import-from-s3.png)
48+
7. Once the import has completed, you can click it to see a summary including item count and the size of the import.
49+
8. On the left nav, click to Tables.
50+
9. In the list of tables, click on the Customers table.
51+
10. On the top right, click on Explore Table Items.
52+
11. Scroll down until you see a grid with your imported data.
53+
54+
Congratulations! You have completed a relational-to-DynamoDB migration.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
---
2+
title : "VIEW migration"
3+
weight : 33
4+
---
5+
6+
## Migrating from a VIEW
7+
8+
In the previous step, you simply ran ```./migrate.sh Customers``` to perform a migration of this table
9+
and data to DynamoDB.
10+
11+
You can repeat this process to migrate the custom view vCustOrders.
12+
13+
1. Run:
14+
```bash
15+
./migrate.sh vCustOrders
16+
```
17+
18+
The script assumes you want a two-part primary key of Partition Key and Sort Key, found in the two leading columns.
19+
20+
In case you want a Partition Key only table, you could specify this like so.
21+
22+
```bash
23+
./migrate.sh vCustOrders 1
24+
```
25+
26+
But don't run this command, because if you do, the S3 Import will fail as you already have a table called vCustOrders.
27+
You could create another view with a different name and import that, or just delete the DynamoDB table
28+
from the DynamoDB console before attempting another migration of vCustOrders.
29+
30+
However, this is not advisable since this particular dataset is not unique by just the first column.
31+
32+
![View output](/static/images/relational-migration/view_result.png)
33+
34+
::alert[Import will write all the records it finds in the bucket to the table. If a duplicate record is encountered, it will simply overwrite it. Please be sure that your S3 data does not contain any duplicates based on the Key(s) of the new table you define.]{header="Note:"}
35+
36+
The second import is also not advisable since if you created a new vCustOrders table in step 1, the second Import attempt would not be able to replace the existing table, and would fail.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
title : "SQL Transformation Patterns for DynamoDB"
3+
weight : 34
4+
---
5+
6+
## Shaping Data with SQL
7+
8+
Let's return to the web app and explore some techniques you can use to shape and enrich your relational
9+
data before importing it to DynamoDB.
10+
11+
1. Within the web app, refresh the browser page.
12+
2. Click on the Querying tab.
13+
3. Notice the set of SQL Sample buttons below the SQL editor.
14+
4. Click button one.
15+
The OrderLines table has a two-part primary key, as is common with DynamoDB. We can think of the returned dataset as an Item Collection.
16+
5. Repeat by clicking each other sample buttons. Check the comment at the top of each query, which summarizes the technique being shown.
17+
18+
![SQL Samples](/static/images/relational-migration/sparse.png)
19+
20+
Notice the final two sample buttons. These demonstrate alternate ways to combine data from multiple tables.
21+
We already saw how to combine tables with a JOIN operator, resulting in a denormalized data set.
22+
23+
The final button shows a different approach to combining tables, without using JOIN.
24+
You can use a UNION ALL between multiple SQL queries to stack datasets together as one.
25+
When we arrange table data like this, we describe each source table as an entity and so the single DynamoDB
26+
table will be overloaded with multiple entities. Because of this, we can set the partition key and sort key
27+
names to generic values of PK and SK, and add some decoration to the key values so that it's clear what type
28+
of entity a given record represents.
29+
30+
![Stacked entities](/static/images/relational-migration/stacked.png)
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
title : "Custom VIEWs"
3+
weight : 35
4+
---
5+
6+
## Challenge: Create New Views
7+
8+
The SQL editor window is provided so that you have an easy way to run queries and
9+
experiment with data transformation techniques.
10+
11+
Using the sample queries as a guide, see how many techniques you can combine in a single query.
12+
Look for opportunities to align attributes across the table so that they can be queried by a GSI.
13+
Consider using date fields in column two, so that they become Sort Key values, and be queryable with
14+
DynamoDB range queries.
15+
16+
1. When you have a SQL statement you like, click the CREATE VIEW button.
17+
2. In the prompt, enter a name for your new view. This will add a CREATE VIEW statement to the top of you query.
18+
3. Click RUN SQL to create the new view.
19+
4. Refresh the page, and your view should appear as a button next to the vCustOrders button.
20+

0 commit comments

Comments
 (0)