aws-samples · sssudha · Jan 19, 2021 · Jan 27, 2021 · Jan 28, 2021
diff --git a/content/design-patterns/ex9globaltables/Step1.en.md b/content/design-patterns/ex9globaltables/Step1.en.md
@@ -0,0 +1,55 @@
++++
+title = "Step 1 - Create the recommendations table as a global table"
+date = 2019-12-02T10:50:03-08:00
+weight = 1
++++
+
+
+Run the following AWS CLI command to create the `recommendations` table in US West (Oregon).
+```bash
+aws dynamodb create-table --table-name recommendations \
+--attribute-definitions AttributeName=customer_id,AttributeType=S AttributeName=category_id,AttributeType=S \
+--key-schema AttributeName=customer_id,KeyType=HASH AttributeName=category_id,KeyType=RANGE \
+--billing-mode PAY_PER_REQUEST \
+--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES \
+--region us-west-2 \
+--tags Key=workshop-design-patterns,Value=targeted-for-cleanup
+```
+Create an identical `recommendations` table in US East (N. Virginia).
+```bash
+aws dynamodb update-table --table-name recommendations --cli-input-json  \
+'{
+  "ReplicaUpdates":
+  [
+    {
+      "Create": {
+        "RegionName": "us-east-1"
+      }
+    }
+  ]
+}'
+```
+Run the following command to wait until the table becomes active.
+```bash
+aws dynamodb wait table-exists --table-name recommendations
+```
+You can view the list of replicas created using describe-table.
+```bash
+aws dynamodb describe-table --table-name recommendations --region us-west-2
+```
+Let's take a closer look at the `create-table` command. You are creating a table named `recommendations`. The partition key on the table is `customer_id`. The sort key is `category_id`, which contains the movie genre like Drama, Comedy etc.
+
+#### Table: `recommendations`
+
+- Key schema: HASH, RANGE (partition and sort key)
+- Table is created in on-demand capacity mode
+
+| Attribute Name (Type)        | Special Attribute?           | Attribute Use Case          | Sample Attribute Value  |
+| ------------- |:-------------:|:-------------:| -----:|
+| customer_id (STRING)      | Partition Key | Customer ID  | `1`  |
+| category_id (STRING)      | Sort key | Category ID   | `Drama`  |
+
+Review the `recommendations` table in the DynamoDB console (as shown in the following screenshot) by choosing the **recommendations** table and then choosing the **Global tables** tab.
+
+![Recommendations table](/images/awsconsole9a.png)
+
diff --git a/content/design-patterns/ex9globaltables/Step2.en.md b/content/design-patterns/ex9globaltables/Step2.en.md
@@ -0,0 +1,70 @@
++++
+title = "Step 2 - Load data into the global table and query the replica"
+date = 2019-12-02T10:50:03-08:00
+weight = 2
++++
+
+
+Insert a new item to the `recommendations` table in US West (Oregon).
+
+```bash
+aws dynamodb put-item \
+    --table-name recommendations\
+    --item '{"customer_id": {"S":"99"},"category_id": {"S":"Drama"}}' \
+    --region us-west-2
+```
+Wait for a second, and query the replica
+
+```bash
+aws dynamodb get-item \
+    --table-name recommendations \
+    --key '{"customer_id": {"S":"99"},"category_id": {"S":"Drama"}}' \
+    --region us-east-1
+```
+Now, run the script that sequentially writes items to the local region and queries the remote region, measuring the replication time. This is done for 10 items
+
+```bash
+python load_recommendations_sequentially.py recommendations ./data/recommendations.csv
+```
+
+The sample `recommendations.csv` record looks like the following:
+```csv
+001,Drama, Argo
+```
+In addition to the customer_id and category_id, we now have the movie title. The script reads each record from the csv file and puts the item into the DynamoDB table in the Us West (Oregon) region. Immediately, it runs a GetItem for that customer_id from the replica table in the US East (N. Virgina) regionr, which returns an empty record. It waits for a second and tries again. Now the replica returns the item for the newly inserted customer id. The following output shows this pattern for a few items.
+Output:
+```txt
+88e9fe579ead:design-patterns ssarma$ python load_recommendations_sequentially.py recommendations ./data/recommendations.csv
+[]
+Current time: 1611813327.91749
+[{'category_id': 'Drama', 'customer_id': '001', 'title': ' Argo'}]
+Current time: 1611813329.044519
+
+[]
+Current time: 1611813329.2009711
+[{'category_id': 'Thriller', 'customer_id': '002', 'title': 'The Last Seven'}]
+Current time: 1611813330.320935
+
+[]
+Current time: 1611813330.476702
+[{'category_id': 'Comedy', 'customer_id': '003', 'title': "The Night They Raided Minsky's"}]
+Current time: 1611813331.594492
+
+[]
+Current time: 1611813331.7503822
+[{'category_id': 'Thriller', 'customer_id': '004', 'title': 'The Final Destination'}]
+Current time: 1611813332.870115
+```
+The output confirms that 10 items have been inserted to the table. 
+
+You can review the replication metrics for the `recommendations` table in the DynamoDB console (as shown in the following screenshot) by choosing the **recommendations** table and then choosing the **Monitor** tab.
+
+![Recommendations table](/images/awsconsole9b.png)
+
+Scroll down to the Latency section to see the Get, Put and Query latency metrics
+
+![Recommendations table](/images/awsconsole9c.png)
+
+You can use Amazon CloudWatch to monitor the behavior and performance of a global table. Amazon DynamoDB publishes ReplicationLatency metric for each replica in the global table.
+ReplicationLatency is the elapsed time between when an item is written to a replica table, and when that item appears in another replica in the global table. ReplicationLatency is expressed in milliseconds and is emitted for every source and destination Region pair. During normal operation, ReplicationLatency should be fairly constant. An elevated value for ReplicationLatency could indicate that updates from one replica are not propagating to other replica tables in a timely manner. Over time, this could result in other replica tables falling behind because they no longer receive updates consistently. In this case, you should verify that the read capacity units (RCUs) and write capacity units (WCUs) are identical for each of the replica tables. In addition, when choosing WCU settings, follow the recommendations in Best Practices and Requirements for Managing Capacity. ReplicationLatency can increase if an AWS Region becomes degraded and you have a replica table in that Region. In this case, you can temporarily redirect your application's read and write activity to a different AWS Region. 
+For more information, see DynamoDB Metrics and Dimensions.
diff --git a/content/design-patterns/ex9globaltables/Step3.en.md b/content/design-patterns/ex9globaltables/Step3.en.md
@@ -0,0 +1,45 @@
++++
+title = "Step 3 - Write to both regions and see the occasional conflict resolution"
+date = 2019-12-02T10:50:04-08:00
+weight = 3
++++
+
+You can run the following parallel command to write to both regions at the same time. The region field updates the region that won in the conflict resolution process for each of the 10 items.
+
+```py
+parallel --jobs 2 < tasks.txt
+```
+
+The script should give you output that looks like the following.
+```txt
+88e9fe579ead:design-patterns ssarma$ parallel --jobs 2 < tasks.txt
+[{'category_id': 'Drama', 'customer_id': '001', 'region': 'West', 'title': ' Argo'}]
+Current time: 1611816863.0019908
+[{'category_id': 'Drama', 'customer_id': '001', 'region': 'East', 'title': ' Argo'}]
+Current time: 1611816864.047831
+
+[{'category_id': 'Thriller', 'customer_id': '002', 'region': 'West', 'title': 'The Last Seven'}]
+Current time: 1611816864.1282911
+[{'category_id': 'Thriller', 'customer_id': '002', 'region': 'East', 'title': 'The Last Seven'}]
+Current time: 1611816865.172729
+
+[{'category_id': 'Comedy', 'customer_id': '003', 'region': 'West', 'title': "The Night They Raided Minsky's"}]
+Current time: 1611816865.252855
+[{'category_id': 'Comedy', 'customer_id': '003', 'region': 'West', 'title': "The Night They Raided Minsky's"}]
+Current time: 1611816866.297246
+
+[{'category_id': 'Thriller', 'customer_id': '004', 'region': 'West', 'title': 'The Final Destination'}]
+Current time: 1611816866.377374
+[{'category_id': 'Thriller', 'customer_id': '004', 'region': 'West', 'title': 'The Final Destination'}]
+Current time: 1611816867.41737
+```
+You can review the transaction conflict errors  metrics for the `recommendations` table in the DynamoDB console (as shown in the following screenshot) by choosing the **recommendations** table and then choosing the **Monitor** tab.
+
+![Recommendations table](/images/awsconsole9b.png)
+
+Scroll down to the  Transactions section to see the Transaction conflict errors. The chart should say No data available. This is because DynamoDB does the conflict resolution automatically.
+![Recommendations table](/images/awsconsole9d.png)
+
+#### Summary
+
+Congratulations, you have completed this exercise and demonstrated how global tables do cross region replications and resolve conflicts. Use DyanmoDB global tables to run your applications that read and write from multiple AWS regions. In the next exercise, you will learn how transactions work in DynamoDB.
diff --git a/content/design-patterns/ex9globaltables/_index.en.md b/content/design-patterns/ex9globaltables/_index.en.md
@@ -0,0 +1,20 @@
++++
+title = "Global Tables"
+date = 2019-12-02T10:17:33-08:00
+weight = 5
+chapter = true
+pre = "<b>Exercise 9: </b>"
+description = "Explore how to create global tables and how the replication works across regions."
++++
+
+A DynamoDB global table is a collection of one or more replica tables, one replica per region, all owned by a single AWS account, that DynamoDB treats as a single unit. Every replica has the same table name, the same primary key schema and stores the same set of data items. When an application writes data to a replica table in one Region, DynamoDB propagates the write to the other replica tables in the other AWS Regions automatically. In a global table, a newly written item is usually propagated to all replica tables within a second. You can add replica tables to the global table so that it can be available in additional Regions.
+
+Use Version 2019.11.21 (Current) of global tables along with on-demand capacity. Using on-demand capacity ensures that you always have sufficient capacity to perform replicated writes to all regions of the global table. The number of replicated write request units will be equal in all Regions of the global table. For example, suppose that you expect 10 writes per second to your replica table in N. Virginia, you should expect to consume 10 replicated write request units in N. Virginia. 
+
+When you use the provisioned capacity mode, you manage your auto scaling policy with UpdateTableReplicaAutoScaling. Minimum and maximum throughput and target utilization are established globally for the table and passed to all replicas of the table. For details about autoscaling and DynamoDB, see Managing Throughput Capacity Automatically with DynamoDB Auto Scaling.
+
+When you are using Version 2019.11.21 (Current) of global tables and you also use the Time to Live feature, DynamoDB replicates TTL deletes to all replica tables. The initial TTL delete does not consume write capacity in the region in which the TTL expiry occurs. However, the replicated TTL delete to the replica table(s) consumes a replicated write capacity unit when using provisioned capacity, or replicated write when using on-demand capacity mode, in each of the replica regions and applicable charges will apply. 
+
+Transactional operations provide atomicity, consistency, isolation, and durability (ACID) guarantees only within the region where the write is made originally. Transactions are not supported across regions in global tables. For example, if you have a global table with replicas in the US West (Oregon) and US East (N. Virginia) regions and perform a TransactWriteItems operation in the US West (Oregon) Region, you may observe partially completed transactions in US East (N. Virginia) Region as changes are replicated. Changes will only be replicated to other regions once they have been committed in the source region. 
+
+If the customer managed CMK used to encrypt a replica is inaccessible DynamoDB will remove this replica from the replication group. The replica will not be deleted and replication will stop from and to this region, 20 hours after detecting the AWS KMS key as inaccessible. If you disable an AWS Region, DynamoDB will remove this replica from the replication group, 20 hours after detecting the AWS Region as inaccessible. The replica will not be deleted and replication will stop from and to this region.
diff --git a/content/reference-materials/_index.en.md b/content/reference-materials/_index.en.md
@@ -26,5 +26,6 @@ DynamoDB Related Tools:
 - **[EMR-DynamoDB-Connector: Access data stored in Amazon DynamoDB with Apache Hadoop, Apache Hive, and Apache Spark](https://github.com/awslabs/emr-dynamodb-connector)**
 
 Online Training Courses:
-- **[Linux Academy: Amazon DynamoDB Deep Dive](https://linuxacademy.com/course/dynamo-db-deep-dive/)**
+- **[A Cloud Guru: Amazon DynamoDB Deep Dive](https://acloudguru.com/course/amazon-dynamodb-deep-dive/)**
+- **[A Cloud Guru: Amazon DynamoDB Data Modeling](https://acloudguru.com/course/amazon-dynamodb-data-modeling/)**
 - **[edX: Amazon DynamoDB: Building NoSQL Database-Driven Applications](https://www.edx.org/course/amazon-dynamodb-building-nosql-database-driven-app)**
diff --git a/design-patterns/cloudformation/UserData.sh b/design-patterns/cloudformation/UserData.sh
@@ -114,6 +114,9 @@ function configure_python_and_install
   yum install -y python36
   alternatives --set python /usr/bin/python3.6
 
+  # This is used for the exercise with Global Tables
+  yum install -y parallel
+
   log Installing workshop requirements.
   /usr/bin/pip-3.6 install -r /home/ec2-user/workshop/requirements.txt
 

diff --git a/design-patterns/data/recommendations.csv b/design-patterns/data/recommendations.csv
@@ -0,0 +1,10 @@
+001,Drama, Argo
+002,Thriller,The Last Seven
+003,Comedy,The Night They Raided Minsky's
+004,Thriller,The Final Destination
+005,Comedy,Page Miss Glory
+006,Mystery,Sauna
+007,Drama,The Last Kiss
+008,Comedy,The Monster
+009,Thriller,L: Change the World
+010,Fantasy,Toys
diff --git a/design-patterns/load_recommendations_sequentially.py b/design-patterns/load_recommendations_sequentially.py
@@ -0,0 +1,61 @@
+from __future__ import print_function # Python 2/3 compatibility
+import boto3
+import time
+from boto3.dynamodb.conditions import Key, Attr
+import csv
+import sys
+from lab_config import boto_args
+
+def import_csv(tableName, fileName):
+    dynamodb = boto3.resource(**boto_args)
+    dynamodb_table = dynamodb.Table(tableName)
+    dynamodb_gt = boto3.resource('dynamodb', region_name='us-east-1')
+    global_table = dynamodb_gt.Table(tableName)
+    count = 0
+
+    time1 = time.time()
+    with open(fileName, 'r', encoding="utf-8") as csvfile:
+        myreader = csv.reader(csvfile, delimiter=',')
+        for row in myreader:
+            count += 1
+            newRecommendation = {}
+            #primary keys
+            newRecommendation['customer_id'] = row[0]
+            newRecommendation['category_id'] = row[1]
+            newRecommendation['title'] = row[2]
+
+            item = dynamodb_table.put_item(Item=newRecommendation)
+
+            response = global_table.query(
+            KeyConditionExpression=Key('customer_id').eq(row[0]) & Key('category_id').eq(row[1])
+            )
+
+            print(response['Items'])
+            print("Current time: %s" % time.time())
+
+            time.sleep(1)
+
+            response = global_table.query(
+            KeyConditionExpression=Key('customer_id').eq(row[0]) & Key('category_id').eq(row[1])
+            )
+
+            print(response['Items'])
+            print("Current time: %s\n" % time.time())
+
+            if count % 100 == 0:
+                time2 = time.time() - time1
+                print("recommendations count: %s in %s" % (count, time2))
+                time1 = time.time()
+    return count
+
+if __name__ == "__main__":
+    args = sys.argv[1:]
+    tableName = args[0]
+    fileName = args[1]
+
+    begin_time = time.time()
+    count = import_csv(tableName, fileName)
+
+    # print summary
+    print('RowCount: %s, Total seconds: %s' %(count, (time.time() - begin_time)))
+
diff --git a/design-patterns/tasks.txt b/design-patterns/tasks.txt
@@ -0,0 +1,2 @@
+python write_recommendations_to_west.py recommendations ./data/recommendations.csv
+python write_recommendations_to_east.py recommendations ./data/recommendations.csv
diff --git a/design-patterns/write_recommendations_to_east.py b/design-patterns/write_recommendations_to_east.py
@@ -0,0 +1,59 @@
+from __future__ import print_function # Python 2/3 compatibility
+import boto3
+import time
+from boto3.dynamodb.conditions import Key, Attr
+import csv
+import sys
+from lab_config import boto_args
+
+def import_csv(tableName, fileName):
+    dynamodb_east = boto3.resource('dynamodb', region_name='us-east-1')
+    east_table = dynamodb_east.Table(tableName)
+    count = 0
+
+    time1 = time.time()
+    with open(fileName, 'r', encoding="utf-8") as csvfile:
+        myreader = csv.reader(csvfile, delimiter=',')
+        for row in myreader:
+            count += 1
+            newRecommendation = {}
+            #primary keys
+            newRecommendation['customer_id'] = row[0]
+            newRecommendation['category_id'] = row[1]
+            newRecommendation['title'] = row[2]
+
+            newRecommendation['region'] = 'East'
+            item = east_table.put_item(Item=newRecommendation)
+
+            response = east_table.query(
+            KeyConditionExpression=Key('customer_id').eq(row[0]) & Key('category_id').eq(row[1])
+            )
+
+            print(response['Items'])
+            print("Current time: %s" % time.time())
+
+            time.sleep(1)
+
+            response = east_table.query(
+            KeyConditionExpression=Key('customer_id').eq(row[0]) & Key('category_id').eq(row[1])
+            )
+
+            print(response['Items'])
+            print("Current time: %s\n" % time.time())
+
+            if count % 100 == 0:
+                time2 = time.time() - time1
+                print("recommendations count: %s in %s" % (count, time2))
+                time1 = time.time()
+    return count
+
+if __name__ == "__main__":
+    args = sys.argv[1:]
+    tableName = args[0]
+    fileName = args[1]
+
+    begin_time = time.time()
+    count = import_csv(tableName, fileName)
+
+    # print summary
+    print('RowCount: %s, Total seconds: %s' %(count, (time.time() - begin_time)))
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		python write_recommendations_to_west.py recommendations ./data/recommendations.csv
		python write_recommendations_to_east.py recommendations ./data/recommendations.csv