#DBcluster | Explore Tumblr posts and blogs

globalmediacampaign · 5 years ago

Text

Amazon DocumentDB (with MongoDB compatibility) read autoscaling

Amazon Document DB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. Its architecture supports up to 15 read replicas, so applications that connect as a replica set can use driver read preference settings to direct reads to replicas for horizontal read scaling. Moreover, as read replicas are added or removed, the drivers adjust to automatically spread the load over the current read replicas, allowing for seamless scaling. Amazon DocumentDB separates storage and compute, so adding and removing read replicas is fast and easy regardless of how much data is stored in the cluster. Unlike other distributed databases, you don’t need to copy data to new read replicas. Although you can use the Amazon DocumentDB console, API, or AWS Command Line Interface (AWS CLI) to add and remove read replicas manually, it’s possible to automatically change the number of read replicas to adapt to changing workloads. In this post, I describe how to use Application Auto Scaling to automatically add or remove read replicas based on cluster load. I also demonstrate how this system works by modifying the load on a cluster and observing how the number of read replicas change. The process includes three steps: Deploy an Amazon DocumentDB cluster and required autoscaling resources. Generate load on the Amazon DocumentDB cluster to trigger a scaling event. Monitor cluster performance as read scaling occurs. Solution overview Application Auto Scaling allows you to automatically scale AWS resources based on the value of an Amazon CloudWatch metric, using an approach called target tracking scaling. Target tracking scaling uses a scaling policy to define which CloudWatch metric to track, and the AWS resource to scale, called the scalable target. When you register a target tracking scaling policy, Application Auto Scaling automatically creates the required CloudWatch metric alarms and manages the scalable target according to the policy definition. The following diagram illustrates this architecture. Application Auto Scaling manages many different AWS services natively, but as of this writing, Amazon DocumentDB is not included among these. However, you can still define an Amazon DocumentDB cluster as a scalable target by creating an Auto Scaling custom resource, which allows our target tracking policy to manage an Amazon DocumentDB cluster’s configuration through a custom HTTP API. This API enables the Application Auto Scaling service to query and modify a resource. The following diagram illustrates this architecture. We create the custom HTTP API with two AWS services: Amazon API Gateway and AWS Lambda. API Gateway provides the HTTP endpoint, and two Lambda functions enable Application Auto Scaling to discover the current number of read replicas, and increase or decrease the number of read replicas. One Lambda function handles the status query (a GET operation), and the other handles adjusting the number of replicas (a PATCH operation). Our complete architecture looks like the following diagram. Required infrastructure Before we try out Amazon DocumentDB read autoscaling, we create an AWS CloudFormation stack that deploys the following infrastructure: An Amazon Virtual Private Cloud (VPC) with two public and two private subnets to host our Amazon DocumentDB cluster and other resources. An Amazon DocumentDB cluster consisting of one write and two read replicas, all of size db.r5.large. A jump host (Amazon Elastic Compute Cloud (Amazon EC2)) that we use to run the load test. It lives in a private subnet and we access it via AWS Systems Manager Session Manager, so we don’t need to manage SSH keys or security groups to connect. The autoscaler, which consists of a REST API backed by two Lambda functions. A preconfigured CloudWatch dashboard with a set of useful charts for monitoring the Amazon DocumentDB write and read replicas. Start by cloning the autoscaler code from its Git repository. Navigate to that directory. Although you can create the stack on the AWS CloudFormation console, I’ve provided a script in the repository to make the creation process easier. Create an Amazon Simple Storage Service (Amazon S3) bucket to hold the CloudFormation templates: aws s3 mb s3:// On the Amazon S3 console, enable versioning for the bucket. We use versions to help distinguish new versions of the Lambda deployment packages. Run a script to create deployment packages for our Lambda functions: ./scripts/zip-lambda.sh Invoke the create.sh script, passing in several parameters. The template prefix is the folder in the S3 bucket where we store the Cloud Formation templates. ./scripts/create.sh For example, see the following code: ./scripts/create.sh cfn PrimaryPassword docdbautoscale us-east-1 The Region should be the same Region in which the S3 bucket was created. If you need to update the stack, pass in –update as the last argument. Now you wait for the stack to create. When the stack is complete, on the AWS CloudFormation console, note the following values on the stack Outputs tab: DBClusterIdentifier DashboardName DbEndpoint DbUser JumpHost VpcId ApiEndpoint When we refer to these later on, they appear in brackets, like Also note your AWS Region and account number. Register the autoscaler: cd scripts python register.py Autoscaler design The autoscaler implements the custom resource scaling pattern from the Application Auto Scaling service. In this pattern, we have a REST API that offers a GET method to obtain the status of the resource we want to scale, and a PATCH method that updates the resource. The GET method The Lambda function that implements the GET method takes an Amazon DocumentDB cluster identifier as input and returns information about the desired and actual number of read replicas. The function first retrieves the current value of the desired replica count, which we store in the Systems Manager Parameter Store: param_name = "DesiredSize-" + cluster_id r = ssm.get_parameter( Name= param_name) desired_count = int(r['Parameter']['Value']) Next, the function queries Amazon DocumentDB for information about the read replicas in the cluster: r = docdb.describe_db_clusters( DBClusterIdentifier=cluster_id) cluster_info = r['DBClusters'][0] readers = [] for member in cluster_info['DBClusterMembers']: member_id = member['DBInstanceIdentifier'] member_type = member['IsClusterWriter'] if member_type == False: readers.append(member_id) It interrogates Amazon DocumentDB for information about the status of each of the replicas. That lets us know if a scaling action is ongoing (a new read replica is creating). See the following code: r = docdb.describe_db_instances(Filters=[{'Name':'db-cluster-id','Values': [cluster_id]}]) instances = r['DBInstances'] desired_count = len(instances) - 1 num_available = 0 num_pending = 0 num_failed = 0 for i in instances: instance_id = i['DBInstanceIdentifier'] if instance_id in readers: instance_status = i['DBInstanceStatus'] if instance_status == 'available': num_available = num_available + 1 if instance_status in ['creating', 'deleting', 'starting', 'stopping']: num_pending = num_pending + 1 if instance_status == 'failed': num_failed = num_failed + 1 Finally, it returns information about the current and desired number of replicas: responseBody = { "actualCapacity": float(num_available), "desiredCapacity": float(desired_count), "dimensionName": cluster_id, "resourceName": cluster_id, "scalableTargetDimensionId": cluster_id, "scalingStatus": scalingStatus, "version": "1.0" } response = { 'statusCode': 200, 'body': json.dumps(responseBody) } return response The PATCH method The Lambda function that handles a PATCH request takes the desired number of read replicas as input: {"desiredCapacity":2.0} The function uses the Amazon DocumentDB Python API to gather information about the current state of the cluster, and if a scaling action is required, it adds or removes a replica. When adding a replica, it uses the same settings as the other replicas in the cluster and lets Amazon DocumentDB choose an Availability Zone automatically. When removing replicas, it chooses the Availability Zone that has the most replicas available. See the following code: # readers variable was initialized earlier to a list of the read # replicas. reader_type and reader_engine were copied from # another replica. desired_count is essentially the same as # desiredCapacity. if scalingStatus == 'Pending': print("Initiating scaling actions on cluster {0} since actual count {1} does not equal desired count {2}".format(cluster_id, str(num_available), str(desired_count))) if num_available < desired_count: num_to_create = desired_count - num_available for idx in range(num_to_create): docdb.create_db_instance( DBInstanceIdentifier=readers[0] + '-' + str(idx) + '-' + str(int(time.time())), DBInstanceClass=reader_type, Engine=reader_engine, DBClusterIdentifier=cluster_id ) else: num_to_remove = num_available - desired_count for idx in range(num_to_remove): # get the AZ with the most replicas az_with_max = max(reader_az_cnt.items(), key=operator.itemgetter(1))[0] LOGGER.info(f"Removing read replica from AZ {az_with_max}, which has {reader_az_cnt[az_with_max]} replicas") # get one of the replicas from that AZ reader_list = reader_az_map[az_with_max] reader_to_delete = reader_list[0] LOGGER.info(f"Removing read replica {reader_to_delete}") docdb.delete_db_instance( DBInstanceIdentifier=reader_to_delete) reader_az_map[az_with_max].remove(reader_to_delete) reader_az_cnt[az_with_max] -= 1 We also store the latest desired replica count in the Parameter Store: r = ssm.put_parameter( Name=param_name, Value=str(desired_count), Type='String', Overwrite=True, AllowedPattern='^d+$' ) Defining the scaling target and scaling policy We use the boto3 API to register the scaling target. The MinCapacity and MaxCapacity are set to 2 and 15 in the scaling target, because we always want at least two read replicas, and 15 is the maximum number of read replicas. The following is the relevant snippet from the registration script: # client is the docdb boto3 client response = client.register_scalable_target( ServiceNamespace='custom-resource', ResourceId='https://' + ApiEndpoint + '.execute-api.' + Region + '.amazonaws.com/prod/scalableTargetDimensions/' + DBClusterIdentifier, ScalableDimension='custom-resource:ResourceType:Property', MinCapacity=2, MaxCapacity=15, RoleARN='arn:aws:iam::' + Account + ':role/aws-service-role/custom-resource.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_CustomResource' ) The script also creates the autoscaling policy. There are several important configuration parameters in this policy. I selected CPU utilization on the read replicas as the target metric (CPU utilization is not necessarily the best metric for your workload’s scaling trigger; other options such as BufferCacheHitRatio may provide better behavior). I set the target value at an artificially low value of 5% to more easily trigger a scaling event (a more realistic value for a production workload is 70–80%). I also set a long cooldown period of 10 minutes for both scale-in and scale-out to avoid having replicas added or removed too frequently. You need to determine the cooldown periods that are most appropriate for your production workload. The following is the relevant snippet from the script: response = client.put_scaling_policy( PolicyName='docdbscalingpolicy', ServiceNamespace='custom-resource', ResourceId='https://' + ApiEndpoint + '.execute-api.' + Region + '.amazonaws.com/prod/scalableTargetDimensions/' + DBClusterIdentifier, ScalableDimension='custom-resource:ResourceType:Property', PolicyType='TargetTrackingScaling', TargetTrackingScalingPolicyConfiguration={ 'TargetValue': 5.0, 'CustomizedMetricSpecification': { 'MetricName': 'CPUUtilization', 'Namespace': 'AWS/DocDB', 'Dimensions': [ { 'Name': 'Role', 'Value': 'READER' }, { 'Name': 'DBClusterIdentifier', 'Value': DBClusterIdentifier } ], 'Statistic': 'Average', 'Unit': 'Percent' }, 'ScaleOutCooldown': 600, 'ScaleInCooldown': 600, 'DisableScaleIn': False } ) Generating load I use the YCSB framework to generate load. Complete the following steps: Connect to the jump host using Session Manager: aws ssm start-session --target Install YCSB: sudo su - ec2-user sudo yum -y install java curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz tar xfvz ycsb-0.17.0.tar.gz cd ycsb-0.17.0 Run the load tester. We use workloadb, which is a read-heavy workload: ./bin/ycsb load mongodb -s -P workloads/workloadb -p recordcount=10000000 -p mongodb.url=”mongodb://:@:27017/?replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false” > load.dat ./bin/ycsb run mongodb -threads 10 -target 100 -s -P workloads/workloadb -p recordcount=10000000 -p mongodb.url=”mongodb://< PrimaryUser>:@:27017/?replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false” > run.dat These two commands load data in the Amazon DocumentDB database and run a read-heavy workload using that data. Monitoring scaling activity and cluster performance The CloudFormation stack created a CloudWatch dashboard that shows several metrics. The following screenshot shows the dashboard for the writer node. The following screenshot shows the dashboard for the read replicas. As YCSB runs, watch the dashboard to see the load increase. When the CPU load on the readers exceeds our 5% target, the autoscaler should add a read replica. We can verify that by checking the Amazon DocumentDB console and observing the number of instances in the cluster. Cleaning up If you deployed the CloudFormation templates used in this post, consider deleting the stack if you don’t want to keep the resources. Conclusion In this post, I showed you how to use a custom Application Auto Scaling resource to automatically add or remove read replicas to an Amazon DocumentDB cluster, based on a specific performance metric and scaling policy. Before using this approach in a production setting, you should decide which Amazon DocumentDB performance metric best reflects when your workload needs to scale in or scale out, determine the target value for that metric, and settle on a cooldown period that lets you respond to cluster load without adding or removing replicas too frequently. As a baseline, you could try a scaling policy that triggers a scale-up when CPUUtilization is over 70% or FreeableMemory is under 10%. About the Author Randy DeFauw is a principal solutions architect at Amazon Web Services. He works with the AWS customers to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS. https://aws.amazon.com/blogs/database/amazon-documentdb-with-mongodb-compatibility-read-autoscaling/

0 notes

awsexchage · 7 years ago

Photo

CloudFormation で作った S3 バケットにおいて, オブジェクトが入っている状態でスタックを削除しようとすると軒並みエラーになるので, その対処方法について検討した #ただそれだけ https://ift.tt/2JMe6oy

どうも

どういうことなの？

じゃあ, どうすれば良いのか — DeletionPolicy で Retain を設定して, スタックの削除とは切り離す — カスタムリソースを利用して, Lambda ファンクションでオブジェクトを削除してからバケットを削除する　　—– カスタムリソースについて　　—– オブジェクトをまるっと削除する Lambda ファンクション (1) 　　—– オブジェクトをまるっと削除する Lambda ファンクション (2) 　　—– スタックを削除する　　—– 懸念点

以上

どうも

#ただそれだけの CloudFormation 初心者, 川原です.

EC2 や ELB, そして, ELB のログを記録する S3 バケットを含むスタックを削除しようとすると, 軒並み以下のようなエラーとなってスタックの削除が失敗に終わる.

Target group 'arn:aws:elasticloadbalancing:ap-northeast-1:123456789012:targetgroup/oreno-api-debug-alb-target/d429742cee94a2c2' is currently in use by a listener or a rule (Service: AmazonElasticLoadBalancingV2; Status Code: 400; Error Code: ResourceInUse; Request ID: 66f8369a-6636-11e8-9cc4-23b879baae28)

えええーっとなって, 慌てて手動でバケットを削除してから, 改めてスタックを削除すると削除は成功で終わる.

これを回避する方法って無いのかなーと思ったのでメモ.

どういうことなの？

DeletionPolicy 属性というドキュメントを読むと, 以下のように記述されている.

削除

AWS CloudFormation はスタックの削除時にリソースと (該当する場合) そのすべてのコンテンツを削除します。この削除ポリシーは、あらゆるリソースタイプに追加することができます。デフォルトでは、DeletionPolicy を指定しない場合、リソースは削除されます。ただし、以下の点を考慮する必要があります。

AWS::RDS::DBCluster リソースの場合、デフォルトポリシーは Snapshot です。

DBClusterIdentifier プロパティを指定しない AWS::RDS::DBInstance リソースの場合、デフォルトポリシーは Snapshot です。

Amazon S3 バケットでは、削除を成功させるためにはバケットのすべてのオブジェクトを削除する必要があります。

なるほど, S3 バケットの中身は削除しなければ, スタックを削除する際に一緒に S3 バケットを削除することは出来ないとのこと.

じゃあ, どうすれば良いのか

DeletionPolicy で Retain を設定して, スタックの削除とは切り離す

AWS CloudFormation におけるリソースの削除処理の方法は、DeletionPolicy 属性で指定します。 DeletionPolicy 属性 - AWS CloudFormation - docs.aws.amazon.com

docs.aws.amazon.com

以下のように, DeletionPolicy に Retain を定義する. i

"ALBLOGBUCKET": { "Type": "AWS::S3::Bucket", "DeletionPolicy" : "Retain", "Properties": { "BucketName": { "Fn::Join" : [ "", [{ "Ref": "Project" }, "-", { "Ref": "Env" }, "-alb-log"]]} } },

Retain を設定することで, スタックを削除する際にも S3 バケット自体は削除せずに, 個別に手動 (AWS CLI 等) で S3 バケットを削除する.

カスタムリソースを利用して, Lambda ファンクションでオブジェクトを削除してからバケットを削除する

CloudFormation テンプレートでは, AWS::CloudFormation::CustomResource または Custom::String リソースタイプを使用して、カスタムリソースを指定することが出来る. また, スタックイベントに応じて Lambda ファンクションを実行することが出来る.

AWS CloudFormation スタックに AWS 以外のリソースを格納できるよう、AWS::CloudFormation::CustomResource リソースを使用してカスタムリソースを指定��ます。 AWS::CloudFormation::CustomResource - AWS CloudFormation - docs.aws.amazon.com

docs.aws.amazon.com

これを利用してスタックを削除する際に Lambda ファンクションを呼び出してバケット内のオブジェクトを削除してからバケットを削除するという方法を検討する. この方法は, 以下の記事を参考にさせて頂いた.

Is there any way to force CloudFormation to delete a non-empty S3 Bucket? Can I force CloudFormation to delete non-empty S3 Bucket? - Stack Overflow

stackoverflow.com

カスタムリソースについて

改めて, カスタムリソースについて, 以下のドキュメントを参考に整理してみる.

AWS CloudFormation スタックに AWS 以外のリソースを格納できるよう、AWS::CloudFormation::CustomResource リソースを使用してカスタムリソースを指定します。カスタムリソース - AWS CloudFormation - docs.aws.amazon.com

docs.aws.amazon.com

カスタムリソースとは, スタックを作成, 更新, 削除する度に CloudFormation がカスタムリソースで定義されたロジック (Lambda ファンクション等) を実行することが出来る機能で, 以下の三者が関連して動作している. (ドキュメントより引用しているが, 内容を理解し辛かったので意訳も入っている)

登場人物役割 template developer 要するに CloudFormation のテンプレートで, カスタムリソースを定義する cusom resource privider CloudFormation から要求される処理と応答を行う (例えば, Lambda ファンクションとか), テンプレート内では ServiceToken の値として指定する CloudFormation スタックオペレーション中にテンプレートで指定されたリクエストを ServiceToken に送信し, スタックオペレーションを進める前に応答を待機する

テンプレート内ではカスタムリソースを以下のように定義する.

MyCustomResource: Type: "Custom::TestLambdaCrossStackRef" Properties: ServiceToken: !Sub | arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:${LambdaFunctionName} StackName: Ref: "NetworkStackName"

オブジェクトをまるっと削除する Lambda ファンクション (1)

Lambda ファンクションを実装するにあたって, CloudFormation 側から送信されるイベントと, Lambda ファンクションが実行された後に CloudFormation に返す情報が必要となる. これらの情報についても, ドキュメントに明記されており, 例えば, スタック削除の場合には, 以下のようなイベント (ドキュメントより引用) が CloudFormation から送信される.

{ "RequestType" : "Delete", "RequestId" : "unique id for this delete request", "ResponseURL" : "pre-signed-url-for-delete-response", "ResourceType" : "Custom::MyCustomResourceType", "LogicalResourceId" : "name of resource in template", "StackId" : "arn:aws:cloudformation:us-east-2:namespace:stack/stack-name/guid", "PhysicalResourceId" : "custom resource provider-defined physical id", "ResourceProperties" : { "key1" : "string", "key2" : [ "list" ], "key3" : { "key4" : "map" } } }

また, Lambda ファンクションが実行された際に CloudFormation に返す情報は以下のような内容となっている. こちらも, スタック削除時のメッセージとなる.

成功した場合.

{ "Status" : "SUCCESS", "RequestId" : "unique id for this delete request (copied from request)", "LogicalResourceId" : "name of resource in template (copied from request)", "StackId" : "arn:aws:cloudformation:us-east-2:namespace:stack/stack-name/guid (copied from request)", "PhysicalResourceId" : "custom resource provider-defined physical id" }

失敗した場合.

{ "Status" : "FAILED", "Reason" : "Required failure reason string", "RequestId" : "unique id for this delete request (copied from request)", "LogicalResourceId" : "name of resource in template (copied from request)", "StackId" : "arn:aws:cloudformation:us-east-2:namespace:stack/stack-name/guid (copied from request)", "PhysicalResourceId" : "custom resource provider-defined physical id" }

これらのイベントメッセージをやり取り出来るように Lambda ファンクションを実装する必要がある.

オブジェクトをまるっと削除する Lambda ファンクション (2)

以下のリポジトリにアップした. serverless framework でデプロイ出来るようにしてある.

Contribute to CleanupBucketOnDelete development by creating an account on GitHub. inokappa/CleanupBucketOnDelete - GitHub

github.com

テンプレートは以下のように書くことで, バケットを削除する前にカスタムリソースが呼ばれて, バケットの中身を削除した後にバケットが削除されるようになる.

AWSTemplateFormatVersion: "2010-09-09" Description: "Clean up Bucket on CloudFormation stack delete demo." Parameters: S3BucketName: Type: String CleanUpBucketFunction: Type: String Resources: BucketResource: Type: AWS::S3::Bucket Properties: BucketName: !Ref S3BucketName CleanupBucketOnDelete: Type: Custom::CleanupBucket Properties: ServiceToken: Fn::Join: - "" - - "arn:aws:lambda:" - Ref: AWS::Region - ":" - Ref: AWS::AccountId - ":function:" - !Ref CleanUpBucketFunction BucketName: !Ref S3BucketName DependsOn: BucketResource

スタックを削除する

上記に掲載した CloudFormation テンプレートを利用して, オブジェクトが入っている場合と入っていない場合でバケットの削除を試してみる. このテンプレートを利用することで, oreno-sample-bucket という名前のバケットを作成する.

$ ./deploy.sh aws-profile oreno-sample-bucket demo create { "StackId": "arn:aws:cloudformation:ap-northeast-1:123456789012:stack/demo-oreno-sample-bucket/1387e400-6680-11e8-9468-50fa13f2a811" } Create Stack Success.

一応, バケットが作成出来たかを確認してみる.

$ aws s3api list-buckets --query=Buckets[].Name | grep "oreno-sample-bucket" "oreno-sample-bucket",

バケットにオブジェクトを放り込んでみる.

$ aws s3 cp test.txt s3://oreno-sample-bucket/ upload: ./test.txt to s3://oreno-sample-bucket/test.txt $ aws s3 ls s3://oreno-sample-bucket/ 2018-06-03 01:20:40 0 test.txt

AWS CLI でバケットの削除を試みてみる.

$ aws s3 rb s3://oreno-sample-bucket remove_bucket failed: s3://oreno-sample-bucket An error occurred (BucketNotEmpty) when calling the DeleteBucket operation: The bucket you tried to delete is not empty

上記のようにエラーとなる. これまで書いたように, バケットを空にする必要がある.

では, この状態でスタックを削除してみる.

$ ./deploy.sh aws-profile oreno-sample-bucket demo delete Delete Stack Success.

正常にスタックの削除が完了した. Lambda から CloudFormation には, 以下のようなイベントが送信されている (CloudWatch Logs にダンプした).

{ "Status": "SUCCESS", "Reason": "Log stream name: 2018/06/02/[$LATEST]018724fd1cc242deade18d90a525daf0", "PhysicalResourceId": "2018/06/02/[$LATEST]018724fd1cc242deade18d90a525daf0", "StackId": "arn:aws:cloudformation:ap-northeast-1:123456789012:stack/demo-oreno-sample-bucket/0580ea80-667d-11e8-80c3-50a68a175a82", "RequestId": "53fdeee6-feca-46b2-8073-94751f752386", "LogicalResourceId": "CleanupBucketOnDelete", "Data": {} }

懸念点

Lambda ファンクションを用いて, オブジェクトを削除するこの方法だが, 一点だけ気になることがある. それは, オブジェクトの数がとてつもなく多い場合に削除に時間が掛かってしまい, スタックの削除自体が正常に終了しない可能性がある. これはどうしても回避することは出来ないので, やはり, テンプレートに DeletionPolicy に Retain に設定して, 手動でバケットを削除する方法が良いと思��れる.

以上

CloudFormation 作った S3 バケットにおいて, オブジェクトが入っている状態でスタックを削除する方法について考察した. シンプルなのは DeletionPolicy に Retain を定義しておいて, 後から手動で S3 バケットを削除する, もしくは, Lambda ファンクションを実装する必要があるけど, カスタムリソースを利用することで, 一気通貫で削除することが出来ることが解った. いずれにせよひと手間かかるのは変わらないし, オブジェクト数次第では, カスタムリソースでは対応しきれない可能性がある為, 適材適所で使い分けるようにしたい.

また, CloudFormation について, カスタムリソースの存在を知れたのはとても良かった. うまく活用していきたいと思う.

元記事はこちら

「CloudFormation で作った S3 バケットにおいて, オブジェクトが入っている状態でスタックを削除しようとすると軒並みエラーになるので, その対処方法について検討した #ただそれだけ」

June 18, 2018 at 02:00PM

#AWS #Official #Blog

0 notes

globalmediacampaign · 5 years ago

Text

Achieving minimum downtime for major version upgrades in Amazon Aurora for PostgreSQL using AWS DMS

AWS provides two managed PostgreSQL options: Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL. When Amazon RDS or Aurora support a new major version of a database engine, for example, PostgreSQL 10 to 11, you can upgrade your DB instances to the new version. Major version upgrades can contain database changes that may not be backward-compatible with existing applications. For more information, see Upgrading the PostgreSQL DB Engine for Aurora PostgreSQL and Best practices for upgrading Amazon RDS to major and minor versions of PostgreSQL. Both Amazon RDS and Aurora provide an option to manually initiate a major version upgrade by modifying your DB instance. This is also known as an in-place upgrade and requires downtime for your applications during the upgrade process. Additionally, you must restore the latest backup in the event of any issues with the upgrade. Therefore, this option may not be desirable for all workload types. An alternative approach is using AWS Database Migration Service (DMS) for major version upgrades. AWS DMS uses PostgreSQL logical replication for near-real-time synchronization of data between two major versions. You should only use AWS DMS if you meet the following requirements: There is no in-place upgrade option available for a particular major version You need to upgrade a few or selective databases in an Aurora cluster You want to minimize the downtime required for the upgrade process and have a faster rollback option to the older instance in case of any issues with cutover The AWS DMS solution requires more AWS resources and additional planning as opposed to the in-place upgrade option. As DMS is based on outbound logical replication solution, there will be an increase in the load in terms of CPU utilization, Read and Write IOPS at the source. The increase may vary based on factors like change activity, number of active tasks during the migration. As a best practice, please test the DMS procedure in a non-production environment and make sure you have sized your source environment to cater to this additional load. Also, you need to have optimal configuration for the number of concurrent tasks, DMS replication instance, and necessary monitoring. This post walks you through upgrading your Aurora PostgreSQL 10 database to Aurora PostgreSQL 11 using AWS DMS with minimal downtime. The walkthrough provides many reusable artifacts that can help you to get started quickly. Although this post uses Aurora, the instructions are also valid for an Amazon RDS for PostgreSQL instance. Solution overview At a high level, this post includes the following steps: Setting up the necessary AWS resources using AWS CloudFormation Setting up the source environment based on Aurora PostgreSQL 10.x version and loading the sample data Setting up the target environment based on Aurora PostgreSQL 11.x Performing schema migration using native PostgreSQL utilities Setting up data migration (full load and change data capture (CDC)) using AWS DMS. Monitoring, testing, and cutover to the target environment The following diagram illustrates the high-level architecture. Prerequisites Before getting started, you must have the following prerequisites: An AWS account with administrator IAM privilege. You can also use the AWS managed policy Administrator. Familiarity with the following AWS services: Amazon Aurora for PostgreSQL AWS Cloud9 AWS CloudFormation AWS CLI AWS Database Migration Service (DMS) Amazon RDS for PostgreSQL Amazon VPC Experience using psql. Setting up the environment To set up your environment, complete the following steps: Download the CloudFormation template Aurora_PostgreSQL_DBLaunch.yaml from the GitHub repo. Launch it on the AWS Management Console. Name the stack apgupgrade. Specify the source and target database configurations. Leave other values at their default. This post uses the US East (N. Virginia) Region, but you can use any preferred Region. Make sure that while using AWS CLI commands, you set AWS_DEFAULT_REGION to your preferred Region. The CloudFormation stack deploys the following resources: VPC (10.0.0.0/25) Internet Gateway Two private subnets Two public subnets AWS Cloud9 environment Aurora cluster with PostgreSQL compatibility 10.x (aurora-source) and 11.x (aurora-target) in a VPC, launched with a single db.r5.large instance (Writer instance) only Sample database with the name demo .You can refer to AWS CloudFormation documentation to know more about stack creation process. The CloudFormation stack creation takes approximately 15 minutes. You can see the progress by looking in the events section. Record the Aurora master user name pgadmin. Default password for the master user is auradmin. This post uses the AWS Cloud9 IDE to run SQL scripts and load data. You can also launch an Amazon EC2 instance in the same VPC. After the launch is successful, log in to the AWS Cloud9 environment. Install the PostgreSQL client tools. Clone the AWS DMS sample repo from GitHub and load the data into the source database. See the following code: cd ~/environment git clone https://github.com/aws-samples/aws-database-migration-samples.git Navigate to the /PostgreSQL/sampledb/v1 directory. Configure the environment variables for the source and target Aurora endpoints. See the following code: export AURORA_SOURCE_EP= export AURORA_TARGET_EP= You can obtain the cluster endpoint names from the AWS CloudFormation output section. Log in as the master user to aurora-source using the psql utility and set up the sample data. The following code creates the schema dms_sample and loads schema objects and data: cd ~/environment/aws-database-migration-samples/PostgreSQL/sampledb/v1 psql -h $AURORA_SOURCE_EP -U pgadmin -d demo -f install-postgresql.sql The default installation takes up to 30–45 minutes and loads approximately 7 GB of data. You may see some psql errors such as role does not exist or psql: install-postgresql.sql:30: INFO: pg_hint_plan: hint syntax error at or near APPEND. You can ignore them. Verify the tables are set up properly and the data load is complete. To verify the list of tables and their sizes, run the following psql command: psql -h $AURORA_SOURCE_EP -U pgadmin -d demo alter database demo set search_path="$user","dms_sample","public"; dt+ dms_sample.* Clone the GitHub repo that contains the scripts and SQLs used by this post. See the following code: cd ~/environment git clone https://github.com/aws-samples/amazon-aurora-postgresql-upgrade Setting up the source environment It is important to thoroughly review the prerequisites, limitations, and best practices when you configure your source environment. This post highlights a few important considerations. For more information, see Using a PostgreSQL Database as a Source for AWS DMS. Enabling logical replication Enable logical replication by updating rds.logical_replication=1 in the aurora-source cluster parameter group and reboot the instance. For more information, see Using PostgreSQL Logical Replication with Aurora. See the following code: # Get the parameter group name and Instance details for Aurora cluster aws rds describe-db-clusters --db-cluster-identifier "aurora-source" --query "DBClusters[*].[DBClusterIdentifier,DBClusterMembers[0].DBInstanceIdentifier,DBClusterParameterGroup]" --output table # Set the rds.logical_replication to 1 for enabling replication aws rds modify-db-cluster-parameter-group --db-cluster-parameter-group-name --parameters "ParameterName=rds.logical_replication,ParameterValue=1,ApplyMethod=pending-reboot" # Reboot the instance aws rds reboot-db-instance --db-instance-identifier Datatype considerations AWS DMS doesn’t support all PostgreSQL datatypes when migrating data from PostgreSQL to PostgreSQL. As of the date this blog post was published, you can’t migrate composite datatypes and timestamps with time zones. Additionally, AWS DMS streams some data types as strings if the data type is unknown. Some data types, such as XML and JSON, can successfully migrate as small files, but can fail if they are large documents. If you have tables with such datatypes, you should use native PostgreSQL replication tools like pg_dump or Publisher/Subscriber logical replication to migrate such tables. For more information, see Migrating from PostgreSQL to PostgreSQL Using AWS DMS. See the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS psql -h $AURORA_SOURCE_EP -U pgadmin -d demo -f SourceDB/dms_unsupported_datatype.sql You do not see rows for the preceding query because you don’t have any unsupported datatypes in your setup. Another key consideration is to identify tables with NUMERIC data type without precision and scale. When transferring data that is a NUMERIC data type but without precision and scale, AWS DMS uses NUMERIC (28,6) (a precision of 28 and scale of 6) by default. For example, the value 0.611111104488373 from the source is converted to 0.611111 on the PostgreSQL target. See the following code: select table_schema,table_name,column_name,data_type from information_schema.columns where data_type ='numeric' and numeric_scale is null; You should evaluate the impact of this precision issue for your workload and adjust the precision for the table. If your application must retain the precision and scale on the target database, you need to modify the source database by using ALTER TABLE. The ALTER TABLE command is a data definition language (DDL) command that acquires an exclusive lock on the table and is held until the end of the transaction, which causes database connections to pile up and leads to application outage, especially for large tables. Therefore, roll out such changes during a maintenance window after careful analysis. If this is not an issue for your workload, you can replicate with the as-is setup, with the caveat that you cannot enable AWS DMS validation for the table involved. For more information, see Validating AWS DMS Tasks. Missing primary keys A captured table must have a primary key. If a table doesn’t have a primary key, AWS DMS ignores DELETE and UPDATE record operations for that table. You also need a primary key for CDC and data validation purposes. To identify tables that don’t have primary keys, enter the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS psql -h $AURORA_SOURCE_EP -U pgadmin -d demo -f SourceDB/missing_pk.sql table_schema | table_name --------------+------------------ dms_sample | mlb_data dms_sample | nfl_data dms_sample | nfl_stadium_data dms_sample | seat To manage such tables, consider the following suggestions: Identify any column that can serve as a primary key. This could be a column with a unique index and no null constraint. If no such key exists, try adding a surrogate key by adding a column like GUID. Another option is to add all columns present in the table. If the table receives only inserts and doesn’t accept any updates or deletes (for example, you are using it as a history table), then you can leave it as is and DMS will copy the inserts. In this walkthrough, you create primary keys on all the tables (except seat) to make sure CDC and AWS DMS validation occurs. Because you skip the primary key creation step for the seat table, you may notice that AWS DMS reports the validation state as no primary key and doesn’t perform any data validation on this table. See the following code: set search_path='dms_sample'; alter table mlb_data add primary key (mlb_id); alter table nfl_data add primary key (position ,name,team); alter table nfl_stadium_data add primary key(stadium,team); Data definition language propagation You can replicate the DDL statements with AWS DMS, but there are exceptions. For example, when using CDC, AWS DMS does not support TRUNCATE operations. For more information, see Limitations on Using a PostgreSQL Database as a Source for AWS DMS. As a best practice, you should apply the DDL statements during the maintenance window on the source and target database manually. Based on your DDL strategy, make sure to turn it on or off by configuring the extra connection attribute captureDDLs during endpoint creation and AWS DMS task policy settings. For more information, see Task Settings for Change Processing DDL Handling. Other considerations before the upgrade PostgreSQL version 11 contains several changes that may affect compatibility with previous releases. For example, the column relhaspkey has been deprecated in the pg_class catalog, and you should use the pg_index catalog instead to check primary keys. If you have any custom monitoring queries that are dependent on such columns, you need to amend them accordingly. For more information on changes that may affect compatibility with previous releases, see the particular major version release notes (such as the PostgreSQL 11.0 release notes on the PostgreSQL website). If you are using extensions like pg_stat_statements or pg_hint_plan, you need to create them manually in the target database. While creating them, make sure to check if there is a version mismatch between the source and target database and look for release notes. For example, pg_repack added support for PostgreSQL 11 in 1.4.4, which means you must upgrade your pg_repack client to 1.4.4. You can verify the installed extensions using the following code: psql> dx After you set up the source environment and validate that it’s ready, you can set up the target environment. Setting up the target environment It is important to carefully review the prerequisites, limitations, and best practices when configuring your target environment. For more information, see Using a PostgreSQL Database as a Target for AWS Database Migration Service. In this section, you perform the schema migration and required parameters in the environment. This post provides a custom parameter group based on PostgreSQL version 11. You can customize this further based on your existing configurations. Set up session_replication_role=replica to temporarily disable all triggers from the instance until the migration is complete. See the following code: # Identify instance id for aurora-target cluster aws rds describe-db-clusters --db-cluster-identifier "aurora-target" --query "DBClusters[*].[DBClusterIdentifier,DBClusterMembers[0].DBInstanceIdentifier, DBClusterParameterGroup]" --output table # Check DB parameter group aws rds describe-db-instances --db-instance-identifier --query "DBInstances[*].[DBClusterIdentifier,DBInstanceIdentifier, DBParameterGroups[0].DBParameterGroupName,DBParameterGroups[0].ParameterApplyStatus]" --output table # Modify session_replication_role setting to replica aws rds modify-db-parameter-group --db-parameter-group-name --parameters "ParameterName=session_replication_role,ParameterValue=replica, ApplyMethod=immediate" # Make sure the ParameterApplyStatusvalue changes from applying to in-sync aws rds describe-db-instances --db-instance-identifier --query "DBInstances[*].[DBClusterIdentifier,DBInstanceIdentifier, DBParameterGroups[0].DBParameterGroupName,DBParameterGroups[0].ParameterApplyStatus]" --output table # Verify that database parameter is set in target database demo=> show session_replication_role ; session_replication_role -------------------------- replica (1 row) To proceed with this migration, you must first create the schema and associated objects in the target database. Schema migration includes two steps: migrate the users, roles, and grants, and migrate the schema definitions. Because this is a homogeneous migration, you use native PostgreSQL tools such as pg_dumpall and pg_dump. Migrating users, roles, and system grants Use pg_dumpall to dump global objects that are common to all databases. This includes information about database roles and properties such as access permissions that apply to whole databases (pg_dump does not export these objects). See the following code: cd ~/environment cd amazon-aurora-postgresql-upgrade/DMS/SourceDB pg_dumpall -h $AURORA_SOURCE_EP -g -U pgadmin -f db_roles.sql --no-role-password Amazon RDS for PostgreSQL and Aurora PostgreSQL block access to the catalog table pg_authid. Therefore, you have to use the —no—role-password option in pg_dumpall to dump the user and roles definition. Additionally, you need to use the PostgreSQL 10 client for this. The db_roles.sql file has all the user and role information, including rdsadmin and other rds_* roles. Identify relevant users for your environment and exclude unwanted users and tablespace definitions from the SQL script and run this on the target database. This makes sure you have consistent users and roles set up in both the source and target environment. Because the passwords are not exported, you must synchronize the passwords manually using alter user commands. If you store your credentials in AWS Secrets Manager or use IAM authentication, set up the necessary credentials and permissions for the target environment. psql -h $AURORA_TARGET_EP -U pgadmin -d demo -c "CREATE USER dms_user WITH PASSWORD 'dms_user'" Migrating schema objects To copy the schema DDL, you use the pg_dump command. See the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS/SourceDB pg_dump –h $AURORA_SOURCE_EP -d demo --schema-only -U pgadmin -f pg_schema.sql The pg_schema.sql file includes all DDL statements. As a best practice, you should create objects in the following order for efficient data loading: Create sequences, tables, and primary key constraints for initiating full load. After the full load is complete, you can have tasks stop before applying CDC changes. You can manage this in the AWS DMS task settings. Create additional secondary indexes and other remaining objects. This approach makes sure that secondary indexes do not slow down the full load process. This post has already extracted the dms_sample schema-related DDLs from the pg_schema.sql script and created multiple DDL files based on the object type. You can find these SQL files in the TargetDB directory. To create the dms_sample schema in the target environment, enter the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS/ psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/create_table.sql psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/create_pk_constraint.sql Verify that the schema objects are in the target before proceeding to database migration. Database migration In this step, you create an AWS DMS replication instance (AWS DMS version 3.3.1), source, and target endpoints via AWS CloudFormation using the stack name DMSRepforBlog. Because you use AWS CLI commands to create replication tasks, you must create two roles: dms-vpc-role and dms-cloudwatch-logs-role. For instructions, see Creating the IAM Roles to Use With the AWS CLI and AWS DMS API. Launch the CloudFormation template with the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS/src # Install jq utility sudo yum install jq -y # Set AURORA_DB_CFSTACK_NAME to Aurora PostgreSQL launch CloudFormation stack name. export AURORA_DB_CFSTACK_NAME="" echo $AURORA_DB_CFSTACK_NAME # Verify above stack name matches the Aurora stack created in earlier step export AWS_DEFAULT_REGION="us-east-1" #Source Endpoint Information SrcRDSEndPoint=$(aws cloudformation describe-stacks --stack-name $AURORA_DB_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="SrcRDSEndPoint") | .OutputValue') #Target Endpoint Information TgtRDSEndPoint=$(aws cloudformation describe-stacks --stack-name $AURORA_DB_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="TgtRDSEndPoint") | .OutputValue') #Subnet Information SubnetID1=$(aws cloudformation describe-stacks --stack-name $AURORA_DB_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="SubnetID1") | .OutputValue') SubnetID2=$(aws cloudformation describe-stacks --stack-name $AURORA_DB_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="SubnetID2") | .OutputValue') #Security Group Information RepSecurityGroup=$(aws cloudformation describe-stacks --stack-name $AURORA_DB_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="RDSSecurityGrp") | .OutputValue') export SrcDBUsername="pgadmin" export SrcDBPassword="auradmin" export TgtDBUsername="pgadmin" export TgtDBPassword="auradmin" # Launch the Cloudformation Stack to create DMS replication instance aws cloudformation create-stack --stack-name DMSRepforBlog --template-body file://DMSRepInstance.yaml --parameters ParameterKey=RepAllocatedStorage,ParameterValue=100 ParameterKey=RepMultiAZ,ParameterValue=false ParameterKey=RepSecurityGroup,ParameterValue=$RepSecurityGroup ParameterKey=ReplInstanceType,ParameterValue=dms.r4.2xlarge ParameterKey=SrcDBUsername,ParameterValue=$SrcDBUsername ParameterKey=SrcDBPassword,ParameterValue=$SrcDBPassword ParameterKey=SrcDatabaseConnection,ParameterValue=$SrcRDSEndPoint ParameterKey=SrcEngineType,ParameterValue=aurora-postgresql ParameterKey=Subnets,ParameterValue="$SubnetID1 , $SubnetID2" ParameterKey=TgtDBUsername,ParameterValue=$TgtDBUsername ParameterKey=TgtDBPassword,ParameterValue=$TgtDBPassword ParameterKey=TgtDatabaseConnection,ParameterValue=$TgtRDSEndPoint ParameterKey=TgtEngineType,ParameterValue=aurora-postgresql Stack creation takes up to 5 minutes. When it is complete, test the connection for the source and target with the following code: Set AWSDMS_CFSTACK_NAME to DMS replication Cloudformation stack name AWSDMS_CFSTACK_NAME="DMSRepforBlog" export AWS_DEFAULT_REGION="us-east-1" #Set variable to replication instance arn DMSREP_INSTANCE_ARN=$(aws cloudformation describe-stacks --stack-name $AWSDMS_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="ReplicationInstanceArn") | .OutputValue') # Set source database end point arn DB_SRC_ENDPOINT=$(aws cloudformation describe-stacks --stack-name $AWSDMS_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="SrcEndpointArn") | .OutputValue') # Set target database end point arn DB_TGT_ENDPOINT=$(aws cloudformation describe-stacks --stack-name $AWSDMS_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="TgtEndpointArn") | .OutputValue') After you get the ARN for the replication instance and endpoints, you can proceed with testing your connection. See the following code: #Test source DB connection aws dms test-connection --replication-instance-arn ${DMSREP_INSTANCE_ARN} --endpoint-arn ${DB_SRC_ENDPOINT} # Ensure status changes from testing to successful (takes about ~1 min) aws dms describe-connections --filter Name=endpoint-arn,Values=${DB_SRC_ENDPOINT} Name=replication-instance-arn,Values=${DMSREP_INSTANCE_ARN} --output table #Repeat same steps for target DB aws dms test-connection --replication-instance-arn ${DMSREP_INSTANCE_ARN} --endpoint-arn ${DB_TGT_ENDPOINT} # Ensure status changes from testing to successful (takes about ~1 min) aws dms describe-connections --filter Name=endpoint-arn,Values=${DB_TGT_ENDPOINT} Name=replication-instance-arn,Values=${DMSREP_INSTANCE_ARN} --output table To refresh the schemas’ source endpoint, enter the following code: # Below command is an asynchronous operation and can take several minutes aws dms refresh-schemas --endpoint-arn $DB_SRC_ENDPOINT --replication-instance-arn $DMSREP_INSTANCE_ARN --output table # Check status aws dms describe-refresh-schemas-status --endpoint-arn $DB_SRC_ENDPOINT --output table Setting up AWS DMS AWS DMS supports full load and CDC for this migration. Alternatively, you can use native tools such as pg_dump for full load and AWS DMS for CDC only. This post uses AWS DMS to carry out both full load and CDC. In this walkthrough, you create two tasks: one for the large tables and the other for the rest of the tables to improve the parallelism using different task settings. Transactional consistency is maintained within a task, so tables in separate tasks don’t participate in common transactions. During the migration, the full load of large tables (in terms of rows or physical size of the table) can take a considerable time. The following are best practices for AWS DMS full load: Identify the large tables and divide them into multiple chunks based on numeric primary keys. Split these chunks into multiple tasks across the number of DMS instances. To copy large partitioned tables, manually split the tasks or use auto-partition available in AWS DMS from version 1.2 or higher. Choose the appropriate replication instance class and number of replication instances after doing a couple of iterations in a non-production environment. For more information, see Choosing the Optimum Size for a Replication Instance. Plan your full load during off-peak hours at the source to reduce the burden and minimize the storage requirements to keep the database changes in Write Ahead Logs (WALs). Turn the auto vacuum off on the target database and turn it on after the full load is complete. For more information, see Improving the Performance of an AWS DMS Migration. You use AWS CLI commands to create two DMS tasks: Task 1 includes person, sporting_event_ticket, and ticket_purchase_hist because they have foreign key relationships Task 2 includes the remaining tables The DMS replication instance CloudFormation template Outputs tab has values for replication and endpoint ARNs. You can also use the jq utility to extract the required ARN. See the following code: # Switch to directory hosting table mapping and task setting files cd ~/environment/amazon-aurora-postgresql-upgrade/DMS/Migration # Task 1 export task_identifier=dms-sample-task1-full-load-cdc aws dms create-replication-task --replication-task-identifier ${task_identifier} --source-endpoint-arn ${DB_SRC_ENDPOINT} --target-endpoint-arn ${DB_TGT_ENDPOINT} --replication-instance-arn ${DMSREP_INSTANCE_ARN} --migration-type full-load-and-cdc --table-mappings 'file://table-mapping-task1.json' --replication-task-settings 'file://tasksetting.json' DMS_TASK_ARN1=$(aws dms describe-replication-tasks | jq -r '.ReplicationTasks[]|select(.ReplicationTaskIdentifier=="dms-sample-task1-full-load-cdc")|.ReplicationTaskArn') # Task 2 export task_identifier=dms-sample-task2-full-load-cdc aws dms create-replication-task --replication-task-identifier ${task_identifier} --source-endpoint-arn ${DB_SRC_ENDPOINT} --target-endpoint-arn ${DB_TGT_ENDPOINT} --replication-instance-arn ${DMSREP_INSTANCE_ARN} --migration-type full-load-and-cdc --table-mappings 'file://table-mapping-task2.json' --replication-task-settings 'file://tasksetting.json' DMS_TASK_ARN2=$(aws dms describe-replication-tasks | jq -r '.ReplicationTasks[]|select(.ReplicationTaskIdentifier=="dms-sample-task2-full-load-cdc")|.ReplicationTaskArn') Make sure you are following the specific instructions for configuring tasks and endpoints based on the AWS DMS engine version. For example, if you use an AWS DMS version lower than 3.1.0, you must do some additional configuration at the source. You can start the tasks when they are ready. To check their status, use the following code, and make sure that ReplicationTaskStatus moves from creating to ready: aws dms describe-replication-tasks --filters Name=replication-instance-arn,Values=${DMSREP_INSTANCE_ARN} --query "ReplicationTasks[:].{ReplicationTaskIdentifier:ReplicationTaskIdentifier,ReplicationTaskArn:ReplicationTaskArn,ReplicationTaskStatus:Status,ReplicationTFullLoadPercent:ReplicationTaskStats.FullLoadProgressPercent}" --output table # Start tasks aws dms start-replication-task --replication-task-arn ${DMS_TASK_ARN1} --start-replication-task-type start-replication aws dms start-replication-task --replication-task-arn ${DMS_TASK_ARN2} --start-replication-task-type start-replication # Track progress aws dms describe-replication-tasks --filters Name=replication-instance-arn,Values=${DMSREP_INSTANCE_ARN} --query "ReplicationTasks[:].{ReplicationTaskIdentifier:ReplicationTaskIdentifier,ReplicationTaskArn:ReplicationTaskArn,ReplicationTaskStatus:Status,ReplicationTFullLoadPercent:ReplicationTaskStats.FullLoadProgressPercent}" --output table Note: After the task starts, you can verify if AWS DMS created the replication slots in the aurora-source cluster by running the following command (select * from pg_replication_slots). There is one slot created per task. This is true only for full load and CDC tasks. If you are using CDC-only tasks, you must create the slot manually and specify it in the source endpoint configuration. Make sure you clean up the tasks and any manual slots after they have completed or when no longer required. # Track progress describing statistics at table level aws dms describe-table-statistics --replication-task-arn ${DMS_TASK_ARN1} --output table aws dms describe-table-statistics --replication-task-arn ${DMS_TASK_ARN2} --output table After the full load is complete, the tasks are configured to stop. You can now create additional indexes and remaining objects. See the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/create_secondary_index.sql psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/create_fk_constraints.sql psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/create_view.sql psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/create_function.sql psql -h $AURORA_TARGET_EP -U pgadmin -d demo -f ./TargetDB/object_grants.sql You have to resume tasks post-index creation for continuous replication. See the following code: aws dms start-replication-task --replication-task-arn ${DMS_TASK_ARN1} --start-replication-task-type resume-processing aws dms start-replication-task --replication-task-arn ${DMS_TASK_ARN2} --start-replication-task-type resume-processing You can now perform DML activity on the source and see AWS DMS replicate it to the target. See the following code: psql -h $AURORA_SOURCE_EP -U pgadmin -d demo -c "select dms_sample.generateticketactivity(1000)" psql -h $AURORA_SOURCE_EP -U pgadmin -d demo -c "select dms_sample.generatetransferactivity(100)" You can monitor the task progress via the console. The following screenshot shows the Database migration tasks page. The validation state for all the tables is Validated, except for the seat table, which has the state No primary key. See the following screenshot. Monitoring the migration You can monitor the progress of the migration using Amazon CloudWatch metrics. For more information, see Monitoring AWS DMS Tasks. As a best practice, make sure that you set up CloudWatch alarms for CDCLatencySource and CDCLatencyTarget. This helps to get timely alerts on replication lag and take appropriate action. For more information about sample AWS CLI commands and setting up monitoring for CDCLatencySource and CDCLatencyTarget, see the DMS Task Monitoring file on GitHub. You can also monitor the TransactionLogsDiskUsage and CPUUtilization metrics on the aurora-source cluster. For more information, see Monitoring an Amazon Aurora DB Cluster. Troubleshooting During the migration of complex and multi-terabyte databases, you might face issues like errors or slow data copy processing. In those scenarios, you can refer to the following posts, which cover various troubleshooting techniques and best practices for using AWS DMS: Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 1) Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 2) Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong? (Part 3) Testing your application After the replication is caught up via AWS DMS, you can start testing your application by connecting to the aurora-target cluster. Before you proceed with testing, gather statistics for the database using utilities such as vacuumdb or analyze. For example, see the following code: vaccumDB -d demo -vZ -h -U pgadmin -p 5432 You can use the following testing strategies: For read-only testing, point your application directly to the aurora-target cluster while the replication is on. For write or any kind of stress testing on the target, create a DB cluster snapshot or clone the database of the aurora-target cluster. This way, you don’t have to break the replication process. Also, make sure you enable triggers, indexes, and any custom parameters before testing. It’s important to test your workload thoroughly against the new version to see if there are any incompatibility or performance issues. Review the existing custom parameters from the source, test, and configure it appropriately in the target environment. Cutover After the testing is complete from both a functional and non-functional standpoint, you are ready to point your applications to the new major version environment. Make sure you have defined a success criteria for cutover along with a rollback plan. This post provides the following checklist as a reference; you should tailor it to your environment. Complete the checklist in the following order: Make sure that all users and roles are set up with appropriate permissions in aurora-target. Verify the database object counts match (such as indexes and functions) in the source and target environment. Refer to sample queries in repo to perform object comparison. Set up your monitoring on target databases, such as CloudWatch metrics, to be the same as the source. Check for any long-running transactions on the source and stop or terminate them with pg_stat_activity. See the following code: cd ~/environment/amazon-aurora-postgresql-upgrade/DMS psql -h $AURORA_SOURCE_EP -d demo -U pgadmin -f ./SourceDB/longrunningsess.sql Stop all the applications that write to aurora-source. This is the point at which your downtime starts. Make that there is no AWS DMS replication delay by monitoring CloudWatch metrics CDCLatencySource and CDCLatencyTarget. Make that there are no data validation errors in AWS DMS. You can also verify the row counts for the key tables using SQL commands. Stop the AWS DMS task. Update the aurora-target cluster parameter group and set session_replication_role=origin and apply immediately. Make sure that sequences (if any) are in sync with the current value at the source by manually adjusting the last value. See the following code: select 'select setval('||quote_literal(schemaname||'.'||sequencename)||','||last_value||',true);' from pg_sequences; Modify your application to point to the aurora-target endpoint and start the application. Conduct necessary tests to verify that the application is working correctly. Enable the target for production use. This is the point at which your downtime stops. Monitor your target environment and logs for any issues. Delete the AWS DMS tasks and clean up the AWS resources after successful testing. Conclusion This post shared step-by-step instructions for migrating your data between Aurora and Amazon RDS for PostgreSQL major versions and key considerations for your planning and execution. As a best practice, review AWS DMS, Aurora, and PostgreSQL documentation for the latest information on major version upgrades. This post also provided the code templates, SQL scripts, task settings, and best practices that you can use to migrate your databases quickly. If you have any questions or comments about this post, please share your thoughts in the comments section. About the Authors Gowri Balasubramanian is a Principal Database Solutions Architect at Amazon Web Services. He works with AWS customers to provide guidance and technical assistance on both relational and NoSQL database services, helping them improve the value of their solutions when using AWS. Amit Bansal is a Senior Consultant with the Professional Services team at Amazon Web Services. He focuses on database migrations to AWS and works with customers to design and implement Amazon RDS, Aurora and Redshift architectures. HariKrishna Boorgadda is a Senior Consultant with the Professional Services team at Amazon Web Services. He focuses on database migrations to AWS and works with customers to design and implement Amazon RDS and Aurora architectures. https://probdm.com/site/MjIyNTc

0 notes

globalmediacampaign · 5 years ago

Text

Using IAM authentication to connect with pgAdmin Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL

Amazon Relational Database Service (RDS) enables you to use AWS Identity and Access Management (IAM) to manage database access for Amazon RDS for PostgreSQL database instances and Amazon Aurora PostgreSQL clusters. Database administrators can associate database users with IAM users and roles. With IAM database authentication, you don’t need to use a password when you connect to a database cluster. Instead, you use an authentication token. An authentication token is a unique string of characters that Aurora generates on request, which uses AWS Signature Version 4. Each token has a lifetime of 15 minutes. You don’t need to store user credentials in the database, because authentication is managed externally using IAM. You can also still use password authentication. For more information, see Client Authentication on the PostgreSQL documentation website. This post shows you how to use IAM authentication with tools you might already be using to connect to your Aurora PostgreSQL cluster. The steps will work equally well on your Amazon RDS for PostgreSQL instance. You can follow along using the provided commands to provision resources and configure your environment for IAM authentication. The post also walks you through connecting to the cluster using either the psql command line tool or pgAdmin using IAM credentials. Prerequisites RDS supports Secure Socket Layer (SSL) encryption for PostgreSQL database instances. You can use SSL to encrypt a PostgreSQL connection between your applications and your PostgreSQL database instances. It is highly recommended to enable SSL certificate verification. For more information, see Using SSL with a PostgreSQL DB Instance. You must download the certificate from the Amazon S3 bucket that the user guide identifies. Additionally, before you create an Aurora database cluster, you must set up your environment for Amazon Aurora. Setup You can use your existing Aurora PostgreSQL cluster or RDS for PostgreSQL database and enable IAM authentication, or you can create a new one. If you don’t have one, you can provision an Aurora PostgreSQL cluster through the AWS Management Console, AWS CLI, AWS SDK, or by using an AWS CloudFormation template. This post uses AWS CLI to create a new Aurora PostgreSQL cluster. Creating a database If you don’t already have an Aurora PostgreSQL cluster or RDS PostgreSQL instance, you must create one. Configure your database with a security group that allows entry from your client machine. Use the following CLI command: aws rds create-db-cluster --db-cluster-identifier --engine aurora-postgresql --master-username --master-user-password --db-subnet-group-name --vpc-security-group-ids Replace the placeholders for cluster name, user name, password, subnet name and security group. If you already have an Aurora PostgreSQL database that you want to work with, you can skip this step. The preceding code creates a database cluster. If you use the console to create a database cluster, then RDS automatically creates the primary instance (writer) for your database cluster. If you use the AWS CLI to create a database cluster, you must explicitly create the primary instance for your database cluster. See the following code: aws rds create-db-instance --db-instance-identifier --db-cluster-identifier --engine aurora-postgresql --db-instance-class db.r4.large Replace the placeholders for instance name and cluster name. For more information, see Creating an Amazon Aurora DB Cluster. Enabling IAM authentication By default, IAM database authentication is disabled on database instances and database clusters. You can enable IAM database authentication (or disable it again) using the console, the AWS CLI, or the RDS API. For more information, see Enabling and Disabling IAM Database Authentication. To enable IAM authentication from the command line, you must know your cluster name. You can find the name on the RDS console or in the output values of the describe-db-clusters AWS CLI command. See the following code: aws rds describe-db-clusters --query "DBClusters[*].[DBClusterIdentifier]" The following command enables IAM authentication on the cluster. aws rds modify-db-cluster --db-cluster-identifier --apply-immediately --enable-iam-database-authentication Replace the placeholder for the cluster name. IAM resources for IAM database access This post attaches a policy with an action of rds-db:connect to a single IAM user. The following diagram illustrates this workflow. You can construct other Amazon Resource Names (ARNs) to support various access patterns and attach the policies to multiple users or roles. For more information, see Creating and Using an IAM Policy for IAM Database Access. Policy To allow an IAM user or role to connect to your database instance or database cluster, you must create an IAM policy. After that, attach the policy to an IAM user or role. For more information, see Create and Attach Your First Customer Managed Policy. You construct the policy document from the following four key pieces of data: The Region of your cluster Your AWS account number The database resource ID Your database user name See the following code: { "Version" : "2012-10-17", "Statement" : [ { "Effect" : "Allow", "Action" : ["rds-db:connect"], "Resource" : ["arn:aws:rds-db:us-east-1:123456789012:dbuser:db-ABCDEFGHIJKL01234/mydbuser"] } ] } Specify an ARN that describes one database user account in one database instance using the following format: arn:aws:rds-db:::dbuser:/ In the preceding example code, the following elements are customized for the environment: us-east-1 – The Region 123456789012 – The AWS account ID db-ABCDEFGHIJKL01234 – The identifier for the DB instance mydbuser – The name of the database account to associate with IAM authentication. A resource ID is the identifier for the database instance. This identifier is unique to a Region and never changes. In the example policy, the identifier is db-ABCDEFGHIJKL01234. To find a DB instance resource ID in the console for RDS, Choose Configuration. The resource ID is located in the Configuration section. Alternatively, you can use the AWS CLI command to list the identifiers and resource IDs for all of your database instances in the current Region. See the following code: aws rds describe-db-instances --query "DBInstances[*].[DBInstanceIdentifier,DbiResourceId]” An IAM administrator user can access DB instances without explicit permissions in an IAM policy. For more information, see Create an IAM User. To restrict administrator access to DB instances, you can create an IAM role with the appropriate, lesser-privileged permissions and assign it to the administrator. Don’t confuse the rds-db: prefix with other RDS API operation prefixes that begin with rds:. You use the rds-db: prefix and the rds-db:connect action only for IAM database authentication. They aren’t valid in any other context. As of this writing, the IAM console displays an error for policies with the rds-db:connect action. You can ignore this error. If resource-id is set to * instead of the explicit resource ID, you can use the same policy for all databases in a Region. If it is explicit, you need a new policy for all read replicas or connecting to a restored backup. There is a trade-off with strict authorization control by not locking down the policy to a single cluster, but this feature can help to reduce effort. This post creates a new IAM user and attaches the policy to the new IAM user using the following AWS CLI commands. You do not need a console password or access keys for this feature. The user from the following example code has neither: aws iam create-user --user-name mydbuser aws iam attach-user-policy --policy-arn arn:aws:iam:123456789012:policy/database-login-mydbuser --user-name mydbuser Creating a database user After you create your IAM user and attach your IAM policy to the user, create a database user with the same name that you specified in the policy. To use IAM authentication with PostgreSQL, connect to the database cluster, create the database user, and grant them the rds_iam role. You can connect as any user that has CREATE USER permissions and execute the following statements: CREATE USER mydbuser WITH LOGIN; GRANT rds_iam TO mydbuser; Connecting With IAM database authentication, you use an authentication token when you connect to your database cluster. An authentication token is a string of characters that you use instead of a password. After you generate an authentication token, it’s valid for 15 minutes before it expires. If you try to connect using an expired token, the connection request is denied. Every IAM authentication token must be accompanied by a valid signature, that uses Signature Version 4. For more information, see Signature Version 4 Signing Process. The AWS CLI and the AWS SDK for Java can automatically sign each token you create. You can use the AWS CLI to generate the connection token. After you have a signed IAM authentication token, you can connect to an Amazon RDS database instance or an Aurora database cluster. Generating Token The authentication token consists of several hundred characters so it can be unwieldy on the command line. One way to work around this is to save the token to an environment variable, and use that variable when you connect. The following example code shows how to use the AWS CLI to get a signed authentication token using the generated-db-auth-token command, and store it in a PGPASSWORD environment variable: export RDSHOST="mypostgres-cluster.cluster-abcdefg222hq.us-east-1.rds.amazonaws.com" export PGPASSWORD="$(aws rds generate-db-auth-token --hostname $RDSHOST --port 5432 --region us-east-1 --username mydbuser)" In the preceding example code, the parameters to the generate-db-auth-token command are as follows: –hostname– The host name of the DB cluster (cluster endpoint) that you want to access. –port– The port number used for connecting to your DB cluster. –region– The Region in which the DB cluster is running. –username– The database account that you want to access. Connecting to the cluster using psql For the general format for using psql to connect, see the following code: psql "host=hostName port=portNumber sslmode=sslMode sslrootcert=certificateFile dbname=dbName user=userName" The parameters are as follows: host – The host name of the database cluster (cluster endpoint) that you want to access. port – The port number used for connecting to your database cluster. sslmode – The SSL mode to use. For more information, see Using SSL with a PostgreSQL database Instance on the PostgreSQL documentation website. It is recommended to use sslmode to verify-full or verify-ca. When you use sslmode=verify-full, the SSL connection verifies the DB instance endpoint against the endpoint in the SSL certificate. You can use verify-full with RDS PostgreSQL and Aurora PostgreSQL cluster and instance endpoints. For Aurora PostgreSQL reader and custom endpoints, use verify-ca. sslrootcert – The SSL certificate file that contains the public key. For more information, see Using SSL with a PostgreSQL database Instance. dbname – The database that you want to access. user – The database account that you want to access. The following example code shows using the command to connect, uses the environment variables that you set when you generated the token in the previous section: psql "host=$RDSHOST port=5432 sslmode=verify-full sslrootcert=/sample_dir/rds-combined-ca-bundle.pem dbname=dbName user= mydbuser" Connect to the cluster using pgAdmin You can use the open-source tool pgAdmin to connect to a PostgreSQL database instance. Complete the following steps: Find the endpoint (DNS name) and port number for your database Instance. On the RDS console and then choose Databases. From the list of your database instances, choose the PostgreSQL database instance name. On the Connectivity & security tab, record the endpoint and port number. You need both the endpoint and the port number to connect to the database instance. The following screenshot shows the endpoint and port number from the database instance details. Install pgAdmin from the pgAdmin You can download and use pgAdmin without having a local instance of PostgreSQL on your client computer. Launch the pgAdmin application on your client computer. Under Dashboard, choose Add New Server. In the Create – Serversection, under General, for Name, enter a name to identify the server in pgAdmin. Deselect Connect now? Under Connection, For Host, enter the endpoint. For example, this post enters mypostgresql.abcdefg222hq.us-east-1.rds.amazonaws.com. For Port, enter the assigned port. For Username, enter the user name that you entered when you created the database instance. As on optional but recommended step, under SSL, change the SSL mode. As part of this optional step, also enter the path of the certificate (downloaded earlier for SSL certificate verification) based on SSL mode selected. It is recommended to use sslmode to verify-full or verify-ca. When you use sslmode=verify-full, the SSL connection verifies the DB instance endpoint against the endpoint in the SSL certificate. You can use verify-full with RDS PostgreSQL and Aurora PostgreSQL cluster and instance endpoints. For Aurora PostgreSQL reader and custom endpoint use option verify-ca. Choose Save. For information about troubleshooting, see Troubleshooting Connection Issues. After you create the server, connect to it using the temporary token that AWS CLI returned to get a signed authentication token using the generated-db-auth-token command. To access a database in the pgAdmin browser, choose Servers. Choose, the database instance. Choose, Databases. Choose the database instance’s database name. To open a panel where you can enter SQL commands, under Tools, choose Query Tool. Limitations There are limitations when you use IAM database authentication. Your application must generate an authentication token. Your application uses that token to connect to the database cluster. If you exceed the limit of maximum new connections per second, the extra overhead of IAM database authentication can cause connection throttling. For more information, see IAM Database Authentication for MySQL and PostgreSQL. Conclusion There are many advantages to using IAM authentication with your RDS for PostgreSQL and Aurora PostgreSQL databases. IAM database authentication eliminate the need to manage database-specific user credentials on your end. You do not need to maintain database-specific passwords, you can simply use IAM credentials to authenticate to database accounts. Network traffic to and from the database is encrypted using SSL. You can use IAM to centrally manage access to your database resources, instead of managing access individually on each database cluster. For applications running on Amazon EC2, you can use profile credentials specific to your EC2 instance to access your database instead of a password, for greater security. This post showed you how to use IAM authentication instead of a password with tools such as the psql command line tool and pgAdmin. You can adapt these instructions and processes to other tools. PostgreSQL Database Management System (formerly known as Postgres, then as Postgres95) Portions Copyright © 1996-2020, The PostgreSQL Global Development Group Portions Copyright © 1994, The Regents of the University of California pgAdmin Copyright (c) 2002 – 2017, The pgAdmin Development Team About the Author Ajeet Tewari is a Solutions Architect for Amazon Web Services. He works with enterprise customers to help them navigate their journey to AWS. His specialties include architecting and implementing highly scalable distributed systems and leading strategic AWS initiatives. https://probdm.com/site/MTQ2NzQ

0 notes