AWS Certified Architect Associate Database Study Sheet

Amazon Databases

RDS

Amazon RDS ( Relational Database Service ) has operational benefits; simplifies setup, scaling and ops of a relational DB in AWS. Ideal for users to spend more time focusing on application itself; while RDS offloads admin tasks, like backups, patching, scaling and replication.

Currently supported MySQL, PostGreSQL, Maria DB, Oracle, SQL Server and Amazon Aurora. Built on Amazon Elastic block storage and can scale up to 4 to 6 TB in provisioned storage and up to 30,000 IOPS;

Amazon RDS supports three Storage types:

-Magnetic: Cost effective Storage that is ideal for apps with light I/O requirements

-General Purpose ( SSD ) faster than magnetic, burst to meet spikes; good for small to med DB

-Provisioned IOPS SSD designed for I/O intensive workloads, random I/O throughput

Min Size for SSD EBS: 1 GiB

Max Size for SSD EBS: 16 TiB

Amazon Aurora DB:

Commercial grade database; cost effective and open source. 5 X Performance of mySQL. Aurora consists of a Primary Instance for READ WRITE and an Amazon Aurora Replica with is a RO. Aurora Scaling: 2 copies of your data in each AZ, with a min of three availability zones, 6 copies of your data.

Backups and Restore

RPO – Recovery Point Objective is defined as the max time of data loss that is ok in the event of a failure or outage event.

RTO – Recovery Time Objective is defined as the max amount of downtime that is permitted to recover from backup and get back to normal ops.

Automated backups feature for RDS: Enables point in time recovery of DB istance. RDS does a Full Daily Back up ( during your preferred back up window ) + captures transaction logs.  Once a day backups will be retained by default; min default retention is 7 days; max retention period is 35 days.  Will occur during a pre-defined 30 min window

** When you delete an RDS instance; all backups are deleted by default** 

You are given the chance to create a snapshot when you delete an RDS instance.

Manual snapshots are not deleted, however. .

Manual Snapshots: Can be performed at any time.  Can only be restore to point in time they were created. Kept until you explicitly delete them.

High Availability and Multi-AZ

Multi-Az deployments allow you to create a DB cluster across, well, you guessed it – Multiple Availability Zones.  This is to increase availability, not performance. DB Failure over in the event of an outage is fully automatic and requires no administrative intervention. Replicates from master DB to to slave instance using synchronous replication.  Route 53 will resolve the new address in event of failover.

Amazon RDS will initiate a failover in the event of:

Loss of availability in Primary AZ.

Loss of Network connectivity to primary DC.

Compute unit failure in primary DB

Storage failure on primary DB

Read Replicas for Increased Performance Horizontally Scaling

  • Read replicas are not for Availability – for increased READ performance 
  • Scale beyond the capacity of a single DB instance for read -heavy workloads.
  • Handle read traffic while DB instance is unavailable
  • Offload reporting adjacent a replica instead of primary
  • Uses asynchronous replication when there is change to the Primary
  • Read Replicas are for these three RDS types: MySQL, MAriaDB and PostGreSql

 Multi-AZ RDS instances + Backups:

When Multi AZ is used on an RDS instance, I/O is not suspended on primary during a backup, since the backups are taken from standby.

AWS DB Security

 

Use IAM Policies with fine-grained access that limit what DB adminstrators can do

Deploy RDS instances into a VPC private subnet

Restrict access to DB using ACL

Restrict access with Security Groups

Rotate Keys and Passwords

AWS RedShift Datawarehouse

OLTP – Online transaction Processing – operations that are frequently writing and changing data. Actions performed on standard DBs.

OLAP – Online Analytical Processing – For datawarehouse. Complex query against large datasets. ” For example, where online transaction processing (OLTP) applications typically store data in rows, Amazon Redshift stores data in columns, using specialized data compression encodings for optimum memory usage and disk I/O”

AWS Redshift is a fast, powerful, fully managed petabyte scale DWH service in the Cloud. Give fast querying abilities over structured data using standard SQL commands to support interactive queying over large datasets.

 

NoSQL Database and Amazon Dynamo DB

 

Traditional DB; tables have a pre-defined schema, table name, primary key, column names and data types.

NoSQL DB are non-relation DBs; no existing traditional table for data stores. Example formats:

-Document DB

-Graph Stores

Key/VAlue Stores

Wide Column Stores

DynamoDB is an AWS NoSQL Service; fully managed, extremely fast with predictable performance by automatically distributing data and traffic for a table over multiple partitions.  All data on high performance SSD drives.  Protects data by replicating data across multiple AZ within an AWS Region.

DynamoDB only requires you have a primary key attribute ; but you don’t need to define attr names and data types in advance. Each attr in an item is a key value pair / can be single valued or multi-valued

{

CarName = “Red5”

CarVendor = “Suzuki”

CarVIN = “12345678890abcdefg”

}

Eventually consistent reads: When data is read; the repsonse may not reflect the results of a recently completed WRITE.

Strongly consistent READS: When this type of request is given; Dyanmo DB returens  a response with most up to date writes

 

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s