On the 8th of November, I went this event that was organized in Padua, surprisingly the room was really full of people!
This was the schedule:
- AWS Introduction and History
- AWS Infrastructure Services – Amazon EC2, Amazon S3
- AWS Infrastructure Services – Amazon EBS, Amazon VPC
- Security, Identity and Access Management
- AWS Databases: Amazon DynamoDB e Amazon RDS
- AWS Elasticity and Management Tools + Demo
1. AWS Introduction and History
The AWS experts made an introduction to AWS explained how the services are grouped and how the infrastructure works, how it is layered:
The regions are limitated metropolitan areas like Dublin or Frankfurt. You might want that your data sit inside a region for legal reasons.
The Availability Zones (AZ) are inside the regions, they contains the Amazon data centers and they do not share risk profiles, like energy suppliers or being close to the same lake.
Edge Locations are used to share data faster, like jquery.js files.
Computing, Networking, Storage are the 3 biggest services, and it also supports many databases.
2. AWS Infrastructure Services – Amazon EC2, Amazon S3
Then they showed us the Amazon Console, a visual tool used to manage all the Amazon Web Services.
The first service is EC2 that allows to create Virtual Machines. These can be Windows or Linux machines and you can customize the dimension and the administration permissions; the good part is that you pay for just what you use. These Virtual Machines are called Amazon Machine Images (AMIs) .
The snapshots of the discs of the Virtual Machines are stored on the S3, another Amazon Service for storage. You can use two type of storages:
- The Instance Store that is volatile, that is to say it is reset if a machine crashes or it is reset. The maximum sized to 10Gb.
- The EBS that is persistent and can go up to 16Gb.
There is an easy way to retrieve the information about instance metadata, you just have to fetch this url
as described here.
You can also perform some task at the machine boot, like launching a script to update the software.
The billing can be:
- on demand: based on the size and the time
- reserved: I use an AMI for some time
- scheduled: just at schedule time
- spot: I use the service only if the cost is less than an amount per hour
Then they talked about S3, that is a way to store content on AWS. Once the files are stored they can be accessed via HTTP or HTTPS, they can be used for the static part of websites. The billing system is easier than EC2, you just pay by how much space you use, unless you use Glacier that is cheaper but delays the availability of the files.
S3 also supports the versioning of the files, but this feature must be enabled manually by the user.
Finally, s3 is object based, every file is inside a bucket, you can have up to 5 buckets; a bucket is something like
3. AWS Infrastructure Services – Amazon EBS, Amazon VPC
EBS stands for Elastic Block Store, it provides storage for EC2. It has its own lifecycle and it is available in many flavours, like: magnetic (limited to 1TB) or general purpose SSD (up to 16GB) , the former is more performant and it also more expensive.
EBS is different by S3 in many ways, first of all EBS is billed by what you requested, not by how much you use; then EBS is available in one single availability zone, and it cannot be accessed from Internet.
VPC stands for Virtual Public Cloud, it allows to decide a machine private ip address and it is the default for all new accounts. A subnet mask from /16 to /28 digits can be used. These subnets can be public and privates; usually the public ones are for the web servers while the private ones are for databases. You can split the subnet into two AZs for reliability.
If a machine inside a private subnet wants to connect to the internet, it must use a NAT instance, Amazon offers a service called NAT Gateway for this.
Amazon provides two more levels of security:
- Security Groups: they are stateful and the rules are applied without and order.
- ACL: they are stateless but the rules are applied in order.
There is also a service called Route53 to provide DNS.
4. Security, Identity and Access Management
AWS received many security certifications and it is currently used by many public institutions and private companies.
One way to provide security is to use Bastions. They are SSH endpoints and we can setup a server to receive or send data with only one bastion.
After we create our Master Root Account on AWS, we are suggested not to use it ever. We should instead create some administrative users (IAM) .
Beside this, we can also federate with Google or others to access, or we can manage our platform via code using one of the many SDKs AWS provides us.
To customize user permissions we can create policies in JSON format. Using these we can state what users can or cannot do. We can give permission to users, groups and rules. Rules are machines that are identified by a token that expires automtically after some time. It is also possible to use a cross account: the B user can have a rule in the A account, so he can manage it without logging out and it.
IAM can be used only in AWS, it cannot be used in other applications, for instance you cannot use it to login into a running instance of WordPress.
5. AWS Databases: Amazon DynamoDB e Amazon RDS
RDS is the service provided by AWS to interact a with relational database. It is an interface to other DB like MySQL, SQLServer, Amazon Aurora, Oracle … It works on a single AZ but it can be configured to automatically replicate to another one that will be used automatically and transparently as failover. It is billed on demand and provides auto-backup or manual snapshots. From a snapshot I can create a replica on another region.
DynamoDB is a NOSQL database, similar to MongoDB. It can be created only on SSD disks and it is completly managed, you can even increase or decrease the read and write units. You can define indeces on table rows like LSI (Local Secondary Index) and GSI (Global Secondary Index) to make searches faster.
6. AWS Elasticity and Management Tools + Demo
In the last section, we were introduced on how AWS can scale automatically in case of heavy traffic or computations. First of all, you must define a metric, like CPU usage, web traffic or latency …, the service that does this is CloudWatch. Then you can define an action when your machine passes a threshold for a defined amount of time.
if CPU usage exceeds 90% for 10 minutes then launch another machine. if CPU usage goes below 10% for 10 minutes then shut down a machine.
It is possible to define a minimum and a maximum number of machines that can run at the same time.
AWS provides also a service for load balance called Elastic Load Balance used to distribute traffic. It pings a machine to determinate its status (Health Check) to know if a machine is available.