AWS Infrastructure Offerings

As I’ve recently been suddenly confronted to the AWS learning curve, I quickly came to a realization that I was in over my head:

I don't know what a security group is, and at this point I'm too afraid to ask

AWS is ubiquitous in the infrastructure world but an application developer like me can mostly get by without understanding its intricacies and subtleties. Ironically, engineers that work at some of the largest Internet companies in the world (e.g. Google, Microsoft, Facebook) may also find themselves lost when having to rely on public cloud offerings, because their experience lies with homegrown technologies.

The AWS reference documentation is well-written and plentiful, but lacks a narrative that takes the reader from basic to simple to slightly more advanced concepts. This post is the result of my own learning experience and is seeking to be a pragmatic, progressive introduction to AWS. In other words, this document is selective as to the technologies it covers and (lightly) opinionated. Feedback is welcome.

Regions

A region is a real-world area in the world where AWS servers are co-located. Regions have a broad geographical location which is reflected in their name, e.g. “ap-south-1” (Asia Pacific South) or “us-west-2” (West Coast of the United States). Regions allow you to deploy your application to a location physically close to where your users are, with the goal of reducing the network roundtrip. The list of AWS regions can be found here.

Availability Zones

An availability zone (abbreviated AZ) is a physical subdivision of a region that has its own power source and network links to the public Internet, each of which are 100% independent from other AZs in the same region. As their name indicates AZs are intended to provide resilience within a region: running your application across multiple AZs in a region should allow for your users to still use it when one of them goes offline.

Even though they are physically independent, you should be able to make network requests across AZs without incurring a performance penalty: AZs are in close proximity and are interconnected by dedicated high-throughput, low-latency network links. Regions provide between 2 and 5 AZs each.

EC2

An Elastic Cloud Computing (EC2) instance is a virtual machine which provides core resources such CPU, RAM, mass storage and networking. It is the base unit for running application code on the AWS platform. Unlike physical servers, EC2 instances are ephemeral: once it is shut down, the same instance cannot be turned back on and, if needed, a distinct instance must be provisioned to replace it.

Types

EC2 instances have a type, which determines the amounts and attributes of each resource. Over time, the number of instance types has grown to cover a wide range of use cases. The nomenclature is fairly confusing and intricate, so you need to reason in terms of what the application code you need to run is going to need:

Comparing EC2 instances should be a science, and it turns out to be more of an art. As is the case when you provision a machine in the real world, there is a ton of flexibility and tradeoffs to consider. At a high-level, you should first determine the primary bottleneck of your application and chose an appropriate letter-type, then benchmark it across a range of actual types to figure out determine which is right.

CPU

Not all EC2 instance classes run on the same CPU, so AWS uses vCPU as a mean to compare them — a vCPU generally corresponds to having exclusive access to a single thread of the virtual CPU. But there are exceptions to this rule: for instance, the T2 type provides a physical core as opposed to a single thread. More variables also need to be considered:

Mass storage

As noted earlier, an instance that was shut down cannot be brought back up. This is specifically a concern if the instance has stored data locally that needs to be persisted past its termination. The top-level dichotomy in terms of mass storage is EBS v. local instance store:

Provisioning and Pricing

The default mode of provisioning EC2 instances is called “on-demand” and is the most intuitive one. The instance will be provisioned on the fly and will run until it is shut down. The cost is based on the amount of the instance was running: the amount of computing resources available is linear with the cost a customer is willing to pay.

Reserved instances provide the same range of options as the on-demand offering, but offer a lower cost in return for a lasting commitment: the customer declares a need for a given number of instances for up to 3 years. By being able to anticipate demand, AWS can deliver the supply more efficiently, and pass down the savings.

Spot is a different mode of provisioning, where the instances that are not reserved and not used on-demand are made available at a discounted price to workloads that are either stateless or highly fault-tolerant.

Spot instances run at a lower priority that on-demand or reserved instances, so that when a request is made for a number of spot instances, the request may be only partially fulfilled if there isn’t enough supply. Conversely, AWS reserves the right to reclaim Spot instances if they are needed to fulfill on-demand requests. This is why Spot instances make sense for workloads that can be interrupted rapidly: in exchange for a decreased availability, Spot instances come at a very large discount compared to either of the other modes.

These provisioning modes are not exclusive to one another: it’s actually a common practice to “pad” a pool of reserved instances with on-demand or spot instances when the application is confronted with a spike that couldn’t be anticipated, e.g. large one-off migrations.

AMI

An Amazon Machine Image (AMI) is a data bundle, which packs the software required to launch an EC2 instance. Several operating system vendors (e.g. Microsoft, RedHat, Ubuntu) publish their operating systems as AMIs, which makes it trivial to use on AWS. Some AMIs may be subject to licensing restrictions or need to be purchased directly from the vendor.

It is common practice to build custom AMIs, which allows for the application to be installed and baked into the image ahead of it running in production. The resulting image may be kept private in case it includes proprietary software.

The typical build process of an AMI consists in provisioning an EC2 instance with a public AMI to use as a base, install any custom software needed at runtime and finally register the resulting volume as an AMI. The process is different depending on whether you intend to use the AMI to image an EBS volume or a local instance store.

Launch Configuration and Templates

Provisioning a single EC2 instance is a tedious task, as you need to pick a wide range of parameters: region, AZ, type, AMI are required parameters, but the number of options and parameters can be tedious to specify every time. Since launching EC2 instances is something that can happen several times a day even for medium-sized setups, there is a clear need for that process to be automated.

Launch configurations capture the set of parameters needed to launch an instance. Launch templates serve the same purpose as launch configurations but are more capable in that they can be versioned. Modifying a template creates a new version where parameters may be changed, added, or removed. You can then launch EC2 instances from any version, but only one version can be the default one at a given time. Templates are the preferred way to launch EC2 instances in Auto-Scaling Groups.

Auto-Scaling Group

An Auto-Scaling Group (ASG) is tasked with managing a set of EC2 instances, with the goal of adjusting the number of instances up and down, as configured by a scaling policy. ASG have the following properties and constraints:

The scaling policy of an ASG determines when and how the ASG will grow or shrink the pool of EC2 instances it manages. The simplest policy is a static one, where the the upper and lower bound are equal: in this case, the ASG will simply monitor the health of instances and bring up a new instance only in case a previous one was deemed unhealthy. Beyond this simple use case, scaling policies can be configured in a variety of ways:

Lastly, be aware that multiple scaling policies can be active within a single ASG at once. When they conflict, AWS honors the ones that keeps capacity the highest.

ASG v. Application Service
  1. A single EC2 instance has the ability to concurrently run as many services as the hardware can support, e.g. a web frontend and a database and and Kafka node.
  2. When an EC2 instance is managed by an ASG, all the applications running on the instance will scale up and down in the same way.
  3. However, one of the key value proposal of fragmenting complex applications in multiple services is to allow for each to scale independently of the other: the Web tier presumably doesn't have the same scaling characteristics as a the backing database.
  4. Is it a common approach to define one ASG per service type, and to specialize the instance types and setup for the specific task they run.

Elastic Load Balancing

An Elastic Load Balancer (ELB) is tasked with fronting and dispatching network traffic. An ELB dispatches traffic to targets in a target group, which in the most common case is an ASG. An ELB can also distribute traffic to a set of IP addresses, or to AWS Lambda functions. ELBs can operate at two levels of the networking model:

Because it specializes in handling and routing network traffic, an ELB can augment an ASG in a couple of ways:

Lastly, the routing logic of an ELB can be configured in a few ways:

Lastly, and importantly, an ELB may be reachable from the public Internet but its IP shouldn’t be expected to remain static over time. All ELBs will be provisioned with a publicly resolvable DNS name (*.elb.amazonaws.com), which will remain constant throughout the lifespan of the ELB. This record can be used directly or can be CNAME’d to a more user-friendly form (api.example.com → api-1234567890.ap-south-1.elb.amazonaws.com).

Virtual Private Cloud

A Virtual Private Cloud (VPC) is a bit of an umbrella term for a set of related capabilities that are intended to isolate and protect the network traffic to and from EC2 instances. Even though AWS resources are generally shared between AWS customers, a VPC provides a logical separation of your AWS resources from other accounts. A VPC can span multiple availability zones but is limited to a single region.

There is a pre and post-VPC world in AWS, but VPCs have been the default networking stack for EC2 instances since 2013. Concepts and resources introduced prior to the advent of VPCs are generally labelled as “Classic” in the AWS terminology. The rest of this post will focus on the VPC stack and avoid discussing the legacy stack.

Addresses, CIDR Block and Subnets

When it comes to networking, half of the problem is addressing: A VPC is assigned a range of at most 216 private IP addresses which can not be routed over the public Internet. This range is broken down in disjoint subnets, each of which maps to one Availability Zone.

The common notation for these ranges is called the CIDR notation, where a range is written as a IP address followed by a slash and a mask, e.g. 192.0.2.0/24. The value of the mask is used to tell how many addresses are in the block. In the case of IPv4, which uses 32 bits, a mask of 24 bits gives you 8 variable bits: 28 = 256 addresses. The lower the value of the mask, the larger the block.

On AWS, it is common to have VPCs with a 172.x.0.0/16 address space, which in turn is divided in multiple /20 subnets – for instance, with 4 availability zones:

Name CIDR Number of Addresses First Address Last Address
VPC 172.x.0.0/16 65,536 172.x.0.1 172.x.255.255
Subnet 1 172.x.0.0/20 4,096 172.x.0.0 172.x.15.254
Subnet 2 172.x.16.0/20 4,096 172.x.16.1 172.x.31.254
Subnet 3 172.x.32.0/20 4,096 172.x.32.1 172.x.47.254
Subnet 4 172.x.48.0/20 4,096 172.x.48.1 172.x.63.254

Beyond private IPs, a Subnet can also be configured to supply a public IP address to the instances that are launched in it. If so, the network interface of the instance will be given a random publicly routable IP address coming from AWS’s pool. If you prefer for instances to not get a public IP address, this behavior can be overridden in the launch template.

Route Tables and Internet Gateways

After addressing, the other half of networking is routing. VPCs have a routing table which governs the way traffic is routed within the VPC, as well as ingress and egress traffic. Inter-VPC traffic is kept within through a local route in the routing table: this route cannot be changed.

Egress traffic to the public Internet is generally directed at an Internet Gateway (IGW), which interfaces the VPC with the public Internet. Internet Gateways have two purposes:

By default, a VPC will have a route to the IGW that applies to all subnets in the VPC - such subnets are called public subnets. When that route is not present for a given subnet, the subnet is fully isolated from the Internet and is deemed private.

Even though directly accepting ingress traffic through an Internet Gateway is feasible, it’s a more common pattern to use ELBs to front the traffic and dispatch it to an ASG where the instances do not need to be provisioned with a public IP address.

Security Groups and Network ACLs

VPCs allow for fine grained control over the traffic that is allowed to flow to and from instances. Network ACLs apply to one or many subnets and act like a standard firewall: they are tasked with filtering packets. They consist of matching rules that are evaluated in order: the first rule that matches the traffic will determine whether the packet is allowed to proceed. ACLs for ingress and egress traffic are defined separately.

Security Groups (SGs) also offer filtering capabilities but they act at the instance level: a given Security Group may only apply to select instances in a subnet, and they may span over instances across multiple subnets. Importantly, SGs also offer the flexibility to match traffic based other SGs. This allows grouping instances by “type” on a network basis and enforcing a given network topology between different groups.

For instance, say you have a Web frontend which relies on two backend services: Backend Service 1 and Backend Service 2. Each backend service uses the same Database server. The Web frontend shouldn’t talk to the database directly. Additionally, let’s assume that Backend Service 2 is of a higher tier of importance that Backend Service 1, and should therefore avoid making calls to it:

Component Security Group Rules
Web frontend sg-web Allow egress to sg-be1
Allow egress to sg-be2
Backend Service 1 sg-be1 Allow ingress from sg-web
Allow egress to sg-db
Allow egress to sg-be2
Deny ingress from sg-be2
Backend Service 2 sg-be2 Allow ingress from sg-web
Deny egress to sg-be1
Allow egress to sg-db
Allow ingress from sg-be1
Database sg-db Allow ingress from sg-be1
Allow ingress from sg-be2

Some of these rules are redundant and could therefore be omitted, but the principle remains: Security Groups are powerful in that they can apply at once to multiple instances of the same kind.

Default VPCs

VAs mentioned earlier, VPCs have become the default mode of managing EC2 instances and network setups. When you create a new account, a default VPC and subnets are provisioned by default in every AWS region. In each region, this VPC will consist of a /16 CIDR block (65K private addresses), divided in as many /20 subnets (4K private addresses) as there as AZs.

All subnets will be public ones and a default route will allow them to reach the Internet through the Internet Gateway. There will be a default Network ACL allowing for all traffic to flow, and a default Security Group as well. You can chose to either take the default VPC and tune it to your needs, or to provision a new VPC from scratch.

Elastic IP Address

An Elastic IP Address (EIP) represents a single public IP address that is routable and reachable from the public Internet. An EIP can be provisioned within your account and assigned to an instance, which then becomes accessible on the Internet.

Unlike instances, which are ephemeral, EIPs are static and can be freely attached to a given instance, then detached and moved to a different one. When an EIP is attached to an instance, the public IP of that instance changes to be that specified by the EIP. EIPs will retain their associated public address until they are explicitly released.

The goal of EIPs is to provide resilience: in case of an issue with the underlying instance or the application it runs, the EIP can be reassigned to another instance running the same application. The new instance will immediately begin receiving connections directed at that IP (with the caveat than previous connections will be interrupted and need to be re-established).

Note that EIPs can be created for free, but will incur charges when they are not attached to a running instance (because they consume an address that is left unused).

Elastic Network Interfaces

An Elastic Network Interface (ENI) augments the concept of EIP by adding more settings such as private IP addresses and Security Groups but still remain decoupled from instances. Just like EIPs, ENIs can be freely attached to and detached from instances.

An ENI can only be mapped to one instance, but an instance can have multiple ENIs at once: the limit is dependent on the instance type. All EC2 instances come with a default primary interface which is proper to the instance and cannot be detached.

An ENI possesses the following attributes:

An ENI can only be defined within a VPC, and is more specifically defined within a Subnet: it will use private IPs that are in the CIDR block for that Subnet. You can choose to automatically obtain a private IP from the pool of the subnet, or If the Subnet is configured to allocate a public IP for each instance, creating an ENI will also give it a random public IP from the AWS pool of public IP addresses.

Conversely, an ENI can only be associated with Security Groups from the VPC they belong to.

Finally, an ENI can be attached to an Elastic IP address: this allows the same kind of failover scenario as with plain EIPs with the added benefit that the ENI also carries the Security Groups to the new instance.