Historically at OVO all product teams shared the same AWS account for hosting their services. This made sense at the time, when the organisation was much smaller and people didn't have much experience with AWS. Nowadays we have moved to a model in which each team has one AWS account for its production services and at least one more for hosting non-production environments.
Giving each team their own accounts makes a lot of sense in terms of autonomy, reliability and security. However, as teams made the transition from the old world to the new, a lot of them struggled to get started. It can be quite daunting to be handed a completely bare AWS account and told "set it up however you like".
With that in mind, I've put together an AWS account template that should be a good starting point for most use cases. I'll guide you through the main features using diagrams, and at the end I'll provide a link to a CloudFormation template that encodes everything you've seen.
It's worth pointing out that this template is merely a reference to help get you started. It's likely that you'll need to tweak it to suit your particular needs.
The diagram above shows the scope of the template. It contains a VPC with some public and private subnets, a VPN tunnel to your corporate network and a peering connection to a VPC in another team's AWS account.
Note that the region (eu-west-1) and the various IP ranges in the diagrams were chosen arbitrarily for the sake of concreteness, so you'll need to change them as appropriate to your environment.
As an aside, if you're planning to set up multiple AWS accounts for your organisation, it's a good idea to have a centralised spreadsheet or something to manage their IP range allocations. This avoids the risk of having overlapping IP ranges between different accounts, in case you want to peer their VPCs together in future.
Public subnets are where you put any resources that you want to be accessible from the outside world, such as internet-facing load balancers for your services.
Private subnets, on the other hand, host anything that should not be directly accessible. This includes your service's EC2 instances, Fargate tasks, Lambdas and databases.
Depending on your security needs you may also want to have a third "super duper private" layer of subnets, completely isolated from the outside world, where you put their databases.
So what makes a subnet "public" or "private"? Two things:
- Resources created in public subnets are given a public IP address by default, so they are addressible from the outside world.
- Public and private subnets have different Network ACL rules which mean that public subnets accept traffic from the outside world and private ones don't.
Resources in public subnets access the internet through an Internet Gateway. This lives inside your AWS account but not inside the VPC - it needs to be explicitly "attached" to the VPC where you want to use it.
Resources in private subnets access the internet differently, because they don't have public IP addresses. They use a NAT Gateway, which you create in one of your public subnets.
Each subnet uses a route table to work out how to route traffic. These route tables will refer to the Internet Gateway or NAT Gateway as appropriate.
Depending on your needs, you might want a VPN tunnel between your corporate network and your VPC. This can be useful for a few reasons:
- If you have dashboards or other tooling that you don't want to expose to the internet, you can front them with an internal load balancer and access them from the office via the VPN.
- If you need SSH access to EC2 instances, you can use the VPN tunnel. However, it's not good practice to SSH into instances regularly as there should be no need. If you are using Fargate or Lambda as your compute platform, then of course there is nothing to SSH into anyway.
- If the services running in your AWS account need to talk to any services running in on on-premise datacenter, they can do this via the VPN.
Setting up the VPN will require some work by your networks team. Once you've set everything up on the AWS side, you'll download a configuration file, send it to them, and they'll set things up on their side. This configuration file contains a shared secret, so make sure to send it securely.
You might need to peer your VPC with a VPC in another team's account so that you can access their resources without going over the public internet.
One account has to request a peering connection, and the other account accepts it. This manual process can be automated using CloudFormation but it is a little bit fiddly:
- Run this CloudFormation template in the other account to create a cross-account role
- Run some CloudFormation in your account to request a peering connection
- CloudFormation will assume the cross-account role in the other AWS account and automatically accept the peering connection
One thing that catches everybody out the first time they set up a peering connection is the fact that routing needs to be configured in both VPCs. You need an entry in your route table so that you can make requests to services in the other account, and the other account needs a corresponding entry in its route table so that the response can be routed back to you.
Let's summarise what the route tables look like for the public and private subnets.
- 10.128.0.0/19 (inside the VPC) - Local routing, no route table entry is needed for this
- 0.0.0.0/0 (everywhere else) - use the Internet Gateway
- 10.128.0.0/19 (inside the VPC) - Local routing - again, no route table entry needed
- 10.0.0.0/9 (corporate network) - use the VPN Gateway
- 10.129.0.0/19 (other AWS account) - use the VPC peering connection
- 0.0.0.0/0 (everywhere else) - use the NAT Gateway
Network ACLs are similar to firewall rules in a traditional datacenter. They are the first line of network security, providing a foundation on top of which you use can security groups to add more nuanced rules.
It's worth pointing out that, unlike security groups, Network ACLs are stateless. That means you need to provide separate rules for requests (e.g. an HTTPS request on port 443) and their responses (e.g. an HTTPS response on an ephemeral port between 1024 and 65535).
The Network ACL rules for the public and private subnets are summarised below.
- Allow HTTP(S) requests from outside world to load balancers
- Allow load balancers to respond on ephemeral ports (1024-65535)
- Allow services inside VPC to make TCP requests to outside world and receive responses on ephemeral ports
- Allow incoming traffic from public subnets (e.g. load balancers)
- Allow services to make TCP requests to outside world (e.g. to download docker containers)
- Allow outside world to respond on ephemeral ports
- Allow SSH connections (port 22) from corporate network
Note that nowadays we don't need to open any ports for ntp (network time protocol). AWS provides the Amazon Time Sync Service, which is available inside the VPC.
This gist contains two CloudFormation templates.
cross-account-peering-role.yml is the template mentioned in the "VPC peering" section above. This template should be run in the other AWS account. It sets up a cross-account role so that CloudFormation can automatically accept the VPC peering connection.
vpc.yml sets up everything you've seen above: the VPC, subnets, route tables, VPN connection, VPC peering connection, etc.
Good luck and happy DevOps-ing!