It’s very common for workloads to exist in multiple environments, which can include both cloud environments and, potentially, previously existing data center infrastructure. It’s for this reason, that any planning should include network considerations concerning intra and intersystem connectivity, public IP address management, and domain name resolution. In this article, we’ll discuss five AWS Best Practices to plan your network topology successfully.
Using highly available network connectivity for your workload public endpoints
All endpoints in your infrastructure, and the routing to them, must be highly available. This can be achieved in a number of ways.DNS, connect delivery networks (CDNs), API Gateway, load balancing, or reverse proxies all can contribute to this.
It’s key that all users of the workload have highly available connectivity. All of the following services provide highly available public-facing endpoints: Amazon Route 53, AWS Global Accelerator, Amazon CloudFront, Amazon API Gateway, and Elastic Load Balancing (ELB). If your users have access to your application via the internet, as is often the case, API operations services should be used to confirm the correct use of Internet Gateways. You should also be double-checking that route table entries for the subnets that host your application endpoints are correct. If, on the contrary, your users access the application via your on-premise environment, you should make sure that connectivity between AWS and your on-premise environment is also highly available.
When it comes to managing the domain names of your application endpoints, you should ensure you’re using highly available DNS. You can use Route 53 to manage your domain names or find an AWS Marketplace appliance that meets your specific requirements.
Useful resources:
What is Elastic Load Balancing?
Provisioning redundant connectivity between your private networks in the cloud and on-premise environments
To ensure redundant connectivity, you should be using multiple AWS Direct Connect (DX) connections. Alternatively, you can use VPN tunnels connecting separately deployed private networks. Multiple DX locations result in high availability. If you’re using multiple AWS Regions at once, you should guarantee redundancy in, at least, two of them.
One of the main things to take into account when planning your network topology should be ensuring that there is a redundant connection to your on-premise environment available at all times. To achieve your specific availability needs, it’s likely that you’ll need redundant connections to multiple AWS regions. Service API operations should be used to identify the correct use of Direct Connect circuits.
It’s best practice to capture your current connectivity, for example, Direct Connect, virtual private gateways, and AWS Marketplace appliances. API operations should be used to query the configuration of Direct Connect connections and collect virtual private gateways where route tables use them.
Useful resources:
AWS Direct Connect Resiliency Recommendations
Using Redundant Site-to-Site VPN Connections to Provide Failover
Ensuring IP subnet allocation accounts for expansion and availability
Workload requirements, such as factoring in future expansions and allocation of IP addresses to subnets across Availability zones should be taken into account. Amazon VPC IP address ranges need to be large enough to accommodate these requirements. This can include load balancers, EC2 instances, and container-based applications.
Your network topology plans should include measures to accommodate future growth, regulatory compliance, and integration with others. In the beginning, the idea of growth can easily be underestimated, but it’s important to remember that regulatory compliance might change and acquisitions of private network connections can be difficult to implement without the proper planning ahead.
AWS accounts and Regions should be selected based on factors such as your service requirements, latency, regulatory, and DR (disaster recovery) requirements. Firstly, you should identify your needs for regional VPC deployments. For this, you should determine whether you’re deploying multi-VPC connectivity and whether there’s a need for segregated networking for regulatory requirements. You should also clearly identify the size of the VPCs. VPCs should be as large as possible. Remember the initial VPC CIDR block allocated to your VPC cannot be changed or deleted. However, you may add additional non-overlapping CIDR blocks to the VPC, though there’s a risk this might fragment your address ranges. While planning, make sure you allow for the use of Elastic Load Balancers, Auto Scaling groups, concurrent AWS Lambda invocations, and service endpoints.
Useful resources:
Single Region Multi-VPC Connectivity
Choosing hub-and-spoke topologies over many-to-many mesh
A hub-and-spoke model, like the one provided by AWS Transit Gateway, should be used if you have more than two network address spaces connected by VPC peering, AWS Direct Connect, or VPN. In case there are only two networks employing said connections, you can simply connect them to each other. However, while the number of networks grows, the complexity of such meshed connections becomes increasingly untenable. You should be prioritizing the routing of traffic across your multiple networks. AWS Transit Gateway provides a hub-and-spoke model that is incredibly easy to maintain.
Enforcing non-overlapping private IP address ranges in all private address spaces where they are connected
IP address ranges for each individual VPC mustn’t overlap when peered or connected via VPN. Similarly, any potential IP address conflicts between one of your VPCs and on-premise environments or other cloud providers should be avoided. In addition, you need to have a reliable way to allocate private IP address ranges whenever it’s needed.
It’s best practice to monitor and manage your CIDR use. When it comes to this matter, you should be evaluating potential usage on AWS, adding CIDR ranges to existing VPCs, and creating VPCs to allow for planned growth in usage. Service API operations should be used to collect current CIDR consumption, for example, VPCs, subnets, etc. Additionally, you should be capturing your current subnet usage by using service API operations to collect subnets per VPC in each Region, record the current usage, determine if any overlapping IP ranges have been created, and calculate the spare capacity.