Horizontal, Vertical & Autoscaling with CloudFormation & Kubernetes (Part 1)
— CloudFormation, Kubernetes, IaC, EC2, AWS, Load Balancing, Microservices, Prometheus, Grafana — 7 min read
Horizontal scaling and vertical scaling are two different strategies used to increase the capacity and performance of a system.
Horizontal Scaling (Scale Out)
Horizontal scaling involves adding more instances of a system or components, such as servers, to a distributed network. This approach is often referred to as "scaling out."
How does it work?
- Add More Machines: Increase capacity by adding more servers or nodes to the existing system.
- Distributed Load: The workload is distributed across multiple servers or instances.
- Load Balancing: Load balancers are typically used to distribute traffic evenly among the instances.
Pros:
- Fault Tolerance: Improves fault tolerance because if one server fails, others can take over the load.
- Cost-Effective: Can be more cost-effective since you can use commodity hardware.
- Scalability: Easier to scale infinitely by adding more machines.
- Flexibility: Allows for incremental upgrades and expansions.
Cons:
- Complexity: Can add complexity to the system in terms of configuration and management.
- Data Consistency: Maintaining data consistency and state synchronization across multiple nodes can be challenging.
- Network Latency: May introduce network latency as data needs to be communicated between different servers.
Use Cases:
- Web applications with varying and unpredictable traffic patterns.
- Distributed databases and microservices architectures.
- Systems that require high availability and fault tolerance.
Vertical Scaling (Scale Up)
Vertical scaling involves adding more power (CPU, RAM, disk space) to an existing server or system. This approach is often referred to as "scaling up."
How does it work?
- Increase Resources: Enhance the capacity of the existing server by upgrading its hardware components.
- Single Instance: The system continues to run on a single server or a limited number of servers, each with higher capacity.
Pros:
- Simplicity: Easier to implement and manage since you are only upgrading existing hardware.
- Consistency: No need to handle distributed data or load balancing, maintaining a consistent state is simpler.
- Lower Latency: Reduced network latency as all operations are handled within a single machine.
Cons:
- Downtime: Typically requires downtime to upgrade hardware components.
- Limitations: There is a physical limit to how much you can upgrade a single machine.
- Cost: Can be more expensive due to the cost of high-end hardware.
Use Cases:
- Applications with steady and predictable workloads.
- Systems that require high performance from a single instance, such as databases or enterprise applications.
- Environments where simplicity and ease of management are prioritized over fault tolerance.
Horizontal Scaling: AWS Auto Scaling Group
This CloudFormation template creates an Auto Scaling Group (ASG) that horizontally scales EC2 instances based on the CPU utilization.
AWSTemplateFormatVersion: "2010-09-09"Description: "AWS CloudFormation Template for Horizontal Scaling with Auto Scaling Group"
Parameters: InstanceType: Type: String Default: t2.micro Description: EC2 instance type
Resources: # Define the Launch Configuration. Specifies the AMI ID and instance type for instances in the Auto Scaling Group. LaunchConfig: Type: AWS::AutoScaling::LaunchConfiguration Properties: ImageId: ami-0c55b159cbfafe1f0 # Replace with a valid AMI ID in your region InstanceType: !Ref InstanceType SecurityGroups: - !Ref InstanceSecurityGroup
# Define the Auto Scaling Group. Manages a group of instances, adjusting the number of instances based on demand. It is set with a minimum size of 1, a maximum size of 5, and a desired capacity of 2. AutoScalingGroup: Type: AWS::AutoScaling::AutoScalingGroup Properties: LaunchConfigurationName: !Ref LaunchConfig MinSize: 1 MaxSize: 5 DesiredCapacity: 2 VPCZoneIdentifier: - subnet-12345678 # Replace with your subnet ID TargetGroupARNs: - !Ref TargetGroup MetricsCollection: - Granularity: "1Minute" HealthCheckType: "EC2" HealthCheckGracePeriod: 300
# Define the Scaling Policy. Uses target tracking scaling to adjust the number of instances based on average CPU utilization. ScaleUpPolicy: Type: AWS::AutoScaling::ScalingPolicy Properties: AutoScalingGroupName: !Ref AutoScalingGroup PolicyType: "TargetTrackingScaling" TargetTrackingConfiguration: PredefinedMetricSpecification: PredefinedMetricType: ASGAverageCPUUtilization TargetValue: 50.0
# Define the Security Group. Allows SSH access to the instances. InstanceSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: "Enable SSH access" SecurityGroupIngress: - IpProtocol: tcp FromPort: 22 ToPort: 22 CidrIp: 0.0.0.0/0
# Define the Target Group for the Load Balancer. Used by the load balancer to route traffic to the instances. TargetGroup: Type: AWS::ElasticLoadBalancingV2::TargetGroup Properties: VpcId: vpc-12345678 # Replace with your VPC ID Port: 80 Protocol: HTTP HealthCheckProtocol: HTTP HealthCheckPort: "80" HealthCheckPath: "/" Matcher: HttpCode: "200" TargetType: instance
Outputs: AutoScalingGroupName: Description: "Auto Scaling Group Name" Value: !Ref AutoScalingGroup
Vertical Scaling: Modify EC2 Instance Type
This CloudFormation template creates a single EC2 instance and demonstrates vertical scaling by modifying the instance type.
AWSTemplateFormatVersion: "2010-09-09"Description: "AWS CloudFormation Template for Vertical Scaling by Modifying EC2 Instance Type"# Instance type parameter. Allows changing the instance type to scale vertically by selecting different instance sizes (e.g., t2.micro, t2.small, t3.medium).Parameters: InstanceType: Type: String Default: t2.micro AllowedValues: [t2.micro, t2.small, t2.medium, t2.large, t3.micro, t3.small, t3.medium, t3.large] Description: EC2 instance type
Resources: # Define the EC2 Instance. Creates a single EC2 instance with a specified instance type. EC2Instance: Type: AWS::EC2::Instance Properties: InstanceType: !Ref InstanceType ImageId: ami-0c55b159cbfafe1f0 # Replace with a valid AMI ID in your region SecurityGroups: - !Ref InstanceSecurityGroup
# Define the Security Group. Allows SSH access to the instance. InstanceSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: "Enable SSH access" SecurityGroupIngress: - IpProtocol: tcp FromPort: 22 ToPort: 22 CidrIp: 0.0.0.0/0
Outputs: InstanceId: Description: "EC2 Instance ID" Value: !Ref EC2Instance InstanceType: Description: "EC2 Instance Type" Value: !Ref InstanceType
AWS Auto Scaling
AWS Auto Scaling primarily facilitates horizontal scaling by automatically adjusting the number of Amazon EC2 instances in an Auto Scaling group. It can add or remove instances based on the current demand, ensuring that you have the right amount of compute capacity to handle your application's load. When combined with Elastic Load Balancing (ELB), AWS Auto Scaling distributes incoming application traffic across multiple instances, ensuring high availability and reliability.
AWS allows you to change EC2 Instance Types to add more CPU, memory, or storage. This form of vertical scaling can be manually adjusted based on performance metrics. ELB can also support vertical scaling by redirecting traffic to instances that have been manually scaled up to larger instance types.
Kubernetes
Kubernetes facilitates horizontal scaling through the Horizontal Pod Autoscaler (HPA), which automatically scales the number of pods in a deployment or replica set based on observed CPU utilization or other custom metrics. HPA ensures that your application can handle varying loads by adding or removing pods as needed. The Cluster Autoscaler component can adjust the number of nodes in a Kubernetes cluster based on the resource requests of the pods. If pods cannot be scheduled due to insufficient resources, the cluster autoscaler will add more nodes to the cluster.
Kubernetes supports vertical scaling through the Vertical Pod Autoscaler (VPA), which automatically adjusts the CPU and memory requests and limits for containers running in pods. VPA can recommend or automatically apply changes to the resource requests based on historical data and current usage patterns. Kubernetes can manage the scaling of pods and resources within pods, the actual nodes can be upgraded (scaled vertically) by changing the underlying VM or hardware specifications. This is typically managed by the cloud provider (e.g., resizing the VM in GKE, AKS, or EKS).
Comparison and Use Cases
AWS Auto Scaling provides horizontal scaling for EC2 instances, ensuring that your application can handle varying loads by adjusting the number of instances. Vertical scaling is also possible by resizing instances manually.
- Use Case: Ideal for automatically scaling EC2 instances based on demand. It's best suited for applications deployed on EC2 where you need to ensure that there is enough compute capacity to handle traffic spikes.
- Scaling Method: Primarily horizontal scaling with some support for manual vertical scaling.
Kubernetes offers robust support for both horizontal and vertical scaling. Horizontal Pod Autoscaler and Cluster Autoscaler handle horizontal scaling, while Vertical Pod Autoscaler manages vertical scaling. Kubernetes provides a flexible and powerful platform for scaling containerized applications.
- Use Case: Ideal for containerized applications where you need fine-grained control over the scaling of individual components (pods). Kubernetes excels in microservices architectures and large-scale applications requiring dynamic scaling and orchestration.
- Scaling Method: Supports both horizontal and vertical scaling of pods and nodes, providing a comprehensive solution for managing resource allocation and application performance.
Share this post!
Thanks for reading! Don't forget to smash that share button and subscribe.