[AWS Cost Saving] Use combination of on-demand and spot instances in Auto-Scaling

While reading about using spot instances in auto-scaling, I came up with an idea regarding using spot instances in our production auto-scaling setup.

Before we discuss more about this, let’s identify possible issues with spot instances :

  • Whenever spot price goes above our bid price, we will lose all our instances.
  • If we keep on increasing our bid price depending upon spot price, there are chances that we might pay more than price of on-demand instances. There are cases where in order to retain spot instances, people have bid too high and spot prices touched $1000. The below image depicts that spot users have spend more than $4 per hour for a m1.large instance while on_demand_price is just $0.240 per hour.


  • Spot price varies from availability zones to availability zones.


To tackle these issues, I came across a solution to avoid all these issues.

Create two auto-scaling groups:

Let’s create two auto-scaling groups i.e., one with on-demand instances (running minimal instances i.e., min-size = max-size = 2 instances) and another with spot instances (running majority of our instances). Both auto-scaling groups can be attached to same elastic load balancer. As Elastic load balancer distributes requests in round-robin fashion, so this will make sure that requests get served either by on-demand instances or spot instances.

Let’s handle scale-up:

For on-demand auto-scaling group, we can set scale up policy whenever average CPU utilization crosses 70% while for spot auto-scaling group, we can set scale down policy whenever average CPU utilization crosses 60%. This will make sure every time auto-scaling scale up activity happens from spot auto-scaling group and we don’t pay more

Let’s handle scale down:

For scale down activity, on-demand auto-scaling group will scale down whenever average CPU utilization touches 20% while spot auto-scaling group will scale down at average CPU utilization touches 30%. As on-demand auto-scaling group has kind-of fixed instances therefore down scale will also happen from spot auto scaling group.

How to avoid losing spot instances and not end up paying more when spot prices > on-demand prices?

To achieve these objectives, I came up with a small pseudo code:

Step 1: Identify the availability zones and their spot prices by parsing the output of “ec2-describe-spot-price-history”.

Step 2: Pick the maximum spot price from the output of #1.

Step 3: Create our bid price i.e., bid_price = spot_price + spot_price % 20.

Step 4: Using our bid price, create a spot auto-scaling group – attach it to our load balancer.

Step 5: Keep performing #1 every minute. If difference between bid_price and spot price is less than 20% – Update bid price i.e., execute #3 – create new launch configuration – update autoscaling group with new launch configuration.

Step 6: Keep performing #1 every minute. If the difference between on_demand_price and spot_price is less than 20%, reduce the number of spot_instances and spin up same number of instances using on_demand_instances. For example : Reduce the number of spot_instances by 2 and spin up same number of on_demand_instances. Make sure that we have our 50% instances running as on_demand and another 50% as spot_instances.

If the gap between on_demand_price and bid_price is less than 10%, make sure our 75% of the load is handled by on_demand_instances i.e., our one-fourth instances are spot_instances and remaining ones are on_demand_instances. This can be achieved by using min_size, max_size and desired capacity parameters.

If the on_demand_price is greater than bid_price, set the min_size of spot_instances autoscaling group to zero. This will make sure all instances are on_demand_instances and we are not paying more than the price of on_demand_instances.

The reason why we are gradually reducing number of spot instances when the price difference between spot and on_demand is less than 10% or 20% is mainly because the whole auto-scaling process takes a bit of time (like 2-3 minutes) to launch an instances and get attached to load balancer, pass health checks and start serving out traffic and if we shut down all spot instances at the same time when spot_price > on_demand_price, the end-user experience will get affected. To avoid that, we suggest gradually increasing or decreasing auto-scaling instances

For further refer the below link



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s