Every Cloud(Front) has a silver lining

A deep dive into some of the different options when configuring CloudFront to get the best out of it for your application.

Kirsty McDonald

Background

In a recent client engagement we were tasked with setting up an open source PHP learning management system on AWS. Our client will have users from all over Australia accessing video content via Moodle, so it was decided that CloudFront would act as the content delivery network (CDN) service - AWS has edge locations in Melbourne, Sydney and Perth, and more local edge locations helps to deliver content with lower latency.

After quickly establishing and building some lovely CloudFormation templates with all of our required AWS services, it became very apparent that CloudFront was not doing the CloudFront-y things it should (i.e. speeding up the distribution of content as nothing was getting cached in the relevant edge locations)…hmmm.


Whitelisting Headers

One thing we had not considered in the initial configuration was whitelisting the headers. As the default behaviour for CloudFront is to not cache based on headers, it meant that no caching was happening.

To overcome this, we changed the CloudFormation template to whitelist the host header, and - hurrah - we began to see some initial caching on objects!

While this didn’t fix everything (it was soon realised that the application was emitting Cache-Control headers for dynamic content so that wasn’t going to get cached by default), the static content was now being cached.

Sample CloudFormation code:

```
DefaultCacheBehavior:         
    AllowedMethods:
      - DELETE
      - GET
      - HEAD
      - OPTIONS
      - PATCH
      - POST
      - PUT
      DefaultTTL: 86400
      ForwardedValues:
        QueryString: true
        Headers:
        - "Host"
        Cookies:
          Forward: all
      TargetOriginId: elb
      ViewerProtocolPolicy: redirect-to-https
      Compress: true
```

Because a lot of the headers that were previously going through to Moodle were now no longer being captured (as we had only whitelisted the host header), we then enabled CloudFront logging in S3.


Troubleshooting

After switching all this on, our client was confused as to why all content still wasn’t being cached now. This was due to a few things:

  • The Cache-Control headers discussed above
  • A lot of the content loaded on the page had unique IDs (e.g. logins) in the URL meaning that it wasn’t cacheable, as caching is based on URL path
  • Sometimes the asset had not been hit widely enough that all edge nodes had it available to use for caching.

CloudFront caching viewer results


Custom Behaviours

To try and increase the amount of content being cached, custom behaviours were investigated. Custom behaviours use path patterns to match things (for example anything ending with .js or .css or .png or .pg could have a blanket caching behaviour which matches before the default and ignores the rest of the path).

This means when CloudFront gets a request, the path is compared with path patterns in the order in which cache behaviours are listed. The first match determines which cache behaviour is applied to the request. There were several custom behaviours set up on different file types to try and improve caching.

The CloudFormation template would look like this when setting up one custom behaviour:

```
DistributionConfig:
    CacheBehaviours:
    - PathPattern: "*.jpg"
        TargetOriginId: elb
        ViewerProtocolPolicy: redirect-to-http
```


Caching Error Messages

With all this tweaking, we thought we’d finally start seeing some better caching stats in the dashboard. Instead as testing began we saw a large number of 504 gateway time-out errors start to appear (likely due to overloading the EC2 instances). These had popped up occasionally in testing previously, but not at this level!

The only thing that had been changed in CloudFormation was CloudFront, but we didn’t think caching would cause these errors. As it turns out, CloudFront caches HTTP 4xx and 5xx status codes for 5 seconds by default. We lowered the caching time for 504 errors to be 0 seconds to test, and to our relief the large amount of 504 errors weren’t a problem any more.


Summary

With its ability to provide significant performance improvements for end users at a fraction of the cost of tuning up the rest of your AWS stack, CloudFront is an extremely useful tool. Once you have an understanding of how your application is trying to cache items, it gives you a much better starting point to begin configuring CloudFront to get the best results.

Contact us

We will get back to you within 24 hours.