There are many CDN players in the market. Have you ever wondered how you can make your own CDN, which is cost efficient and able to provide you good performance? In this post we are going to see how you can make your own CDN network in the most optimal way.
The CDN that we will be building should consider the following factors.
- Fault tolerance
- Good Performance.
- Cost optimized.
We will not try to re-invent the wheel, but would rather make use of existing tools/softwares/services to build the CDN. We are going to make use of the following resources.
- Varnish [For caching the content]
- Amazon Route 53 [For intelligent routing of traffic]
- DigitalOcean [Digital Ocean provides VMs across the globe]
- CentOS 7.5 [To used used as OS on VMs to run Varnish]
We will be using Varnish because it is one of the widely used free software available which does the job of caching and delivery and performance is good. Amazon’s Route53 product will be used for mapping and global load balancing. There are different ways to distribute traffic on Route53 and for this POC, we will be used Geo based load balancing. We will be using DigitalOcean for running Varnish software on their VMs.
Since it is just a POC and not a production grade network, we will just use HTTP protocol for delivery. In case we want to us HTTPS, then we might need to use nginx on port 443 for SSL termination.
Step 1: Design of CDN POP distribution.
We will have multiple servers deployed in different geographies. Below are the Digital ocean Datacenters available and we will make use of them to run our VMs.
Step 2: Making distribution groups and server deployment.
I will be using 4 servers for 4 groups
- Server location India for the Indian users
- Server location Singapore for users for rest of the Asia
- Amsterdam location for Europe
- New York for North and South America.
- If anyone does not fit any of the above criteria then they will be mapped to default New York location server.
Below is the mapping done in Route53.
Digital Ocean Servers:
For caching and delivering the content, we have 4 servers deployed at 4 locations. 1 server will be used as Origin server which the actual server which hosts the web application.
You may also configure health checks to ensure that if any of these servers are not reachable then the alternative server will handle the request.
Step 3 : Configure the domain to point to the DNS load balancer.
We will be using the domain http://www.cdntest.xyz for testing the performance.
In this setup, the DNS maps the end user to the mapped CDN server as per the policy we have define. As per the policy, the mapping happens. Mapping is the core of the CDN. Since the Internet is so dynamic, it is not necessary that the nearest server will always result in best performance. There may be other factors which may impact the the route and thus closest server might not be optimal. But for this POC, we have defines a static map. We may select map based on different parameters, like latency, distance, load distribution. Please note, if you want to make the best mapping algorithm, then you need to consider maltiple factors into consideration before defining the best path. Also, the path may not remain best for a long time and this your mapping should be dynamic for the best results. Thats how [somehat] Akamai’s mapping works.
Lets see the site’s performance without CDN using webpagetest.org
1. From: New York, NY USA
2. From: Amsterdam, NL
3. From: Singapore
4. From: Mumbai, India
Following are the numbers observed after testing on the CDN we configured.
1.From: New York, NY USA
2.From: Amsterdam, NL
3. From: Singapore
4. From: Mumbai, India
We observed that when CDN servers are used for the test website, the page load time improved a lot in all the cases.
Further Cost Optimizations:
In this POC, we used Amazon Route53 for geo based distribution. You may also use Cedexis to data and design your own algorithm to route traffic. There are few other similar services offered by NS1 and Akamai GTM. What I demonstrated in the POC using Route53, can be achieved using Bind. This article explains the BIND setup for IP resolution with Geo IP in. It uses Maxmind database to find the location of the IP addresses. You will save substantial amount of money if you can configure the load balancer with BIND. Using MaxMind integration with bind, you make it Geo Aware and thus may define the IP resolution to that server which is in the same geo location. For most of the time, closest server might be optimal.
I chose Digital Ocean in this demo because using API’s you can easily scale up/down and save on cost. As per your budget, you may go with other providers. You can also include load balancers in POP/Datacenter to make use of scalability.
Few Advantages of using Varnish:
- You may use it free of cost
- Varnish 6 also support compression
- Its open source so you may customize it if required.
- Varnish 5 onward you may write your own rules in vcl files.
- Latest Varnish is very powerful and help is available online in terms of tools/docs/help forums
For now, I will stop here. I hope that you would be able to make own CDN after going through this post. Next time we will talk about MultiCDN and we will try to make one. Stay tuned!