We often receive calls from potential customers who are looking to create a multimedia-based application or website. To our surprise, almost all of them think that they need a CDN (Content Delivery Network) to provide the multimedia file service to their customers. However, this is generally not true, in large part because the media players on modern operating systems are capable of compensating for all but the worst network latencies and performance – and the worst latencies are often in the “last mile” connection between the viewer and the internet, which cannot be addressed by the CDN. The fact that CDNs are so popular is a testament to the power of their marketing, not the necessity of their technology. Let’s take a look at an example of why a CDN won’t deliver any benefit in many cases.
In our picture below, we have a typical multimedia user in New York City, watching content served by our servers in the MAE-West NAP Network Access Point in San Jose, California. We chose MAE-West for our data centerbecause it is an internet hub and has the best connections to other hubs around the country and the world and minimizes the latency (delay) for data to get back and forth from the user’s computer to our servers. This is important for interactive applications where the client PC has a lot of short conversations with the server. Our experience building the data center for NetSuite has borne this out: they still have only one datacenter to serve their customers worldwide after 10 years in operation. We shall see that this advantageous location on the internet is part of the reason why our customers can usually do without a CDN.
The user is connected to the internet through their ISP by a typical 384kbps DSL line. Other types of ISPs which offer faster service may improve the viewing experience, but we have found that despite claims of tens of megabytes of download speed, most ISPs cannot actually deliver that speed during prolonged media downloads. So, unless the internet is severely congested, the speed of the download is determined by the “last mile” connection – the users’s connection to the public internet through their ISP. Internet congestion is relatively rare, since the internet protocol allows data to travel through multiple paths to get to a destination

To start viewing the video, the user clicks “play” on the hosted website, and a request for the media travels across the internet to our server at our point of presence (POP), which takes a few moments to find the media and start sending it. Then the first packet wends its way back across the internet to the users’s computer. This process adds up to 0.052 seconds. After that, the media player software on the users’ computer receives packets of the video until it has enough stored up (“buffered”) until it decides that it can play the video without any interruptions, at 20.91 seconds. This delay is usually determined by a desired maximum and minimum buffer size that the player calculates. After this point, playback begins, but the buffer continues to fill while the “whoa” message makes its way back to the server and the stream stops. Then, playback empties the buffer until the minimum size is reached, and the player requests more video data from the server. This goes on until all the video data is sent.

Now, what does the CDN bring to the table? A CDN places the video on a server “near” the user, often in the same data center as the ISP equipment serving the user. Actually, since the CDN doesn’t know where the next user will be, it has to place the video on many servers around the country, which becomes expensive very quickly since it will store the same data many times.

When the user clicks “play”, the request is directed to the nearby CDN server, which can get the first packets of data to the user’s computer without the delays of sending it across long stretches of the internet: the CDN has reduced the latency dramatically. However, reducing the latency doesn’t improve the user’s experience very much: since most of the delay before play begins is the 20.8 seconds to fill the buffer, the video begins to play only imperceptibly faster with the CDN than with a single-point hosted solution such as ENKI, colocation, or another cloud vendor.

If the latencies of the internet connection are sufficiently large, for example when the provider is isolated from NAP (network access points) such as MAE-West (represented by the larger network hubs in the picture), then they can lead to buffer underrun, which causes the video playback to pause. The different latencies from a variety of vendors, and their consequences, are compared in the chart below.

Some media player software uses adaptive buffer sizing to reduce the chance that this can happen: it watches how the buffer is filling up: if it’s filling too slowly or latencies are large, it will buffer more of the video before playing; if the buffer fills quickly or latencies are small, the software will buffer less video before starting playback.
The CDN, by bypassing the public internet, can also reduce the effects of public internet congestion. Excess congestion lowers the transfer speed and can cause buffer underruns. However, as mentioned above, we see congestion relatively rarely. A more common problem that slows downloads is that the ISP’s infrastructure is overloaded due to excessive cost-cutting and lack of investment. This is especially prevalent with cable internet services, where everyone shares a “party line.” The CDN cannot help with this problem since it occurs between the CDN’s server and the user.
Since a CDN can be very expensive – 10 times your hosting cost or more – when does it become a necessity? As your service grows, you will first run into bandwidth limitations if your software architecture runs the multimedia stream through a single server or database. However, this is not a reason to switch to a CDN, since you can rearchitect your application to use a federated file storage or delivery method with multiple servers collaborating to provide the streams. The next barrier to cross is the bandwidth limitation of your provider’s POP. Many services have solved this problem by rewriting their software again to allow deployment to multiple POPs, spreading the media files among the different locations, but sharing directory information so that the user’s browser can find the file in the correct location. However, if your total volume grows beyond being able to be served by a few POPs, you will need to move the files closer to the users. And that’s when a CDN becomes imperative.
For most of our customers, this will never happen, and even for those who it may happen to, they will likely be able to go for years without needing a CDN. As an example, we have hosted a multimedia file sharing company that grew to 200,000 users with 600Mb/sec average throughput. They were able to serve all these users through two 4-core virtual instances (one running Java and one running PostgreSQL.) Their cost was a tiny fraction of what a CDN would have charged them, and they served customers throughout North America, Latin America, and even Europe. (Note that since media players are relatively insensitive to latency, even serving customers on another continent from a single POP is quite feasible.)
