Architecture

Docker 2015 Recap – The good, the bad and the ugly.

docker 2015

Year in review

There is no question in my mind that Docker was, by far, the most disruptive technology of 2015. What barely registered on the radar for many in 2014, became something shiny in Q1, advanced to thinking material in Q2, reached “you have to be crazy to run that in production” by Q3, and is now on everyone’s three year plan.

To make a long story short:

  1. Containers were a great idea 12 years ago and Docker has brought containers to Linux and managed to improve on the concept.
  2. That said, the actual implementation of containers in Linux is a disaster. If you can’t use Triton or another container technology, be prepared to step on a lot of mines.
  3. In proper hands and under proper circumstances (dev, qa, limited production use), Docker is already useful so use it.
  4. Though it’s production capable, it’s not production ready. Parental supervision required.

Getting some perspective

Sun released Solaris 10 and Zones around 2005. The performance advantages of OS level virtualization over hardware level virtualization were and are obvious but the real innovation came the combination of Zones with complementary technologies like ZFS. On the simplest level, quota support, delegated datasets, LOFI mounts, and snapshotting capabilities added layers upon layers of  functionality to containers.

By 2008, the only systems I had in dev, staging, or production that didn’t run on top of zones were Oracle RAC nodes. Every piece of code and every application server was deployed by sending an incremental ZFS snapshot of the container in staging to the servers in production. Effectively every server in production was running block level clones of the containers in staging.

It shouldn’t surprise anyone in the tech industry that we’ve come full circle as technology loves to do. Docker brings almost this exact pattern of shipping containers based on incremental file system changes to Linux but with some compelling twists and, yes, some troubling flaws.

The good

Container as an artifact

Developers rarely keep a clean house and by that I mean a clean working environment. Working on multitudes of projects they’ll often have multiple versions of a runtime installed, not to mention any number of libraries. When software is written in a dirty environment, dependencies can be forgotten or met unintentionally and it leads to “worked on my laptop syndrome.” One of the major advantages of containers is the filesystem level isolation for the processes. When developers work properly with containers, their software, runtimes, and dependencies are all isolated from any other project. The same code and dependencies running in dev, go to staging, and then possibly even to production.

Container as a process

One of the most brilliant conceptual changes Docker has, in contrast to what Sun had made possible with Zones and ZFS, is the idea of a container as a process. I don’t know if this was originally intended by design or a happy consequence of building containerization on top of cgroups but it’s ground breaking for several reasons:

  1. Running a single process per container has significantly less resource requirements than running a traditional fat container like a Solaris Zone.
  2. Running a single process per container has significantly less attack vectors than running a traditional fat container.
  3. The paradigm of a container accepting STDIN and returning STDOUT/STDERR like a normal process brings the UNIX philosophy to new heights, allowing, not just software, but even very complex processes to be packaged as containers and piped together.

API

While the lean, process model container is one of the best innovations Docker brings to the table, it is not without it’s downsides. Docker containers are poorly suited for services which require multiple running processes, involve complex configurations, and/or work with many files/output channels. Mounting configuration, data, or log files from the Docker host can work around some of these issues but that strays from the general goal of having a self contained, self reliant image which can be run anywhere.

The alternative is to pass some unwieldy combination of environment variables to the container at runtime. On the command line, this is infuriating after a while but when managed by API, it becomes much more tractable. The API also makes it fairly straightforward to replace cloud resources with Docker resources so configuration management tools like Chef, Puppet, Ansible, and Vagrant have already been able to provide reasonably mature support despite the still growing adoption levels.

That said, the most innovative use of the Docker API has been Joyent’s Triton offering. By creating a virtual Docker endpoint as a gateway to the Joyent public cloud, they have become the only service, that I’m aware of, which runs Docker containers on bare metal without maintaining dedicated Docker Host machines. Having built and run a production SAAS based on Docker, I can’t stress enough how much of a game changer that is in terms of optimal resource utilization, and consequently in terms of price.

Layers

While ZFS technically enabled Zones to be built from incremental changes to a file system, there was really very little, if any integration there. I suppose you could name snapshots but I don’t think there was any way to get a history of the layers involved in creating the container like there is with Docker.

With ZFS+deduplication you could also achieve something along the lines of Docker’s caching and reuse of layers. To be honest, in most cases I preferred not to enable deduplication and to have another copy of my data in production. In development, however, if you want to run 100 containers based on Ubuntu, you are probably more than happy to reuse the Ubuntu layer.

Another compelling pattern for layer reuse is the data volume container. Assuming a service relying on a database, you could store the initial database in an image and then run any number of data volume containers based on that image. Each container will contain a private version of the data but only store the changes from the original image which is excellent for development, automated testing, etc.

With the properly constructed Dockerfile, caching layers can also significantly reduce the amount of time involved in downloading images based on layers which have already been downloaded, updating containers with new changes, etc. That comes with several caveats that layer caching is not infallible and you may easily miss updating critical pieces of your software if you do it wrong, for examples see here and here.

…the bad, and the ugly

The ugly truth is that the Linux container technology underlying Docker is incapable of providing true containerization. Unlike Solaris zones which were designed from the beginning to provide complete isolation between containers, Linux containers are really just a combination of loosely coupled kernel features doing things they can, but weren’t necessarily meant to do, hob-cobbled together with a bunch of proverbial duct tape.

Once, after a service upgrade, we experienced transient failures that crippled our system. It took about 6 hours to find a rogue instance of the previous application that had remained running although the “container” it was running was deleted. Since there is no real container entity in Linux, you will often end up with orphaned file system mounts, and less often, broken iptables rules and rogue processes.

Another major issue is the lack of “C4I” solutions/patterns in the Docker space. In a fat host/VM environment, services rely on any number of base services provided by the host including DNS resolution, email, logging, time synchronization, firewall, etc. In the lean environments of process model containers, there is nothing to rely on. In many cases, organizations fall back to providing non-containerized solutions from the host. In other cases, deploying privileged agent containers alongside application containers is used. Many times a combination of both is necessary but none of these solutions is optimal.

Most importantly, Docker when used improperly can add numerous attack surfaces to any system. You must be extremely careful with access to the docker daemon whether via TCP or via socket. Helpers like docker-machine are so focused on providing the easiest out-of-box experience that it’s too easy to do something wrong. You must be vigilant regarding the trustworthiness and the maintenance of container contents. With DockerHub and Docker, the instant gratification of pulling and running a basic instance of any software can quickly erase any common sense regarding the trustworthiness of the software inside. Has the software been updated or am I opening myself to known exploits? Has the software been back-doored? Will the container be mining bitcoins for someone else using my resources?

In summary

Docker has both good and bad points but the situation is surely improving with time. If Linux containers continue to be developed as a disconnected combination of resources and their controls, I wouldn’t expect a mature Linux based solution for at least 10 years.

For the long run, a better and faster alternative would be to keep Docker’s conceptual innovations and API while replacing the underlying technology with adaptations of mature container solutions like Zones. Joyent is already playing a visionary role on that front. Their focus on cloud, however, means that there are no plans in sight to support a more standard zones based Docker Host. I wish someone in the space would pick up the gauntlet there.

In short, if you can run on Triton/Solaris derivatives with Zones, that would be my first pick. If you have to run on Linux, tread carefully, resist the temptation for instant gratification,  and Docker on Linux can already be useful in many cases.

Triton Bare Metal Containers FTW!

triton

If you haven’t heard of Docker, I’m not sure what cave you’ve been living in but here’s the short story:

Hardware level virtualization, like you are used to (VMWare, Virtualbox, KVM, Xen, Amazon, Rackspace, Azure, etc.) is slow and awful. Virtualization at the OS level, where processes share isolated access to a single Operating System kernel, is much more efficient and therefore awesome. Another way of saying OS level virtualization is “containers” such as have existed in FreeBSD and Solaris for over a decade.

Docker is hacking together a whole slew of technologies (cgroups, union filesystems, iptables, etc.) to finally bring the concept of containers to the Linux masses. Along the way, they’ve also managed to evolve the concept a little adding the idea of running a container as very portable unit which runs as a process.

Instead of managing dependencies across multiple environments and platforms, an ideal Docker container encapsulates all the runtime dependencies for a service or process. Instead including a full root file system, the ideal Docker container could be as small as a single binary. Taking that hand in hand with the growing interest in developing reusable microservices, we have an amazing tech revolution on our hands.

That is not to say that everything in the Docker world is all roses. I did say that Docker is hacking together a slew of technologies so while marketing and demos portray:

You are more likely to start with something like this:

Either way, tons of containers per host is great until you realize you are lugging them around on a huge, slow, whale of a cargo ship.

  1. Currently, Docker’s level of isolation between containers is not that great so security and noisy neighbors are issues.
  2. Containers on the same host can’t easily run on the same ports so you may have to do some spaghetti networking.
  3. On top of that, if you are running Docker in Amazon, Google, Azure, etc. you are missing the whole point which was to escape the HW level virtualization.

Joyent to the rescue!

Joyent is the only container based cloud provider that I’m aware of. They have been running the vast majority of my cloud instances (possibly yours as well) on OS level virtualization for years now (years before Docker was a twinkle in Shamu’s eye). As such, they are possibly the most experienced and qualified technical leaders on the subject.

They run a customized version of Illumos, an OpenSolaris derivative, with extremely efficient zone technology for their containers. In January (Linux and Solaris are Converging but Not the Way you Think), I wrote about the strides Joyent made allowing Linux binaries to run inside Illumos zones.

Triton May Qualify as Witchcraft

The love child of that work, announced as GA last week, was Triton- a Docker API compliant (for the most part) service running zone based Docker containers on bare metal in the cloud. If running on bare metal weren’t enough of an advantage, Joyent completely abstracted away the notion of the Docker host (ie. the cargo ship). Instead, your Docker client speaks to an API endpoint which schedules your bare metal containers transparently across the entire cloud.

Addressing each of the points I mentioned above:

  1. Zones in Illumos/Joyent provide complete isolation as opposed to Linux based containers so no security or noisy neighbor problems.
  2. Every container gets it’s own public ip address with a full range of ports so no spaghetti networking
  3. No Docker host and no HW virtualization so every container is running full speed on bare metal

Going back to the boat analogy, if Docker containers on Linux in most clouds looks like this:

Docker containers as zones on bare metal in Joyent look like this:

Enough of the FUD

I’m not a big fan of marketing FUD so I’ve been kicking the tires on the beta version of Triton for a while. Now with the GA, here are the pros and cons.

Pros:

  1. Better container isolation
  2. Better networking support
  3. Better performance
  4. No overhead managing a Docker host
  5. Great pricing (per minute and much lower pricing)
  6. User friendly tooling in the portal, including log support and running commands on containers using docker exec.

Cons:

  1. The API still isn’t fully supported so things like build and push don’t work. You can mostly work around this using a docker registry.
  2. Lack of a Docker Host precludes using some of the patterns that have emerged for logging, monitoring, and sharing data between containers.

Summary

Docker is a game changer but it is far from ready for prime time. Triton is the best choice available today for running container based workloads in general, and for production Docker workloads specifically.

At Codefresh, we’re working to give you the benefits of containers in your development workflow without the headache of actually keeping the ship afloat yourselves.  Sign up for our free Beta service to see how simple we can make it for you to run your code and/or Docker containers in a clean environment. Take a look at my getting started walkthrough or contact me if you have any questions.

How To Setup a Multi-Platform Website With Cedexis

multiplatform

With the recent and ongoing DDOS attacks against Github, many sites hosted as Github Pages have been scrambling to find alternative hosting. I began writing this tutorial long before the attacks but the configuration you find here is exactly what you need to serve static content from multiple providers like Github Pages, Divshot, Amazon S3, etc.

In my previous article, I introduced the major components of Cedexis and how they fit together to build great multi-provider solutions. If you haven’t read it, I suggest taking a quick look to get familiar with the concepts.

In this article, I’ll show you, step-by-step, how to build a robust multi-provider, multi-platform website using the following Cedexis components:

  • Radar- provides real-time user measurements of provider performance and availability.
  • Sonar- provides lightweight efficient service availability health-checks.
  • OpenMix- DNS traffic routing based on the data from Radar and Sonar.
  • Portal- UI for configuration and reporting.

I’ll also briefly cover some of the reports available in the portal.

Basic Architecture

OpenMix applications all have the same basic architecture and the same basic building blocks. Here is a diagram for reference:

Preparations

For the purpose of the tutorial, I’ve built a demo site using the Amazon S3 and Divshot static web hosting services. I’ve already uploaded my content to both providers and made sure that they are responding.

Both of these services provide us a DNS hostname with which to access the sites.

Amazon S3 is part of the standard community Radar measurements, but as a recently launched service, Divshot hasn’t made the list yet. By adding test objects, we can eventually enable private Radar measurements instead.

Download the test objects from here:

I’ve uploaded them to a cedexis directory inside my web root on Divshot.

Configuring the Platforms

Platforms are, essentially, named data sets. Inside the OpenMix Application, we assign them to a service endpoint and the data they contain influences how Cedexis routes traffic. In addition, the platforms become dimensions by which we slice and dice data in the reports.

We need to define platforms for S3 and Divshot in Cedexis and connect each platform to their relevant data sources (Radar and Sonar).

Log in to the Cedexis portal here and click on the Add Platform button in the Platforms section.

 

 

We’ll find the community platform for Amazon S3 under the Cloud Storage category. It means that S3 performance will be monitored automatically by Cedexis’ Community Radar. You can leave the defaults on this screen:

After clicking next, we’ll get the opportunity to set up custom Radar and Sonar settings for this platform. We want to enable Sonar to make sure there are no problems with our specific S3 bucket which community Radar might not catch.

We’ll enable Sonar polls every 60 seconds (default) and for the test URL, I’ve put the homepage of the site.

We’ll save the platform and create another:

Divshot is somewhere in between Cloud Storage and Cloud Compute. It’s really only hosting static content so I’ve chosen the Cloud Storage category, but there is no real difference from Cedexis’ perspective. If they eventually add Divshot to their community metrics, it might end up in a different category.

Since Divshot isn’t one of the pre-configured Cloud Storage platforms, choose the platform “Other”.

The report name is what will show up in Cedexis charts when you want to analyze the data from this platform.

The OpenMix alias is how OpenMix applications will refer to this platform. Notice that I’ve called it divshot_production. That is because Divshot provides multiple environments for development, staging, and QA. In the future, we may define platforms for other environments as well.

Since there are no community Radar measurements for Divshot, we prepare private measurements of our own in the next step.

We are going to add three types of probes using the test objects which we downloaded above.

First the HTTP Response Time URL (If you are serving your content over SSL, as you should, then you should choose the HTTPS Response Time URL probe, etc.) The HTTP Response Time probe should use the Small Javascript Timing Object.

Click Add probe at the bottom left of the dialog to add the next probes.

In addition to the response time probe, we will add the Cold Start and Throughput probes to cover all our bases.

The Cold Start probe should also use the small test object.

The Throughput probe needs the large test object.

Make sure the test objects are correct in the summary page before setting up Sonar.

Configure the Sonar settings for Divshot similarly to those from S3 with the exception of using the homepage from Divshot for the test URL. Then click ‘Complete’ to save the platform and we are done setting up platforms for now.

A little bit of information about platforms. A nice thing about platforms is that they are decoupled from the OpenMix applications. That means that you can re-use a platform across multiple OpenMix applications with completely different logic. It also means you can slice and dice your data using application and platform as separate dimensions.

For example, if we had applications running in multiple datacenters, we would be interested to know about the performance and availability of each data center across all our applications. Conversely, we would also want to know if a specific application performs better in one data center than another. Cedexis hands us this data on a silver platter.

Our First OpenMix Application

Open the Application Configuration option under the OpenMix tab and click on the plus in the upper right corner to add a new application..

We’re going to select the Optimal RTT quick app for the application type. This app will send traffic to the platform with the best response time according to the information Cedexis has on the user making the request.

Define the fallback host. Note that this does not have to be one of the defined platforms. This host will be used in case the application logic fails or there is a system issue within Cedexis. In this case, I trust S3 slightly more than Divshot so I’ll configure it as the fallback host.

In the second configuration step, I’ve left the default TTL of 20 seconds. This means that users should check every 20 seconds to see if a different provider should be used to return requests. Once Cedexis detects a problem, the maximum time for users to be directed to a different provider should be approximately the same as this value.

In my experience, 20 seconds is a good value to use. It is long enough that users can browse one or two pages of a site before doing any additional DNS lookups and it is short enough to react to shorter downtimes.

Increasing this value will result in fewer requests to Cedexis. To save money, consider automatically changing TTLs via RESTful Web Services. Use lower TTLs during peaks, where performance could be more erratic, and use longer TTLs during low traffic periods to save request costs.

On the third configuration step, I’ve left all the defaults. The Optimal RTT quick app will filter out any platforms which have less than 80% availability before considering them as choices for sending traffic.

Depending on the quality of your services, you may decide to lower this number but you probably do not want it any higher. Why not eliminate any platform that isn’t reporting 100% available? The answer is that RUM measurements rely on the, sometimes poor quality, home networks of your users and, as a result, they can be extremely finicky. Expecting 100% availability from a high traffic service is unrealistic and leaving a threshold of 80% will help reduce thrashing and unwanted use of the fallback host.

Regarding eDNS, you pretty much always want this enabled since many people have switched to using public DNS resolvers like Google DNS instead of the resolvers provided by their ISPs.

Shared resolvers break assumptions made by traffic optimization solutions and the eDNS standard works around this problem, passing information about the request origin to the authoritative DNS servers. Cedexis has supported eDNS from the beginning but many services still don’t.

In the final step, we will configure the service endpoints for each of the platforms we defined.

In our case, we are just associating the hostname aliases that Amazon and Divshot gave us with the correct platform and it’s Radar/Sonar data.

In a more complicated setup, you might have a platform per region of a cloud and service endpoints with different aliases or CNAMEs across each region.

Pay attention that each platform in the application has an “Enabled” checkbox. This makes it easy to go into an application and temporarily stop sending traffic to a specific platform. It is very useful avoiding downtime in case of maintenance windows, migrations, or intermittent problems with a provider.

Choose the Add Platform button on the bottom left corner of the dialog to add the second platform, not the Complete button on the bottom right.

Define the Divshot platform like we did for S3 and click Complete.

You should get a success message with the CNAME for the Cedexis application alias. Click “Publish” to activate the OpenMix application right away.

Alternatively, clicking “Done” would leave the application configured but inactive. When editing applications, you will get a similar dialog. Leaving changes saved but unpublished can be a useful way to stage changes to be activated later with the push of a button.

Building a Custom OpenMix Application

The application we just created will work, but it doesn’t take advantage of the Sonar data that we configured. To consider the Sonar metrics, we will create acustom OpenMix app and by custom, I mean copy and paste the code from Cedexis’ GitHub repository. If you’re squeamish about code, talk to your account manager and I’m sure he’ll be able to help you.

The specific code we are going to adapt can be found here (Note: I’ve fixed the link to a specific revision of the repo to make sure the instructions match, but you might choose to take the latest revision.)

We only need to modify definitions at the beginning of the script:

Let’s create a new application using the new script. Then we can switch back and forth between them if we want. We’ll start by duplicating the original app. Expand the app’s row and click Duplicate.

Switch the application type to that of a Custom Javascript app, modify the name and description to reflect that we will use Sonar data and click next.

Leave the fallback and TTL as is on the next screen.

In the third configuration step, we’ll be asked to upload our custom application.
Choose the edited version of the script and click through to complete the process.

As before, publish the application to activate it.

Adding Radar Support to our Service

At this point, Cedexis is already gathering availability data on both our platforms via Sonar. Since we used the community platform for S3, we also have performance data for that. To finish implementing private Radar for Divshot, we must include the Radar tag in our pages so our users start reporting on their performance.

We get our custom JavaScript tag from the portal (Note: If you want to get really advanced, you can go into the tag configuration section and set some custom parameters to control the behavior of the Radar tag, for example, how many test to run on each page load, etc.)
Copy the tag to your clipboard and add it in your pages on all platforms.

Going Live

Before we go live, we should really test out the application with some manual DNS requests, disabling and enabling platforms, to see that the responses change, etc.

Once we’re satisfied, the last step is to change the DNS records to point at the CNAME of the OpenMix application that we want to use. I’ll set the DNS to point at our Sonar enabled application.

A useful service to check how our applications are working ishttps://www.whatsmydns.net/. This will show how our application CNAMEs resolve from location around the world. For example, if I check the CNAME resolution for the application we just created, I get the following results:

By and large, the Western Hemisphere prefers Divshot while the Eastern Hemisphere prefers Amazon S3 in Europe. This is completely understandable. Interestingly, there are exceptions in both directions. For example, in this test, OpenMix refers the TEANET resolver from Italy to Divshot while the Level 3 resolver in New York is referred to Amazon S3 in Europe. If you refresh the test every so often, you will see that the routings change.

Since this demo site isn’t getting any live traffic, I’ve generated traffic to show off the reports. First the dashboard which gives you a quick overview of your Cedexis traffic on login. Here we show that the majority of the traffic came from North America, and a fair amount came from Europe as well. We also see that, for the most part, the traffic was split evenly between the two platforms.

To examine how Cedexis is routing our traffic, we look at the OpenMix Decision Report. I’ve added a secondary dimension of ‘Platform’ to see how the decisions have been distributed. You see that sometimes Amazon S3 is preferred and other times Divshot.

To figure out why requests were routed one way or the other, we can drill down into the data using the other reports. First, let’s check the availability stats in the Radar Performance Report. For the sake of demonstration, I’ve drilled down into availability per continent. In Asia, we see shoddy availability from Divshot but Amazon S3 isn’t perfect either. Since we didn’t see much traffic from Asia, this probably didn’t affect the traffic distribution. Theoretically, a burst of Asian traffic would result in more traffic going to Amazon.

In Europe, Divshot generally showed better availability than Amazon, reporting 100% except for a brief outage.

In North America, we see a similar graph. As to be expected, the availability of Amazon S3 in Europe is lower and less stable in North America. Divshot shows 100% availability which is also expected.

It’s important to note that the statistics here are skewed because we are comparing our private platform, measured only by our users to the community data from S3. The community platform collects many more data points in comparison to our private platform and it’s almost impossible for it to show 100% availability. This is also why we chose an 80% availability threshold when we built the OpenMix Application.

Next let’s look at the performance reports for response times of each platform. With very little traffic from Asia, the private performance measurements for Divshot are pretty erratic. With more traffic, the graph should stabilize into something more meaningful.

The graph for Europe behaves as expected showing Amazon S3 outperforming Divshot consistently.

The graph for North America also behaves as expected with Divshot consistently outperforming Amazon S3.So we’ve seen some basic data on how our traffic performs. Cedexis doesn’t stop there. We can also take a look at how our traffic could perform if we add a new provider. Let’s see how we could improve performance in North America by adding other community platforms to our graph.

I’ve added Amazon S3 in US-East, which shows almost a 30ms advantage on Divshot, though our private measurement still need to be taken lightly with so little traffic behind them. Even better performance comes from Joyent US-East. Using Joyent will require us to do more server management but if we really care about performance, Cedexis shows that it will provide a major improvement.

Summary

To recap, In this tutorial, I’ve demonstrated how to set up a basic OpenMix application to balance traffic between two providers. Balancing between multiple CDNs or implementing a Hybrid Datacenter+CDN architecture is just as easy.

Cedexis is a great solution for setting up a multi-provider infrastructure. With the ever-growing Radar community generating performance metrics from around the globe, Cedexis provides both real-time situational awareness for your services operational intelligence so you can make well-informed decisions for your business.

Unpacking Cedexis; Creating Multi-CDN and Multi-Cloud applications

package

Technology is incredibly complex and, at the same time, incredibly unreliable. As a result, we build backup measures into the DNA of everything around us. Our laptops switch to battery when the power goes out. Our cellphones switch to cellular data when they lose connection to WiFi.

At the heart of this resilience, technology is constantly choosing between the available providers of a resource with the idea that if one provider becomes unavailable, another provider will take its place to give you what you want.

When the consumers are tightly coupled with the providers, for example, a laptop consuming power at Layer 1 or a server consuming LAN connectivity at Layer 2, the choices are limited and the objective, when choosing a provider, is primarily one of availability.

Multiple providers for more than availability.

As multi-provider solutions make their way up the stack, however, additional data and time to make decisions enable choosing a provider based on other objectives like performance. Routing protocols, such as BGP, operate at Layer 3. They use path selection logic, not only to work around broken WAN connections but also to prefer paths with higher stability and lower latency.

As pervasive and successful as the multi-provider pattern is, many services fail to adopt a full stack multi-provider strategy. Cedexis is an amazing service which has come to change that by making it trivial to bring the power of intelligent, real-time, provider selection your application.

I first implemented Multi-CDN using Cedexis about 2 years ago. It was a no-brainer to go from Multi-CDN to Multi-Cloud. The additional performance, availability, and flexibility for the business became more and more obvious over time. Having a good multi-provider solution is key in cloud-based architectures and so I set out to write up a quick how-to on setting up a Multi-Cloud solution with Cedexis; but first you need to understand a bit about how Cedexis works.

Cedexis Unboxed

Cedexis has a number of components:

  1. Radar
  2. OpenMix
  3. Sonar
  4. Fusion

OpenMix

OpenMix is the brain of Cedexis. It looks like a DNS server to your users but, in fact, it is a multi-provider logic controller. In order to setup multi-provider solutions for our sites, we build OpenMix applications. Cedexis comes with the most common applications pre-built but the possibilities are pretty endless if you want something custom. As long as you can get the data you want into OpenMix, you can make your decisions based on that data in real time.

Radar

Radar is where Cedexis really turned the industry on their heads. Radar uses a javascript tag to crowdsource billions of Real User Monitoring (RUM) metrics in real time. Each time a user visits a page with the Radar tag, they take a small number of random performance measurements and send the data back to Cedexis for processing.

The measurements are non-intrusive. They only happen several seconds after your page has loaded and you can control various aspects of what and how much gets tested by configuring the JS tag in the portal.

It’s important to note that Radar has two types of measurements:

  1. Community
  2. Private.

Community Radar

Community measurements are made against shared endpoints within each service provider. All Cedexis users that implement the Radar tag and allow community measurements get free access to the community Radar statistics. The community includes statistics for the major Cloud compute, Cloud storage, and CDNs making Radar the first place I go to research developments and trends in the Cloud/CDN markets.

Community Radar is the fastest and easiest way to use Cedexis out of the box and the community measurements also have the most volume so they are very accurate all the way up to the “door” of your service provider. They do have some disadvantages though.

The community data doesn’t account for performance changes specific to each of the provider’s tenants. For example, community Radar for Amazon S3 will gather RUM data for accessing a test bucket in the specified region. This data assumes that within the S3 region all the buckets perform equally.

Additionally, there are providers which may opt out of community measurements so you might not have community data for some providers at all. In that case, I suggest you try to connect between your account managers and get them included. Sometimes it is just a question of demand.

Private Radar

Cedexis has the ability to configure custom Radar measurements as well. These measurements will only be taken by your users, the ones using your JS tag.

Private Radar lets you monitor dedicated and other platforms which aren’t included in the community metrics. If you have enough traffic, private Radar measurements have the added bonus of being specific to your user base and of measuring your specific application so the data can be even more accurate than the community data.

The major disadvantage of private Radar is that low volume metrics may not produce the best decisions. With that in mind, you will want to supplement your data with other data sources. I’ll show you how to set that up.

Situational Awareness

More than just a research tool, Radar makes all of these live metrics available for decision-making inside OpenMix. That means we can make much more intelligent choices than we could with less precise technologies like Geo-targeting and Anycast.

Most people using Geo-targeting assume that being geographically close to a network destination is also closer from the networking point of view. In reality, network latency depends on many factors like available bandwidth, number of hops, etc. Anycast can pick a destination with lower latency, but it’s stuck way down in Layer 3 of the stack with no idea about application performance or availability.

With Radar, you get real-time performance comparisons of the providers you use, from your user’s perspectives. You know that people on ISP Alice are having better performance from the East coast DC while people on ISP Bob are having better performance from the Midwest DC even if both these ISPs are serving the same geography.

Sonar

Whether you are using community or low volume private Radar measurements, you ideally want to try and get more application specific data into OpenMix. One way to do this is with Sonar.

Sonar is a synthetic polling tool which will poll any URL you give it from multiple locations and store the results as availability data for your platforms. For the simplest implementation, you need only an address that responds with an OK if everything is working properly.

If you want to get more bang for your buck, you can make that URL an intelligent endpoint so that if your platform is nearing capacity, you can pretend to be unavailable for a short time to throttle traffic away before your location really has issues.

You can also use the Sonar endpoints as a convenient way to automate diverting traffic for maintenance windows- No splash pages required.

Fusion

Fusion is really another amazing piece of work from Cedexis. As you might guess from its name, Fusion is where Cedexis glues services together and it comes in two flavors:

  1. Global Purge
  2. Data Feeds

Global Purge

By nature, one of the most apropos uses of Cedexis is to marry multiple CDN providers for better performance and stability. Every CDN has countries where they are better and countries where they are worse. In addition, maintenance windows in a CDN provider can be devastating for performance even though they usually won’t cause downtime.

The downside of a Multi-CDN approach is the overhead involved in managing each of the CDNs and most often that means purging content from the cache. Fusion allows you to connect to multiple supported CDN providers (a lot of them) and purge content from all of them from one interface inside Cedexis.

While this is a great feature, I have to add that you shouldn’t be using it. Purging content from a CDN is very Y2K and you should be using versioned resources with far futures expiry headers to get the best performance out of your sites and so you never have to purge content from a CDN ever again.

Data Feeds

This is the really great part. Fusion lets you import data from basically anywhere to use in your OpenMix decision making process. Built in, you will find connections to various CDN and monitoring services, but you can also work with Cedexis to setup custom Fusion integrations so the sky’s the limit.

With Radar and Sonar, we have very good data on performance and availability (time and quality) both from the real user perspective and a supplemental synthetic perspective. To really optimize our traffic we need to account for all three corners of the Time, Cost, Quality triangle.

With Fusion, we can introduce cost as a factor in our decisions. Consider a company using multiple CDN providers, each with a minimum monthly commitment of traffic. If we would direct traffic based on performance alone, we might not meet the monthly commitment on one provider but be required to pay for traffic we didn’t actually send. Fusion provides usage statistics for each CDN and allows OpenMix to divert traffic so that we optimize our spending.

Looking Forward

With all the logic we can build into our infrastructure using Cedexis, it could almost be a fire and forget solution. That would, however, be a huge waste. The Internet is always evolving. Providers come and go. Bandwidth changes hands.

Cedexis reports provide operational intelligence on alternative providers without any of the hassle involved in a POC. Just plot the performance of the provider you’re interested in against the performance of your current providers and make an informed decision to further improve your services. When something better come along, you’ll know it.

The Nitty Gritty

Keep an eye out for the next article where I’ll do a step by step walk-through on setting up a Multi-Cloud solution using Cedexis. I’ll cover almost everything mentioned here, including Private and Community Radar, Sonar, Standard and Custom OpenMix Applications, and Cedexis Reports.

Virtual Block Storage Crashed Your Cloud Again :(

darthvader

You know it’s bad when you start writing an incident report with the words “The first 12 hours.” You know you need a stiff drink, possibly a career change, when you follow that up with phrases like “this was going to be a lengthy outage…”, “the next 48 hours…”, and “as much as 3 days”.

That’s what happened to huge companies like NetFlix, Heroku, Reddit,Hootsuite, Foursquare, Quora, and Imgur the week of April 21, 2011. Amazon AWS went down for over 80 hours, leaving them and others up a creek without a paddle. The root cause of this cloud-tastrify echoed loud and clear.

Heroku said:

The biggest problem was our use of EBS drives, AWS’s persistent block storage solution… Block storage is not a cloud-friendly technology. EC2, S3, and other AWS services have grown much more stable, reliable, and performant over the four years we’ve been using them. EBS, unfortunately, has not improved much, and in fact has possibly gotten worse. Amazon employs some of the best infrastructure engineers in the world: if they can’t make it work, then probably no one can.

Reddit said:

Amazon had a failure of their EBS system, which is a data storage product they offer, at around 1:15am PDT. This may sound familiar, because it was the same type of failure that took us down a month ago. This time however the failure was more widespread and affected a much larger portion of our servers

While most companies made heartfelt resolutions to get off of EBS, NetFlix was clear to point out that they never trusted EBS to begin with:

When we re-designed for the cloud this Amazon failure was exactly the sort of issue that we wanted to be resilient to. Our architecture avoids using EBS as our main data storage service…

Fool me once…

As Reddit mentioned in their postmortem, AWS had similar EBS problems twice before on a smaller scale in March. After an additional 80+ hours of downtime, you would expect companies to learn their lesson, but the facts are that these same outages continue to plague clouds using various types of virtual block storage.

In July 2012, AWS experienced a power failure which resulted in a huge number of possibly inconsistent EBS volumes and an overloaded control plane. Some customers experienced almost 24 hours of downtime.

Heroku, under the gun again, said:

Approximately 30% of our EC2 instances, which were responsible for running applications, databases and supporting infrastructure (including some components specific to the Bamboo stack), went offline…
A large number of EBS volumes, which stored data for Heroku Postgres services, went offline and their data was potentially corrupted…
20% of production databases experienced up to 7 hours of downtime. A further 8% experienced an additional 10 hours of downtime (up to 17 hours total). Some Beta and shared databases were offline for a further 6 hours (up to 23 hours total).

AppHarbor had similar problems:

EC2 instances and EBS volumes were unavailable and some EBS volumes became corrupted…
Unfortunately, many instances were restored without associated EBS volumes required for correct operation. When volumes did become available they would often take a long time to properly attach to EC2 instances or refuse to attach altogether. Other EBS volumes became available in a corrupted state and had to be checked for errors before they could be used.
…a software bug prevented this fail-over from happening for a small subset of multi-az RDS instances. Some AppHarbor MySQL databases were located on an RDS instance affected by this problem.

The saga continues for AWS who continued to have problems with EBS later in 2012. They detail ad nauseam, how a small DNS misconfiguration triggered a memory leak which caused a silent cascading failure of all the EBS servers. As usual, the EBS failures impacted API access and RDS services. Yet again Multi-AZ RDS instances didn’t failover automatically.

Who’s using Virtual Block Storage?

Amazon EBS is just one very common example of Virtual Block Storage and by no means, the only one to fail miserably.

Azure stores the block devices for all their compute nodes as Blobs in their premium or standard storage services. Back in November, a bad update to the storage service sent some of their storage endpoints into infinite loops, denying access to many of these virtual hard disks. The bad software was deployed globally and caused more than 10 hours of downtime across 12 data centers. According to the post, some customers were still being affected as much as three days later.

HP Cloud provides virtual block storage based on OpenStack Cinder. See related incident reports here, here, here, here, here. I could keep going back in time, but I think you get the point.

Also based on Cinder, Rackspace offers their Cloud Block Storage product. Their solution has some proprietary component they call Lunr, as detailed in this Hacker News thread so you can hope that Lunr is more reliable than other implementations. Still, Rackspace had major capacity issues spanning over two weeks back in May of last year and I shudder to think what would have happened if anything went wrong while capacity was low.

Storage issues are so common and take so long to recover from in OpenStack deployments, that companies are deploying alternate cloud platforms as a workaround while their OpenStack clouds are down.

What clouds won’t ruin your SLA?

Rackspace doesn’t force you to use their Cloud Block Storage, at least not yet, so unless they are drinking their own kool-aid in ways they shouldn’t be, you are hopefully safe there.

Digital Ocean also relies on local block storage by design. They are apparently considering other options but want to avoid an EBS-like solution for the reasons I’ve mentioned. While their local storage isn’t putting you at risk of a cascading failure, they have been reported to leak your data to other customers if you don’t destroy your machines carefully. They also have other fairly frequent issueswhich take them out of the running for me.

The winning horse

As usual, Joyent shines through on this. For many reasons, the SmartDataCenter platform, behind both their public cloud and open source private cloud solutions, supports only local block storage. For centralized storage, you can use NFS or CIFS if you really need to but you will not find virtual block storage or even SAN support.

Joyent gets some flack for this opinionated architecture, occasionally even from me, but they don’t corrupt my data or crash my servers because some virtual hard disk has gone away or some software upgrade has been foolishly deployed.

With their recently released Docker and Linux Binary support, Joyent is really leading the pack with on-metal performance and availability. I definitely recommend hitching your wagon to their horse.

The Nooooooooooooooo! button

If it’s too late and you’re only finding this article post cloud-tastrify, I refer you to the ever amusing Nooooooooooooooo! button for some comic relief.