Tag: solaris

Docker 2015 Recap – The good, the bad and the ugly.

docker 2015

Year in review

There is no question in my mind that Docker was, by far, the most disruptive technology of 2015. What barely registered on the radar for many in 2014, became something shiny in Q1, advanced to thinking material in Q2, reached “you have to be crazy to run that in production” by Q3, and is now on everyone’s three year plan.

To make a long story short:

  1. Containers were a great idea 12 years ago and Docker has brought containers to Linux and managed to improve on the concept.
  2. That said, the actual implementation of containers in Linux is a disaster. If you can’t use Triton or another container technology, be prepared to step on a lot of mines.
  3. In proper hands and under proper circumstances (dev, qa, limited production use), Docker is already useful so use it.
  4. Though it’s production capable, it’s not production ready. Parental supervision required.

Getting some perspective

Sun released Solaris 10 and Zones around 2005. The performance advantages of OS level virtualization over hardware level virtualization were and are obvious but the real innovation came the combination of Zones with complementary technologies like ZFS. On the simplest level, quota support, delegated datasets, LOFI mounts, and snapshotting capabilities added layers upon layers of  functionality to containers.

By 2008, the only systems I had in dev, staging, or production that didn’t run on top of zones were Oracle RAC nodes. Every piece of code and every application server was deployed by sending an incremental ZFS snapshot of the container in staging to the servers in production. Effectively every server in production was running block level clones of the containers in staging.

It shouldn’t surprise anyone in the tech industry that we’ve come full circle as technology loves to do. Docker brings almost this exact pattern of shipping containers based on incremental file system changes to Linux but with some compelling twists and, yes, some troubling flaws.

The good

Container as an artifact

Developers rarely keep a clean house and by that I mean a clean working environment. Working on multitudes of projects they’ll often have multiple versions of a runtime installed, not to mention any number of libraries. When software is written in a dirty environment, dependencies can be forgotten or met unintentionally and it leads to “worked on my laptop syndrome.” One of the major advantages of containers is the filesystem level isolation for the processes. When developers work properly with containers, their software, runtimes, and dependencies are all isolated from any other project. The same code and dependencies running in dev, go to staging, and then possibly even to production.

Container as a process

One of the most brilliant conceptual changes Docker has, in contrast to what Sun had made possible with Zones and ZFS, is the idea of a container as a process. I don’t know if this was originally intended by design or a happy consequence of building containerization on top of cgroups but it’s ground breaking for several reasons:

  1. Running a single process per container has significantly less resource requirements than running a traditional fat container like a Solaris Zone.
  2. Running a single process per container has significantly less attack vectors than running a traditional fat container.
  3. The paradigm of a container accepting STDIN and returning STDOUT/STDERR like a normal process brings the UNIX philosophy to new heights, allowing, not just software, but even very complex processes to be packaged as containers and piped together.

API

While the lean, process model container is one of the best innovations Docker brings to the table, it is not without it’s downsides. Docker containers are poorly suited for services which require multiple running processes, involve complex configurations, and/or work with many files/output channels. Mounting configuration, data, or log files from the Docker host can work around some of these issues but that strays from the general goal of having a self contained, self reliant image which can be run anywhere.

The alternative is to pass some unwieldy combination of environment variables to the container at runtime. On the command line, this is infuriating after a while but when managed by API, it becomes much more tractable. The API also makes it fairly straightforward to replace cloud resources with Docker resources so configuration management tools like Chef, Puppet, Ansible, and Vagrant have already been able to provide reasonably mature support despite the still growing adoption levels.

That said, the most innovative use of the Docker API has been Joyent’s Triton offering. By creating a virtual Docker endpoint as a gateway to the Joyent public cloud, they have become the only service, that I’m aware of, which runs Docker containers on bare metal without maintaining dedicated Docker Host machines. Having built and run a production SAAS based on Docker, I can’t stress enough how much of a game changer that is in terms of optimal resource utilization, and consequently in terms of price.

Layers

While ZFS technically enabled Zones to be built from incremental changes to a file system, there was really very little, if any integration there. I suppose you could name snapshots but I don’t think there was any way to get a history of the layers involved in creating the container like there is with Docker.

With ZFS+deduplication you could also achieve something along the lines of Docker’s caching and reuse of layers. To be honest, in most cases I preferred not to enable deduplication and to have another copy of my data in production. In development, however, if you want to run 100 containers based on Ubuntu, you are probably more than happy to reuse the Ubuntu layer.

Another compelling pattern for layer reuse is the data volume container. Assuming a service relying on a database, you could store the initial database in an image and then run any number of data volume containers based on that image. Each container will contain a private version of the data but only store the changes from the original image which is excellent for development, automated testing, etc.

With the properly constructed Dockerfile, caching layers can also significantly reduce the amount of time involved in downloading images based on layers which have already been downloaded, updating containers with new changes, etc. That comes with several caveats that layer caching is not infallible and you may easily miss updating critical pieces of your software if you do it wrong, for examples see here and here.

…the bad, and the ugly

The ugly truth is that the Linux container technology underlying Docker is incapable of providing true containerization. Unlike Solaris zones which were designed from the beginning to provide complete isolation between containers, Linux containers are really just a combination of loosely coupled kernel features doing things they can, but weren’t necessarily meant to do, hob-cobbled together with a bunch of proverbial duct tape.

Once, after a service upgrade, we experienced transient failures that crippled our system. It took about 6 hours to find a rogue instance of the previous application that had remained running although the “container” it was running was deleted. Since there is no real container entity in Linux, you will often end up with orphaned file system mounts, and less often, broken iptables rules and rogue processes.

Another major issue is the lack of “C4I” solutions/patterns in the Docker space. In a fat host/VM environment, services rely on any number of base services provided by the host including DNS resolution, email, logging, time synchronization, firewall, etc. In the lean environments of process model containers, there is nothing to rely on. In many cases, organizations fall back to providing non-containerized solutions from the host. In other cases, deploying privileged agent containers alongside application containers is used. Many times a combination of both is necessary but none of these solutions is optimal.

Most importantly, Docker when used improperly can add numerous attack surfaces to any system. You must be extremely careful with access to the docker daemon whether via TCP or via socket. Helpers like docker-machine are so focused on providing the easiest out-of-box experience that it’s too easy to do something wrong. You must be vigilant regarding the trustworthiness and the maintenance of container contents. With DockerHub and Docker, the instant gratification of pulling and running a basic instance of any software can quickly erase any common sense regarding the trustworthiness of the software inside. Has the software been updated or am I opening myself to known exploits? Has the software been back-doored? Will the container be mining bitcoins for someone else using my resources?

In summary

Docker has both good and bad points but the situation is surely improving with time. If Linux containers continue to be developed as a disconnected combination of resources and their controls, I wouldn’t expect a mature Linux based solution for at least 10 years.

For the long run, a better and faster alternative would be to keep Docker’s conceptual innovations and API while replacing the underlying technology with adaptations of mature container solutions like Zones. Joyent is already playing a visionary role on that front. Their focus on cloud, however, means that there are no plans in sight to support a more standard zones based Docker Host. I wish someone in the space would pick up the gauntlet there.

In short, if you can run on Triton/Solaris derivatives with Zones, that would be my first pick. If you have to run on Linux, tread carefully, resist the temptation for instant gratification,  and Docker on Linux can already be useful in many cases.

Triton Bare Metal Containers FTW!

triton

If you haven’t heard of Docker, I’m not sure what cave you’ve been living in but here’s the short story:

Hardware level virtualization, like you are used to (VMWare, Virtualbox, KVM, Xen, Amazon, Rackspace, Azure, etc.) is slow and awful. Virtualization at the OS level, where processes share isolated access to a single Operating System kernel, is much more efficient and therefore awesome. Another way of saying OS level virtualization is “containers” such as have existed in FreeBSD and Solaris for over a decade.

Docker is hacking together a whole slew of technologies (cgroups, union filesystems, iptables, etc.) to finally bring the concept of containers to the Linux masses. Along the way, they’ve also managed to evolve the concept a little adding the idea of running a container as very portable unit which runs as a process.

Instead of managing dependencies across multiple environments and platforms, an ideal Docker container encapsulates all the runtime dependencies for a service or process. Instead including a full root file system, the ideal Docker container could be as small as a single binary. Taking that hand in hand with the growing interest in developing reusable microservices, we have an amazing tech revolution on our hands.

That is not to say that everything in the Docker world is all roses. I did say that Docker is hacking together a slew of technologies so while marketing and demos portray:

You are more likely to start with something like this:

Either way, tons of containers per host is great until you realize you are lugging them around on a huge, slow, whale of a cargo ship.

  1. Currently, Docker’s level of isolation between containers is not that great so security and noisy neighbors are issues.
  2. Containers on the same host can’t easily run on the same ports so you may have to do some spaghetti networking.
  3. On top of that, if you are running Docker in Amazon, Google, Azure, etc. you are missing the whole point which was to escape the HW level virtualization.

Joyent to the rescue!

Joyent is the only container based cloud provider that I’m aware of. They have been running the vast majority of my cloud instances (possibly yours as well) on OS level virtualization for years now (years before Docker was a twinkle in Shamu’s eye). As such, they are possibly the most experienced and qualified technical leaders on the subject.

They run a customized version of Illumos, an OpenSolaris derivative, with extremely efficient zone technology for their containers. In January (Linux and Solaris are Converging but Not the Way you Think), I wrote about the strides Joyent made allowing Linux binaries to run inside Illumos zones.

Triton May Qualify as Witchcraft

The love child of that work, announced as GA last week, was Triton- a Docker API compliant (for the most part) service running zone based Docker containers on bare metal in the cloud. If running on bare metal weren’t enough of an advantage, Joyent completely abstracted away the notion of the Docker host (ie. the cargo ship). Instead, your Docker client speaks to an API endpoint which schedules your bare metal containers transparently across the entire cloud.

Addressing each of the points I mentioned above:

  1. Zones in Illumos/Joyent provide complete isolation as opposed to Linux based containers so no security or noisy neighbor problems.
  2. Every container gets it’s own public ip address with a full range of ports so no spaghetti networking
  3. No Docker host and no HW virtualization so every container is running full speed on bare metal

Going back to the boat analogy, if Docker containers on Linux in most clouds looks like this:

Docker containers as zones on bare metal in Joyent look like this:

Enough of the FUD

I’m not a big fan of marketing FUD so I’ve been kicking the tires on the beta version of Triton for a while. Now with the GA, here are the pros and cons.

Pros:

  1. Better container isolation
  2. Better networking support
  3. Better performance
  4. No overhead managing a Docker host
  5. Great pricing (per minute and much lower pricing)
  6. User friendly tooling in the portal, including log support and running commands on containers using docker exec.

Cons:

  1. The API still isn’t fully supported so things like build and push don’t work. You can mostly work around this using a docker registry.
  2. Lack of a Docker Host precludes using some of the patterns that have emerged for logging, monitoring, and sharing data between containers.

Summary

Docker is a game changer but it is far from ready for prime time. Triton is the best choice available today for running container based workloads in general, and for production Docker workloads specifically.

At Codefresh, we’re working to give you the benefits of containers in your development workflow without the headache of actually keeping the ship afloat yourselves.  Sign up for our free Beta service to see how simple we can make it for you to run your code and/or Docker containers in a clean environment. Take a look at my getting started walkthrough or contact me if you have any questions.

Linux and Solaris are Converging but Not the Way You Imagined

tux

In case you haven’t been paying attention, Linux is in a mad dash to copy everything that made Solaris 10 amazing when it launched in 2005. Everyone has recognized the power of Zones, ZFS and DTrace but licensing issues and the sheer effort required to implement the technologies has made it a long process.

ZFS

ZFS is, most probably, the most advanced file system in the world. The creators of ZFS realized, before anyone else, that file systems weren’t built to handle the amounts of data that the future would bring.

Work to port ZFS to Linux began in 2008 and a stable port of ZFS from Illumos was announced in 2013. That said, even 2 years later, the latest release still hasn’t reached feature parity with ZFS on Illumos. With developers preferring to develop OpenZFS on Illumos and the licensing issues preventing OpenZFS from being distributed as part of the Linux Kernel, it seems like ZFS on Linux (ZOL) may be doomed to playing second fiddle.

DTrace

DTrace is the most advanced tool in the world for debugging and monitoring live systems. Originally designed to help troubleshoot performance and other bugs in a live Solaris kernel, it quickly became extremely useful in debugging userland programs and run times.

Oracle has been porting DTrace since at least 2011 and while they both own the original and have prioritized the most widely used features, they still haven’t caught up to the original.

Zones

Solaris Zones are Operating System level virtual machines. They are completely isolated from each other but all running on the same kernel so there is only one operating system in memory. Zones have great integration with ZFS, DTrace, and all the standard system monitoring tools which makes it very easy to support and manage servers with hundreds of Zones running on them. Zones also natively support a mechanism called branding which allows the kernel to provide different interfaces to the guest zone. In Oracle Solaris, this is used to support running zones from older versions of Solaris on a machine running a newer OS.

Linux containers of some type or another have been around for a while, but haven’t gotten nearly as mature as Zones. Recently, the continued failure of traditional hypervisors to provide bare metal performance in the cloud, coupled with the uptake of Docker, has finally gotten the world to realize the tremendous benefits of container based virtualization like Zones.

The current state of containers in Linux is extremely fractured with at least 5 competing projects that I know of. LXC, initially released in 2008, seems to be the favorite but historically had serious privilege separation issues but has gotten a little better if you can meet all the system requirements.

Joyent has been waiting at the finish line.

While Linux users wait and wait for mature container solutions, full OS and application visibility, and a reliable and high performance file system, Joyent has been waiting to make things a whole lot easier.

About a year ago, David Mackay showed some interest in Linux Branded Zones, work which had been abandoned in Illumos. In the spring of 2014, Joyent started work on resurrecting lx-zones and in September, they presented their work. They already have working support for 32 bit and some 64 bit Linux binaries in Linux branded SmartOS Zones. As part of the process, they are porting some of the main Linux libraries and facilities to native SmartOS which will make porting Linux code to SmartOS much easier.

The upshot of it is that you can already get ZFS, Dtrace, and Linux apps inside a fully isolated, high performance, SmartOS zone. With only 9 months or so of work behind it, there are still some missing pieces to the Linux support, but, considering how long Linux has been waiting, I’m pretty sure SmartOS will reach feature parity with Linux a lot faster than Linux will reach feature parity with SmartOS.

SmartDataCenter, the Open Cloud Platform that Actually Already Works

sdc

For years enterprises have tried to make OpenStack work and failed miserably. Considering how many heads have broken against OpenStack, maybe they should have called it OpenBrick.

Before I dive into the details, I’ll cut to the chase. You don’t have to break your heads on cloud anymore. Joyent have open sourced (as in get it on Github) their cloud management platform.

It’s free if you want (install it on your laptop, install it on a server). It’s supported if you want. Best of all, it actually works outside of a lab or CI test suite. It’s what Joyent runs in production for all their public cloud customers (I admit to being one of the satisfied ones). It’s also something they have been licensing out to other cloud providers for years.

Now for the deep dive.

What’s wrong with OpenStack?

First off, it isn’t a cloud in a box, which is what most people think it is. In 2013,Gartner called out OpenStack for consciously misrepresenting what OpenStack actually provides:

no one in three years stood up to clarify what OpenStack can and cannot do for an enterprise.

In case you’re wondering, the analyst also quoted Ebay’s chief engineer on the true nature of OpenStack:

… an instance of an OpenStack installation does not make a cloud. As an operator you will be dealing with many additional activities not all of which users see. These include infra onboarding, bootstrapping, remediation, config management, patching, packaging, upgrades, high availability, monitoring, metrics, user support, capacity forecasting and management, billing or chargeback, reclamation, security, firewalls, DNS, integration with other internal infrastructure and tools, and on and on and on. These activities are bound to consume a significant amount of time and effort. OpenStack gives some very key ingredients to build a cloud, but it is not cloud in a box.

The analyst made it clear that:

vendors get this difference, trust me.

Other insiders put the situation into similar terms:

OpenStack has some success stories, but dead projects tell no tales. I have seen no less than 100 Million USD spent on bad OpenStack implementations that will return little or have net negative value.Some of that has to be put on the ignorance and arrogance of some of the organizations spending that money, but OpenStack’s core competency, above all else, has been marketing and if not culpable, OpenStack has at least been complicit.

The motive behind the deception is clear. OpenStack is like giving someone a free Ferrari, pink slip and all but keeping the keys. You get pieces of a cloud but no way to run it. Once you have put all your effort into installing OpenStack and you realize what’s missing, you are welcome to turn to any one of the vendors backing OpenStack for one of their packaged cloud platforms.

OpenStack is a foot in the door. It’s a classic bait and switch but even after years, no one is admitting it. Instead, blue chip companies fight to steer OpenStack into the direction that suits them and their corporate offerings.

What’s great about SmartDataCenter?

It works.

The keys are in the ignition. You should probably stop reading this article and install it already. You are likely to get a promotion for listening to me 😉

Great Technology

SmartDataCenter was built on really great technologies like SmartOS (fork of Solaris), Zones, ZFS, and DTrace. Most of these technologies are slowly being ported to Linux but they are already 10 years mature in SDC.

  • Being based on a fork of Solaris brings you baked in enterprise ready features like IPSEC, IPF, RBAC, SMF, Resource management and capping, System auditing, Filesystem monitoring, etc.
  • Zones are the big daddy of container technology guaranteeing you the best on-metal performance for your cloud instances. If you are running a native SmartOS guest, you get the added benefit of CPU bursting and live machine re-size (no reboot, or machine pause necessary).
  • ZFS is the most reliable, high performance, file system in the world and is constantly improving.
  • DTrace is the secret to low level visibility with zero to no overhead. In cloud deployments where visibility is usually close to zero, this is an amazing feature. It’s even more amazing as the cloud operator.

Focus

SDC was built for one thing by one company, to replace the data centers of the past. It says so in the name. With one purpose, SDC has been built to be veryopinionated about what it does and how it does it. This gives SDC a tremendous amount of focus, something sorely lacking from would-be competition like OpenStack.

Lastly, it works.

Sun Oracle Webcast Wrap Up

Last night I watched almost the entire 5 hour live webcast announcing Oracle’s strategies regarding the Sun Microsystems acquisition. As a near-evangelist for Sun and Solaris, I’m very happy with the deal finally going through and even happier that most of what Oracle said makes sense to me as a customer.

What I liked:

  • The clear commitment to the SPARC roadmap especially the T series. I honestly don’t know what I would have done if the T series servers disappeared. I’m very happy that they put raising the clock speed into the roadmap because some applications just can’t be deployed on these servers.
  • The clear commitment to making waves in Enterprise Storage. NetApp was specifically mentioned and obviously the 7000 series arrays are best suited to compete with the NetApp arrays but I hope they will draw some EMC blood as well. I like the plans for integrating backup capabilities.
  • The plans to integrate really great Solaris tech into Oracle applications like DTrace, and RBAC
  • The plans to offer direct support. Honestly this was one of the most annoying parts of working with Sun was having to work with different support providers in every location.
  • The plans to change the supply chain and ship direct- no more out of stock excuses.
  • The plans to integrate Ops Center with Oracle Enterprise Manager.
  • Larry Ellison’s stand up comedy
  • And completely unrelated- the flashing disk lights on the Exadata V2 🙂

I didn’t like:

  • The obvious cut planned for the x64 line of hardware. While they are keeping x64 where convenient (storage appliances, database machines, various other “clusters”) it looks like Oracle has no plans for dealing in x64 server business as a server business. I’m not a big user of the x64 stuff for servers but Sun doesn’t really offer anything reasonable for entry level anymore except the x64 line. This brings me to my next point-
  • The SPARC roadmap is slightly sucky as in how much processing power do you really want inside a single box.  According to the roadmap, their next plan is to double the amount of cores in a T3 processor so you’ll have one cpu with 16 cores and 128 threads. Their going to put two in a machine? four?  Here is how I see the servers they have today:
    • T1000- useless poorly designed server
    • T2000- ok server but a waste of rack space at 2RU
    • T5120- ok server but a waste of rack space considering I could put a T5140 in the same space
    • T5220- worse than the T5120 at 2RU
    • T5140- The best server ever built with exactly the right amount of everything
    • T5240- 2RU again???
    • T5440- I could serve ~8.64 billion web requests per day from one of these but I’d need a 1.6Gbit uplink and two servers for redundancy = 8RU, or else use 4 T5140 machines, deliver the same performance, and use 4RU?- maybe 5RU including n+1 redundancy.
    • NONEXISTANT – little SPARC machine for backup/monitoring/insert your SPARC only app that doesn’t deserve a minimum of 32 threads and 2RU  here.

    At some point, you just want more smaller machines for less points of failure. I really have uses for low end SPARC machines and they don’t make them any more.

  • I don’t really like the “server phone home” idea.
  • No mention of OpenSolaris- I’m not really a user but I didn’t like that it wasn’t mentioned- What does that mean??
  • No mention of Webstack. I really like Sun Webstack as an idea. I’m not sure what is happening to it now?
  • No mention of how Oracle will be combining the knowledge bases? Sunsolve? Bigadmin? docs.sun.com? forums.sun.com (looks like this already had an Oracle makeover :?)

One thing I’m not sure about is the integration of Sun virtualization technologies into Oracle VM. On one hand it sounds good, on the other hand, I think this was the only part of the presentation where I noticed there were no due dates. Virtualization is super important to me so I really want to know where things stand.

Obviously, it is easy to  get up and say everything will integrate but doing it is much harder. Just getting past the internal politics of this will be a major issue. Now we can only wait and see if Oracle can pull it off.

I used to get upset with “Oracle people” for always thinking that Oracle was the solution to every problem. If they pull off this acquisition, I much just become an “Oracle person” myself.