Tag: FreeBSD

Triton Bare Metal Containers FTW!

triton

If you haven’t heard of Docker, I’m not sure what cave you’ve been living in but here’s the short story:

Hardware level virtualization, like you are used to (VMWare, Virtualbox, KVM, Xen, Amazon, Rackspace, Azure, etc.) is slow and awful. Virtualization at the OS level, where processes share isolated access to a single Operating System kernel, is much more efficient and therefore awesome. Another way of saying OS level virtualization is “containers” such as have existed in FreeBSD and Solaris for over a decade.

Docker is hacking together a whole slew of technologies (cgroups, union filesystems, iptables, etc.) to finally bring the concept of containers to the Linux masses. Along the way, they’ve also managed to evolve the concept a little adding the idea of running a container as very portable unit which runs as a process.

Instead of managing dependencies across multiple environments and platforms, an ideal Docker container encapsulates all the runtime dependencies for a service or process. Instead including a full root file system, the ideal Docker container could be as small as a single binary. Taking that hand in hand with the growing interest in developing reusable microservices, we have an amazing tech revolution on our hands.

That is not to say that everything in the Docker world is all roses. I did say that Docker is hacking together a slew of technologies so while marketing and demos portray:

You are more likely to start with something like this:

Either way, tons of containers per host is great until you realize you are lugging them around on a huge, slow, whale of a cargo ship.

  1. Currently, Docker’s level of isolation between containers is not that great so security and noisy neighbors are issues.
  2. Containers on the same host can’t easily run on the same ports so you may have to do some spaghetti networking.
  3. On top of that, if you are running Docker in Amazon, Google, Azure, etc. you are missing the whole point which was to escape the HW level virtualization.

Joyent to the rescue!

Joyent is the only container based cloud provider that I’m aware of. They have been running the vast majority of my cloud instances (possibly yours as well) on OS level virtualization for years now (years before Docker was a twinkle in Shamu’s eye). As such, they are possibly the most experienced and qualified technical leaders on the subject.

They run a customized version of Illumos, an OpenSolaris derivative, with extremely efficient zone technology for their containers. In January (Linux and Solaris are Converging but Not the Way you Think), I wrote about the strides Joyent made allowing Linux binaries to run inside Illumos zones.

Triton May Qualify as Witchcraft

The love child of that work, announced as GA last week, was Triton- a Docker API compliant (for the most part) service running zone based Docker containers on bare metal in the cloud. If running on bare metal weren’t enough of an advantage, Joyent completely abstracted away the notion of the Docker host (ie. the cargo ship). Instead, your Docker client speaks to an API endpoint which schedules your bare metal containers transparently across the entire cloud.

Addressing each of the points I mentioned above:

  1. Zones in Illumos/Joyent provide complete isolation as opposed to Linux based containers so no security or noisy neighbor problems.
  2. Every container gets it’s own public ip address with a full range of ports so no spaghetti networking
  3. No Docker host and no HW virtualization so every container is running full speed on bare metal

Going back to the boat analogy, if Docker containers on Linux in most clouds looks like this:

Docker containers as zones on bare metal in Joyent look like this:

Enough of the FUD

I’m not a big fan of marketing FUD so I’ve been kicking the tires on the beta version of Triton for a while. Now with the GA, here are the pros and cons.

Pros:

  1. Better container isolation
  2. Better networking support
  3. Better performance
  4. No overhead managing a Docker host
  5. Great pricing (per minute and much lower pricing)
  6. User friendly tooling in the portal, including log support and running commands on containers using docker exec.

Cons:

  1. The API still isn’t fully supported so things like build and push don’t work. You can mostly work around this using a docker registry.
  2. Lack of a Docker Host precludes using some of the patterns that have emerged for logging, monitoring, and sharing data between containers.

Summary

Docker is a game changer but it is far from ready for prime time. Triton is the best choice available today for running container based workloads in general, and for production Docker workloads specifically.

At Codefresh, we’re working to give you the benefits of containers in your development workflow without the headache of actually keeping the ship afloat yourselves.  Sign up for our free Beta service to see how simple we can make it for you to run your code and/or Docker containers in a clean environment. Take a look at my getting started walkthrough or contact me if you have any questions.

Portsnap, Apache Configurations, and CGI – Questions Answered

  1. Explain the importance of installing and running portsnap after installing a current version of FreeBSD.

    Portsnap is a system for securely distributing the FreeBSD ports tree. Approximately once an hour, a “snapshot” of the ports tree is generated, repackaged, and cryptographically signed. The resulting files are then distributed via HTTP.

    The first time portsnap is run, it will need to download a compressed snapshot of the entire ports tree (portsnap fetch) and then a “live” copy of the ports tree can be extracted into /usr/ports/ (portsnap extract). This is necessary even if a ports tree has already been created in that directory (e.g., by using CVSup), since it establishes a baseline from which portsnap can determine which parts of the ports tree need to be updated later.

    Initializing portsnap as soon as possible will ensure the most secure and up to date software installations on your machine and will prevent a long download of the initial compressed tree when you need it later.

    After the initialization of portsnap, it is recommended to put ‘portsnap cron’ in a cronjob to fetch updates regularly. Then you should use ‘portsnap update’ before using the ports system. Putting ‘portsnap update’ in cron is not recommended since it can cause problems if run while using the ports tree.

  2. Explain the role configuration files in Unix applications. In Apache version 2.2, the configuration files have been modularized. What are the advantages and disadvantages of using a modular approach to configuration files?

    Configuration files allow you to control the settings and parameters of a service or program by editing in most cases a simple text file. Well known configuration files include /etc/hosts (local hostname to ip resolution), /etc/nsswitch.conf (name service configuration), /etc/resolv.conf (DNS server configuration). Samba uses a configuration file which looks more like a windows .ini file. Apache uses it’s own XML-ish format

    Apache’s use of modular configuration files is not new in version 2.2 as can be seen here: http://httpd.apache.org/docs/1.3/mod/core.html#include

    More likely you are used to a binary distribution of Apache which has split the configuration file into several files/directories and included them for te base configuration. The reason for doing this is convenience. It is easy to manage multiple Apache servers with similar settings by creating a basic shared configuration for all servers and only modifying a subset of the configuration sitting in an included file.

    It is also popular to use a set of prepared configuration files (module configurations/vhost configurations) in a directory marked “_available” and symlink them into a directory called “_enabled” which is included in the Apache configuration. This provides a quick on/off mechanism for certain configurations.

  3. Explain the use of “directives” in configuration files. Provide an example of two directives found in an Apache configuration file and detail what each accomplishes.

    Directives is a word which is pretty specific to Apache. Each directive controls some part of the configuration. Apache has ~410 of them. Each one is characterized by the syntax of the arguments it accepts, the default value if there is one, the context in which it can be used (server, virtual host, directory, etc.), what overrides must be in place for the directive to be used in a .htaccess file, status, module, and compatibility.

    Examples:

    ServerName is a directive which sets the request scheme, hostname and port that the server uses to identify itself. http://httpd.apache.org/docs/2.2/mod/core.html#servername

    The ServerAdmin directive sets the contact address that the server includes in any error messages it returns to the client. http://httpd.apache.org/docs/2.2/mod/core.html#serveradmin

  4. What is meant by “overrides”? Provide an example of an override found in an Apache configuration file and detail what is accomplished by the override.

    An override allows a directive to be overridden by directives in a .htaccess file located in one of the web content directories.

    An example of an override is AuthConfig. This override will allow the .htaccess file in a directory to change the apache configuration of that directory in terms of authentication (either allow or deny access, specify users, etc.) http://httpd.apache.org/docs/2.2/mod/core.html#allowoverride

  5. Define what is meant up a Common Gateway Interface, how it is used in websites, and the methods for providing one? Discuss the advantages and disadvantages of providing this functionality on a web site.

    CGI is an older standard for allowing a web server like Apache to send request parameters to an external program and use the program’s output as a response. Before scripting languages like PHP or PERL where built straight into web server modules, this was the only way to use dynamically generated content.

    CGI is generally not a great solution today although it is still used. It’s performance is poor due to the need to start a completely new process on each request. CGIs generally take more memory and require more open processes on the server. PERL has a CGI module which makes writing a CGI script fairly easy.

    More common today is the use of FastCGI which requires a different interface from the external program. FastCGI keeps a number of external programs running to improve performance. Many recommend running PHP as a FastCGI program in order to take advantage of Apache’s newer multi-threaded MPM.

No ZFS Support for EMC Replication Manager

As I originally blogged, I was hoping to use EMC snapshots to perform server-less/network-less backups. EMC provides two main tools for managing snapshots in this type of situation:

  • EMC Replication Manager
  • EMC PowerSnap Networker Module

The PowerSnap Module supposedly automates taking snapshots for the purpose of backups, while Replication Manager supposedly provides a much more robust package.

With Replication Manager you might create a policy to take a snapshot every five minutes, keep the last 10, and use those for backups whenever necessary.

To make a long story short, Replication Manager is useless for LUNs with ZFS. According to EMC, this won’t change in the near future. PowerSnap also has no support for taking snapshots of LUNs with ZFS on them so basically EMC has no server-less backup offerings for Solaris with ZFS.

As an IT guy in general, ZFS is the best thing that has happened to file systems in the last 10 years and it is only getting better. ZFS is already standard in FreeBSD and NetBSD. Linux supports ZFS over FUSE due to license issues but I’m confident those will be solved. The file system is platform independent, meaning you can move the data transparently between Intel and Sparc architectures. Deduplication has just been added to the feature set and disk encryption is on it’s way.

As a Solaris admin, I really can’t figure out why EMC would decide to cut off their own foot like this. It is clear that UFS will remain for legacy and backwards compatibility but ZFS is the future. Not planning to support ZFS is like not planning to support Solaris.

The only possibility that I can see is that EMC sees Sun, Solaris, and ZFS as enough of a threat, that they are strategically trying to limit options? For operations local to a server, ZFS has largely replaced the need for heavy hardware like EMC on the SAN. Some would argue that ZFS RAID + JBOD is better than ZFS + RAID on EMC. You can do the snapshots without the EMC. On a simple level, you can send snapshots asynchronously to another system, similar to MirrorView, without the EMC. You can do deduplication without the EMC. Now with Sun’s Flash Cache technology which integrates with ZFS, you can get the performance without the EMC. Along the same lines, you see Sun changing the rules of the storage/database game with solutions like Exadata V2. The integration of Zones with ZFS may be challenging Vmware on the virtualization front, especially with the serious advantage Sun’s Coolthreads servers have in terms of consolidation.

That said, I still prefer to offload this work to dedicated storage hardware for the time being and probably in the future. If EMC chooses not to support ZFS, they will only force us not to buy EMC arrays. We will stop buying disks, stop buying tools, etc.

Instead, they should be providing better support for ZFS, integrating with ZFS to get better performance, providing tools which make EMC the preferred disk array behind a ZFS filesystem.