Tag Archive for sysadmin

Sun Oracle Webcast Wrap Up

Last night I watched almost the entire 5 hour live webcast announcing Oracle’s strategies regarding the Sun Microsystems acquisition. As a near-evangelist for Sun and Solaris, I’m very happy with the deal finally going through and even happier that most of what Oracle said makes sense to me as a customer.

What I liked:

  • The clear commitment to the SPARC roadmap especially the T series. I honestly don’t know what I would have done if the T series servers disappeared. I’m very happy that they put raising the clock speed into the roadmap because some applications just can’t be deployed on these servers.
  • The clear commitment to making waves in Enterprise Storage. NetApp was specifically mentioned and obviously the 7000 series arrays are best suited to compete with the NetApp arrays but I hope they will draw some EMC blood as well. I like the plans for integrating backup capabilities.
  • The plans to integrate really great Solaris tech into Oracle applications like DTrace, and RBAC
  • The plans to offer direct support. Honestly this was one of the most annoying parts of working with Sun was having to work with different support providers in every location.
  • The plans to change the supply chain and ship direct- no more out of stock excuses.
  • The plans to integrate Ops Center with Oracle Enterprise Manager.
  • Larry Ellison’s stand up comedy
  • And completely unrelated- the flashing disk lights on the Exadata V2 🙂

I didn’t like:

  • The obvious cut planned for the x64 line of hardware. While they are keeping x64 where convenient (storage appliances, database machines, various other “clusters”) it looks like Oracle has no plans for dealing in x64 server business as a server business. I’m not a big user of the x64 stuff for servers but Sun doesn’t really offer anything reasonable for entry level anymore except the x64 line. This brings me to my next point-
  • The SPARC roadmap is slightly sucky as in how much processing power do you really want inside a single box.  According to the roadmap, their next plan is to double the amount of cores in a T3 processor so you’ll have one cpu with 16 cores and 128 threads. Their going to put two in a machine? four?  Here is how I see the servers they have today:
    • T1000- useless poorly designed server
    • T2000- ok server but a waste of rack space at 2RU
    • T5120- ok server but a waste of rack space considering I could put a T5140 in the same space
    • T5220- worse than the T5120 at 2RU
    • T5140- The best server ever built with exactly the right amount of everything
    • T5240- 2RU again???
    • T5440- I could serve ~8.64 billion web requests per day from one of these but I’d need a 1.6Gbit uplink and two servers for redundancy = 8RU, or else use 4 T5140 machines, deliver the same performance, and use 4RU?- maybe 5RU including n+1 redundancy.
    • NONEXISTANT – little SPARC machine for backup/monitoring/insert your SPARC only app that doesn’t deserve a minimum of 32 threads and 2RU  here.

    At some point, you just want more smaller machines for less points of failure. I really have uses for low end SPARC machines and they don’t make them any more.

  • I don’t really like the “server phone home” idea.
  • No mention of OpenSolaris- I’m not really a user but I didn’t like that it wasn’t mentioned- What does that mean??
  • No mention of Webstack. I really like Sun Webstack as an idea. I’m not sure what is happening to it now?
  • No mention of how Oracle will be combining the knowledge bases? Sunsolve? Bigadmin? docs.sun.com? forums.sun.com (looks like this already had an Oracle makeover :?)

One thing I’m not sure about is the integration of Sun virtualization technologies into Oracle VM. On one hand it sounds good, on the other hand, I think this was the only part of the presentation where I noticed there were no due dates. Virtualization is super important to me so I really want to know where things stand.

Obviously, it is easy to  get up and say everything will integrate but doing it is much harder. Just getting past the internal politics of this will be a major issue. Now we can only wait and see if Oracle can pull it off.

I used to get upset with “Oracle people” for always thinking that Oracle was the solution to every problem. If they pull off this acquisition, I much just become an “Oracle person” myself.

No ZFS Support for EMC Replication Manager

As I originally blogged, I was hoping to use EMC snapshots to perform server-less/network-less backups. EMC provides two main tools for managing snapshots in this type of situation:

  • EMC Replication Manager
  • EMC PowerSnap Networker Module

The PowerSnap Module supposedly automates taking snapshots for the purpose of backups, while Replication Manager supposedly provides a much more robust package.

With Replication Manager you might create a policy to take a snapshot every five minutes, keep the last 10, and use those for backups whenever necessary.

To make a long story short, Replication Manager is useless for LUNs with ZFS. According to EMC, this won’t change in the near future. PowerSnap also has no support for taking snapshots of LUNs with ZFS on them so basically EMC has no server-less backup offerings for Solaris with ZFS.

As an IT guy in general, ZFS is the best thing that has happened to file systems in the last 10 years and it is only getting better. ZFS is already standard in FreeBSD and NetBSD. Linux supports ZFS over FUSE due to license issues but I’m confident those will be solved. The file system is platform independent, meaning you can move the data transparently between Intel and Sparc architectures. Deduplication has just been added to the feature set and disk encryption is on it’s way.

As a Solaris admin, I really can’t figure out why EMC would decide to cut off their own foot like this. It is clear that UFS will remain for legacy and backwards compatibility but ZFS is the future. Not planning to support ZFS is like not planning to support Solaris.

The only possibility that I can see is that EMC sees Sun, Solaris, and ZFS as enough of a threat, that they are strategically trying to limit options? For operations local to a server, ZFS has largely replaced the need for heavy hardware like EMC on the SAN. Some would argue that ZFS RAID + JBOD is better than ZFS + RAID on EMC. You can do the snapshots without the EMC. On a simple level, you can send snapshots asynchronously to another system, similar to MirrorView, without the EMC. You can do deduplication without the EMC. Now with Sun’s Flash Cache technology which integrates with ZFS, you can get the performance without the EMC. Along the same lines, you see Sun changing the rules of the storage/database game with solutions like Exadata V2. The integration of Zones with ZFS may be challenging Vmware on the virtualization front, especially with the serious advantage Sun’s Coolthreads servers have in terms of consolidation.

That said, I still prefer to offload this work to dedicated storage hardware for the time being and probably in the future. If EMC chooses not to support ZFS, they will only force us not to buy EMC arrays. We will stop buying disks, stop buying tools, etc.

Instead, they should be providing better support for ZFS, integrating with ZFS to get better performance, providing tools which make EMC the preferred disk array behind a ZFS filesystem.

Making Path Persistent

I’ve been paying a lot of attention to this site since I switched platforms and somehow people are finding some fairly irrelevant content on my site for the search terms making path persistent in solaris 10 so I figured I better put some real answers up.

It is hard to know exactly what kind of path they had in mind- were they referring to the standard PATH variable which lists the directories in which to search for executables or were they referring to something more complicated?

You can make the executable search PATH variable persistent in several ways:

  1. On the system level you can set it in the /etc/profile file. It will affect all users except maybe root.
  2. On a per user level, or for the user root, you can set the PATH in the .profile file in the user’s home directory

Caveats on Using Snapshots for Server-less Backups

Whether you are dealing with disk I/O in reading the data from the disks, or CPU for compressing or encrypting the data (or both- remember to compress and then encrypt!), or network for transferring the data to a backup server, the added load of a backup on your production servers is unwelcome. For this reason, the period of time during which backups can be made, aka. backup window, may be limited- even severely.

You may say, “It only takes me X hours to do a full backup of everything”, but over time backup windows are notorious for becoming too small. Backups are split over multiple days, technologies upgraded, etc. When planning a backup strategy, my approach is to eliminate the backup window altogether- that is do whatever you can to take the backup off the production hardware altogether.

Storage Snapshots are one method for taking the production servers out of the backup equation. By creating a consistent, point in time snapshot on your storage, and mounting it on your backup server, you can backup your data using your backup server’s resources while your production servers continue as usual.

Caveats of this method in general are:

  1. Most snapshot technologies are some form of “Copy On Write”. This means that after you take a snapshot, the data from any area written to on the disks will first be copied somewhere else for safe keeping and then be overwritten.
    • This may cause a performance hit on your production system as you are generating extra IO on every write.
    • As long as the data being used in production has not changed significantly from the snapshot, your backups will still be sending the majority of their read operations to the same physical disks being used by production so this doesn’t relieve the backup load on the storage as much as it relieves the load on the servers.
  2. Key word is “consistent”.
    • You do not want to be where KDE developers were when ext4 was released. Depending on the applications or systems you are trying to backup, you may need to “quiet” them (FLUSH TABLES WITH READ LOCK,  ALTER TABLESPACE <tablespacename> BEGIN BACKUP, etc.)
    • If your application, ie. Oracle Database uses Datafiles or ASM spread over several LUNs, then all your storage level snapshots probably need to be taken together in order for the DB itself to remain consistent. For more, look at “Consistency Groups.”
  3. Once you have the snapshot, your backup server needs to see the snapshot LUN, and be able to mount the filesystem on the LUN. If your backup server doesn’t run the same operating system as your production servers this may be an issue. Ie. Try convincing a Windows server to mount a ZFS Pool (I dare you).

Anyway- these are just some things to look out for when you want to use storage level snapshots to backup servers without loading the production systems themselves. In another post I’ll touch on some EMC infrastructure specifics to look out for.

Webservd Default Home Directory

Someone currently building an internal development environment required some integration between servers using SSH and the webservd user.

He came to me when he saw that the default home directory for the webservd user is /.  He didn’t want to create a /.ssh/authorized_keys file and I didn’t blame him. My first reaction was to change the home directory but I didn’t want to break something so I opened up Google and found something incredible.

DISCLAIMER: The following is quoted from documentation at docs.sun.com (emphasis is mine). I do not recommend you actually listen to it’s instructions:

If the runtime user of the OpenSSO Enterprise web container instance is a non-root user, this user must be able to write to its own home directory.

For example, if you are installing Sun Java System Web Server, the default runtime user for the Web Server instance is webservd. On Solaris systems, the webservd user has the following entry in the /etc/passwd file:

webservd:x:80:80:WebServer Reserved UID:/:

The webservd user does not have permission to write to its default home directory (/). Therefore, you must change the permissions to allow the webservd user to write to its default home directory. Otherwise, the webservd user will encounter an error after you configure OpenSSO Enterprise using the Configurator.

Did someone actually write in documentation to give the webservd user write access to / ?!?!? What were they thinking?