Month: November 2009

Xlib: PuTTY X11 proxy: wrong authentication protocol attempted

While setting up some developers with remote SmartSVN via X over SSH using Plink, I ran into the following error:

Xlib: PuTTY X11 proxy: wrong authentication protocol attempted

SmartSVN couldn’t connect to the tunneled X server display. I was extremely confused since I’d been using X tunneling successfully with SecureCRT. After googling the error message a little bit, it seems that the part about the “wrong authentication protocol attempted” is  misleading. You could get this message for not having the right magic cookie on the client side, or for not having a cookie at all as was apparently my case.

In my case, the developers are being authenticated against Active Directory via Samba/Winbind. Their home directories are non-existent until the first time they login via SSH. When using XForwarding over SSH, the ssh daemon on the server usually handles setting the DISPLAY and authentication cookies but in my case, it was trying to set up the cookies before the user’s home directory was created.

With some more digging, I found the $HOME/.ssh/rc and /etc/ssh/sshrc files which allows you to replace the standard XForwarding process with you custom process. Paraphrased from the sshd man page:

The primary purpose of $HOME/.ssh/rc is to run any initialization routines that might be needed before the user’s home directory becomes accessible; AFS is a particular example of such an environment…

If X11 forwarding is in use, it will receive the proto cookie pair in its standard input and DISPLAY in its environment. The script must call xauth because sshd will not run xauth automatically to add X11 cookies…

This file will probably contain some initialization code followed by something similar to:

if read proto cookie && [ -n "$DISPLAY" ]
  if [ `echo $DISPLAY | cut -c1-10`  =  'localhost:' ]
    # X11UseLocalhost=yes
    echo add unix:`echo $DISPLAY |
    cut -c11-` $proto $cookie
    # X11UseLocalhost=no
    echo add $DISPLAY $proto $cookie
  fi | xauth -q -

If this file does not exist, /etc/ssh/sshrc is run, and if that does not exist, xauth is used to store the cookie…

/etc/ssh/sshrc : Similar to $HOME/.ssh/rc. This can be used to specify machine-specific login-time initializations globally.

I pretty much cut and paste the code from the man page with two caveats-

  1. I used the full path to the xauth binary in the second to last line.
  2. I added the process to create the user’s home directory before the xauth.

That done, Plink was able to setup the XForwarding tunnel without a problem. I still can’t explain why Plink failed in the first place while SecureCRT had no problems with having the home directories appear later in the login process.

Additional Reading – X Protocol background:

X is a client-server protocol. The client program connects to a DISPLAY (usually defined in the similarly named environment variable) which represents the server displaying the GUI. Technically the display can refer to an X server on the same machine, on a remote machine on the same LAN, or even a server located across the Internet.

In order for a client to successfully connect to a display, the client needs to be authorized using either host authentication, cookie authentication, or user authentication. Host authentication allows all connections to an X server from one or more hosts/ip addresses. This is extremely insecure and should not be used. User authentication requires the client to authenticate as a user (using Kerberos for example) with authorization to access the X server. The most common authentication used with X servers is cookie authentication which basically uses a pre-shared key to authenticate clients. If your client knows the key, it gets in. If not, not.

In most cases, ie. every Linux desktop installation, the X server and client are on the same machine so both the server and client can easily look at the cookie in the user’s home directory. In the case of a remote connection (a purely X protocol connection between client and server over the network), the user will have to copy the cookie from the server side to the client side using the xauth utilities. Since the advent of SSH and XForwarding, this process has pretty much gone to pasture. The ssh client and ssh daemon are now mostly responsible for setting up authentication on the tunneled X connection although in cases like the one above, administrators might have to help things along.

Issues Running Java Programs Remotely via X

Recently I started using SmartSVN running remotely on a Solaris 10 server and displaying on my Windows XP machine via Xming (the free Xserver). I quickly ran into some performance and usability issues which I hope to have solved.

  • Performance:
    1. Xming uses OpenGL for rendering by default. I added the following option to the java command line: -Dsun.java2d.opengl=true
      I’m not sure this technically made a difference but a developer claimed a 300% performance increase??
    2. Since I am working over a LAN, I disabled compression for the SSH connection being used to tunnel X and set my preferred cipher to Blowfish. Again- I’m not sure this made a difference but theoretically it should be faster.
  • Usability:
    1. There is an apparently known issue with running Java programs via X which causes all sorts of problems in painting the windows correctly. In my case, the mouse position was being shown in one place but the menus were being highlighted about 50 pixels lower. It took me a while to find it in this postabout Swing on Remote X and this related post about cursors and menus in NetBeans. Of course, after finding it in Google, I found the following comments in the SmartSvn start script:
      # If you experience problems, e.g. incorrectly painted windows,
      # try to uncomment one of the following two lines
      #export AWT_TOOLKIT=MToolkit
      #export AWT_TOOLKIT=XToolkit

      Uncommenting the export AWT_TOOLKIT=MToolkit did the trick. Now the menus are much more responsive and the cursors show in the correct position.

I’m not sure if the problem here is in XToolkit or in SmartSVN. I’m also not sure what this will mean in future versions of SmartSVN. Apparently JDK7 will not support MToolkit so either XToolkit has to work properly by then or SmartSVN has to be fixed to use XToolkit properly.

EMC Replication Manager in Solaris

UPDATE: No ZFS Support for Replication Manager in the near future

Using storage level snapshots can be used to run backups without directly requiring resources from the original host.

EMC Replication Manager coordinates the creation of application consistent snapshots across all the hosts in your network. It handles scheduling creation/expiration of snapshots,  mounting and unmounting from backup servers, etc. from a single console.

Although it is not tightly integrated into EMC Networker like the similar Networker PowerSnap module, it can be used to start a backup process after taking a new snapshot and it has the capability to manage snapshots unrelated to backups from a GUI.

While the data sheet claims support for Solaris, there are several caveats which I have run into.

  1. There is no mention of ZFS support in the data sheet and apparently, there is no support in the software either. One would expect this to be a non-question since ZFS has been part of Solaris since 2006.
  2. The data sheet is missing the word “SPARC” next to the word Solaris. There is no support for x86.

Honestly, this has put a dent in my plans since my backup server is an x86 box. I’m hoping the lack of ZFS support will work out as long as we can script any FS specific magic we need. I don’t have an option of running something like Linux on it (just to get the software working) because I won’t be able to even mount the ZFS filesystems- let alone back them up.

In the meantime, I’ll have to move my backups to a SPARC server and considering the lack of low end SPARC machines, I’ll have to allocate something way too expensive to be a backup server.

Caveats on Using Snapshots for Server-less Backups

Whether you are dealing with disk I/O in reading the data from the disks, or CPU for compressing or encrypting the data (or both- remember to compress and then encrypt!), or network for transferring the data to a backup server, the added load of a backup on your production servers is unwelcome. For this reason, the period of time during which backups can be made, aka. backup window, may be limited- even severely.

You may say, “It only takes me X hours to do a full backup of everything”, but over time backup windows are notorious for becoming too small. Backups are split over multiple days, technologies upgraded, etc. When planning a backup strategy, my approach is to eliminate the backup window altogether- that is do whatever you can to take the backup off the production hardware altogether.

Storage Snapshots are one method for taking the production servers out of the backup equation. By creating a consistent, point in time snapshot on your storage, and mounting it on your backup server, you can backup your data using your backup server’s resources while your production servers continue as usual.

Caveats of this method in general are:

  1. Most snapshot technologies are some form of “Copy On Write”. This means that after you take a snapshot, the data from any area written to on the disks will first be copied somewhere else for safe keeping and then be overwritten.
    • This may cause a performance hit on your production system as you are generating extra IO on every write.
    • As long as the data being used in production has not changed significantly from the snapshot, your backups will still be sending the majority of their read operations to the same physical disks being used by production so this doesn’t relieve the backup load on the storage as much as it relieves the load on the servers.
  2. Key word is “consistent”.
    • You do not want to be where KDE developers were when ext4 was released. Depending on the applications or systems you are trying to backup, you may need to “quiet” them (FLUSH TABLES WITH READ LOCK,  ALTER TABLESPACE <tablespacename> BEGIN BACKUP, etc.)
    • If your application, ie. Oracle Database uses Datafiles or ASM spread over several LUNs, then all your storage level snapshots probably need to be taken together in order for the DB itself to remain consistent. For more, look at “Consistency Groups.”
  3. Once you have the snapshot, your backup server needs to see the snapshot LUN, and be able to mount the filesystem on the LUN. If your backup server doesn’t run the same operating system as your production servers this may be an issue. Ie. Try convincing a Windows server to mount a ZFS Pool (I dare you).

Anyway- these are just some things to look out for when you want to use storage level snapshots to backup servers without loading the production systems themselves. In another post I’ll touch on some EMC infrastructure specifics to look out for.