Monthly Archives: November 2008

RAID 10 vs RAID 5: Performance, Cost, Space, and HA

DISCLAIMER: I am not a SAN storage expert but I have spent a lot of time looking into SAN storage systems from the business side and I thought I’d share some of my conclusions.

It seems that the proverbial question is how to balance the performance, cost, usable space, and availability of a storage solution. Any DBA will ask you to give him RAID 10 on small fast disks. Anyone paying the bills will ask “Why can’t I use half the disks I bought?”

I took a couple hours with your friendly neighborhood spreadsheet and did the math. I base my calculations on EMC Clariion storage and tried to follow the EMC best practices guide as much as possible.

According to the best practices, I started my calculations based on a necessary performance level consisting of total IOPS, read percentage, and write percentage.
Then, using the following formulas, I calculate the actual disk IOPS required to provide the requested performance:

  • RAID 5 (4+1 Groups)
    Disk IOPS = (Read % * Required IOPS) +
                    (Write % * RAID5 write penalty * Required IOPS)
  • RAID 10
    Disk IOPS = (Read % * Required IOPS) +
                    (Write % * RAID10 write penalty * Required IOPS)

The RAID 5 write penalty in a 4+1 RAID group is 4 while the RAID 10 write penalty is 2.
Before you even put this in a spreadsheet you know what it will tell you-

  • In a 100% Read Only environment RAID 5 and RAID 10 will give the same performance. RAID 5 may use less disks to do it but not necessarily.
  • In a 100% Write Only environment, RAID 5 will require twice as many disk IOPS and almost twice the number of disks.
  • Anywhere in between those two extremes, the more writes required, the less number of RAID 10 disks you will need to achieve the performance.

If we stop there, it doesn’t seem like there is any point in using RAID 5 since even in the best case scenario, there is only a partial chance that we will use less disks. That is where the cost and space effectiveness issues come in.

  • Space Effective Storage Allocation

If I want 2000 IOPS, 100% Read Only, I can do that using 15 x 146GB 15k RPM disks in RAID 5 or in RAID 10. In RAID 5 I will get ~1.5TB net space while in RAID 10 I will get ~1TB.

  • Cost Effective Storage Allocation

So far, we have compared different RAID types using the same size and speed disks and we saw that theoretically we can use less disks to reach the same performance but at the expense of usable disk space.

If we use bigger disks for the RAID 10, does it make up for the lost space? What effect does using RAID 10 with fewer large disks as opposed to RAID 5 with lots of smaller disks have on the cost of my solution?

That brings us back to the spreadsheet. Using the required disk IOPS we can figure out the required number of physical disks of each type. For the sake of comparison I use the following information which I found on the Internet (your mileage may vary):

  • 146GB 4GbFC 15k RPM, 140 IOPS, $1256
  • 300GB 4GbFC 10k RPM, 120 IOPS, $1348
  • 1TB 4Gb SATA II 7.2k RPM, 80 IOPS, $2088

For each of these I calculate the minimum number of physical disks required for to reach the required IOPS with the required read/write profile for both RAID 10 and RAID 5. Then I figure in the RAID group sizes and calculated the usable disk space.

Using the prices above, I calculate the price per TB of disk space in each RAID configuration and find:

  • 146GB, RAID 5 (4+1): $11.91K/TB
  • 300GB, RAID 5 (4+1): $6.35K/TB
  • 1TB, RAID 5 (4+1): $2.87K/TB
  • 146GB, RAID 10 (4+1): $19.01K/TB
  • 300GB, RAID 10: $10.15K/TB
  • 1TB, RAID 10: $4.59K/TB

What is really interesting here is how close the 300GB RAID 10 is to the 146GB RAID 5! Is this a coincidence?

Looking at the IOPS/TB relationship and $K/IOPS, we find that the ratios are dependant on the read/write profile of the required IOPS. Given the similar Price/TB of 300GB RAID 10 and 146GB RAID 5, I look there for a price/performance/disk space sweet spot.

The following table shows the difference between 146GB RAID 5 IOPS/TB and 300GB RAID 10 IOPS/TB.
Each column represents a different Read percentage (the Write percentage is the inverse).
Negative numbers mean that for this Read percentage and IOPS requirement, RAID 10 gives more IOPS/TB of disk. Positive numbers mean that RAID 5 gives better IOPS/TB.

What you see from this is that for any read workload under 70%, you will get more IOPS/TB from 300GB 10k RPM disks using RAID 10 than you will with RAID 5 on 146GB 15k RPM disks.
Even if you hit 80%, RAID 5 will gain less than 100 IOPS over the RAID 10 configuration and you are still better off paying less for your disks- let the cache do it’s job. Combine all this with our previous conclusion – that the 300GB RAID 10 configuration is ~$1.75K less expensive per TB and I say you have a winner.

Network Interface Utilization in Solaris

A friend asked me how he could see the network utilization in Solaris. It seems like a fairly simple request but for some reason this is not a simple command line away.

In Linux I would instinctively go straight to iptraf. I don’t know if iptraf is the tool of choice these days but I’m pretty sure it is an apt-get away if not already installed.

If you are a DTrace wizard, you could whip something up. Maybe you could get the information from one of the of the DTraceToolkit scripts if their installed. The DTraceToolkit scripts I’ve seen seem to give too much information as most of them are concentrated on not only telling you if the network is loaded but what is loading it as well.

For the sake of practice I wrote the following script:

#!/usr/bin/perl -w
print "Interface: ";
$if=<>;
chomp($if);
$max=`dladm show-dev -p $if | awk -F= '{print \$3}' | awk '{print \$1*1024*1024/8}'`;
print "Max speed: ",$max,"\n";
$if=~m/([a-z0-9]+?)(\d+)/;
($module,$instance)=($1,$2);
$last_rbytes=0;
$last_obytes=0;
while(1){
@kstat=`kstat ${module}:${instance}:mac:/[or]bytes\$/ |awk '{print \$2}'`;
chomp(@kstat);
if($last_rbytes!=0){
printf("%02d%%\n",
(($kstat[$#kstat-1]-$last_rbytes)+
($kstat[$#kstat-2]-$last_obytes))/$max*100);
}
$last_rbytes=$kstat[$#kstat-1];
$last_obytes=$kstat[$#kstat-2];
sleep 1;
};

This script will ask you which interface you want to watch and then print out the utilization percentage on a new row every ~second.

On a side note, it seems strange to me the the received bytes are stored in kstat as rbytes while the transmitted bytes are stored in obytes. The only answer I can come up with is that if they would have chosen ibytes (in bytes) instead of rbytes, then the ‘i’ and ‘o’ might become interchanged in typos since they are next to each other on the keyboard. If they would have chosen tbytes (transmitted bytes), the same situation occurs- ‘r’ next to ‘t’. Still, as a friend pointed out, they could have used sbytes (sent bytes) which makes more sense than obytes.

Top on Solaris

Recently, I was asked to give some advice on an integration project involving some Solaris web servers . One of the sides requested to install the top command.
Now I know and love top for Linux but using top on Solaris is a waste in my opinion. Solaris comes with the prstat command built in- why use something else?
Of course he answered that top was standard for him and he was used to it but I felt obliged to convince him otherwise so I dug around and found some proof 🙂

Brendan Gregg wrote up a great piece comparing top vs prstat using dtrace on his website:
http://www.brendangregg.com/DTrace/prstatvstop.html.

In summary, he finds the following:

  • Top uses more system calls than prstat
  • Top opens and closes the psinfo file over and over while prstat only open it once and saves the file handle
  • Top takes more cpu time to do its job than prstat due to the overhead in the extra system calls and code differences
  • When top uses the cpu it uses it for longer than prstat
  • Most of the issues top has compared to prstat are connected to the number of processes running on the server so the more processes running, the worse top will perform compared to prstat

Aside from the performance issues, prstat also has the ability to give you project and zone related information which I doubt top knows about.

In short top is great for Linux but if you are going to use Solaris, use prstat!