software RAID on old vs. new CPUs

The Linux kernel has several software RAID algorithms, and selects the one that is fastest on your CPU. Isn't that always the same algorithm then ? No, definitely not. Newer CPUs have additional instructions that help speed things up. And it's not just clock speed that matters, memory bandwidth plays an important role too.

  • On an old Pentium II Xeon 450 MHz, raid5 uses p5_mmx, and raid6 uses mmxx2. Software raid6 calculations are 72% slower than raid5.
  • On a Pentium IV Xeon 1.5 GHz, raid5 using pIII_sse, and raid6 uses sse2x2. Software raid6 calculations are 12% slower than raid5.
  • On an AMD Athlon XP2000+ (1.6 GHz), raid5 uses pIII_sse, raid6 uses sse1x2. Software raid6 calculations are 42% faster than raid5.
On 64-bit systems, no relevant instructions are different between generations so far:
  • On a AMD Athlon64 XP3400 (2.4 GHz), raid5 uses generic_sse, raid6 uses sse2x4 (raid6 44% slower than raid5).
  • On a Xeon 5160 3GHz, raid5 uses generic_sse, raid6 uses sse2x4 (raid6 15% slower than raid5).
  • Same algorithms on a Xeon X5450 3GHz (raid6 20% slower than raid5).
  • Same algorithms on a Xeon E5430 2.66GH (raid6 18% slower than raid5).
  • Same algorithms again on a Xeon X5650 2.66GHz (raid6 15% slower than raid5).


deinoscloud said…
Very interesting!
Do you know what it's been used when running on an Intel® Atom™ Processor D510 1.66GHz (Dual-Core) 32Bit??

bert said…
@deinoscloud: the QNAP Linux kernel doesn't report it in dmesg, apparently. I did try to find that info, of course :-) What did you expect ? ;-)
bert said…
@deinoscloud: found it: on the Atom CPU in my QNAP raid 5 uses pIII_sse, measured at 5120 MB/s, and raid6 uses sse2x2, measured at 1082 MB/s.
These are the same algorithms as on the 32-bit Pentium IV CPU I tested. Performance for raid5 is 2.5x higher for comparable clock speed, but raid6 performance is only half.
The raid5 performance is far above
Of course, being a NAS device, the CPU in the QNAP is basically free to do this, so the CPU overhead is not slowing down other tasks, as long as it doesn't become a bottleneck. And with 4 cores available to spread other tasks over, I don't see it becoming a bottleneck soon.
deinoscloud said…
Awesome. Thx for the info!

