Friday, May 29, 2009

IET to ESX multipathing FAIL

A software-implemented iSCSI initiator (in VMware ESX) on one side, the iSCSI Enterprise Target (IET on CentOS5) software on the other side, and a separate storage VLAN to connect the two. Works great in my test lab.

But then I wanted to add multipathing to the setup. Plan was: create a second VLAN, and give both the IET server and the iSCSI client a new interface in that VLAN, using a new IP subnet. That gives the client two ways to reach the server, thereby introducing multipathing !

Unfortunately, IET spews error messages at high speed when doing this: I got "kernel: iscsi_trgt: Abort Task (01) issued on tid:1 lun:0 by sid:26459747326427648 (Unknown Task)" about 4000 times per second.

Restarted everything without the second storage VLAN and without the new interfaces, and now all is well again.

Friday, May 15, 2009

Link aggregation between CentOS 5 and a SLM2024

It's been a while since I made time to try something new. This week, I finally took something off of the "need to try this" list: link aggregation. I've had a gigabit Ethernet switch with link aggregation for about a year now, and my main Linux box has 3 gigE NICs, but I was still only using one. Time for change.

Google found me some good documentation for channel bonding on CentOS5. Manually editing the ifcfg-eth{0,1,2}, ifcfg-bond0, and modprobe.conf is all that's required. That worked, but the default bonding setting is "balance-rr", the simplest loadbalancing algorithm. What I wanted to use was full IEEE 802.3ad link aggregation, mode 4 of the bonding module.

During testing, I got fooled into believing that "service network restart" unloaded and reloaded the bonding module. It doesn't, I should have tested using "service network stop; rmmod bonding; service network start" from the start. Learned my lesson, configured the switch into LACP mode (dynamic link aggregation instead of static), and I was on for some bandwidth testing.

I tried a couple of different bandwidth eaters, but floodping, NFS reads, didn't really stress the configuration. In comes netcat: "nc -l 5555 > /dev/null" on one side, and "nc myserver 5555 < /dev/zero" on the other, and you'll get a gigabit stream of data in no-time. Using dstat and a couple of netcats, the current record stands at more than 200 MBps. Mission accomplished !