A introduction to
Channelbonding can be found here.
Description
Why?
Round-robin is the only way to get more than the speed of a single interface for a single
TCP-connection. But it doesn't work as it is proposed to be. The Problems are the Switches which have a different transmission policy implemented for a tunrk/etherchannel. To prevent form out-of-order-delivery a MAC-based policy is used. Thus traffic from one node is always recieved on the same single port/interface of the other node.
One can work around this by using two or more (depending on the number ob interfaces bond together) separated physical Networks. This way the Switch doesn't need to know about the Channelbonding and only has one port assigned per MAC address.
Using more than 2 interfaces per node in a bond this setup gets very complexe.
A possible solution:
Using VLANs in stead of physicaly separated Networks.
Design
Read the
VLAN description if you are not familar with
VLANs.
%IMAGE{"BondingVLANs2.jpg|thumb|300px|Diagram: Mixed Environment 1 and 2 nics"}%
%IMAGE{"BondingVLANs4.jpg|thumb|300px|Diagram: Mixed Environment 1,2 and 4 nics"}%
n interfaces/node
For n interfaces per node the Layout is quite simple: n
VLANs are created on the Switch. For each interface a virtual interface is created for a unique
VLAN (same
VLANs as on the Switch). Than these virtual interfaces are bond together.
Finally the Switch is configured to deliver tagged packets of a single
VLAN only to the single network-port of the node which has a
VLAN-interface for that
VLAN.
This way a packet send over interface 'b' is always recived at interface 'b' of the target node! As Packets are spread over all available interfaces on transmission, on the recive side packets are coming in on all interfaces. The Speed should scale linear with the number of interfaes. For each Group of 'b'-interfaces a kind of direct 'pipe' is created.
Mixed Environments
Physically separated Networks wouldn't be flexible enough to house node with different numbers of interfaces! --
Using
VLANs, different interface configurations, are not a problem (if the number of interfaces is even or 1).
If n is the maximum number of interfaces per node on the network, n
VLANs are created on the switch.
- For a n-interfaces-node the configuration is the same as above.
- On a node with less interfaces The n VLANs of the Switch are evenly attached to the physical interfaces. These virtual interfaces are bond together afterwards. This way there are more than one VLAN interface that share a single physical link, but as traffic is deliverd equally to all of them performance will not suffer. Benefit: The Server can use it's maximum bandwidth to serve multiple nodes at a single time, as traffic is spread equally over all available pipes .
- On a node with a single link only. All VLANs are created on top of the same pyhsical interfaces. This causes some overhead, but the overall Networks performance is improved by equal usage of all pipes .
The Desing is illustrated in the two grafics on the right.
Notes
- All interfaces in the bond share the same MAC address. Thus all VLAN interfaces have have the same MAC address. As the VLAN interfaces are on top of the physical interfaces, the MAC addresses of the physical devices remain unchanged. To Revice packets of the VLANs (whose devices have all the MAC address of one physical interface) they have to switch to promisc mode where packets with different MAC addresses than the interfaces's one are not dropped by the driver. The Task moves to the kernel, largely increasing the overhead. To prevent this behaviour all physical interfaces have to inherit the MAC address of the first physical device.
Setup
Linux
Setting up a Working bond device using virtual interfaces is not trivial. There are many configurations to be set in the right order.
Calling sequence
- clean up!
- Load the bonding module: #modprobe bonding mode=0 miimon=100
- Check if eth0 is configured via DHCP
- Get IP and MAC address of eth0 ->
- Change eth1 's MAC address to eth0 's
- Change bond0 's MAC address to eth0 's
- Configure bond0 's IP and MAC address according to eth0 's and bring it up
- Create the VLANs: #vconfig add eth0 10 to #vconfig add eth1 40
- Add the virtual interfaces to the bond #ifenslave bond0 eth0.10 to eth1.30
The system should work right now.
Script
The Script's tasks:
- Get the IP-Address of eth0 . It will be used to generate the bond's IP.
- Get the MAC address of eth0
- Configure bond0 and eth1
- Create VLANs
- Attach VLAN interfaces to the bond
Solaris
As we were not able to get either
VLANs nor Round-robin working on the Thumper no Setup-info is availabe right now.
Problems
- Out of order delivery (Solution: bigger TCP Window, stacks, better reordering)
TO DO
Optimization
TO DO
Visit
the NFS page for a first performance analysis
The following values have to be checked for a performance analysis:
- Interrupts generated by out-of-order-delivery
- Kernel CPU usage (VLAN-tagging + bonding-module)
- Memory usage (Increased TCP-buffers for better performance)
- Network Throughput
- RTT, latecy
Lirst Overview
Performance compared to a Single Link:
Configuration |
application |
CPU sys: |
memory usage |
interupts/s |
throughput |
RTT |
1 Link |
netperf (recive side) |
~3% |
~30k |
940MB/s |
100µs |
NFS-mount Ramdisk (client) |
|
|
|
120MB/s |
2 Links (bond+VLANs) |
netperf (recive side) |
~10% |
~45k |
1,90GB/s |
130µs |
NFS-mount Ramdisk (client) |
|
|
|
230 to 245MB/s |
There seem to be some more kernel load using Channelbonding and
VLANs. The throughput scales linear with the number of links. Interupts seem to be a minor Problem.
The CPU-percentage is measured for the whole system. 100% is maximum beeing all CPUs busy.
- top (kernel CPU usage: CPU usage sys: )
- vmstat 1 (for interupts and context changes, as well as Memory load)
- Ifstat (trafic)
- netperf (synthetic TCP-performance)
- dd reading from an NFS-mounted RAM-disk (real TCP-performance)
- ping for a first RTT impression
In general it is better not to use tools as
vmstat , but read the information directly from the system-statistics located in the
/proc/ Filesystem (especially
/proc/stat ,
/proc/interupts and
/proc/meminfo ).