This is Corona time, and Corona drives virtualization like nothing else did before. I recently had to fix issues with remote access. Usually, it would have to handle round about 1,500 users, but now, the number of users increased to 15,00, so ten times as much. The existing MPX-11500 could not handle all these connections, latency was way to high.
Our customer needed a quick resolution, as every day cost money. Our first suggestion, of course, was to reduce the strength of encryption, but they didn’t agree. So we needed to find something else.
Another problem: No interference with currently connected users, no service window.
A quick and dirty solution. With obstacles
This customer had 2 Citrix ADC MPX boxes, set up as a HA pair. One active, the other passive. Active/passive is a waste of 50%. On the other hand, we saw somewhat like good performance with up to 8,000 users. So we decided to go active/active.
The downside of active/active
Active/active is not redundant if you take a closer look. The performance will go down dramatically if one of these boxes goes down. That’s why we usually avoid active/active. It is not a long term solution, but hey, it’s corona time, far from normal. It had to be a quick solution, so reliability was a minor concern.
The Citrix ADC cluster?
To be 100% honest: I hate cluster! The main reason: It supports almost all features, and it’s hardly used, so hardly tested. In practical life, this means, you always lack at least one functionality you need, and another one won’t work like expected. So I didn’t consider going cluster.
The 2nd method?
This is a load-balanced solution. 2 Citrix ADCs (NetScalers), load-balanced by “something” else. In this case, it’s a fire-wall with (limited) load-balancing capabilities.
so there are 3 steps to do:
- “Divorcing” this HA pair
- setting up load balancing (not covered here, it’s a matter of the firewall)
- setting up the 2nd box
“Divorcing” a Citrix ADC (NetScaler) HA pair?
You could think, it’s pretty easy. Just open Citrix ADC’s (NetScaler’s) GUI, navigate to System → HA, click the secondary box, and click remove.
A damn bad idea. Doing so would bring the former secondary box in an active state immediately. All SNIPs and VIPs will be duplicated, and IP address conflicts would break current connections. So that’s not an option.
We created an ssh connection to the active box and executed
from the command line. This gives us a complete list of IP addresses currently in use. We have to find new IPs for all SNIPs and VIPs.
Next, we downloaded the current ns.conf from the passive Citrix ADC (NetScaler). (if you don’t know where it is: /flash/nsconfig/ns.conf).
While we leave the NSIP in place (
set ns config -IPAddress 10.0.20.100 -netmask 255.255.255.128), we have to replace all the rest. I use search and replace for tasks like that, as search and replace does a 100% job.
Next, I have to search for MAC-Addresses. There may be MAC addresses in a ns.conf. One of the best things about Norepad++ is searching for REGEX.So I set the mode of searching to Regular Expressions and the search string to a regex matching a MAX address. It does not need to be an exact match, so
\w\w:\w\w:\w\w (groups od 2 word characters, separated by colons) would be sufficient.
Typically you can find MAC-addresses with link aggregates, but there may be more use-cases for virtual MAC addresses.
set channel LA/1 -lamac 01:23:45:67:89:ab -throughput 0 -lrMinThroughput 0 -bandwidthHigh 0 -bandwidthNormal 0
In my example, remove
-lamac 01:23:45:67:89:ab. Not doing so leads to duplicated MAC addresses with non-predictable, but rather odd, results!
The divorce of a pair of Citrix ADCs (NetScalers)
Now we are done with creating a new ns.conf, and we may upload it to the secondary Citrix ADCs (NetScaler). And now, let’s reboot the box.
We now have the two independent Citrix ADCs (NetScalers), using different IPs, not interfering to each other.
Wait, there is something missing:
Don’t forget to go to the former active box and then to: System → HA, click the “unknown” (formerly secondary) Citrix ADC (NetScaler), and click delete.
Now it’s finished. Should not take longer than an hour. Should not affect any of the active user sessions.
The last step would be, setting up load-balancing on your firewall. Don’t forget about persistence, probably based on source IP (SSL session ID would be possible as well), or user sessions will be randomly disconnected. This has to be done on your firewalls or routers.
Hi Johannes, in this case, why do you not consider about GSLB ?
(maybe because it was faster to go to Firewall load balancing?)
Hi Makunouchi, GSLB won’t help in most cases, as it would need two IPs on the outside.
GSLB is basically a trick about DNS, isn’t it? It would distribute the load across 2 IPs. Sure, if you have plenty of IP addresses, GSLB would be good. However, most companies don’t have that many IPs. IPs are expensive (I currently own 14 IPs, and have to pay 50 € per month. (I host a NetScaler test and lab environment at https://wonderkitchen.network with IPs 126.96.36.199-48 together with some websites and other services). That’s quite costly!
My customer didn’t have a second IP, so I had to find a way to deal with just one IP. And load balancing from the firewall was a quick and dirty solution. They removed this solution after some weeks, as they bought bigger boxes.