I recently was asked to teach Citrix SD-WAN. My first thought was: wtf? I asked Google, and Google, knowing everything, spoke to me in infinite wisdom: Citrix SD-WAN’s previous name is Branch Repeater. And Branch Repeater, I did already know this, once was the new name for WanScaler (a product I have been certified on, but never used in real life). Meanwhile the product got rebranded again and is now called Citrix NetScaler SD-WAN. So it is just a rebranded product?
WanScaler once was a great product, caching WAN traffic, and thereby preventing content from traversing a WAN multiple times. “Compression rates” of 1:100,000 had been possible, would probably still be possible, if … Yes, if we nowadays would not encrypt everything. Caching and encryption don’t go together well, never did and will never ever do. That’s why I didn’t recently hear much about WanScaler or CloudBridge. It has its reason to exist, mainly in ICA environment, but never got a big success.
But I do what people want me to do, so I started reading into it. I built my own test environment consistng of 2 SD-WANs, 2 WanEms, a server and a client. And found absolutely thrilling information! It is not just a 3rd re-branding of a product of very limited area of application, instead it is a brand new approach to WAN: Software Defined WAN, SD-WAN.
There are two different types of appliances: These classical WanScalers, Repeaters, which ever name you want to use (SD-WAN WO [WAN-optimization]), and this brand new type of software defined WAN (SD-WAN SE [Standard edition]). And, in addition, and a blended version, called NetScaler SD-WAN enterprise of course, worth thinking about it!
What’s so totally new about it?
I will just focus on software defined WAN (SD-WAN SE). SD-Wan nowadays is a hype. Gartner says:
By end of 2019, 30% of enterprises will use SD-WAN products in all their branches, up from less than 1% today.
That’s an ambitious prognosis! And Citrix is right here, one of just 5 solutions currently on the market. Gartner:
Organisations looking for WAN optimization or dynamic selection capabilities should consider this vendor, especially when Citrix applications are also present
What means: dynamic selection capabilities?
It’s all about finding out, where to send packets too. Still not clear?
Current deployments:
Usually we have a MPLS connection to branches. MPLS is fast, has low jitter and is reliable, guaranteed SLAs of 99,9% are usual (this means: less than 1 hour down per month), in real life European SLAs will be even much higher. It’s very common to bundle MPLS with a GSM LTE or GSM G4 connection in active passive, as a last mile outage due to construction works is a very likely thing to happen. This will result in a SLA of 99,999%, meaning: 5 minutes down time per year. In addition we usually also have Internet connections (with much lover SLAs of about 98-99%, 7 to 15 hours down per month) in place.
What’s wrong about it?
Simple like that: we have 3 connections, one costly, one moderate and one cheap. And we only use the costly one for WAN transfer. If we need to upgrade (Gartner speaks about 15% increase of WAN traffic per year, so there are always upgrades coming up), we need to upgrade the most costly one. It’s a damn expensive solution.
Can we make things better?
Yes we can! SD-WAN would be a solution allowing all this 3 connections to be used at the same time. You think about link load balancing? You’re totally wrong! Keep on reading.
A Citrix NetScaler SD-WAN is a virtual WAN solution. Tracerouting your WAN from inside, you would just see a single hop, not two of them, so it’s technicaly to be considered as a tunnel. It’s a tunnel, aggregating of all this three connections. But the really important thing is: it’s a UDP based tunnel using UDP port 4980.
What’s great about an UDP based tunnel? It’s just a less reliable tunnel, isn’t it?
Yes and no. UDP is not reliable at all. That’s true. UDP does not have connections like TCP has. No sequence numbers, no acknowledgement numbers. If we need reliability, we have to add it on a higher layer. However, using a stateless protocol, we can send one packet of a single TCP connection on MPLS, the next one on GSM. Even more: the packet and its acknowledge don’t even need to use the same connection. And it’s easy to avoid a congestion, because we can dynamically swerve a line if quality of this connection is decreasing.
And that’s what it is: We have a tunnel between data centre and branch office. And the tunnel intelligently and dynamically selects the best matching connection for a certain kind of traffic. We can hardly predict where a packet would flow. Its policy based, so we can assign certain requirements about quality for each kind of traffic.
Asymmetric connections? What about our firewall?
It’s all based on UDP. There is no connection on layer 4. Our Sessions are layer 7 only, and a firewall is a L3/4 device. The only thing you have to do: allow UDP 4980 originating from all branches to traverse your firewall!
Which connection is the best one?
It depends (this is the universal answer for every architect to all kind of questions).
Think about ICA? It’s the least latency one. User experience is very sensitive about latency. Jitter? We don’t care much about it, as long as we can keep latency below a certain value.
SIP (VOIP telephony)? Latency is not a big issue. Latency above a certain amount will make our phone calls less interactive, but everything below 200ms would be fine. Instead its jitter we have to care about. Jitter would distort spoken words, so they are harder to understand. And we are concerned about packet loss of course, as packet loss has an even stronger impact on understandability than jitter.
TCP based Videos? Well, neither of them is a problem, as we usually buffer videos. Even packet loss is not a big problem, as long as it doesn’t exceed a certain value. Instead its mere bandwidth. Videos nowadays fill up our costly WAN links causing congestion and packet loss.
File transfer? During browsing of directories, latency clearly is an issue, but this is not true about up- or download of files. WanScalers always had their method to deal with this latency issues, and it didn’t disappear in the enterprise or WO version. Again, it’s just bandwidth. However we would prioritize it a bit higher than video as less speed immediately impacts user experience.
So every type of connection has its own, very special, requirements. Pure prioritization won’t be sufficient.
All our Connections are pooled into the so called SD-WAN network connection. Our NetScaler SD-WAN Box will continuously monitor all possible connections and select the one best matching for every TCP stream, even for every single TCP packet. And even more: We have no need for symmetry. We could send an ICA packet on MPLS while we get the acknowledge packet over the internet! So asymmetric connections (like ADSL or cable TV) are an issue no more.
I did a short survey with my costumers: SIP traffic is not increasing very much. ICA traffic is increasing slowly while bulk traffic, mainly HTTP and videos, is exploding. Gartner calculates with annual growth of 15% continuing until 2019. So why upgrade our expensive MPLS connections and not use cheap internet instead?
How often do we measure latency?
At least every 50 ms on an idle connection. SD-WAN protocol sends measurement information with each and every packet it transmits. So the more busy your connection is, the more measurement data we transmit.
Security
Sending data over the internet is a risky thing. Citrix NetScaler SD-WAN uses 256 BIT AES IPSec. No data will traverse the internet in an unencrypted way. The crypto library supports NSA Suite B, I think, this is sufficient.
There is another aspect. If I want to hack into a connection, I need to pick up every single packet of this very data stream. I face dramatically more overhead if a single packet is missing. How can I collect all packets, if packets are distributed randomly over several very different connections, provided by different providers? This would even be challenging for these intelligence services all over the world! How could they ever reassemble a TCP stream? In my opinion this is a strong plus on security, even a plus over MPLS.
Resilience
We continuously measure the quality of a connection. If we see any parameter changing (latency, jitter, congestion, bandwidth) we will immediately change our assessment. Internet can have both, less latency and less jitter than MPLS. But it might change within milliseconds. It’ important to react. SD-WAN does immediately!
SIP (VOIP) is very sensitive to both, packet loss and jitter. At the same time it’s not a top bandwidth waster. We could therefore easily duplicate data over two or more lines and send it simultaneously. The data arriving first will be forwarded to the user, the other packets will be discarded. This would reduce the risk of packet loss and, at the same time, reduce both jitter and latency. This is an other great feature of Citrix NetScaler SD-WAN, and turned on by default for SIP. Isn’t it a great chance for our VOIP calls?
There is a great Video on YouTube about resilence. It’s a marketing video, taken at Synergy 2016, but I like it, as it truly shows how it works.
More chances to safe costs
We currently use MPLS because it is secure, available and offers guaranteed SLAs. Using Citrix NetScaler SD-WAN we don’t need to care about security as it’s built into SD-WAN. However SLAs of internet connections are not on top. What about using three internet connections at the same time. Three connections using very different technology like a GSM based, a cable based and a DSL based connection, instead of MPLS? This would offer SLAs above MPLS and, at the same time, be by far less costly? I think, evaluating this would be worthwhile!
Links
I have spoken to guys from Danish government evaluating Citrix NetScaler SD-WAN, some weeks ago. They are very interested in SD-WAN, mainly as they already have a department using Citrix NetScaler SD-WAN: Danish AgriFish. It’s all about costs and random disruption of ICA connections, and it works perfectly well. AgriFish is enthusiastic about it, other governmental authorities will follow. I link this (Citrix) success story here. I’m sceptic about success stories (never trust statistics you didn’t fake yourselves, but these guys I met face to face could prove this AgriFish one, so I can trust in it!