With Exchange 2010 comes many advantages in the HA realm. One of them is the ability to connect to the Client Access Server for RPC. This means, when a Mailbox Server does a *over (failover or a switchover), the user is still connected to their RPC Endpoint. You can also create a Client Access Array which load balances your RPC Endpoint on your CAS Servers. Lots of information on the RPC Client Access Server here and here. So what options are available for load balancing this new RPC Client Access Array and at the same time, load balancing all our other services? And what are the pros/cons of each method? If you want to know, read on…
Exchange Load Balancing Options
In Exchange 2007, if you wanted any type of HA, you needed at least four servers. 2 for CCR Nodes and 2 for HUB/CAS Nodes. The reason why you cannot have 2 nodes altogether is that CCR Nodes were limited to the Mailbox role only. For an Exchange Site, you need to always have at least the HUB/CAS/MBX Role for that site to be operational. In Exchange 2010, more options are now available. You now have something called Database Availability Groups (DAGs). These DAG members can contain all Exchange roles (HUB/CAS/MBX/UM) but still may not contain the Edge Transport role.
There is a problem though. There is a Windows limitation that allows you to install Windows Network Load Balancing on a server that also contains Failover Clustering Services. So while we can now have 2 Exchange 2010 Servers, we need a way to load balance the CAS role to provide High Availability for the following CAS Services:
- Outlook Web App (formerly Outlook Web Access) (HTTP Traffic)
- Exchange Control Panel (HTTP Traffic)
- Exchange Web Services (HTTP Traffic)
- Exchange ActiveSync (HTTP Traffic)
- Autodiscover (HTTP Traffic)
- Offline Address Book (HTTP Traffic)
- Outlook Anywhere (HTTP Traffic)
- RPC Client Access (RPC Traffic)
There are a few options for load balancing. The first is the ability to use ISA. The problem here, is that ISA can only load balance HTTP-based traffic. If you take a look at the bulleted list above, you can see that RPC Client Access Service is RPC Traffic which means that ISA cannot load balance this traffic. We have a few load balancing options then:
- 2 Multi-Role DAG Members and Hardware Load Balancers – Utilize 2 Multi-Role DAG Members (MBX/HUB/CAS). Use a hardware load balancer to load balance all of the bulleted items above including the RPC Client Access Service using an RPC Client Access Array which load balances port 135 for the RPC Endpoint Mapper and 1024-65535 ports. Typically, since you are using High Availability, this means that you would most likely want to have 2 hardware load balancers.
- 2 DAG Members, 2 HUB/CAS Servers, and Windows Network Load Balancing – Utilize 2 DAG Members (MBX). Use 2 HUB/CAS Servers with Windows Network Load Balancing. Windows Network Load Balancing will load balance all of the bulleted items above including the RPC Client Access Service using an RPC Client Access Array which load balances port 135 for the RPC Endpoint Mapper and 1024-65535 ports.
- 2 DAG Members and DNS Round Robin – Use 2 Multi-Role DAG Members (MBX/HUB/CAS). Use DNS Round Robin to achieve a “poor man’s solution” type of load balancing. With this scenario, you will not have automatic failover for the RPC Client Access Service. You will essentially create two A Record for the RPC Client Access Array; one pointing to the first multi-role DAG Member and one pointing to the second multi-role DAG Member. You will most likely want to lower the TTL values of these DNS records to 5 minutes so if a failure does happen, you can remove one of the A records and the clients will flush their DNS cache within 5 minutes time.
- 2 DAG Members, ISA/TMG/UAG, and either Hardware Load Balancing or DNS Round Robin – Use 2 Multi-Role DAG Members (MBX/HUB/CAS). Use ISA/TMG/UAB to load balance all HTTP items from the bulleted list above. The issue here is that now with Exchange 2010, for mailbox access, users connect to the Client Access Server for their RPC Endpoint. To make this redundant, we create an RPC Client Access Array. This RPC Client Access Array can be load balanced through a hardware load balancer, DNS Round Robin, or Windows Network Load Balancing. ISA/TMG/UAG cannot load balance non-HTTP Traffic. So if you have ISA/TMG/UAG, you can still use it to load balance all HTTP Traffic but you would still need to use a Hardware Load Balancer, DNS Round Robin, or Windows Network Load Balance to load balance the RPC Client Access Array. The example picture below will show the use of UAG with a Hardware Load Balance mix.
Exchange Load Balancing Options and their benefits
Taking a look at the above list of options, we can use several different options including Windows Network Load Balancing, Hardware Load Balancing, and DNS Round Robin. Each has their pros and cons in terms of cost and functionality.
Hardware Load Balancing
Hardware Load Balancers do have the best functionality from a perspective of Client to Server Affinity depending on the vendor used. For example, we can use multiple affinities and have fallbacks to a specific affinity of our preferred affinity fails. For example, we can set up up our hardware load balancers to use the following affinity in terms of preference:
- Existing Browser-Based Cookie
- Hardware Load Balanced created cookie
- SSL Session ID
- Source IP
The goal here is to make sure that every user is load balanced evenly and that automatic failover can occur quickly and smoothly.
Windows Network Load Balancers
Windows Network Load Balancers do not have as good of functionality of Hardware Load Balancers from a perspective of Client to Server Affinity. For example, we only have one affinity method. That method is Source IP. The downside to using Source IP is if you have a lot of connections coming from a NAT’d Source IP. This means that all of these connections will end up hitting the same Client Access Server as again, the only Affinity Method a Windows Network Load Balancer has is Source IP.
Most likely, if you don’t have the need for more than 8 CAS Servers, Windows Network Load Balancing will suffice for you needs. It’s cheap, comes with Windows, and does its job.
ISA Server, TMG, or UAG
ISA/TMG/UAG Servers to have more capabilities than Windows Network Load Balancers. The one downside to them is that they cannot load balance RPC Traffic. Because of that, you can still use ISA/TMG/UAG to load balance your HTTP traffic, but you’ll still need a Hardware Load Balancer or a Windows Network Load Balancer to load balance your RPC Client Access Array.
ISA/TMG/UAG do scale better than Windows Network Load Balancing but not as well as a Hardware Load Balancer. ISA/TMG/UAG does not cost as much as a Hardware Load Balancer but is more expensive than Windows Network Load Balancing. ISA/TMG/UAG also has the capability to do Load Balanced created cookies as well as Source IP Affinity depending on the protocol ISA/TMG/UAG is publishing.
Another upside to using ISA/TMG/UAG is that they can do pre-authentication. This means that if a server goes down in which a client has affinity to, ISA/TMG/UAG still contains the authentication context of the user and automatically re-authenticates to the new Client Access Server.
DNS Round Robin
DNS Round Robin scales just as high as Hardware Load Balancers because the connections will just go directly to the Client Access Servers. If anything, it has the highest scale as you don’t have anything in the middle doing anything with the connections. It’s also free to use! But in this case, free is not necessarily good because you lose a lot of functionality. Hardware Load Balancers, Windows Network Load Balancers, and ISA/TMG/UAG all have the capability to detect server failures and automatically stop sending to the server and direct all traffic to a server that is operational.
DNS Round Robin has no automatic server failure detection. If a host goes down, an Administrator will need to realize it, remove the DNS A/HOST Record for the server that went down, and then clients will have to wait for the TTL value on the old DNS record to expire. When that happens, the client will begin connecting to the proper server. So you save a lot of money going with this option, but you lose all automation and gain downtime instead.