I ran into an issue where we had two OCS 2007 R2 Front End Servers behind an F5 Load Balancer. We kept getting “This service is temporarily unavailable” from our Communicator Clients after we configured the Communicator 2007 R2 Response Group Tab. For those that are unfamiliar with this tab, it is a web based extension to the Communicator interface that allows users to log in and out of groups. For more information about the Response Group Tab, click here.
If we take a look at the F5 documentation for OCS 2007 R2 here, we see the following configuration requires for the Response Group Service (RGS):
What this is saying, is that our client will be connecting over 5071 TCP to our Load Balancer in order to communicate with the RGS. This is incorrect! 5071 TCP is indeed used for Response Group Service communication, but this is only for Front End to Front End server communication. The RGS has something called a matchmaking service. The service on the client is just a website that communicates to the Load Balancer over port 443. So, the million dollar question… Why do we get a service unavailable?
When you’re dealing with the RGS, each Front End Server has a Matchmaking service. From the Technet Documentation:
Each Front End Server has a Match Making service, which is an internal service that is responsible for queuing calls and finding available agents. Only one Match Making service per pool is active at a time–the others are passive. If a Front End Server with the active Match Making service becomes unavailable, one of the passive Match Making services becomes active. The Response Group Service does its best to make sure that call routing and queuing continues uninterrupted. However, there may be instances when active calls are lost as a result of the transition. Any calls that are in transfer when the service transition occurs are lost. If the transition is due to the Front End Server going down, any calls currently being handled by the active Match Making service on that Front End Server are also lost.
The Match Making service is what utilizes 5071 TCP. But as stated earlier, this is only for Front End Server to Front End Server communication. Our Front End Servers need to be able to communicate with each other without traversing the load balancer. This means that that each server must be able to contact DNS, get the IP of the other server, and then communicate with that IP over 5071. This is key as to why we’re encountering the issue.
Sometimes, depending on the environment, servers behind load balancers will have multiple IPs assigned. One for connectivity from the load balancer and another IP for other server operations such as management. These Front End Servers had their default gateways set to the F5 and each Front End IP that was used for the F5 were on different segments. The problem here is when one Front End tried to communicate to the other Front End, it would query DNS and get the IP and it would route to the F5. The F5 would then route it back but the Front End Server saw it coming from the load balancer and think it’s an unauthorized server for RGS requests. This is why Communicator would see the service as unavailable.
There are a few ways to fix this issue:
- Modify hosts file on each Front End Server so they are communicating to the correct IP which are on the same segment
- Rework your load balancing configuration so the Front End Servers only use 1 IP which is where the load balancer sends the traffic and have the Front End IPs be able to directly talk to each other.
- Modify DNS so all the traffic destined to the FQDN of the Front End Server would go directly to the Front End IP which is on the same segment as the other Front End IP.
- If you must keep both Front End Servers on separate subnets and have them route through the load balancer, if possible, modify the load balancer so the requests appear to be coming from the original host that sent the request instead of the load balancer.
When it comes down to it, you just need to make sure that when 1 Front End Server talks to another, it needs to appear that it is coming from the other Front End Server instead of the load balancer so that it is an authorized host for RGS requests over 5071 TCP.