I seem to encounter this issue quite often and felt this topic warrants a dedicated blog post. The basic point of this post is to explain that you cannot have more than one default gateway on separate NICS on a multihomed server! Well, technically you actually can, but things won’t work correctly. Now I am not saying that you cannot have multiple Default Gateways on a specific NIC as this is quite possible as Windows will assign metrics so one Default Gateway is given priority over another which provides redundancy. What I am saying is that you cannot have a Default Gateway on one NIC and then assign a Default Gateway on another NIC.
Any time I have seen Multihomed Servers (OCS Edge, Exchange Edge, ISA, Etc.) malfunctioning, the first thing I’ll do is a ROUTE PRINT. Quite often, I’ll see several lines that display:
0.0.0.0
0.0.0.0
0.0.0.0
0.0.0.0
That instantly tells me that multiple Default Gateways are assigned. You should only be seeing one line with 0.0.0.0. The entire point of a Default Gateway is it’s the last resort on where to send a packet. Now with that in mind, does it make any sense to have multiple last resorts? No!
So please, put the Default Gateway on only one NIC. For OCS, I typically put it on the Access Edge NIC. For Exchange Edge/ISA, I put it on the Internet Facing NIC. Ok, so you may be thinking, well my external router doesn’t allow RDP traffic… How am I going to manage my box from the inside since the RDP packets will be blocked at the external firewall? What I always do on an Edge Server (and you should also be doing this on any multi-homes DMZ/Edge Server including ISA), is create static routes so any internal traffic will go to your internal network from your internal NIC. It’s essentially creating a fake Default Gateway for only specific subnets (your internal subnets) set on your Internal NIC.
So let’s say you’re setting up an OCS Edge Server and it has 4 NICs:
Access Edge – 10.10.10.100 (DMZ Subnet) – Default Gateway Assigned here
Web Conferencing Edge – 10.10.10.101/24 (DMZ Subnet)
Audio / Video Edge – 10.10.10.102/24 (DMZ Subnet)
Internal NIC – 192.168.200.100/24 (Internal Network)
So how can we get all internal traffic to go out directly through the Internal NIC even though the Default Gateway is assigned to the Access Edge? As stated before, we’ll create a static route. So let’s say your internal router is 192.168.200.1, we’ll create a static route using the following syntax
route add 192.168.200.0 mask 255.255.255.0 192.168.200.1 -p
So for anything destined to the 192.168.200.x network (due to mask being 255.255.255.0 it will route to the default gateway of 192.168.200.1. And Windows is smart enough to see that 192.168.200.1 is on the same subnet as your 192.168.200.100 NIC and assign that as the interface it should send it out of. Problem solved!
Now what if you have a bunch of internal subnets that have similar address ranges? Simple! Supernet your internal networks!
route add 192.168.0.0 mask 255.255.0.0 192.168.200.1 -p
This supernet basically says anything that’s 192.168.x.x (only uses 1st 2 octets since you’re using a mask of 255.255.0.0 otherwise known as /16), send it to the 192.168.200.1 gateway. And again, Windows is smart enough to see that 192.168.200.1 is on the same subnet as your 192.168.200.100 NIC and assign that as the interface it should send out of. So if you have a 192.168.200.x, a 192.168.199.x, or a 192.168.198.x network, all those packets will route to the 192.168.200.1 router which will then send the packet to the appropriate subnet. Problem solved!
And the -p stands for persistent. It means that the static route will survive a reboot.
All the above applies to ISA as well. Let’s say you’re doing LDAPS authentication which uses port 636. Your external router may not allow 636. So by creating the static route to your internal network, the LDAPS traffic won’t be going through your external router and be blocked. It instead will go through your internal router which would most likely be allowing it as Internal Routers are more relaxed in their restrictions.
One thing to take into consideration is that if you are in an environment where the Default Gateways are assigned to all NICs and you modify your server to be properly configured with a Default Gateway on one NIC, make sure that any services such as remote backup on your server are allowed to access over the internet over the ports required for these services or things such as remote backup will start failing.
Good to know. Thanks for following up Kamil.
Hi Elan,
I have found the reason of my problem. You need to enable weak host model on Windows 2008 to make Edge working fine:
http://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx
Yep, I have done this with OCS R2 Edge as well on Server 2008. Not sure what may be causing it. I have seen a forum thread where people configuring NLB on Server 2008 required gateways on both NICs for things to work properly but didn’t have to on Server 2003. I wonder if there’s some sort of connection there where if something is a specific way on Server 2008 then it will work fine where if something is another way it won’t work fine. Not sure, I’d have to have that issue myself to troubleshoot and see what’s going on.
Hi Elan,
I have Edge R2 version setup on Windows 2008 with four NIC’s – three public IP’s, one internal.
When i setup only one gateway on access edge NIC, add routes to internal networks through Internal NIC, traffic from/to Access edge and from/to Internal NIC works fine, but web conferencing Edge and A/V Edge does not work at all. Live Meeting does not start externally, I must have default gateways setup on all Public NIC’s to start Live Meeting externally. Have you checked your setup on Windows 2008 ?
Again, thanks for your time and discussion here. I’ll post back if I learn anything else!
You probably already know what I’m about to say, but figured I’d say it anyways. It’s in regards to your Cluster not detecting what site it’s in. One thing to keep in consideration is that even though with Server 2008 you can have 4 subnets, both the subnets that the CMS will use (your corporate subnets for the CMS) must be part of the same Site (Stretched AD Site).
So if you have 10.10.10.2/24 for your Node1 in Datacenter A and 10.10.20.2/24 for your Node2 in Datacenter B, the 10.10.10.x and 10.10.20.x subnet have to be a part of the same AD Site. For your private NICs, they don’t have to really be a part of the same AD Site but will obviously have to route to each other.
No, I’m not with PSS and don’t work for Microsoft. Sorry. And ya, I do agree an article on Multi-Subnet Clusters would be a good article.
Thanks for your response.
You work with PSS, right? Perhaps you could take a look at SRX090114600383. I just glanced at your SCC link, and what its lacking is Server 2008’s ability to use routable public and private networks. So in a 2 node cluster, you could have 4 subnets. I could really use a good example of a properly configured SCC multi-subnet cluster (routes and addresses used on each interface)
You most likely understand the IPRouteEnabled registry key more than me as you’ve been going through this whereas I’ve never had the need for it. I do know that the IPRouteEnabled switch allows packets generated outside of your server to take advantage of your routing table though whereas with IPRouteEnabled off only packets generated on the local system itself can use the routing table.
The issue with the 2 NICs enabled causing issues does sound a bit odd. I do have my Server 2008 SCC article which will show you h ow I configured my Cluster. But… it’s nothing special. I have seen in Andy Grogan’s blog that he ran into an issue before where NLB just wouldn’t work until both NICs have the Default Gateway assigned. In the case of a Cluster/NLB, it wouldn’t be as big of an issue as an Edge Server because both Default Gateways should be hitting internal routers that allow the same type of traffic and as long as both Default Gateways know how to route to each of the subnets, it “should” be fine. Because of this, I renamed the title of the article from “Default Gateways and Multihomed Boxes” to “Default Gateways and Multihomed Edge Boxes.”
My Server 2008 Cluster article I eluded to above can be found here.
Can you describe when to use the IpEnableRouter registry key? There are conflicting articles that discuss this being enabled (and disabled) in 2003, but that it is disabled in 2008.
I have a situation where a multi-subnet cluster requires two NICs for redundant communication to eachother. The cluster isn’t properly detecting the AD site when both NICs are enabled, and i’m having a hard time figuring out the right combination of routes, use of the IpEnableRouter entry and the default gateway for each. My environment works fine when the 2nd NIC is disabled, but this obviously fails the network validation as it is no longer redundant.
I have another scenario as well. I’ve got nodes that commmunicate with eachother via NICs with 128x addresses, but these nodes also receive traffic from the internet via a content switch connected to NIC2 using a 10x address. I got this woking by assigning the Default GW on 128 NIC and enabling the IpEnableRouter key. But I’m not sure this is the right way to approach it. Furthermore, I think this only worked because the switch both NICs were connected to shared the same Layer2 space (vlan). If I select static routes over a default gateway, how does the server decide which nic to use? if 128x and 10x are directly connected thats an easy one. but what if the destination ix 70x? both NICs need to be able to reach this location depending on weather it is traffic through the content switch or directly to the node.