A common question I see out there is if the RPC Client Access Service (including Client Access Service Arrays) can access databases in other sites. The answer is, yes. Let’s take a look at a couple scenarios.
Scenario #1 – Full Site Failure
Let’s say you have a Client Access Server Array called array.domain.com. Primary Site goes down. As a part of the manual site switchover process, you must update the DNS records in your Primary Site to point to the CAS infrastructure at your DR Site. One out of several DNS records you change will include the CAS Array. You change array.domain.com to point to DRSiteCAS instead of PrimarySiteCAS. The client (after the DNS record flushes – recommended for TTL value to be 5 minutes for DNS records in site resilient solutions) will then start to connect to the DRSiteCAS which will then access the database in the DR Site.
Scenario #2 – Server Failure(s) in Primary Site and Disabling Automatic Activation for Databases and Servers
In the case where all database copies go down in the Primary Site, your databases can automatically failover to the DR Site as long as you allow automatic activation on the DR Servers (yes, you can turn off automatic activation on databases and servers) and as long as you still have Majority for your Quorum. In this scenario, the RPC Client Access (and array) can access the mailbox databases that are mounted in the DR Site.
As I just eluded to above, it is possible to turn off automatic activation on databases and servers. There is something called Database Activation Policy. Let’s say you wanted to disable a specific database from being considered in the Automatic Activation Process.
You can use the following command to prevent the database from being considered in the Automatic Activation Process:
This example resumes the copy of the database DB1 on the server MBX2 for automatic activation:
This is also possible to do at the mailbox server level using the Set-MailboxServer cmdlet. You can use the following command to prevent any databases on a specific mailbox server from being considered in the Automatic Activation Process:
This example resumes all database copies on the mailbox server “MailboxServer” for automatic activation:
Let’s say we have 6 DAG Servers with 4 in the Primary Site and 2 in the DR Site with no modifications to the Automatic Activation Policy (DAG Servers in the DR Site can automatically mount databases). Let’s say, we currently have a lack of funds for storage which prohibit the ability to have mailbox database copies on all servers. So PrimarySiteMBX01 and PrimarySiteMBX02 in the Primary Site are mirrored in terms of mailbox database copies. PrimarySiteMBX03 and PrimarySiteMBX04 in the Primary Site are mirrored in terms of database copies. PrimarySiteMBX01 and PrimarySiteMBX02 are mirrored with SecondarySitMBX0102 in the DR Site and PrimarySiteMBX03 and PrimarySiteMBX04 are mirrored with SecondarySiteMBX0304 in the DR Site.
To make it a bit more clear, the following image shows database distribution. You can see there are 6 nodes and 3 copies of each database.
Should PrimarySiteMBX01 and PrimarySiteMBX02 go down (as illustrated below), SecondarySiteMBX0102 can automatically mount the database because majority is still there for quorum. In this case, the RPC Client Access Array in the Primary Site will still successfully be able to provide mailbox access to the databases mounted on SecondarySiteMBX0102 in the DR Site. This is one of the nice things I like about Exchange 2010 High Availability, is that if your DAGs go down, you can allow the copy in the DR Site to automatically activate (provided the Database Activation Policy as described above allows it to automatically mount) whereas in Exchange 2007, you had to manually activate any SCR copy.
Exchange 2007 and Exchange 2010 Clusters both use Majority Node Set Clustering. This means that 50% of your votes (server votes and/or 1 file share witness) need to be up and running. With DAGs, if you have an odd number of DAG nodes in the same DAG (Cluster), you have an odd number of votes so you don’t have a witness. If you have an even number of DAGs nodes, you will have a file share witness in case half of your nodes go down, you have a witness who will act as that extra +1 number.
So in this scenario, we have 6 votes from the servers plus 1 witness from the file share witness totaling 7 votes. This means we can have up to 3 servers fail and our cluster will still be online. This is because if you are in the scenario where we 7 votes, if 3 go down that leaves us with 4 votes which satisfies the 50% + 1 majority rule. Because of this, we still have majority and our quorum and cluster are still fully operational.
Now when exactly would we have to do a manual switchover? Well, there’s a couple cases. The first would be if your Primary Datacenter has a complete outage. This may be due to power failure, environmental disaster, etc… The other is because all Primary Datacenter DAG members go down or just enough servers go down (again, 50% + 1 voters must be up which means if we lose more than 3 machines (includes FSW), our entire cluster goes offline. In this case, you’ll have to do a manual datacenter switchover. You’ll move over all services to the secondary datacenter including changing the RPC Client Access Server FQDN to point to the single CAS Server or the standby VIP that publishes RPC across multiple Secondary Datacenter CAS Servers.