I was reminded today of a question I used to see a lot in the forums. Not so much anymore, but perhaps a refresher is in order.
Granted, it seems almost brilliant to simply configure all the URLs and connection points to the DAG IP. And after all it does say its use is “Cluster and Client” 😛
and if that means there is no need to worry about load balancing and let Exchange handle it, then why not?
- There is no Exchange dependency on the Cluster IP being online. Both Exchange 2013 and Exchange 2016 support IP-Less Database Availability Groups. The cluster IP can go offline and Exchange will run just fine. The only real reason to assign a Cluster IP address is if you are using backup software or another 3rd party application that requires it. If you run Exchange with the Preferred Architecture recommendations, you won’t be doing backups anyway!
- If the Cluster name goes offline and the IP with it, Managed Availability won’t attempt to bring it online. That requires manual intervention. Yuck.
- The Cluster IP is held by a specific mailbox server in the DAG at any one time – meaning all client connections will go through that multi-role server and no others.
- If the quorum owner moves to another server, there is no guarantee that the clients will handle that gracefully.
- The only way to prevent a server from end-user client access in this scenario is to pause or stop the cluster service on the affected server.
- IT’S NOT SUPPORTED!
If you are using lagged copies, you have hopefully also enabled the Replay Manager as well. Once you do so, be aware of the implications. Most notably:
“consider an environment where a given database has 4 copies (3 highly available copies and 1 lagged copy), and the default setting is used for ReplayLagManagerNumAvailableCopies. If a non-lagged copy is out-of-service for any reason (for example, it is suspended, etc.) then the lagged copy will automatically play down its log files in 24 hours.”
To repeat: By default, if a non-lagged copy is out of service for more than a day, the lagged copy of that database will play down its logs and essentially become a HA copy.
So consider this scenario: The servers have a mix of HA and lagged copies on the same drives. One of them encounters some hardware issue, so you suspend all the databases on it and block activation until you can fix the problem, but that’s ok – there are 3 healthy copies of the databases on other servers. But here is the catch. They have to be 3 HA copies. If it’s two HA copies and one lagged, then log play-down will kick off on those lagged copies after 24 hours if you haven’t changed the default and there goes the suspenders you counting on in case the belt fails.
Sounds obvious, but something that could bite you if you aren’t paying attention and you suddenly realize 2 days later that all the replay queue lengths of the affected databases are at zero, so stay safe out there.
Note that in 2016 CU1, Replay Manager is enabled by default and other goodies!
As for what happened to the cast of “Leave it to Beaver”, well, not much really.