Potential for database corruption after installing Exchange 2007 SP3 RU3
I recently posted about the availability of Update Rollup 3 for Exchange 2007 SP3 and Exchange 2010 SP1 and then followed up by posting about an issue impacting some customers which have RIM BlackBerry devices connecting to Exchange 2010 SP1 with RU3.
Over the weekend, the Exchange Product Group was made aware of an issue which may lead to database corruption if you are running Exchange 2007 Service Pack 3 with Update Rollup 3.
The issue was introduced in Exchange 2007 SP3 RU3 by a change in how the database is grown during transaction log replay when new data is written to the database file and there are no available free pages to be consumed and is of specific concern in two scenarios:
- When transaction log replay is performed by the Replication Service as part of ensuring the passive database copy is up-to-date and/or
- When a database is not cleanly shut down and recovery occurs.
When this issue occurs, the following similar events are logged in the Application Event log of the Mailbox server:
- Event Type: Error
Event Source: ESE
Event Category: Logging/Recovery
Event ID: 454
Description: Microsoft.Exchange.Cluster.ReplayService (12716) Recovery E20 SG1\DB1: Database recovery/restore failed with unexpected error -4001.- Event Type: Error
Event Source: MSExchangeRepl
Event Category: Service
Event ID: 2097
Description: The Microsoft Exchange Replication Service encountered an unexpected Extensible Storage Engine (ESE) exception in storage group 'SG1\DB1'. The ESE exception is a read was issued to a location beyond EOF (writes will expand the file) (-4001) ().- Event Type: Error
Event Source: MSExchangeRepl
Event Category: Service
Event ID: 2095
Description: Log file D:\logs\SG1\E200006AFAE.log in SG1\DB1 could not be replayed. Re-seeding the passive node is now required. Use the Update-StorageGroupCopy cmdlet in the Exchange Management Shell to perform a re-seed operation
While only a small number of customers have been affected to date, the Product Group believe the risk is significant enough that they are recommending all customers to uninstall Exchange 2007 SP3 RU3 on all Mailbox Servers and Transport servers. Uninstalling the rollup will revert the system back to the previously installed version. They have also removed the Exchange 2007 SP3 RU3 download from the Microsoft Download Center and from Microsoft Update until we are able to produce a new version of the rollup.
It is strongly recommended that the below actions are taken to ensure that no data loss or outage is experienced.
For environments leveraging CCR and/or Standby Continuous Replication (SCR)
If you note the listed events in your environment the following steps must be taken in order to restore your high-availability configuration:
- Rollback the CCR Mailbox server hosting the passive database copies and any SCR target Mailbox servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
- Re-seed all database copies on the CCR Mailbox server and any SCR target Mailbox servers hosting the passive database copies.
- Verify the database copy status is healthy for all passive copies.
- Perform a switchover and rollback the remaining CCR Mailbox server to the previously installed version (e.g., Exchange 2007 SP3 RU2).
If you are not seeing these events in your continuous replication enabled environment, we recommend the following steps:
- Rollback the CCR Mailbox server hosting the passive database copies and any SCR target Mailbox servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
- Perform a switchover and rollback the remaining CCR Mailbox server to the previously installed version (e.g., Exchange 2007 SP3 RU2).
For environments leveraging Single Copy Clusters (SCC)
- Rollback passive nodes within the SCC environment to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
- Perform a switchover and rollback the remaining SCC Mailbox server nodes to the previously installed version (e.g., Exchange 2007 SP3 RU2).
- Restore and recover any damaged databases leveraging a last known good backup.
For environments leveraging standalone Mailbox servers
- Rollback the standalone Mailbox servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
- Restore and recovery any damaged databases leveraging a last known good backup.
For Hub Transport and Edge Transport servers
- Rollback the standalone transport servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
- Recover damaged mail.que databases by following the steps in Working with the Queue Database on Transport Servers.