This article is really about SQL.
We received a SCOM alert yesterday that the Lync Front End Service (RTCSRV.exe) stopped on one of our SBA’s. Sadly, this is a fairly routine occurrence as the combination of Windows + Lync + SCOM + SCCM (especially SCCM) can overwhelm the meager 4GB on our SBA’s.
So I RDP’ed to the SBA and tried to start the Front End service. It failed. Unusually, the Mediation service and Centralized Logging Service were also stopped.
So I did Step 1 in my troubleshooting routine and opened up Event Viewer and looked at the Lync logs. I saw a bunch of failed logins to the XDS database. “Got it”, I thought to myself. “The SQL Service must have stopped”. But when I returned to Services, the SQL Service was indeed running. Just for good measure I restarted it. After the restart, none of the services would start. So that didn’t fix it.
I dug deeper into the event logs and came across the following 2 entries in the Application log.
That doesn’t sound good. It seems there might have been some corruption sneak in to the XDS database.
On an SBA, the XDS database holds the replicated copy of the CMS. If Lync can’t read this database on startup, it won’t know what its configuration is. So without this database, Lync won’t start.
The next step was to “look” at the database via SQL Server Management console. Except, due to firewalls, I couldn’t connect to SQL on the SBA from our main Lync pool SQL Servers. So I went about installing the SQL Server support tools on to the SBA.
Finding the download was a lot harder than it should be. But I eventually found it on this page. I downloaded the SQL Server 2012 Management Studio x64 version and got that installed.
I could now look at the status of the database. Can’t say I like what I saw.
As I am not at all skilled with SQL, I brought in one of our SQL Admins to help with this.
We went through the following:
Alter Database [xds] set emergency DBCC CHECKDB (xds); GO
DBCC did not report any errors. So we tried:
Alter Database [xds] set online
That didn’t work.
We tried a bunch of things, such as taking the DB offline and then back online.
Based on this error, I ran a chkdsk scan to see if there were bad sectors on the drive or any other disk errors.
Chkdsk reported no issues.
We ran the following but it didn’t help either.
Alter Database [xds] set single_user with rollback immediate
We ran this command too but it didn’t do anything:
EXEC sp_resetstatus 'xds'
Finally we ran the below which fixed the problem. The risk is that the below could lose data. As this is database is just a copy of the CMS, we went ahead with it. That and we didn’t have any other options
Alter Database [xds] set emergency DBCC CHECKDB ('xds',REPAIR_ALLOW_DATA_LOSS); GO
That finally mounted the xds database.
Once that finally mounted, the Lync services on the SBA started successfully.