Search This Blog

Tuesday, December 23, 2014

AD Site Availability Degraded / AD Site Performance Health Degraded

After deploying the Active Directory Management Packs, we had a domain controller start alert spewing. I had not come across anything out there that really dealt with the alert; the warning from this type of event was not in eventid.net either. But it's all figured out now and here is the solution to the perplexing problem I encountered.

You could also title this, "How to Perform an Online/Offline Defragmentation of your Health Service Store in System Center".

Problem Description:

First, the SCOM console began to fill up with AD Site Availability Health Degraded and AD Site Performance Health Degraded critical alerts from the Active Directory Management Packs.

AD Site Availability Health Degraded and AD Site Performance Health Degraded

On the offending domain controller, I observed the following Application event log spewing:
 

 
The contents of the warning were as follows:
HealthService (1704) A significant portion of the database buffer cache has been written out to the system paging file. This may result in severe performance degredation. See help link for complete details of possible causes. Log Name: Application | Source: ESENT | Event ID: 906

Troubleshooting:

Initially what I suspected was that I had an application or process going bonkers on the server, taking up memory and causing the SCOM agent to malfunction or be starved of resources. I loaded the Systernals Process Monitor utility to see what was happening when these events fired off, since typically it only took a few minutes in between each event. What was captured was a significant amount of file activity from the Health Service to
C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Health Service Store\HealthServiceStore.edb . Essentially, there was no other process at the time of these warnings or corresponding alerts in the System Center Management Console that could account for issues on the system.
 
 
SCOM HealthService | ReadFile | C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Health Service Store\HealthServiceStore.edb
 
With the smoking gun being the Health Service Database, I performed some quick online maintenance from within the console to start.
 
In the Operations Manager Console, I started by browsing to the Operations Manager folder, then Agent Details and selecting the Agents by Version view.
Management Console Tree -> Operations Manager -> Agent Details -> Agents By Version
 
 
Selecting the offending computer brought up the Health Service Tasks I could perform, Start Online Store Maintenance, being the one I was looking for.
Management Console Health Service Task for Health Service Database Maintenance | Start Online Store Maintenance
 
Final Solution:

Unfortunately, the online store maintenance was not adequate enough to remediate the errors and warnings I was encountering so I opted for an offline defragmentation of the Health Service Store database. Perform the following if local warnings persist on the client system.
 
  • Login to the offending client system via console or RDP
  • Open an administrative command prompt
  • Change directory to "C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Health Service Store"
  • From the service console (services.msc) or from command prompt (net stop “Microsoft Monitoring Agent”), stop the Microsoft Monitoring Agent service
  • Run esentutl /r edb (without this, you likely won't be able to perform a defragmentation)
  • Next, run esentutl /d HealthServiceStore.edb
Running esentutl /d HealthServiceStore.edb in order to compact and defragment the health service database after log spewing occurred from loading the Active Directory management packs

When this completed, my HealthServiceStore.edb file went from 174MB to 27Mb and both the warnings in the local Application event log and the critical health alerts in the System Center Operations Manager Console went away.