Search This Blog

Thursday, October 31, 2013

Automated Discovery and Troubleshooting of Gray State Systems in System Center 2012 (Part-1)

Recently come across a rash of clients and internal systems at the office where monitored devices, for whatever reason, have gone into a gray state. I needed a way to quickly discover these systems, and ideally, run a script that would take some basic actions to remediate or troubleshoot these agents. In this first post, I'll give the full code necessary to get the gray agent discovery running. In the second post, I'll give a powershell script that detects the grayed out agents, shuts down the HealthService, clears the agent health directory, and then turns the HealthService back on automatically.

I came across three lines of code in the following blog, which got me pointed in the right direction. However, the code did not work correctly as provided.

http://www.bictt.com/blogs/bictt.php/2011/05/27/scom-trick-14-troubleshoot-grey

$WCC = get-monitoringclass -name "Microsoft.SystemCenter.Agent"
$MO = Get-MonitoringObject -monitoringclass:$WCC | where {$_.IsAvailable -eq $false}
$MO | select DisplayName


With just that code, I would receive the following error screen:

Illustrates an error that is common when using powerhsell get-monitoringclass without specifiying the appropriate variables for the script to connect to the System Center 2012 Management Server


If you update the code to include the following path and connection to your system center server, the code will function properly. Running this should spit out a list of computers with a gray state in the agent status. This code should all be included in your powershell script:


$RMSFQDN = "<your SCOM managment server FQDN>"
$Name = "Microsoft.EnterpriseManagement.OperationsManager.Client"
$ModuleLoaded = Get-Pssnapin $Name -ErrorAction SilentlyContinue

If (-not $ModuleLoaded)
{
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
}

New-ManagementGroupConnection -ConnectionString $RMSFQDN
Set-Location "OperationsManagerMonitoring::";


$AgentClass = get-monitoringclass -name:Microsoft.SystemCenter.Agent
$MO = Get-MonitoringObject -monitoringclass:$AgentClass | where {$_.IsAvailable -eq $false}

$MO | select DisplayName


Also, review this link for a comprehensive list of WMI hotfixes for various platforms:

http://support.microsoft.com/kb/2591403

Updated 11-15-2013: Review this link for agent based system hotfixes: http://support.microsoft.com/kb/2843219