Search This Blog

Wednesday, May 23, 2012

SCOM - Putting Systems in Maintenance Mode through Citrix

One of the goals I have is to give the ability of application developers and other individuals the ability to put their systems into maintenance mode without necessarily having to access the Management Console directly or get me involved. When they do code updates or other system maintenance, it is handy to give them a quick and basic way to put their systems into maintenance. Additionally, there are times where it is nice to have the ability to put systems into maintenance mode from a cell phone, tablet or other device from a remote location, which is where Citrix comes in handy. Basically, I put the following code into a file and publish that script through Citrix to the users that may need it.

Here is the script:

# Enter the FQDN of your SCOM management server in this variable
$RMSFQDN = "my-managementserver.mydomain.internal"

# Enter the internal DNS suffix for your environment
$DNS = "mydomain.internal"

$Name = "Microsoft.EnterpriseManagement.OperationsManager.Client"

$ModuleLoaded = Get-Pssnapin $Name -ErrorAction SilentlyContinue

If (-not $ModuleLoaded)
{
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
}

New-ManagementGroupConnection -ConnectionString $RMSFQDN
Set-Location "OperationsManagerMonitoring::";

$startTime = [System.DateTime]::Now

# You can change the default time for how long systems should be in maintenance
$Hours = 3

$endTime = $startTime.AddHours($Hours)

$comment = "Computer Maintenance"

While ( ($computerPrincipalName -ne "done") -or ($computerPrincipalName -ne "Done") )
{
 $computerClass = get-monitoringclass -name:Microsoft.SystemCenter.ManagedComputer
 $computerPrincipalName = Read-Host "Enter the computer name to put into maintenance (enter 'done' to finish maintenance mode)"
 $computerCriteria = "PrincipalName='" + $computerPrincipalName + "." + $DNS + "'"
 write-host $computerPrincipalName
 $computer = get-monitoringobject -monitoringclass:$computerClass -criteria:$computerCriteria

 if($computer -eq $null)
 {
  $unixClass = get-monitoringclass -name "Microsoft.Unix.Computer"
  $monObject = Get-MonitoringObject -monitoringclass:$unixClass
  $computer = $monObject | where {$_.displayname -eq $computerPrincipalName}
 }
 ELSE
 {
  $computerClass = get-monitoringclass -name:Microsoft.Windows.Computer
  $computer = get-monitoringobject -monitoringclass:$computerClass -criteria:$computerCriteria
 }
 if($computer.InMaintenanceMode -eq $false)
 {
  "Putting " + $computerPrincipalName + " into maintenance mode"
  New-MaintenanceWindow -startTime:$startTime -endTime:$endTime -comment:$comment -monitoringObject:$computer
 }
}
stop-process -Id $PID

Wednesday, May 16, 2012

Monitoring Theory (Murphy's Law of Monitoring)

Yeah, I just created a new monitoring theory. Ok, might not be new but it is the philosophy of how I often configure System Center as the monitoring platform of choice. As a systems admin, we all hate the late night alerts that essentially mean nothing. So how do you prevent those alerts from hitting your phone whilst in the comfort of your sleep? How should monitoring be approached? Do you care that once a day, a server's CPU spikes and generates a critical alert? I take this approach to alerting and monitoring, transient problems should be collated over a period of time to see if there is a long term trend that needs addressing; alerts, especially after hours, should be comprised of site, server or service down plus hard disk space issues. Ultimately, are these not the events that will have the corporate director or customer calling you in the morning to chew your head off? So start your approach there, whether specifically subscribing to those alerts individually or lowering the severity of other alerts (cpu utlization, disk slowness, etc.). As for that transient information, create an SLA report with somewhere between a 90-95% value. Why? Well trying to get a server to have 99% acceptable values or CPU utlization could get expensive and likely be a waste of resources when the server is not busy the other 80% of the time. What the SLA gives you is a trending value for the alert that you do not necessarily care about if it happens once or twice. However, if it is consistent enough over the period of a month or quarter, then you may want to look at some upgrade planning or optimization of the server. The SLA reports also allow you to look at the health of all monitored devices at once, instead of ad hoc alerts that come in for a transient condition.

Monday, March 26, 2012

Powershell script to stop system maintenance mode

In the event you want to pull the computer out of maintenance mode for some reason, here is the script for ending the maintenance mode. The command line prompts could be hard coded as variables or you could add a $params statement to the front of the script to feed in a variable.


$RMSFQDN = Read-Host "Enter the FQDN of your management server"

$Name = "Microsoft.EnterpriseManagement.OperationsManager.Client"

$ModuleLoaded = Get-Pssnapin $Name -ErrorAction SilentlyContinue

If (-not $ModuleLoaded)
{
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
}

New-ManagementGroupConnection -ConnectionString $RMSFQDN
Set-Location "OperationsManagerMonitoring::";

$computerClass = get-monitoringclass -name:Microsoft.SystemCenter.ManagedComputer
$computerPrincipalName = Read-Host "Enter the FQDN computer name for ending maintenance:"
$computerCriteria = "PrincipalName='" + $computerPrincipalName + "'"
$computer = get-monitoringobject -monitoringclass:$computerClass -criteria:$computerCriteria


if($computer -eq $null)
{
$unixClass = get-monitoringclass -name "Microsoft.Unix.Computer"
$monObject = Get-MonitoringObject -monitoringclass:$unixClass
$computer = $monObject | where {$_.displayname -eq $computerPrincipalName}
}
ELSE
{
$computerClass = get-monitoringclass -name:Microsoft.Windows.Computer
$computer = get-monitoringobject -monitoringclass:$computerClass -criteria:$computerCriteria
}


if($computer.InMaintenanceMode -eq $true)
{
"Stopping maintenance mode for: " + $computerPrincipalName
$computer.StopMaintenanceMode([DateTime]::Now.ToUniversalTime(),[Microsoft.EnterpriseManagement.Common.TraversalDepth]::Recursive);
}
ELSE
{
"The computer " + $computerPrincipalName + " is not in maintenance mode."
}

Thursday, March 22, 2012

Comprehensive Powershell Script for System Maintenance Mode

The script I posted yesterday was a quick and dirty script to put a windows host into maintenance mode. Problem is, that script doesn't work on agentless systems or Linux/Unix, so I took care of that. The following script will work for windows hosts, agentless hosts and linux/unix. The caveat is you have to enter the fully qualified domain name of the host. Alternatively, you could alter the script to append your DNS suffix to the host name to save some key strokes. Enjoy!


$RMSFQDN = Read-Host "Enter the FQDN of your Management Server"
$Name = "Microsoft.EnterpriseManagement.OperationsManager.Client"

$ModuleLoaded = Get-Pssnapin $Name -ErrorAction SilentlyContinue

If (-not $ModuleLoaded)
{
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
}

New-ManagementGroupConnection -ConnectionString $RMSFQDN
Set-Location "OperationsManagerMonitoring::";

$startTime = [System.DateTime]::Now
$Hours = Read-Host "Enter the number of hours to put into maintenance:"
$endTime = $startTime.AddHours($Hours)
$comment = "Computer Maintenance"



$computerClass = get-monitoringclass -name:Microsoft.SystemCenter.ManagedComputer
$computerPrincipalName = Read-Host "Enter the FQDN computer name"
$computerCriteria = "PrincipalName='" + $computerPrincipalName + "'"
$computer = get-monitoringobject -monitoringclass:$computerClass -criteria:$computerCriteria


if($computer -eq $null)
{
$unixClass = get-monitoringclass -name "Microsoft.Unix.Computer"
$monObject = Get-MonitoringObject -monitoringclass:$unixClass
$computer = $monObject | where {$_.displayname -eq $computerPrincipalName}
}
ELSE
{
$computerClass = get-monitoringclass -name:Microsoft.Windows.Computer
$computer = get-monitoringobject -monitoringclass:$computerClass -criteria:$computerCriteria
}

if($computer.InMaintenanceMode -eq $false)
{
"Putting " + $computerPrincipalName + " into maintenance mode"
New-MaintenanceWindow -startTime:$startTime -endTime:$endTime -comment:$comment -monitoringObject:$computer
}

Wednesday, March 21, 2012

Scripting the Maintenance Window for a System through Powershell

Enclosed is a powershell script I put together from a number of examples out there. I ran into some overly complex scripts or ones that didn't quite work right when trying to find something quick and easy to use.

Ultimately, in our environment, I hard coded many of the variables to make the process quick and easy (Basically everything but the computer name). For the purposes of this script, it prompts for everything but the time window. I have removed the comments for the time being until I can get it to show up nicely in the blog.


$RMSFQDN = Read-Host "Please enter the FQDN of your management server:"
add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client";
New-ManagementGroupConnection -ConnectionString $RMSFQDN
Set-Location "OperationsManagerMonitoring::";

$startTime = [System.DateTime]::Now

$Hours = 1

$endTime = $startTime.AddHours($Hours)

$strComputerName = Read-Host "Enter the computer name to be put into maintenance mode:"

$objComputer = Get-Agent | Where-Object {$_.Name -match $strComputerName}

$comment = Read-host "Please provide a comment for putting the system in maintenance mode:"

$objComputer.HostComputer | New-MaintenanceWindow -StartTime:$startTime -EndTime:$endTime -Comment:$Comment

Wednesday, February 22, 2012

Duplicate name on network warning in SCOM

Came across this lovely gem recently. Troubleshooting steps didn't really help, at all.

A duplicate name has been detected on the TCP network. The IP address ofthe machine that sent the message is in the data. Use nbtstat -n in acommand window to see which name is in the Conflict state.

Servers in this case were in the DMZ, weren't in a domain and definitely did not have duplicate names. They did have multiple network adapters, but that shouldn't have caused the issue. Finally realized, the systems had been converted (PtoV) into our virtual enviornment from physical.

Dropped to a command prompt and ran "set devmgr_show_nonpresent_devices=1" to ensure device manager would show all hidden devices. From there, I ran devmgmt.msc to open device manager and then from the toolbar, I selected the "Show hidden devices" option. Going through, the old network adapters were still in there. After removing the adapters and rebooting, the warning did not come back.

http://support.microsoft.com/kb/315539 explains how to show hidden devices in windows.

Friday, February 17, 2012

WMI Warnings Conquered

Had my final battle with WMI and am happy to say, I won. This time the issue was caused by agentless monitoring problems. I had followed the usual steps, starting with the encryption issue in the mof files, followed by updating WMI and then WSH. However, those steps did not solve the problem with my agentless systems constantly reporting back WMI errors on monitoring.

Here is the concoction I came up wtih that finally stopped the last of the warnings in the console.

These were Windows 2003 servers I was atempting to monitor. I started by adding both the SCOM action account to the DCOM group on the local server. I also added the SCOM management server computer account. I had noticed the server and action account were both showing up in the security logs on the local server I was trying to monitor.

Next, I tweaked my component services permissions to allow both the action and computer accounts to both the "Access Permissions" and "Launch and Activation Permissions" options under COM Security.

I installed "Remote DTC" as a windows feature and allowed remote clients to connect for network DTC access.

Finally, I added the action account and management server computer account with full permissions over the following registry key:
HKLM\Software\Microsoft\WBEM

Upon rebooting, I received no further WMI errors. I will post a more detailed, blow by blow accounting of all the things to use for troubleshooting WMI warnings in System Center.