In-Depth
MOM is still Watching You
It takes a powerful server to watch over an entire network. Is Microsoft Operations Manager up to the task?
- By Bill Heldman
- 12/01/2001
8:05 a.m. It’s Gloria in purchasing. She can’t
log on to the network. She phrases the problem to you this way: “The Internet
must be down because I can’t sign on. It just won’t accept my password.”
8:10 a.m. You pull up your Microsoft Operations
Manager (MOM) console in order to see if any new alerts have been put
up for the network. Sure enough, after drilling into the MOM hierarchy
a bit, you find that the DHCP service is reporting that it’s out of IP
addresses.
8:15 a.m. You add IP addresses using the
DHCP admin console and ask Gloria if she’s now able to log on. She is.
You lean back in your Sam’s Club BackSaver chair and let out a satisfied
sigh. You’re a hero.
Well, not quite a hero. Gloria expects the network, including her logon,
the Internet, Internet e-mail and regular e-mail, and all of her applications
to be up 24/7. You’re glad MOM was there to alert you that you were out
of IP addresses. But how could you use MOM to help you proactively if
this problem crops up again—perhaps by automatically creating some new
addresses for you?
Product
Information |
Microsoft Operations Manager 2000
$850/processor for MOM; $950/processor for MOM Application
Management Pack
Microsoft Corp.
Redmond, Washington
www.microsoft.com/mom/
|
|
|
Unfortunately, you can’t do that with this first release. MOM’s current
function is event and performance monitoring, rather than proactively
handling crisis situations. MOM can collect and analyze system events,
compile them until a pre-configured threshold counter is met and then
alert you, but it can’t actively solve problems you’re having with your
network. (MOM alerts can execute scripts—but there might not be any action
that you could script to handle the case of DHCP running out of IP addresses.)
Systems Management vs. Operations Management
As network implementations grow and become more complex, a plethora of
management issues surface—things like how to make sure thousands of PCs
are all running the same version of software or how to know when a server
crashes. Today we have a very robust product, Microsoft Systems Management
Server (SMS) that allows us to solve the former problem, but the tools
we have to monitor the latter aren’t as granular as we’d like. SMS handles
change and configuration management. But what it doesn’t do—and never
has—is monitor things like non-SMS server services, printing functions,
server hardware and other key elements of your network, and then report
back to administrators via some alerting mechanism that a system is in
trouble. This latter capability is called operations management.
The folks at Microsoft had the change and configuration management thing
handled with SMS, but weren’t doing anything with operations management.
Now they’ve plugged this hole in their systems management software with
MOM. Microsoft purchased the code that underlies MOM from NetIQ (www.netiq.com),
a vendor actively involved in Windows NT- and 2000-based metrics and operations
management monitoring.
MOM System Requirements
You can install MOM on a single computer or distribute its components
among multiple computers. When you do a single-computer installation,
you install the MOM administrator console (which, of course, uses MMC
as its interface); MOM reporting; and the Web console (which allows administrators
to pull up the MOM information from a browser on any computer). A MOM
server configured in this manner has some minimum recommended hardware
requirements:
- 550 MHz Intel Pentium III processor.
- Minimum of 5GB of free disk space.
- 512MB of RAM.
- Video adapter capable of rendering 256 colors or more at 800x600
resolution.
- Minimum of 10MB/second network speed.
- Windows 2000 Server or Advanced Server running Service Pack 2.
Microsoft minimum requirements are just that—bare minimums. As such,
they need to be beefed up by at least 50 percent. If I were building a
production MOM box, I’d put it on a two- or four-way Pentium III 1GHz
or better with at least a full gigabyte of RAM and as much disk space
as I could cram into the box. Better yet, I’d consider splitting the load
out to multiple servers. There’s no need to install the software on a
domain controller—Win2K Server running as a member server is fine for
MOM.
You also need to tell MOM what to monitor. MOM’s monitoring rules are
contained in knowledge modules—software components that run atop the MOM
engine and contain Knowledge Base articles, preset threshold values, event
IDs and other pertinent information designed to keep track of a particular
component. I’ll review the included and optional add-on knowledge modules
later in this article.
Computers that are being monitored must run an agent and, thus, also
have minimum requirements. A monitored computer needs to have at least
a 200MHz Pentium CPU, 100MB of free disk space, 64MB of RAM, and Windows
NT 4.0 SP4 or Win2K. MOM also contains a modicum of Unix support; it can
read Syslog files shipped from a Unix computer.
MOM Installation
One of the interesting features you see when you first run the MOM setup
program is the Prerequisite Verifier, which takes a gander at your system
and says, “Hey, you need to install or upgrade these things before I can
install the product.” Figure 1 shows the Prerequisite Verifier in action.
One feature of particular interest is the clickable link at the bottom
of the Prerequisite Verifier that will carry out the needed configuration
actions. The instructions in the details pane of the Prerequisite Verifier
are quite good and tell you exactly what you need to do. I simply cut
and pasted them into a Word document, then printed out the whole thing.
|
Figure 1. The Prerequisite Verifier checks a
variety of software before it will let you begin a new MOM installation.
(Click image to view larger version.) |
MOM uses a SQL Server database to store all the information it collects
about the systems you choose to monitor. The full edition of SQL Server
2000 is the preferred database solution, but MOM comes with the MSDE version
to use for evaluation. MOM uses Microsoft Access 2000 for its reports;
you can use the included Runtime edition or install the full product (which
you’ll need to modify to create reports).
MOM’s estimated retail price is based solely on the number of processors
in each MOM computer, at $850 per processor. There’s an optional Application
Management Pack that I’ll talk about in a minute. If you decide to purchase
this, you’ll pay another $950 per processor. There’s no need to purchase
a license for the computers that you’re monitoring.
What Comes with MOM?
Great question! Remember that MOM’s job is to watch Windows-based computers’
event logs, read the events posted there, and then alert the administrator
of that event and possibly even perform some action. Here are the things
that MOM brings to the table:
- Event Management. MOM watches multiple computers and aggregates
their events into a central database. Because of this capability, it’s
possible for administrators to get an overview of how the server farm
is performing (metrics) or to drill down on a specific computer to gather
more information about an event (alerting).
- Rules. The administrator can create rules that can perform
a specified function when an event occurs. A pretty cool thing that
rules can do is hook up with a Microsoft Knowledge Base (KB) article
referencing the difficulty you’re having. Figure 2 shows a DNS dynamic
lookup rule that points to a particular KB article. Some of the KB references
in the rules that Microsoft supplies are pretty generic; others are
in-depth and right to the heart of the problem.
- Alerts. Administrators may set up alerts that examine a single
computer event, a string of events from a given computer, or a string
of events from several computers. They can be set up with varying severity
levels. Alerts can be drilled down upon to pinpoint the data that led
up to the alert, as well as be set up to e-mail or page administrators.
They can be set up to send Simple Network Management Protocol (SNMP)
traps or be provided with a script that redirects their information
to other management systems (such as Hewlett-Packard’s OpenView, CA
Unicenter, IBM Tivoli, or BMC Patrol.) You can also view alerts directly
in the MOM management console, as shown in Figure 3.
- Performance. MOM can be configured to monitor performance
thresholds by using System Monitor counters. This kind of information
is not only useful for alerting, but also for capacity planning and
historical tracking of system behavior. Thresholds can be set for local
events or you can aggregate the events into a system-wide collection.
|
Figure 2. Some events reference the Microsoft
Knowledge Base directly for more information. (Click image to view
larger version.) |
|
Figure 3. Some alerts in the MOM management console.
Most of these alerts relate to a SQL Server database that was running
out of space. (Click image to view larger version.) |
Management packs (collections of knowledge modules) are the brains behind
the MOM operation. They’re preconfigured rule-sets and Knowledge Base
articles that snap into MOM and provide administrators a foundation for
their network monitoring. The rules can be modified according to your
specific needs. With the base MOM product, you receive Management Packs
that can be set up to monitor these components:
- Win2K
- Active Directory
- File Replication Service (FRS)
- DNS
- WINS
- IIS
- DHCP
- RRAS
- Microsoft Transaction Service (MTS)
- Microsoft Message Queuing (MSMQ)
- Microsoft Distributed Transaction Coordinator (MSDTC)
- SMS
- MOM
- Terminal Server
- Windows NT 4.0 systems logs
You can also purchase the optional Application Management Pack that covers
various BackOffice and .NET server products:
- Exchange Server 2000
- SQL Server 2000 and 7.0
- Exchange 5.5
- Site Server 3.0
- Proxy Server 2.0
- SNA Server 4.0
Other vendors can write application-management packs that snap into MOM,
as well. NetIQ has been actively involved in this area and offers eXtended
Management Packs (XMPs) for MOM to monitor Oracle, Web services and antivirus
software, as well as extended capabilities for monitoring Windows networks.
NetIQ agents use each application’s API to extract more information than
is possible from simply reading a system’s event logs and System Monitor
counters, which is really all that MOM does today. As a result, NetIQ’s
XMPs (as well as other third-party offerings) will be the real added value
that makes MOM go over the top.
The
Down Side of Uptime |
Many organizations depend on server uptime as a key
metric. If you present a report to a group of people
interested in the uptime of a given set of servers and
you fail to include in your report that certain key
server services were down for a brief time during your
reporting period, are you lying when you say the server
was up?
Think about it this way. You're running an Exchange
server that gets a lot of use during the work week.
You check the server each morning: Yep, still up and
running. One morning you get a call telling you that
e-mail isn't working. You go to the server and, horrified,
figure out that one of the Exchange server services
has stopped, along with its dependent services, for
no apparent reason. Your heart skips a beat. What if
this thing's in the tank? Fortunately, you're able to
restart the services OK and get on with your life. The
logs reveal that the thing halted in the middle of the
night. E-mail services have been stopped for more than
five hours.
Does that episode count as a server outage? The server
was up the entire time-but what about its services?
Do you see the distinction? When you present your reports,
it doesn't mean diddly that your servers were up if
the crucial services they were supposed to run weren't
running for whatever reason. You technically still had
a server outage on your hands because-and here's the
key part-clients couldn't access the computer. That's
the real deal with uptime. Were clients able to utilize
the server? If not, even though the server was perfectly
operational the whole time, in actuality you had a server
outage on your hands. (Note that clients, in this context,
could imply another server needs to access your problem
box to do, say, a database lookup. If the service is
down, the server's out and your uptime reports need
to reflect it that way.)
I've actually had people who'll argue this uptime point
with me, even though it seems so blatantly obvious.
If the spring was broken on your garage door but the
electric garage door opener was perfectly operational,
would you say that you could still use the garage door?
No! It doesn't matter that the system is running if
a key operational component is down.
|
|
|
MOM also includes a Reporting tool with the capability of viewing reports
in chart or HTML format. Load-balancing and redundancy are fully supported.
MOM’s server/agent architecture keeps network traffic down yet provides
a method for central data collection. You can push the agents to targeted
groups of computers via an intelligent push and install mechanism, thus
reducing the amount of administrative overhead.
When you start MOM to view the alerts for various systems, you’re taken
to the default MOM node and given a complete system overview, similar
to what you see in Figure 4. In the details pane of this view you’re given
an “executive overview” of the status of the system, including the number
of alerts you’ve not yet addressed, the performance monitors you’re watching,
the events you’ve received, computers you’re monitoring, and so on. Note
the excessive number of rules that are processed, by default, within the
MOM system.
|
Figure 4. The default node in the MOM management
console, showing an overload of rules to be monitored. (Click image
to view larger version.) |
Minor MOM Annoyances
There are some annoying things about MOM in this release. MOM is extremely
verbose and can generate heavy CPU usage and network traffic, as well
as copious output. It’s difficult to manage rules due to the depth of
the hierarchy that contains them. And installing a knowledge module enables
all the rules in that module, making it easy to swamp yourself in data
and bog down your network. My advice: Limit the number of knowledge modules
you install.
MOM isn’t integrated with SMS, apart from a knowledge module that can
monitor SMS computers. MOM personnel in Redmond tell me that there are
plans to allow for the integration of SMS with MOM sometime in the future.
Somebody
Else's Code Under the Sheets |
As I mentioned in the main article, MOM is based on
code that Microsoft purchased from NetIQ. Lest you think
that buying and repackaging code stuff is something
new and sneaky on Microsoft's part, think again. Microsoft
has been at this kind of thing for a long time. For
example:
- NTBACKUP.EXE, that old tape backup stalwart
that shipped clear back with NT 3.5, was actually
written by Veritas.
- The Windows 2000 disk compression code was
taken from a great third-party NT compression tool
called Diskeeper.
- Windows Terminal Services is a little brother
to Citrix Metaframe. (Microsoft has a sort of symbiotic
relationship going with Citrix. Citrix is the only
developer authorized to bundle its code directly over
the NT kernel. When you buy Metaframe, you're buying
Microsoft OS code disguised as Citrix; when you run
Terminal Services, you're running highly scaled-down
Citrix code.)
Some would even say that Windows XP's icons look an
awful lot like Mac OSX-but I'm not gonna go there! In
fact, some would say that Microsoft purchases lots of
code: FrontPage, PowerPoint, Visio and a bunch of games,
for starters. In fact, an anti-Microsoft site, www.vcnet.com/bms/departments/catalog/yrcatalog.shtml,
manages to point out an entire shopping list of code
that was developed by someone else and then purchased
by Microsoft.
My guess is that a lot of people who start small software
development companies would like nothing more than for
Microsoft to nail down a deal with them that made the
company financially solid and its owners millionaires.
I also offer that Microsoft has been pretty good at
taking a software product that was initially developed
with some sort of standalone use in mind and then integrating
it into the appropriate Microsoft suite. So, even if
you believe that Microsoft isn't any good at writing
code (a claim I don't believe and would never defend),
you've got to admit that Microsoft is awfully good at
making the code work well and become a viable part of
the company's offerings.
|
|
|
The Long and Short of It
So, should you rush to install MOM on your own network?
If you’ve already invested in NetIQ’s AppManager suite of products, stay
there until MOM has been through the second release cycle and some of
the third-party offerings have been released and tweaked. AppManager’s
current functionality beats that of the just-released MOM.
If you haven’t invested yet in NetIQ and its plethora of Windows-based
management products and you have a favorable licensing agreement with
Microsoft, such as the Select or Enterprise Agreement programs, then it
may be to your benefit to investigate the use of MOM instead of AppManager.
Even then, you should consider purchasing XMPs for the applications you
want to monitor within your MOM system as they’re released. Be sure to
leave adequate room in your budget to cover these additional purchases.
Some Microsoft products are excellent in their first release and get
better and better as service packs and new revisions come out. I think
Exchange and ISA Server are great examples of this. But other Microsoft
code seems to come out the door not quite ready for prime time. SMS 2.0
was a great example of this: Microsoft released the product right at Y2K
crush time and it wasn’t until SP2 for SMS (now at SP3) that the product
really got to where it needed to be. MOM is somewhere in the middle. It’s
not the fully robust code that I’d like to see; but neither is it as buggy
as SMS 2.0 was when it first shipped. If you’re anxious to get into the
operations management game, then you can safely go forward with MOM, but
I’d anticipate fairly fast service pack releases coupled with a rush of
XMPs.
Personally, I don’t think MOM buys you a heck of a lot at this juncture
because it’s more about event and System Monitor counter consolidation
than it is about operations management. That said, let me speak out of
both sides of my mouth and say that if you’re willing to invest the time
to install, tune and understand what MOM’s telling you, you’ll end up
with a system that—in 15 minutes—can give you your entire network’s heartbeat.
And that may be worth the price of admission.