In-Depth
Troubleshooting Under Pressure
The root of your problems may lie in your event logs.
I was working on the help desk of a regional public
transportation agency, and we were implementing a standardization project
to move people away from DOS, Windows 3.1, Windows 95, and mainframes
to a Windows NT Workstation base. Contractors were doing the actual rollout,
removing the old hardware, distributing the new machines and configuring
the users.
We added a new station in one of the maintenance garages. Previously
there was only a mainframe terminal; now those users were getting a shiny
new PC. New Ethernet cabling was run, the new box put in place and the
old terminal removed.
The maintenance shop supervisor returned from user training, perhaps
not eager to put these new skills to work, but at least ready to try.
But the new workstation gave an error dialog that it was unable to log
into the domain. Of course, with NT, if you can’t log in, you can’t do
anything.
Because of the new wiring, we immediately suspected the physical connection
and called the cabling contractor to come out and check it. The contractor
said the wiring was just fine and left (probably saying unkind things
about us). Of course, a simple check of the link LED on the NIC would
have shown us that it had a good connection to the hub.
Next, we sent out the primary contractor to fix the problem. He tried
changing out the NIC, unloading and reloading the driver and loading and
unloading components, all with no success.
We sent out another technician the next morning. He repeated many of
the same troubleshooting steps, with exactly the same results: No ability
to log in, and no way to work on the computer at all.
The shop supervisor, who fit the stereotype of a union bus maintenance
shop supervisor—forearms like Popeye—wasn’t happy.
This is where I got involved. The supervisor was asking politely if we
would consider returning the terminal he had previously, so he could get
some work done. I grabbed one of the terminals and drove out to the site.
I was met there by the supervisor, who was holding a three-foot crescent
wrench in his hand. He never left the office while I was there. Considering
the wrench and who was holding it, I didn’t want to leave without fixing
the problem.
I logged on locally to the machine and checked the System Event Log.
There were about 300 messages, all identical: A duplicate name was found
and all network functions were disabled. We were using computer names
that were five-digit numbers; they shouldn’t have had any duplicates,
but this one had a transposition error in it. I changed the computer’s
name to what it should have been, rebooted, and helped the shop supervisor
log on. Total time on site: Maybe 10 minutes.
The moral: Check the event logs first. The other technicians had experience
with Windows 9x, but not NT, so they didn’t know about the Event Log service.
About the Author
James D. Pollock, MCSE, MCSA, successfully escaped from the maintenance
garage and now works as a systems administrator at Pioneer Pacific College,
where he’s also a senior instructor teaching Microsoft networking.