Repairing systems with old ERDs can spell trouble. If you don’t want to reinstall the machines and restore from tape, try these steps.
Last-Ditch Fix-Its
Repairing systems with old ERDs can spell trouble. If you don’t want to reinstall the machines and restore from tape, try these steps.
Last month I
reviewed the basics of when to use the ERD and the necessary
steps to perform a straightforward emergency repair. But
what happens after you follow all the steps and that lovely
blue screen still stares you in the face? Is your only
hope a new installation followed by a tape restore process?
Maybe not.
The main underlying issue with most emergency repair
processes is the SETUPDD.SYS file, which is located on
Setup Disk 2 of the three Windows NT installation diskettes.
SETUPDD.SYS initiates and controls the emergency repair
process. The original version of this file uses the version
resource dates between files to make the final determination
of whether to replace a file on the system with an original
file from the NT installation disk. The problem with this
method is that after Service Pack updates, most of the
files will have more recent version resource dates and,
therefore, won’t be replaced—even if they need
to be because of corruption or mismatch.
This problem was recognized with the first Service Pack
release, and it was corrected with SP2 by modifying SETUPDD.SYS
to disable the version resource date comparison function.
This means that the SP2 and later versions of SETUPDD.SYS
only use the CRC values that are located in the SETUP.LOG
to compare against the files on the system to determine
if they should be replaced. To resolve this fundamental
problem, you must copy the SETUPDD.SYS file from SP2 or
greater to the original Setup Disk 2 of the three NT installation
diskettes.
The Impact of Service Packs
Beginning with SP3, NT 4.0 really should have been called
NT 4.1 instead of just another Service Pack bug fix. SP3
put in place many fundamental components for the then-NT
5.0 project, a.k.a. Windows 2000. This evolution has only
intensified with the subsequent SP4 and SP5 modifications
to the underlying system. Most of these changes don’t
affect the repair process, and they probably aren’t
even used in most NT implementations. For example, DCOM-based
applications are only now beginning to roll out of software
shops, but its foundational underpinnings have been in
NT since SP3. However, the changes that do directly affect
our current subject are the ones related to Logon and
modifications to the System and Security Registry hives.
As I mentioned last month, many people never update
the ERD after the initial installation of the system.
This will come back to haunt you. If you complete an emergency
repair on an SP3 or greater system with a pre-SP3 ERD
that you diligently modified with the proper SETUPDD.SYS,
you’ll be in emergency repair hell… well, at
least purgatory. When the repair is complete, your system
files will be in a pre-SP3 state with the System and Security
Registry hives already altered by the original SP3 application.
Remember, while Service Packs can make complex changes
to the structure of any component, the ERD process only
copies files. When you try to reboot the system, the pre-SP3
files that need to read the SAM won’t be able to
access the SP3-altered Security and System Registry hives.
The same is true when you run SYSKEY against the SAM.
If you’ve lost your key, you’ll need to repair
from a pre-SP3 repair disk. (KnowledgeBase article Q143475,
“Windows NT System Key Permits Strong Encryption
of the SAM,” discusses this in more detail.)
This problem has grown with the arrival of SP4. The original
SP3 files that needed to be considered were SAMSRV.DLL,
SAMLIB.DLL, and WINLOGON.EXE. With SP4, you need to add
LSASRV.DLL, SERVICES.EXE, and MSV1_0.DLL to the files
that are needed to access the underlying architecture
that has been modified during the Service Pack system
modification process.
First Aid for Old ERDs
If you find yourself with an original ERD and a later
Service Pack system that needs the emergency repair process,
first slap yourself for not keeping your ERD up to date.
Then, perform the following abbreviated steps.
- Make a duplicate copy of the Emergency Repair Disk
before modifying it, because this procedure may keep
the repair procedure for fixing other problems.
- Remove the attributes from the SETUP.LOG file by
typing the following at the command
attrib -r -h -s a:\SETUP.LOG
- Add the following lines under the [Files.WinNt]
section of the SETUP.LOG file (I’ve broken them
to accommodate the width of the online column):
\%Systemroot%\System32\Samsrv.dll
=
“samsrv.dll”,” 30ec0”,”\”,“nt40
repair disk”,”samsrv.dll”
\%Systemroot%\System32\Samlib.dll
=
“samlib.dll”,”f993”,”\”,“nt40
repair disk”,”samlib.dll”
\%Systemroot%\System32\Winlogon.exe
=
“winlogon.exe”,” 3c2eb”,”\”,“nt40
repair disk”,”winlogon.exe”
\%Systemroot%\system32\lsasrv.dll
=
“LSASRV.DLL”,”2e7c7”,”\”,“nt40
repair disk”,”lsasrv.dll”
\%Systemroot%\system32\services.exe
=
“SERVICES.EXE”,”2e740”,”\”,“nt40
repair disk”,”services.exe”
\%Systemroot%\system32\msv1_0.dll
=
“MSV1_0.DLL”,”cca6”,”\”,“nt40
repair disk”,”msv1_0.dll”
- CopySAMSRV.DLL, SAMLIB.DLL, WINLOGON.EXE, LSASRV.DLL,
SERVICES.EXE, and MSV1_0.DLL from the NT 4.0 Service
Pack 4 media to the root folder of the Emergency Repair
Disk.
For a complete description of the steps involved as well
as other methods for resolving this problem, read KnowledgeBase
article Q196603, “Repair Windows NT after Installation
of Service Pack 4,” from Microsoft’s Web site.
Similarly, KnowledgeBase article Q146887, “Repairing
Windows NT after the Application of Service Pack 3,”
describes the process to repair an SP3 installation.
The previous procedure introduces a cool concept. If
you modify the [Files.WinNt] section of SETUP.LOG appropriately,
you can force the emergency repair process to copy specific
files to the systems beyond those in the original installation.
As outlined previously, the [Files.WinNt] section of SETUP.LOG
contains a reference to every NT system file, its directory
path on the system, and its CRC value. This concept and
the steps involved are outlined in KnowledgeBase article
Q164471, “Replacing System Files Using a Modified
Emergency Repair Disk.”
Automating ERDs
Keeping ERDs up to date is a tedious task because it
must be performed at each individual workstation. In many
cases administrators simply can’t keep up with the
routine task while tending to the myriad of other tasks
that scream for help. As usual, the loudest voice gets
the time and attention, and an Emergency Repair Disk isn’t
a loud enough voice until you’re confronted with
a corrupted system file.
A seemingly simple way to keep ERDs up to date is to
write a batch file that executes RDISK /S and can be run
from a shortcut in the Startup menu or placed on the desktop.
The benefit of this is that users can keep their ERDs
updated regularly. The drawback is that most users will
probably never execute the batch file. Another way to
keep the ERD information up to date is to place the commands
periodically in a logon script on the domain controllers.
If you try this, make sure you use the /S- switch instead
of only the /S switch. When you add the hyphen to the
/S switch, RDISK will compile all of the Registry information
and copy it to the \%Systemroot%\repair directory, but
it won’t ask users if they want to create an ERD.
Of course, this doesn’t create or update the physical
ERD, but at least the correct, updated information will
be stored in the \%Systemroot%\repair directory of the
machine.
This lack of ERD management has left a hole in the administration
of NT that’s been filled by third-party software
developers. One interesting company is Aelita Software,
which developed a product called Aelita ERDisk. (MCP Magazine
provided a full-length review of this product in the October
1999 issue, so I’ll only mention it briefly here.)
The main benefit of ERDisk is that it creates ERDs for
any number of machines on the network remotely from a
central location. In addition to updating the \%Systemroot%\repair
directory on remote machine, the ERD information is also
copied across the network and stored in a central location
identified by individual workstation (see Figure 1). This
storage organization can also be extended to groupings
by department, location, or whatever you prefer.
|
Figure 1. In addition to updating
the \%Systemroot%\ repair directory on a remote machine,
the ERD information is also copied across the network
and stored in a central location identified by individual
workstation. |
Aelita ERDisk solves some of the major logistical problems
of managing the ERD process and maintaining the information
on a large network. I highly recommend that you obtain
an evaluation copy of the software and see if makes sense
for your network.
Additional
Information |
For more information about
Aelita ERDisk, read the product
review published in the October 1999
issue of MCP Magazine.
Other resources include:
- KnowledgeBase article Q143475, "Windows
NT System Key Permits Strong Encryption
of the SAM."
- KnowledgeBase article Q196603, "Repair
Windows NT after Installation of Service
Pack 4.
|
|
|
Diskless Emergency Repairs
The final issue applies mainly to Domain Controllers.
As you know, when you add user accounts and machines to
the domain controllers, the SAM grows in size to accommodate
the entries. Each user account consumes at least 1K, and
each workstation machine consumes at least .5K of space.
In a large organization, the SAM will grow to several
megabytes and won’t fit on the ERD, because the information
can’t span multiple disks. The RDISK /S option copies
the SAM and security information to the \%Systemroot%\repair
directory. If the SAM is too large, the attempt to write
it to an ERD fails. The proper information will still
exist in the \%Systemroot%\repair directory, however.
This means you can still perform an emergency repair
without the ERD. Simply follow the procedure I outlined
last month, except that when the process asks if you have
an ERD, select No. Setup will look for the \%Systemroot%\repair
directory based upon the BOOT.INI file and proceed with
the repair process normally.
Prevention is the Best Cure
The No. 1 rule is to keep your ERDs updated on a regular
basis. The second rule is to avoid relying on the ERD
as a backup method. Only use it to bring your system to
a bootable state so you can run backup for appropriate
restorations. And the third rule is to perform various
types of ERD processes on a non-production system before
the need arises. This way you aren’t stuck learning
on a disabled production machine while the eyes of frustrated
users burn holes into your head.