Tech Line

Disk Signature Disaster

Saving yourself some pain when replacing failed cluster disks.

Chris: We recently had a hard disk fail on or Windows 2003 cluster and it was an absolute nightmare. I replaced the failed disk and could not get the cluster server to recognize the new disk in order to restore the missing disk files from backup. I assigned the same drive letter to the new disk, but each time I would try and bring the disk resource online, it would fail.

Since we were in a pinch, we decided to scrap the cluster and start again from scratch. After rebuilding the cluster, we were able to restore files to the two cluster virtual servers from backup. I'm sure there has to be an easier way to recover a failed cluster. What else could I have done?
— James

 

 

Tech Help—Just An
E-Mail Away

Got a Windows, Exchange or virtualization question or need troubleshooting help? Or maybe you want a better explanation than provided in the manuals? Describe your dilemma in an e-mail to the MCPmag.com editors at mailto:[email protected]; the best questions get answered in this column and garner the questioner with a nifty MCPmag.com baseball-style cap.

When you send your questions, please include your full first and last name, location, certifications (if any) with your message. (If you prefer to remain anonymous, specify this in your message, but submit the requested information for verification purposes.)

After talking with James, I learned that his cluster ran two virtual servers: a virtual file server and a virtual print server. Each virtual server resided in its own group on the cluster. With this relatively simple setup, rebuilding his cluster did not take a long time. Since the point of having a cluster is high availability, taking down an entire cluster is never the best option. The reason James had this problem is due to how Microsoft Cluster Service (MSCS) treats disk signatures.

The MSCS associates physical disk resources by the disk signature that's written to each physical disk when the disk is initialized by a Windows OS. If you replace a physical disk within the cluster, the Cluster service will see the original disk as failed and will not even see the new disk. To have the new disk seen as the original disk, the original disk's signature reference in the cluster configuration must match the new disk. While there are a few tools that can do this, by far the easiest method is to associate the new disk with the failed disk is by running the Server Cluster Recovery Utility.

The Server Cluster Recovery Utility is included in the Windows Server 2003 Resource Kit and can be downloaded from Microsoft at http://www.microsoft.com/downloads. This tool is especially useful when replacing a shared cluster disk or in a disaster recovery scenario when a cluster is being rebuilt using new physical disk resources. Oftentimes, after a cluster quorum is restored, physical disk resources will still not be able to come online. That's because the signature for the disks stored in the cluster configuration does not match the signature of the new disks. In these instances, the Server Cluster Recovery Utility can be used to return the disks to a usable state.

To use the Server Cluster Recovery Utility, first install the replacement disk and use Disk Management to initialize and format the new disk as NTFS. Then go to Cluster Administrator and create a new resource for the newly added physical disk. Here are the steps:

  1. In Cluster Administrator, right-click the Resources container, select New, and then click Resource.
  2. In the New Resource dialog box, enter a name for the new resource, select "Physical Disk" as the resource type, and then select the group in which to associate the resource.
  3. Select the possible owners for the disk (same as original disk) and click Next.
  4. In the Dependencies dialog box, click Next.
  5. The newly added disk should be displayed in the Disk drop-down menu. Select the disk and click Finish.

With the newly installed disk associated with the cluster, you can now use it to replace the failed disk resource. To do this, first ensure that the Windows Server 2003 Resource Kit Tools are installed on the node you plan to perform the procedure on and then follow these steps:

  1. Run clusterrecovery.exe to open the Server Cluster Recovery Utility.
  2. Once the tool opens, enter the name of your cluster in the Cluster Name field. Then select the "Replace a physical disk resource" radio button and click Next.
  3. Select the original (failed) disk in the "Old physical disk resource" drop-down menu and then select the new physical disk from the "New physical disk resource" drop-down menu. Then click Replace.
  4. Next you are given a friendly reminder from the Server Cluster Recovery Utility to delete the original disk resource and then change the drive letter of the new disk resource so that it matches the drive letter assigned to the original (failed) disk. Click OK.
  5. Click Exit to close the Server Cluster Recovery Utility.
  6. In Cluster Administrator, locate the failed disk resource. The failed disk resource will be easy to spot in Cluster Administrator because it will have the word "(lost)" next to its name. Right-click on the lost resource and select Delete. When prompted to confirm, click Yes.
  7. Use Disk Management to change the drive letter associated with the new disk.

At this point, you can bring the virtual server resources back online and restore the original virtual server data from backup.

Note that some resources may fail to come online. For example, a File Share resource will fail if the original folder that the resource is associated with is not present. After the backup is restored, you will be able to bring all resources in the group (virtual server) online. Also keep in mind that depending on how your enterprise back-up software is configured, you'll most likely need to reinstall your back-up agent software into the virtual server in order to perform the restore.

Before the days of the Server Cluster Recovery Utility, cluster disk recovery was fraught with pain. As soon as I would hear of a problem, my mind would instantly fill up with the burnt tooth smell that serves as an ominous sign at most dentist offices. Now when I hear of a cluster disk failure, I just smile from ear to ear. This could mean that either I'm comforted by the ease of the Server Cluster Recovery Utility, or that my sanity is starting to return!

About the Author

Chris Wolf is a Microsoft MVP for Windows --Virtual Machine and is a MCSE, MCT, and CCNA. He's a Senior Analyst for Burton Group who specializes in the areas of virtualization solutions, high availability, storage and enterprise management. Chris is the author of Virtualization: From the Desktop to the Enterprise (Apress), Troubleshooting Microsoft Technologies (Addison Wesley), and a contributor to the Windows Server 2003 Deployment Kit (Microsoft Press).learningstore-20/">Troubleshooting Microsoft Technologies (Addison Wesley) and a contributor to the Windows Server 2003 Deployment Kit (Microsoft Press).

comments powered by Disqus
Most   Popular