In-Depth

SANs for the Masses

If you had previously dismissed storage networks as being too costly and complex for your organization, it may be time for another look.

When faced with a growing Christmas dilemma, Frank Costanza created a "Festivus for the rest of us." When faced with a growing storage dilemma, some network administrators almost have had to resort to the same level of creativity. Regardless of the size of your company, odds are that managing data has becoming increasingly more difficult. Data volume is continuing to grow, yet the methods you may use to protect data are the same as they were five years ago.

At this point, I'm sure someone out there is thinking, "Well if it ain't broke, don't fix it!" In data recovery circles, the word broke is usually associated with missing backup data or backups that are taking longer than expected to restore.

Microsoft once said that an administrator is only two failed restores away from a pink slip. Ultimately, this means that if you wait for recovery failures before making adjustments to your strategy, your adjustments may be helping someone else while you're off looking for a new job.

For small to mid-sized organizations that did not have the budget or assumed number of servers to mandate an investment in a storage area network (SAN), you may want to take another look at this technology.

SAN Benefits
Storage networking literally involves creating a network for your storage resources. To understand this, consider the pre-SAN storage infrastructure of nearly all organizations. Before SANs, each server represented its own data island. For any person, service, or application to access the data, they would have to traverse the LAN. Because of this type of access, backup and restore jobs always had to run at night so as not to bog down the LAN during the day. Sure, other factors also determine the fact the backups run during off-peak hours, such as the CPU, memory, and disk overhead associated with backup jobs. Even if a backup was local, the resource consumption on the server would be noticeable to the end user.

Aside from removing backup and restore traffic from the LAN, SANs will also give you the ability to pool all of your storage resources together so that they can be used by multiple servers on your network. For example, if you're backing up to tape, odds are that only one or two servers are able to directly connect to a tape library via a SCSI connection. With a SAN, you could connect all of your servers to the same library, if desired. This would give you a much greater level of flexibility in managing your storage.

The same approach can be applied to disk storage arrays. If you connect disk arrays to a SAN, you can allocate individual disks in the array to a specific server. For many organizations, IT managers and administrators usually find themselves "throwing more storage" at a scalability problem. The usually result of this effort is that some serves wind up with over allocated storage that's never used. With centrally managed pooled storage on a SAN, you would have the ability to allocate and reallocate storage resources as your servers' storage requirements change.

To summarize, the primary purpose of a storage network is to isolate your storage infrastructure from the public network. This provides the following advantages:

  • Data storage and backup is consolidated on its own network.
  • SANs can offer up to 4Gbps bandwidth, allowing for speedy backup and recovery.
  • Storage networks can have built-in redundancy (storage and access path), allowing for automated fault-tolerant data access.

Better Backups
Aside from being able to pool and share your storage resources, you'll also see that you'll have access to several new approaches to backing up and managing data. These backup approaching include: LAN-free, server-free, and server-less.

LAN-free backups have been around well before the rise of storage networking. A simple LAN-free backup is shown in Figure 1.

LAN-free backup
Figure 1. LAN-free backup

A backup or restore job is considered to be LAN-free if its data doesn't traverse the LAN. The easiest way to achieve this is to locally attach a storage drive or library to each server. However, as the number of servers in your organization increases, so would the need for backup storage devices. With a SAN, you can still perform LAN-free backups, but also have the ability for several systems to share storage resources.

Server-free backups emerged as a direct result of SAN technology. With server-free backups, a backup job of a server's data will run without consuming any CPU cycles on the server. This is accomplished by using another host as the "data mover." As long as both systems have access to the target server's storage in the SAN, the system configured as the data mover can move the backup and restore data. This approach is shown in Figure 2.

Server-free backup
Figure 2. Server-free backup

Server-less backups go one step beyond server-free backups. Like with a server-free backup, no CPU cycles are consumed on the actual server hosting the data. Also, with this approach, no additional server is needed to move the data. Instead, a device on the SAN such as router or switch can make the backup copy of the data for you up receiving a SCSI-3 Extended Copy (also known as X-Copy) command. This method is shown in Figure 3.

Server-less backup
Figure 3. Server-less backup

While these backup types may sound appealing at this point, don't think that Microsoft is going to give you a supercharged version of Windows Backup to make it all happen. When storage resources are shared, care has to be taken to ensure that multiple servers on the SAN don't try and reuse the exact same media, such as tapes in a library. The result of this could mean that Server2's backup mistakenly overwrites a tape used to backup Server1, for example. With this in mind, you would need to use a robust backup platform such as the offerings by CommVault or Veritas.

Whether or not you plan to share your backup media resources on a SAN, sharing disks resources still has its merit as you can simply allocate disks as they're needed to the servers that need them. This level of sharing can be easily accomplished with the features already present in SAN switches and host bus adapters (HBAs). With all this talk about storage networks, let's see what they're all about.

SAN Topologies
Like LANs, SANs are networks that can be configured in specific topologies. There are three basic SAN topologies:

  • Point-to-point
  • Arbitrated loop
  • Switched fabric

Point-to-Point Topology
Point-to-point is the most basic of all SAN topologies and requires little hardware to implement. As shown in Figure 4, a point-to-point SAN simply incorporates two devices connected by a single fibre channel cable. This is the equivalent of using a crossover cable to connect two computers. As with a crossover cable, the two devices have the entire bandwidth of the SAN for communications. To set up a point-to-point SAN, you connect the first component's receive connector to the second component's transmit connector, and vice versa. Due to its simplicity, this topology is rarely employed unless you only have the budget for a new library and some cabling and plan to purchase a switch the following year.

Point-to-point
Figure 4. Point-to-point is the most basic SAN topology.

Arbitrated Loop Topology
Arbitrated loop topology, also referred to as fibre channel-arbitrated loop (FC-AL), is the SAN equivalent of a token ring network topology. In this configuration, all devices connected to the loop share the bandwidth of the single-loop connection. Like with a token ring network , only one device can transmit data at a time, thus limiting overall performance of the storage network, especially as the network starts to scale.

Arbitrated loop
Figure 5. In arbitrated loop topology, all devices in the loop share bandwidth.

In this topology, all SAN components are arranged in a loop to which up to 127 devices can be connected (Figure 5). The physical connections for this topology type are typically creating using a hub that supports (FC-AL).

Switched Fabric Topology
Because only one device can transmit over the loop at a time, the FC-AL topology is limited in performance and scalability. The most common topology that offers optimal network performance and scalability is the switched fabric topology. Switched fabric is just like switched Ethernet and offers switched point-to-point connections between SAN network components. This topology allows each component to utilize the entire network bandwidth. Also, with switched fabric, devices can be easily added and removed from the SAN without interruption. With an FC-AL SAN, each time a device is added or removed, the entire network needs to reinitialize. A switched fabric SAN is shown in Figure 6.

Switched fabric
Figure 6. Switched fabric topology allows each component to use the entire network's bandwidth.

That's the logical side of SANs; now let's look at the physical aspects.

Key SAN Ingredients
SAN components are connected with fibre channel cable, which can be purchased in two forms: copper and fiber-optic. (Fibre channel uses the spelling "fibre" so as not to be fully associated with fiber optics.) Your choice of cable should primarily be dictated by the needs of your SAN. Fiber-optic cable costs much more than copper and is more fragile, but has its advantages.

If you need to maintain high bandwidth over a great distance or if electromagnetic interference (EMI) is a concern, your likely choice is fiber optic. If you're looking for durable cable to be used for connecting local storage devices, copper may be your best bet.

Each server connecting to the SAN will need a fibre channel host bus adapter (HBA). The HBA requirement is pretty logical. If you want to connect a server to an external SCSI device, you need a SCSI adapter. The same can be said for needing an HBA to connect to a SAN. Your choice of HBA will primarily be driven by the cable and connectors used to connect to the SAN. Most older 1 to 2Gbps SAN devices interconnected cables using a Gigabit Interface Converter (GBIC). A newer connector, the small form factor pluggable (SFP) has emerged and is currently supporting 4Gbps SAN transports. One major advantage to these two devices is that you can upgrade the cable used to connect to a switch without having to replace the switch itself. With an Ethernet switch, on the other had, each connector was soldered to the switch so you did not have any flexibility. This isn't the case with most fibre channel switches on the market today.

Because GBICs adapt different switches and to different fibre channel mediums, there are a few unique types of GBICs. So unfortunately, you can't mosey on down to your local computer shop and shout, "I'll take three GBICs-to go!" GBICs are identified primarily by their connector types (think back to narrow vs. wide SCSI). To interface with copper mediums, there are GBICs with DB-9 connectors, which are old and rarely used anymore. There are also GBICs with High Speed Serial Data Connectors (HSSDC)s. HSSDCs look similar to USBs and are the most common GBIC for copper mediums. For fiber-optic connections, there are multimode and singlemode GBICs, with the choice of GBIC based on the type of fiber employed in the network. A multimode GBIC is shown in Figure 7.

Multimode GBIC
Figure 7. Multimode GBICs are one choice for fiber-optic connections.

Now let's look at some of the devices that may exist on a SAN.

Which Switch?
SAN switches and hubs work the same as their Ethernet counterparts, al-lowing you to network devices on the SAN, such as storage arrays, libraries and other switches and servers. With a hub, all devices share network bandwidth, while a switch provides dedicated point-to-point connections, just like Ethernet. Hub are used to connect FC-AL topologies, while many switches can support either switched fabric or arbitrated loop. The type of topology a switch supports is determined by the type of ports it contains. F_Ports are used in switched fabric topologies and FL_Ports with arbitrated loop. Some switches have ports that can act as either F or FL ports. These ports are known as universal, or U_Ports. To support cascading of switches and the growth of your SAN, many switches also contain E_Ports, which are used to interconnect switches.

There's one other major SAN component of which to be aware-the router or bridge. Some vendors call this piece a router, while others call it a bridge; but no one wants to go out on a limb and call it a "brouter." The majority of vendors refer to this piece as a router, so that's what I'll call it.

With fibre channel, the job of a router is much different than with IP networks. Fibre channel routers translate fibre channel frames to frames of another transport, such as SCSI. Aside from a switch, the router is perhaps the most important piece of your SAN infrastructure. Because this device provides fibre-channel-to-SCSI translation, you can connect legacy SCSI devices, such as storage arrays or libraries, to your fibre channel SAN. Many newer libraries include built-in fibre channel HBAs, so a router isn't necessary; but for moving your existing libraries to a SAN, purchasing a router is money well spent.

Zoning Laws
Another hot topic accompanying the rising popularity of SANs is the use of zoning. It's easiest to think of zoning as the SAN equivalent to virtual LANs (vLANs). With LANs, you can set up vLANs on a switch to segment the single physical switch into multiple logical switches. This makes some switch port connections unable to see connections to other switch ports. With zoning, you can apply the same concept to SAN switches.

With networked storage, security may be a primary concern. For example, you may not want servers in the Development Organizational Unit (OU) to be able to access storage in the Finance OU. By setting up zoning on your SAN switches, your storage infrastructure can be configured so that Finance servers connected to the SAN can only see disk arrays allocated to Finance.

There are two primary ways to configure zoning on SAN switches. One is by port. For example, you can allow devices on Switch A Port 5 to communicate with devices connected to Switch A Port 9. The other is by World Wide Name (WWN). WWNs are unique, 64-bit identifiers for devices or ports. Some devices with multiple ports have a WWN for each port, allowing more granular management. Because of their length, WWNs are expressed in hexadecimal, so a typical WWN would look like this:

3A:08:7C:98:56:D9:02:44

With WWN, zoning is configured on a device or device-port basis, allowing you to move the device on the SAN and change its associated switch port without affecting zoning configuration. However, if the device fails and has to be replaced, you'll have to reconfigure the zoning so that the WWN of the replacement device is associated with the correct zone.

Remember: Without zoning, all servers connected to the SAN have the potential to see all storage devices on the SAN. Configuring zoning allows you to limit what storage devices particular servers can see. If you have plans to expand your SAN to be shared by multiple departments, consider zoning a necessity.

Bridging the Gaps
For many organizations that require high data availability, disbursing storage across two or more remote sites offers the security of being able to maintain data availability even after a disaster. Availability can be achieved through technologies such as clustering, replication, snapshots and traditional backup and recovery. To configure a storage infrastructure that traverses geographical boundaries, organizations need a protocol that uses a WAN's economic advantages.

The cheapest transmission medium is the Internet, which requires IP. Wouldn't it be cool to be able to bridge SANs in two sites through the Internet? That's what Fibre Channel over IP (FCIP) is all about. In order for this to happen, a device able to do the fibre-channel-to-FCIP translation is needed. Some fibre channel switches have integrated FCIP ports allowing this. Remember, however, that FCIP doesn't provide any means to directly interface with a fibre channel device; it's a method of bridging two fibre-channel SANs over an IP network.

iSCSI on the Rise
One the vast majority of today's SANs are driven by the fibre channel protocol (FCP), Internet SCSI (iSCSI) has emerged as a serious alternative. With iSCSI, you can build out a storage network using Ethernet technologies, instead of proprietary fibre channel switches. For example, if you wanted to setup a 1Gb iSCSI SAN, you would start by purchasing a Gb Ethernet switch. As costs for 10Gb Ethernet switches drop, you could upgrade your iSCSI SAN to 10Gb and simply rotate the 1Gb switch into your LAN. With fibre channel, you would not have this flexibility since fibre channel switches cannot switch Ethernet traffic. Instead, many older fibre channel switches wind up as bargains on Ebay.

Also, if you're looking for an inexpensive start to iSCSI networking, you can evaluate the offerings of Rocket Division Software, which provides inexpensive software iSCSI target and initiator software. This software allows you to share SCSI disk resources over a TCP/IP network and is a great way to get started with iSCSI storage networking. For example, if you wanted to share SCSI drives on one box, you would install the iSCSI target software on it. Rocket Division offers a few free versions of their StarWind iSCSI target software. For iSCSI clients to share the disk storage on the target server, they need the StarPort iSCSI client software. While there are plenty of other commercial solutions available for entry-level iSCSI learning, I was turned onto the Rocket Division products by their initial price (free!), albeit for a limited number of connections.

Now if you're looking for more advanced iSCSI implementations, you will see several iSCSI storage devices, such as tape libraries to still be pretty costly in comparison to some equivalent fibre channel devices. I believe that this is purely an issue of supply and demand. As iSCSI continues to gain in popularity, more vendors will be producing iSCSI products and thus the prices will start to drop.

Like many other aspects of IT, the fact that multiple storage networking solutions exist can only be good for the consumer. Ultimately, we'll see better prices and feature sets as a result.

With their high degree of flexibility and scalability, SANs are here to stay. With prices continually becoming more reasonable for the SMBs, SANs are finally truly an option for the masses.

comments powered by Disqus
Most   Popular