In-Depth
SANs for the Masses
If you had previously dismissed storage networks as being too costly and complex for your organization, it may be time for another look.
When faced with a growing Christmas dilemma, Frank Costanza created a
"Festivus for the rest of us." When faced with a growing storage
dilemma, some network administrators almost have had to resort to the
same level of creativity. Regardless of the size of your company, odds
are that managing data has becoming increasingly more difficult. Data
volume is continuing to grow, yet the methods you may use to protect data
are the same as they were five years ago.
At this point, I'm sure someone out there is thinking, "Well if
it ain't broke, don't fix it!" In data recovery circles, the word
broke is usually associated with missing backup data or backups that are
taking longer than expected to restore.
Microsoft once said that an administrator is only two failed restores
away from a pink slip. Ultimately, this means that if you wait for recovery
failures before making adjustments to your strategy, your adjustments
may be helping someone else while you're off looking for a new job.
For small to mid-sized organizations that did not have the budget or
assumed number of servers to mandate an investment in a storage area network
(SAN), you may want to take another look at this technology.
SAN Benefits
Storage networking literally involves creating a network for your
storage resources. To understand this, consider the pre-SAN storage infrastructure
of nearly all organizations. Before SANs, each server represented its
own data island. For any person, service, or application to access the
data, they would have to traverse the LAN. Because of this type of access,
backup and restore jobs always had to run at night so as not to bog down
the LAN during the day. Sure, other factors also determine the fact the
backups run during off-peak hours, such as the CPU, memory, and disk overhead
associated with backup jobs. Even if a backup was local, the resource
consumption on the server would be noticeable to the end user.
Aside from removing backup and restore traffic from the LAN, SANs will
also give you the ability to pool all of your storage resources together
so that they can be used by multiple servers on your network. For example,
if you're backing up to tape, odds are that only one or two servers are
able to directly connect to a tape library via a SCSI connection. With
a SAN, you could connect all of your servers to the same library, if desired.
This would give you a much greater level of flexibility in managing your
storage.
The same approach can be applied to disk storage arrays. If you connect
disk arrays to a SAN, you can allocate individual disks in the array to
a specific server. For many organizations, IT managers and administrators
usually find themselves "throwing more storage" at a scalability
problem. The usually result of this effort is that some serves wind up
with over allocated storage that's never used. With centrally managed
pooled storage on a SAN, you would have the ability to allocate and reallocate
storage resources as your servers' storage requirements change.
To summarize, the primary purpose of a storage network is to isolate
your storage infrastructure from the public network. This provides the
following advantages:
- Data storage and backup is consolidated on its own network.
- SANs can offer up to 4Gbps bandwidth, allowing for speedy backup and
recovery.
- Storage networks can have built-in redundancy (storage and access
path), allowing for automated fault-tolerant data access.
Better Backups
Aside from being able to pool and share your storage resources,
you'll also see that you'll have access to several new approaches to backing
up and managing data. These backup approaching include: LAN-free, server-free,
and server-less.
LAN-free backups have been around well before the rise of storage networking.
A simple LAN-free backup is shown in Figure 1.
|
Figure 1. LAN-free backup |
A backup or restore job is considered to be LAN-free if its data doesn't
traverse the LAN. The easiest way to achieve this is to locally attach
a storage drive or library to each server. However, as the number of servers
in your organization increases, so would the need for backup storage devices.
With a SAN, you can still perform LAN-free backups, but also have the
ability for several systems to share storage resources.
Server-free backups emerged as a direct result of SAN technology. With
server-free backups, a backup job of a server's data will run without
consuming any CPU cycles on the server. This is accomplished by using
another host as the "data mover." As long as both systems have
access to the target server's storage in the SAN, the system configured
as the data mover can move the backup and restore data. This approach
is shown in Figure 2.
|
Figure 2. Server-free backup |
Server-less backups go one step beyond server-free backups. Like with
a server-free backup, no CPU cycles are consumed on the actual server
hosting the data. Also, with this approach, no additional server is needed
to move the data. Instead, a device on the SAN such as router or switch
can make the backup copy of the data for you up receiving a SCSI-3 Extended
Copy (also known as X-Copy) command. This method is shown in Figure 3.
|
Figure 3. Server-less backup |
While these backup types may sound appealing at this point, don't think
that Microsoft is going to give you a supercharged version of Windows
Backup to make it all happen. When storage resources are shared, care
has to be taken to ensure that multiple servers on the SAN don't try and
reuse the exact same media, such as tapes in a library. The result of
this could mean that Server2's backup mistakenly overwrites a tape used
to backup Server1, for example. With this in mind, you would need to use
a robust backup platform such as the offerings by CommVault
or Veritas.
Whether or not you plan to share your backup media resources on a SAN,
sharing disks resources still has its merit as you can simply allocate
disks as they're needed to the servers that need them. This level of sharing
can be easily accomplished with the features already present in SAN switches
and host bus adapters (HBAs). With all this talk about storage networks,
let's see what they're all about.
SAN Topologies
Like LANs, SANs are networks that can be configured in specific
topologies. There are three basic SAN topologies:
- Point-to-point
- Arbitrated loop
- Switched fabric
Point-to-Point Topology
Point-to-point is the most basic of all SAN topologies and requires
little hardware to implement. As shown in Figure 4, a point-to-point SAN
simply incorporates two devices connected by a single fibre channel cable.
This is the equivalent of using a crossover cable to connect two computers.
As with a crossover cable, the two devices have the entire bandwidth of
the SAN for communications. To set up a point-to-point SAN, you connect
the first component's receive connector to the second component's transmit
connector, and vice versa. Due to its simplicity, this topology is rarely
employed unless you only have the budget for a new library and some cabling
and plan to purchase a switch the following year.
|
Figure 4. Point-to-point is the most basic SAN
topology. |
Arbitrated Loop Topology
Arbitrated loop topology, also referred to as fibre channel-arbitrated
loop (FC-AL), is the SAN equivalent of a token ring network topology.
In this configuration, all devices connected to the loop share the bandwidth
of the single-loop connection. Like with a token ring network , only one
device can transmit data at a time, thus limiting overall performance
of the storage network, especially as the network starts to scale.
|
Figure 5. In arbitrated loop topology, all devices
in the loop share bandwidth. |
In this topology, all SAN components are arranged in a loop to which
up to 127 devices can be connected (Figure 5). The physical connections
for this topology type are typically creating using a hub that supports
(FC-AL).
Switched Fabric Topology
Because only one device can transmit over the loop at a time, the
FC-AL topology is limited in performance and scalability. The most common
topology that offers optimal network performance and scalability is the
switched fabric topology. Switched fabric is just like switched Ethernet
and offers switched point-to-point connections between SAN network components.
This topology allows each component to utilize the entire network bandwidth.
Also, with switched fabric, devices can be easily added and removed from
the SAN without interruption. With an FC-AL SAN, each time a device is
added or removed, the entire network needs to reinitialize. A switched
fabric SAN is shown in Figure 6.
|
Figure 6. Switched fabric topology allows each
component to use the entire network's bandwidth. |
That's the logical side of SANs; now let's look at the physical aspects.
Key SAN Ingredients
SAN components are connected with fibre channel cable, which can
be purchased in two forms: copper and fiber-optic. (Fibre channel uses
the spelling "fibre" so as not to be fully associated with fiber
optics.) Your choice of cable should primarily be dictated by the needs
of your SAN. Fiber-optic cable costs much more than copper and is more
fragile, but has its advantages.
If you need to maintain high bandwidth over a great distance or if electromagnetic
interference (EMI) is a concern, your likely choice is fiber optic. If
you're looking for durable cable to be used for connecting local storage
devices, copper may be your best bet.
Each server connecting to the SAN will need a fibre channel host bus
adapter (HBA). The HBA requirement is pretty logical. If you want to connect
a server to an external SCSI device, you need a SCSI adapter. The same
can be said for needing an HBA to connect to a SAN. Your choice of HBA
will primarily be driven by the cable and connectors used to connect to
the SAN. Most older 1 to 2Gbps SAN devices interconnected cables using
a Gigabit Interface Converter (GBIC). A newer connector, the small form
factor pluggable (SFP) has emerged and is currently supporting 4Gbps SAN
transports. One major advantage to these two devices is that you can upgrade
the cable used to connect to a switch without having to replace the switch
itself. With an Ethernet switch, on the other had, each connector was
soldered to the switch so you did not have any flexibility. This isn't
the case with most fibre channel switches on the market today.
Because GBICs adapt different switches and to different fibre channel
mediums, there are a few unique types of GBICs. So unfortunately, you
can't mosey on down to your local computer shop and shout, "I'll
take three GBICs-to go!" GBICs are identified primarily by their
connector types (think back to narrow vs. wide SCSI). To interface with
copper mediums, there are GBICs with DB-9 connectors, which are old and
rarely used anymore. There are also GBICs with High Speed Serial Data
Connectors (HSSDC)s. HSSDCs look similar to USBs and are the most common
GBIC for copper mediums. For fiber-optic connections, there are multimode
and singlemode GBICs, with the choice of GBIC based on the type of fiber
employed in the network. A multimode GBIC is shown in Figure 7.
|
Figure 7. Multimode GBICs are one choice for
fiber-optic connections. |
Now let's look at some of the devices that may exist on a SAN.
Which Switch?
SAN switches and hubs work the same as their Ethernet counterparts,
al-lowing you to network devices on the SAN, such as storage arrays, libraries
and other switches and servers. With a hub, all devices share network
bandwidth, while a switch provides dedicated point-to-point connections,
just like Ethernet. Hub are used to connect FC-AL topologies, while many
switches can support either switched fabric or arbitrated loop. The type
of topology a switch supports is determined by the type of ports it contains.
F_Ports are used in switched fabric topologies and FL_Ports with arbitrated
loop. Some switches have ports that can act as either F or FL ports. These
ports are known as universal, or U_Ports. To support cascading of switches
and the growth of your SAN, many switches also contain E_Ports, which
are used to interconnect switches.
There's one other major SAN component of which to be aware-the router
or bridge. Some vendors call this piece a router, while others call it
a bridge; but no one wants to go out on a limb and call it a "brouter."
The majority of vendors refer to this piece as a router, so that's what
I'll call it.
With fibre channel, the job of a router is much different than with IP
networks. Fibre channel routers translate fibre channel frames to frames
of another transport, such as SCSI. Aside from a switch, the router is
perhaps the most important piece of your SAN infrastructure. Because this
device provides fibre-channel-to-SCSI translation, you can connect legacy
SCSI devices, such as storage arrays or libraries, to your fibre channel
SAN. Many newer libraries include built-in fibre channel HBAs, so a router
isn't necessary; but for moving your existing libraries to a SAN, purchasing
a router is money well spent.
Zoning Laws
Another hot topic accompanying the rising popularity of SANs is
the use of zoning. It's easiest to think of zoning as the SAN equivalent
to virtual LANs (vLANs). With LANs, you can set up vLANs on a switch to
segment the single physical switch into multiple logical switches. This
makes some switch port connections unable to see connections to other
switch ports. With zoning, you can apply the same concept to SAN switches.
With networked storage, security may be a primary concern. For example,
you may not want servers in the Development Organizational Unit (OU) to
be able to access storage in the Finance OU. By setting up zoning on your
SAN switches, your storage infrastructure can be configured so that Finance
servers connected to the SAN can only see disk arrays allocated to Finance.
There are two primary ways to configure zoning on SAN switches. One is
by port. For example, you can allow devices on Switch A Port 5 to communicate
with devices connected to Switch A Port 9. The other is by World Wide
Name (WWN). WWNs are unique, 64-bit identifiers for devices or ports.
Some devices with multiple ports have a WWN for each port, allowing more
granular management. Because of their length, WWNs are expressed in hexadecimal,
so a typical WWN would look like this:
3A:08:7C:98:56:D9:02:44
With WWN, zoning is configured on a device or device-port basis, allowing
you to move the device on the SAN and change its associated switch port
without affecting zoning configuration. However, if the device fails and
has to be replaced, you'll have to reconfigure the zoning so that the
WWN of the replacement device is associated with the correct zone.
Remember: Without zoning, all servers connected to the SAN have
the potential to see all storage devices on the SAN. Configuring zoning
allows you to limit what storage devices particular servers can see. If
you have plans to expand your SAN to be shared by multiple departments,
consider zoning a necessity.
Bridging the Gaps
For many organizations that require high data availability, disbursing
storage across two or more remote sites offers the security of being able
to maintain data availability even after a disaster. Availability can
be achieved through technologies such as clustering, replication, snapshots
and traditional backup and recovery. To configure a storage infrastructure
that traverses geographical boundaries, organizations need a protocol
that uses a WAN's economic advantages.
The cheapest transmission medium is the Internet, which requires IP.
Wouldn't it be cool to be able to bridge SANs in two sites through the
Internet? That's what Fibre Channel over IP (FCIP) is all about. In order
for this to happen, a device able to do the fibre-channel-to-FCIP translation
is needed. Some fibre channel switches have integrated FCIP ports allowing
this. Remember, however, that FCIP doesn't provide any means to directly
interface with a fibre channel device; it's a method of bridging two fibre-channel
SANs over an IP network.
iSCSI on the Rise
One the vast majority of today's SANs are driven by the fibre channel
protocol (FCP), Internet SCSI (iSCSI) has emerged as a serious alternative.
With iSCSI, you can build out a storage network using Ethernet technologies,
instead of proprietary fibre channel switches. For example, if you wanted
to setup a 1Gb iSCSI SAN, you would start by purchasing a Gb Ethernet
switch. As costs for 10Gb Ethernet switches drop, you could upgrade your
iSCSI SAN to 10Gb and simply rotate the 1Gb switch into your LAN. With
fibre channel, you would not have this flexibility since fibre channel
switches cannot switch Ethernet traffic. Instead, many older fibre channel
switches wind up as bargains on Ebay.
Also, if you're looking for an inexpensive start to iSCSI networking,
you can evaluate the offerings of Rocket
Division Software, which provides inexpensive software iSCSI target
and initiator software. This software allows you to share SCSI disk resources
over a TCP/IP network and is a great way to get started with iSCSI storage
networking. For example, if you wanted to share SCSI drives on one box,
you would install the iSCSI target software on it. Rocket Division offers
a few free versions of their StarWind
iSCSI target software. For iSCSI clients to share the disk storage
on the target server, they need the StarPort
iSCSI client software. While there are plenty of other commercial
solutions available for entry-level iSCSI learning, I was turned onto
the Rocket Division products by their initial price (free!), albeit for
a limited number of connections.
Now if you're looking for more advanced iSCSI implementations, you will
see several iSCSI storage devices, such as tape libraries to still be
pretty costly in comparison to some equivalent fibre channel devices.
I believe that this is purely an issue of supply and demand. As iSCSI
continues to gain in popularity, more vendors will be producing iSCSI
products and thus the prices will start to drop.
Like many other aspects of IT, the fact that multiple storage networking
solutions exist can only be good for the consumer. Ultimately, we'll see
better prices and feature sets as a result.
With their high degree of flexibility and scalability, SANs are here
to stay. With prices continually becoming more reasonable for the SMBs,
SANs are finally truly an option for the masses.