Sliced Too Thin
Fixing last week's slow file copy problem required taking bigger bites.
- By Bill Boswell
we published a question from an administrator who had a complex
problem involving poor file transfer performance between Windows 2000
and NT machines both locally and across a WAN link. Here's a quick synopsis:
Tad's the admin for a Windows 2000 machine in Chicago and an NT machine
in Seattle. The two offices were connected by a DS3 that had been verified
by his network engineers to be correctly configured.
Tad copied files between the machines using Robocopy and measued the
transfer times. He reported some truly pathetic performance results: 0.5
Mbps when copying the file to an NT server across the WAN, 15 Mbps when
copying to an NT server locally, and surprisingly, 39 Mbps when
copying from the NT. Finally, he reported getting good file transfer
times locally between Windows 2000 machines (51 Mbps) but only 5 Mbps
across the WAN.
Tad included this file trace:
# Time Source Destination
37 0.468750 10.10.10.10 10.50.50.50 SMB Write AndX Request,
FID: 0x080f, 4292 bytes at offset 5906824
38 0.531250 10.50.50.50 10.10.10.10 TCP netbios-ssn > 2958 [ACK]
Seq=79369844 Ack=1230722740 Win=8760 Len=0
39 0.531250 10.50.50.50 10.10.10.10 SMB Write AndX Response, FID:
0x080f, 4292 bytes
40 0.531250 10.10.10.10 10.50.50.50 SMB Write AndX Request, FID:
0x080f, 4292 bytes at offset 5911116
41 0.593750 10.50.50.50 10.10.10.10 TCP netbios-ssn > 2958 [ACK]
Seq=79369895 Ack=1230727100 Win=8760 Len=0
42 0.593750 10.50.50.50 10.10.10.10 SMB Write AndX Response, FID:
0x080f, 4292 bytes
Tad was also puzzled because when he used a Netmon capture file as a
test file, his transfer rates improved dramatically.
Help from Bill
Got a Windows or Exchange question or need troubleshooting
help? Or maybe you want a better explanation than provided
in the manuals? Describe your dilemma in an e-mail
to Bill at mailto:firstname.lastname@example.org;
the best questions get answered in this column.
When you send your questions, please include your
full first and last name, location, certifications (if
any) with your message. (If you prefer to remain anonymous,
specify this in your message but submit the requested
information for verification purposes.)
I invited you to try your hand at solving the problem, and I got lots
of great replies. Eric from AT&T, Karen from Amerigroup, and Brian
from Arkwright came the closest, but nobody got the root cause and a full
explanation on the first try. Bradley, Christopher, Steven, and Jeffrey
also came close and deserve honorable mention.
Several of you guessed that the problem was related to a mismatch between
full and half duplex. I'm sure we all have been caught by this problem
at one time or another, and that was my original thought when Tad sent
me the question. I should have mentioned in the scenario that we eliminated
duplex mismatch by checking the switches and routers for collisions at
the specific ports where the servers were connected.
A few of you pointed out that a congested frame relay connection could
cause the poor WAN performance Tad experienced. I hoped that the scenario
made clear that Tad had contracted for the full DS3 bandwidth, but I should
have made it clearer.
I also got quite a few guesses pinpointing frame fragmentation as the
problem. That pesky 4292 frame size in the trace was responsible for those
guesses, and it cost Tad and I some troubleshooting time, as well. It
turns out that Network Monitor (and Ethereal http://www.ethereal.com,
a more sophisticated open-source packet sniffer) combines consecutive
Ethernet frames if their payload is part of the same TCP datagram. I'm
not able to confirm whether this is a feature of the sniffers or a result
of the way the NDIS (Network Device Interface Specification) drivers report
results. If someone knows for sure, please let me know.
A couple of you guessed that Robocopy might be responsible, either by
compressing or otherwise manipulating the data stream. Robocopy was not
the culprit, and Tad got the same transfer times using XCOPY.
What's the root cause, then? It's a feature (if you want to call it that)
of the NT TCP/IP stack documented in Microsoft KB article 223140,
"SMB Block Size Negotiation When Copying Files with Windows NT Explorer."
Here is a synopsis:
NT uses two modes to copy file: Core and Raw. At a client computer,
when copying a file to an NT computer, data is transferred in
Core mode using 4 KB blocks. When copying a file from an NT computer,
data is typically transferred in Raw mode in 60 KB blocks.
This is why Tad got 39 Mbps when copying from the NT server locally
in Chicago but only got 15 Mbps when copying to the NT server.
The Core mode file transfer is inefficient.
Before we discuss the fix, let's see why this feature caused Tad's poor
Although the maximum theoretical speed of a DS3 connection is 45 Mbps,
mitigating factors can affect the practical throughput you can expect
to achieve. Sure, you're going to lose a little performance due to framing
latency, and if you don't contract with the network service provider to
get the full Committed Information Rate, you may get dropped frames. But
the big difference in Tad's case was the distance between the offices.
The speed of light in fiber optic connections is approximately 200,000
meters per second. Using a nifty distance calculator at http://www.indo.com/distance,
you can determine that a geodesic from Chicago to Seattle is 2,795 kilometers.
Doing some quick math, you'll see that it takes about 14 milliseconds
for a frame to travel that distance, meaning that the shortest possible
round trip time between the two offices is 28 ms. The real-world time
will be much longer because fiber doesn't follow a geodesic and there
are other sources of latency along the way.
In Tad's case, the total round trip time was something like 62 ms. You
can calculate this from the information supplied in the packet trace.
Now, because a Core file transfer in NT has such a small block size (4K
compared to 60K), the number of ACKs required to complete the transaction
is very high. In Tad's case, with each ACK imposing a 62 ms delay, the
WAN performance was very bad.
The fix involves adding two Registry entries at the NT machine, only
one of which is documented in KB 223140.
The first entry increases the SMB Request Buffer size to its maximum of
65535 bytes. The second entry increases the TCP Window size to its maximum
considering that an Ethernet frame has a maximum payload of 1460 bytes
and you want the window to be an even multiple of the maximum payload,
yielding 64240 bytes.
Value: SizReqBuf (Reg_Dword)
Data: 65535 (decimal)
Value: TcpWindowSize (Reg_Dword)
Data: 64240 (decimal)
After making these two entries and restarting the NT server, the file
transfer times to and from the NT machine nearly matched the performance
between two Windows 2000 servers, both locally and across the WAN.
As for the performance improvement Tad measured when using the Network
Monitor (Netmon) capture file, that had us confused for quite a while,
I'm embarrassed to say. Apparently, Netmon places each captured packet
in its own disk cluster within the capture file, rather than packing all
the data into one cluster before getting a new cluster. This makes the
size of the file on disk quite large compared to the actual data stored
in the file. We verified this behavior by creating a new volume with a
32 KB cluster size then copying a Netmon file to the volume. On a volume
with a 512 byte cluster size, the file was 102 MB. On a volume with a
32K cluster size, the file showed a size over 1 GB.
Thanks to those of you who volunteered a response. See you next week!
Contributing Editor Bill Boswell, MCSE, is the principal of Bill Boswell Consulting, Inc. He's the author of Inside Windows Server 2003 and Learning Exchange Server 2003 both from Addison Wesley. Bill is also Redmond magazine's "Windows Insider" columnist and a speaker at MCP Magazine's TechMentor Conferences.