Boswell's Q&A

Sliced Too Thin

Fixing last week's slow file copy problem required taking bigger bites.

Readers: Last week we published a question from an administrator who had a complex problem involving poor file transfer performance between Windows 2000 and NT machines both locally and across a WAN link. Here's a quick synopsis:

Tad's the admin for a Windows 2000 machine in Chicago and an NT machine in Seattle. The two offices were connected by a DS3 that had been verified by his network engineers to be correctly configured.

Tad copied files between the machines using Robocopy and measued the transfer times. He reported some truly pathetic performance results: 0.5 Mbps when copying the file to an NT server across the WAN, 15 Mbps when copying to an NT server locally, and surprisingly, 39 Mbps when copying from the NT. Finally, he reported getting good file transfer times locally between Windows 2000 machines (51 Mbps) but only 5 Mbps across the WAN.

Tad included this file trace:

#  Time     Source      Destination Prot Summary
37 0.468750 SMB  Write AndX Request, FID: 0x080f, 4292 bytes at offset 5906824
38 0.531250 TCP  netbios-ssn > 2958 [ACK] Seq=79369844 Ack=1230722740 Win=8760 Len=0
39 0.531250 SMB  Write AndX Response, FID: 0x080f, 4292 bytes
40 0.531250 SMB  Write AndX Request, FID: 0x080f, 4292 bytes at offset 5911116
41 0.593750 TCP  netbios-ssn > 2958 [ACK] Seq=79369895 Ack=1230727100 Win=8760 Len=0
42 0.593750 SMB  Write AndX Response, FID: 0x080f, 4292 bytes

Tad was also puzzled because when he used a Netmon capture file as a test file, his transfer rates improved dramatically.

Get Help from Bill

Got a Windows or Exchange question or need troubleshooting help? Or maybe you want a better explanation than provided in the manuals? Describe your dilemma in an e-mail to Bill at mailto:[email protected]; the best questions get answered in this column.

When you send your questions, please include your full first and last name, location, certifications (if any) with your message. (If you prefer to remain anonymous, specify this in your message but submit the requested information for verification purposes.)

I invited you to try your hand at solving the problem, and I got lots of great replies. Eric from AT&T, Karen from Amerigroup, and Brian from Arkwright came the closest, but nobody got the root cause and a full explanation on the first try. Bradley, Christopher, Steven, and Jeffrey also came close and deserve honorable mention.

Several of you guessed that the problem was related to a mismatch between full and half duplex. I'm sure we all have been caught by this problem at one time or another, and that was my original thought when Tad sent me the question. I should have mentioned in the scenario that we eliminated duplex mismatch by checking the switches and routers for collisions at the specific ports where the servers were connected.

A few of you pointed out that a congested frame relay connection could cause the poor WAN performance Tad experienced. I hoped that the scenario made clear that Tad had contracted for the full DS3 bandwidth, but I should have made it clearer.

I also got quite a few guesses pinpointing frame fragmentation as the problem. That pesky 4292 frame size in the trace was responsible for those guesses, and it cost Tad and I some troubleshooting time, as well. It turns out that Network Monitor (and Ethereal, a more sophisticated open-source packet sniffer) combines consecutive Ethernet frames if their payload is part of the same TCP datagram. I'm not able to confirm whether this is a feature of the sniffers or a result of the way the NDIS (Network Device Interface Specification) drivers report results. If someone knows for sure, please let me know.

A couple of you guessed that Robocopy might be responsible, either by compressing or otherwise manipulating the data stream. Robocopy was not the culprit, and Tad got the same transfer times using XCOPY.

What's the root cause, then? It's a feature (if you want to call it that) of the NT TCP/IP stack documented in Microsoft KB article 223140, "SMB Block Size Negotiation When Copying Files with Windows NT Explorer." Here is a synopsis:

NT uses two modes to copy file: Core and Raw. At a client computer, when copying a file to an NT computer, data is transferred in Core mode using 4 KB blocks. When copying a file from an NT computer, data is typically transferred in Raw mode in 60 KB blocks.

This is why Tad got 39 Mbps when copying from the NT server locally in Chicago but only got 15 Mbps when copying to the NT server. The Core mode file transfer is inefficient.

Before we discuss the fix, let's see why this feature caused Tad's poor WAN performance.

Although the maximum theoretical speed of a DS3 connection is 45 Mbps, mitigating factors can affect the practical throughput you can expect to achieve. Sure, you're going to lose a little performance due to framing latency, and if you don't contract with the network service provider to get the full Committed Information Rate, you may get dropped frames. But the big difference in Tad's case was the distance between the offices.

The speed of light in fiber optic connections is approximately 200,000 meters per second. Using a nifty distance calculator at, you can determine that a geodesic from Chicago to Seattle is 2,795 kilometers. Doing some quick math, you'll see that it takes about 14 milliseconds for a frame to travel that distance, meaning that the shortest possible round trip time between the two offices is 28 ms. The real-world time will be much longer because fiber doesn't follow a geodesic and there are other sources of latency along the way.

In Tad's case, the total round trip time was something like 62 ms. You can calculate this from the information supplied in the packet trace.

Now, because a Core file transfer in NT has such a small block size (4K compared to 60K), the number of ACKs required to complete the transaction is very high. In Tad's case, with each ACK imposing a 62 ms delay, the WAN performance was very bad.

The fix involves adding two Registry entries at the NT machine, only one of which is documented in KB 223140. The first entry increases the SMB Request Buffer size to its maximum of 65535 bytes. The second entry increases the TCP Window size to its maximum considering that an Ethernet frame has a maximum payload of 1460 bytes and you want the window to be an even multiple of the maximum payload, yielding 64240 bytes.

Key: HKLM\SYSTEM\CurrentControlSet\Services\
Value: SizReqBuf (Reg_Dword)
Data: 65535 (decimal)

Key: HKLM\SYSTEM\CurrentControlSet\Services\
Value: TcpWindowSize (Reg_Dword)
Data: 64240 (decimal)

After making these two entries and restarting the NT server, the file transfer times to and from the NT machine nearly matched the performance between two Windows 2000 servers, both locally and across the WAN.

As for the performance improvement Tad measured when using the Network Monitor (Netmon) capture file, that had us confused for quite a while, I'm embarrassed to say. Apparently, Netmon places each captured packet in its own disk cluster within the capture file, rather than packing all the data into one cluster before getting a new cluster. This makes the size of the file on disk quite large compared to the actual data stored in the file. We verified this behavior by creating a new volume with a 32 KB cluster size then copying a Netmon file to the volume. On a volume with a 512 byte cluster size, the file was 102 MB. On a volume with a 32K cluster size, the file showed a size over 1 GB.

Thanks to those of you who volunteered a response. See you next week!

About the Author

Contributing Editor Bill Boswell, MCSE, is the principal of Bill Boswell Consulting, Inc. He's the author of Inside Windows Server 2003 and Learning Exchange Server 2003 both from Addison Wesley. Bill is also Redmond magazine's "Windows Insider" columnist and a speaker at MCP Magazine's TechMentor Conferences.

comments powered by Disqus
Most   Popular