Special Reports

Speed Up Your VB.NET Code

Optimization rules have changed under VB.NET-here are eight great new ways to build wicked-fast code.

Speed Up Your VB.NET Code
Optimization rules have changed under VB.NET. Here are eight new ways to build wicked-fast code.
by Francesco Balena

Technology Toolbox: VB.NET

VB.NET changes the way you write Visual Basic code. You learn quickly that most of the optimization tricks you've learned for VB6 won't work under VB.NET. For example, .NET memory allocation forces you to rethink how you process strings and other data types. In other cases, the problem isn't in the language. For instance, ADO.NET requires optimization techniques completely different from those for ADO.

I'll show you the eight more useful tips for making your VB.NET programs run like the wind. You can apply most of these techniques to C# as well because .NET languages are actually just a thin layer over the .NET Framework. Note that I took all timings by compiling the code without debug support and after ticking the Enable Optimizations option in the Configuration Properties | Optimizations page of the Project Properties dialog box.

1 Concatenate Strings With StringBuilder
Your first optimization trick involves sidestepping string concatenation. Consider this slow code:

' Create a comma-delimited list of all
' numbers in the range [1-10000]
Dim i As Integer, s As String
For i = 1 To 10000
   s &= CStr(i) & ", "
Next

The code allocates a new block of memory each time the string is modified. VB6 programmers solve this familiar problem by allocating a large string, then stuffing characters into it using the Mid command. However, .NET strings are immutable. You can't change them after you create them; even the Mid command creates a new string each time you invoke it. You solve this problem in .NET by using the System.Text.StringBuilder object's Append method:

Dim sb As New System.Text.StringBuilder()
Dim i As Integer
For i = 1 To 10000
   ' Note that we use two Append methods to
   ' avoid creating a temporary string
   sb.Append(CStr(i))
   sb.Append(", ")
Next
' Get the result as a string
Dim s As String = sb.ToString

This code snippet runs about 200 times faster than the one based on the & operator. You get slightly better results if you create a StringBuilder with an initial capacity as large or larger than the number of characters you're going to create:

Dim sb As New System.Text.StringBuilder(60000)

For maximum performance, never reuse a StringBuilder object after you extract its contents with the ToString method, which returns the string in the internal buffer. Any other method would cause an additional memory allocation and copy operation to preserve the immutability of the returned string.

You can concatenate your strings easily and quickly with the Join method if they're stored already in an array:

Dim tmp(10000) As String
Dim i As Integer
For i = 1 To 10000
   tmp(i) = CStr(i)
Next
Dim s As String = String.Join(", ", tmp, 1, 10000)

2 Don't Throw Unnecessary Exceptions
It's hard to get an idea of how costly throwing exceptions can be until you do some accurate benchmarks. This function throws an exception with a probability whose value is passed as an argument:

Dim rand As New Random()
Sub CanThrowException(ByVal prob As Double)
   If rand.NextDouble <= prob Then
      Throw New System.Exception()
   End If
End Sub

This routine can throw an exception, so you must wrap all procedure calls within a Try?Catch block. For example, this code throws and catches about 10,000 exceptions:

Dim i As Integer 
Dim prob As Double = 0.01
For i = 1 To 1000000
   Try
      CanThrowException(prob)
   Catch
      ' do nothing in this case
   End Try
Next

This code runs in 0.45 seconds on my 900-MHz Pentium III system; it runs in only 0.11 seconds if you set the probability to zero. So throwing an exception once every hundred iterations makes the entire routine four times slower. Each exception takes about 30 microseconds, or 27,000 CPU cycles. Don't call procedures that might throw an exception from inside a time-critical loop. Use another mechanism to return a failure notification to the caller. For example, you might return a special value, such as False, Nothing, or a negative value:

Function DoesntThrowException( _
   ByVal prob As Double) As Boolean
   If rand.NextDouble <= prob Then
      Return False   ' notify failure
    Else
      Return True    ' notify success
    End If
End Function

Most classes in the .NET Framework follow these guidelines and don't throw exceptions unless strictly required. For example, the Close method of an OleDbConnection or SqlConnection object doesn't throw an exception if the connection is closed already, unlike its ADO counterpart. Also, floating-point division doesn't throw a DivisionByZero exception if the denominator is zero; it sets the result to the IEEE infinite value instead. Remember that the Try?Catch block doesn't add noticeable overhead to your code—it's throwing exceptions that bogs down execution.

3 Seal Methods
You should seal methods if at all possible. When designing a class that you can use as a base class (one not marked with the NotInheritable keyword), you must decide which methods you can override in derived classes. You might be tempted to flag all methods and properties with the Overridable keyword so that inherited class can completely redefine the behavior of the base class, but you'll pay a performance penalty.

I designed a class whose DoubleIt and DoubleIt2 methods perform the same operation (see Listing 1). However, only the former method is overridable. A simple benchmark shows that a call to the overridable version runs about 35 percent slower than a call to the nonoverridable (or sealed) one.

Another reason for sealing a method is that the compiler can often inline short nonoverridable methods. When this happens, the compiler moves the code from the method to the caller procedure and discards the call instruction, making your code even faster.

If you have an overridable method that you can also call from inside the class, you can organize the code in the class so the overridable method delegates to a private, nonoverridable (and therefore more efficient) method:

Class TestClass
   ' this method can be called from clients
   Overridable Sub MyProc()
      ' delegate to the private method
      Call InternalProc()
   End Sub
   ' this method is more efficient, but can
   ' be called only from inside the class
   Private Sub InternalProc()
      ' ...put the real code here...
   End Sub
End Class

4 Avoid Calls to Interface Methods
Here's another simple optimization tip: Calling a method through a secondary interface is slower than calling the same method in the main class interface. Consider my TestMethod procedure (see Listing 1). You can call it through either a TestObject variable (because the method is public) or through an ITestInterface variable:

Dim o As New TestObject()
' a call through the main interface
o.TestMethod()
' a call through the secondary interface
Dim itest As ITestInterface = _
   DirectCast(o, ITestInterface)
itest.TestMethod()

The call through the main class interface can be four to five times faster than the call through the ITestInterface interface (or any secondary interface). If possible, call methods using a variable of the same type as the object you're dealing with. In absolute terms, the difference between method calls is often negligible, but it becomes apparent if you perform millions of method invocations in a tight loop.

5 Use Value Types, But Beware of Boxing
You can group all .NET data types into two categories: value types and reference types. Value types include numbers, dates, and types defined in Structure or Enum blocks. Typically, they're allocated on the current thread's stack. All other .NET types are reference types, allocated to a memory area known as the managed heap. Examples include strings, arrays, and types defined by means of Class blocks.

In general, value types are more efficient than reference types. Value types stored in local variables don't take memory from the managed heap and never undergo garbage collection, a relatively slow process that fires when the system runs out of available memory in the heap. Also, a value type variable contains the data, whereas a reference type variable points to the area of the managed heap where the data is held. Consequently, reference type variables require an additional pointer dereference to read or write the actual values. For example, a Char variable directly contains a Unicode character, while a String variable contains a pointer to the managed heap block where the string's characters are held. You should always use a Char variable if you know you're working with single-character strings.

Just to keep things interesting, though, value types can slow down your code. This happens when you assign a value type to an Object variable. Object is a reference type, so this assignment works by boxing. The value type allocates a block of memory in the heap, copies the data there, and has the Object variable point to that area.

Unfortunately, the memory allocated in the heap will be reclaimed eventually in a time-consuming, overhead-adding garbage collection operation. And if you assign the Object value back to a value type variable, you trigger an unboxing operation, resulting in a memory copy from the memory heap to the target variable. Run this code to see how boxing and unboxing can affect performance:

Dim i As Integer, sum As Long
Dim o As New TestObject()
For i = 1 To 1000000
   ' the call to GetObject boxes the integer;
   ' the Cint function unboxes it
   sum += CInt(o.GetObject(i))
Next

You can get rid of the boxing/unboxing overhead and improve the code speed by a factor of four if you invoke a method that returns an integer:

sum += o.GetInteger(i)

Remember that you also incur a hidden boxing operation when you call a method and pass a value type to an Object parameter. If you're authoring a class library, always offer overloaded versions of the most critical methods. You'll reduce the number of boxing operations:

Sub DisplayData(ByVal n As Long)
   ' ...
End Sub

Sub DisplayData(ByVal n As Double)
   ' ...
End Sub

Sub DisplayData(ByVal n As Object)
   ' ...
End Sub

These three methods let the compiler pick the most appropriate version, depending on the type of the value passed to the argument. Byte, Short, Integer, and Long arguments use the first version, Single and Double arguments use the second, and all remaining data types use the last one. As Object can cause a boxing operation only if the argument is a value type other than the types just mentioned (such as a DateTime value, a TimeSpan value, or a Structure). Another hidden boxing operation occurs when you access a value type (a Structure, for example) by means of a secondary interface.

6 Avoid String Duplicates
The VB.NET compiler deals with string constants in a rather smart way: It stores all string constants with the same value only once, using a memory area known as string intern pool:

Dim s1 As String = "ABCDE"
Dim s2 As String = "ABCDE"
' Prove that s1 and s2 point to the
' same element in the intern pool
Console.Write(s1 Is s2)  ' displays True

This optimization technique isn't likely to affect the memory footprint of most client apps significantly, but it can be quite effective in objects that get instantiated thousands of times. However, this technique works only for constant strings. You can't apply it to strings built at run time:

' continuing previous example...
Dim s3 As String = "ABC"
s3 &= "DE" 
' s1 and s3 contain the same value but
' point to a different string
Console.Write(s1 = s3)      ' displays True
Console.Write(s1 Is s3)      ' displays False

Say you have a component in the data tier with a field containing the database connection string. This connection string is read at initialization time—from an XML configuration file, for example—so the compiler can't store it in the intern pool. If the component is instantiated n times, there will be n copies of the connection string in the managed heap, which can be quite a waste if n is high and the string is long.

You can avoid this memory consumption in one of two ways, depending on whether the connection string can vary. If the connection string is exactly the same for all the instances, save it in a shared member so that only one string is shared among all the component instances. You've got more work if the connection string can vary, as when you use the same component to connect to one of three databases. In this case, you can't use a single shared member. You must resort to a technique based on the String.Intern method.

The String.Intern method takes a string and searches for it in the string intern pool. The method returns a reference to the existing element if it finds the string in the pool. If the search fails, the method inserts the string in the pool, then returns a reference to the element just added. You might implement the ConnectionString property of your hypothetical data component to leverage the string intern pool like this:

' The private member
Dim m_ConnString As String
Property ConnectionString() As String
   Get
      Return m_ConnString
   End Get
   Set(ByVal Value As String)
      m_ConnString = String.Intern(Value)
   End Set
End Property

The first time a given value is assigned to the ConnnectionString property, the search in the intern pool fails and String.Intern adds the string to the pool, then returns a reference to it. If the same connection string is assigned to a different instance of the component, String.Intern returns a reference to the element in the pool already and creates no duplicates. This technique shrinks your app's memory footprint and speeds up your code because it indirectly reduces the number of garbage collections.

7 Implement the Dispose/Finalize Pattern
.NET objects don't have a Class_Terminate event firing when you set the last reference to them to Nothing. Instead, they can implement a (protected) Finalize method, which the .NET Framework invokes when the object is about to be garbage-collected. You must use this method when the class wraps or uses an unmanaged resource. For example, Finalize uses the OpenClipboard API to open the clipboard, and therefore must call the CloseClipboard API to release the clipboard so other processes can use it:

Class TestObject
   ' put the garbage collector under pressure
   Dim dummyArr(1000) As Byte

   Sub New()
      OpenClipboard(0)
   End Sub

   Protected Overrides Sub Finalize()
      ' close the clipboard when finalized
      CloseClipboard()
      MyBase.Finalize()
   End Sub
End Class

You can omit the call to MyBase.Finalize if the object derives from System.Object, as in this case, because the Finalize method in System.Object does nothing.

There are two problems with the previous implementation, though. First, the garbage collector calls the Finalize method sometime after the object has become unreachable from the main app. This keeps the clipboard open until the next garbage collection, and no other process is able to open it until then. Second, all objects that expose a Finalize method require at least two garbage collections (often more) before they vacate the managed heap completely—not an optimal usage of the available memory.

You can kill both birds with one stone by implementing the IDisposable interface and its Dispose method. This method should contain the same clean-up code found in the Finalize method. It's meant to be called directly from clients just before they set the object to Nothing:

Class TestObject
   Implements IDisposable
   Public Sub Dispose() _
      Implements IDisposable.Dispose
      ' close the clipboard
      CloseClipboard()
      ' no need to finalize this object
      GC.SuppressFinalize(Me)
   End Sub
   ' remainder of code as before...
End Class

The crucial point—the GC.SuppressFinalize(Me) method—tells the .NET Framework that the object has run its clean-up code and doesn't need to be finalized. Instead, .NET removes the object completely from memory at the first garbage collection. A client should use the disposable object like this:

' create the object
Dim o As New TestObject()
' use it
' ...
' run its clean-up code and destroy it
o.Dispose()
o = Nothing

Use a Try...Finally block if there's any chance of an exception being thrown while using the object:

Dim o As TestObject
Try
   ' create and use the object
   o = New TestObject()
   ' ...
Finally 
   ' run its clean-up code and destroy it
   o.Dispose()
   o = Nothing
End Try

The Finalize method runs even if the object throws an exception while executing the constructor method, which might be an issue because the finalization code might access member variables that haven't been initialized correctly. You can sidestep this problem by placing the creation step inside the protected Try region.

Use my TestObject class to get an idea of the extra overhead the Finalize method requires for its extra garbage collection:

Dim i As Integer 
For i = 1 To 100000
   Dim o As New TestObject()
   o.Dispose()
   o = Nothing
Next

This code runs in 0.5 seconds on my system; if you comment out the Dispose method, however, total execution time climbs up to 1.3 seconds (two and a half times slower). The ratio depends largely on how much memory the object consumes. You can see different results if you change the size of the dummyArr array.

8 Beware of WithEvents Variables
VB.NET gives you two ways to trap events coming from an object. First, you can trap events by assigning a reference to a WithEvents variable (as in VB6) and flagging the event handler with the Handles keyword:

' you can use New with WithEvents variables
Dim WithEvents obj As New TestObject
Sub obj_TestEvent() Handles obj.TestEvent
   ' ...handle the event here...
End Sub

Second, you can link an event to an existing procedure at run time with the AddHandler keyword (and later remove the association with the RemoveHandler keyword):

Dim o As New TestObject
Sub UseDynamicEvents()
   ' associate the event with the local proc
   AddHandler o.TestEvent, _
      AddressOf MyHandler
   ' use object (now MyHandler traps events)

   ' ...
   ' remove event handler
   RemoveHandler o.TestEvent, _
      AddressOf MyHandler
End Sub
Sub MyHandler() 
   ' ...handle the event here...
End Sub

.aspx" target="_blank">
Figure 1. Use a Hidden Path to Faster Code.

The AddHandler approach is more flexible. For example, you can trap events from objects declared inside a procedure. However, it does require more code, so you might decide to continue using the VB6-style approach. WithEvents variables run appreciably slower than regular variables, though, because these variables are actually implemented as properties. You're actually calling a method instead of accessing a memory location each time you read or assign them. This mechanism becomes clear if you explore the output of the VB compiler with the ILDASM utility (see Figure 1).

Reading a WithEvents variable goes about two and a half times slower than reading a regular variable; and assigning a WithEvents variable runs more than six times slower than performing the same operation on a regular variable. You also incur hidden overhead whenever you pass a WithEvents variable to a procedure if the argument is passed by reference, because the VB compiler must create a temporary variable to preserve the ByRef semantics.

You can avoid such overhead by steering clear of WithEvents variables; instead, use the AddHandler mechanism exclusively. If you really want to stick with VB6 techniques, you can read a WithEvents variable as efficiently as a regular variable. The get_obj and set_obj methods are just wrappers on the hidden _obj variable (see Figure 1), so you can use this hidden variable instead of the regular WithEvents variable to invoke methods of the corresponding object:

' call a method
_obj.TestMethod()

Just remember never to assign a new value to the hidden variable, or you won't trap events through it correctly.

As you can see, VB6 developers certainly face a learning curve with VB.NET. Of course, your current expertise will help you maintain the huge amount of VB6 code still extant and quite useful. But to optimize .NET code, you need to unlearn a lot of hard-won knowledge and learn many new rules. The good news is that with the new code you write in VB.NET, the results can be astounding.

The best piece of advice I can give you is to never take anything for granted and always benchmark all the alternative techniques you can use to solve the problem at hand. In many cases, you'll be surprised by your findings.

\
comments powered by Disqus
Most   Popular