PowerShell How-To
How To Create Arrays for Performance in PowerShell
Is your script taking too long to execute? Speed it up with this handy tip.
- By Adam Bertram
- 09/28/2017
Creating arrays in PowerShell is a common occurrence for any scripter. Arrays are an important data structure in any language and PowerShell is no different. However, not all arrays are the same. In fact, in PowerShell, the word "array" is usually treated as a generalized term meaning only one type of data structure to hold values. But, there is more than one type of array in PowerShell. If scripters don't understand this, they may be writing code that could be much more efficient.
When someone new to PowerShell first learns about arrays, they're taught to define an array by listing element separated by a comma.
PS C:\> $browsers = 'Internet Explorer','Chrome','Safari'
PS C:\> $browsers
Internet Explorer
Chrome
Safari
PS C:\> $browsers = @('Internet Explorer','Chrome','Safari')
PS C:\> $browsers
Internet Explorer
Chrome
Safari
To add items to this array, it's then typical to use the += operator.
$browsers += 'Opera'
PS C:\> $browsers
Internet Explorer
Chrome
Safari
Opera
This method works great, and if just working with a small number of elements (< 1000), this method of adding items to this array doesn't take up too much time. But under the covers, this method of adding items to an array isn't efficient. When the += operator is used, it's actually destroying the array and creating a new one. Since computers are so fast, you'll hardly notice but start working with arrays of larger and larger item counts, the lag is obvious.
To demonstrate, let's build a simple array and add 99,999 items to it using the += operator and measure how long it takes.
PS C:\> $bigarray = @()
PS C:\> Measure-Command -Expression { @(0..99998).foreach({ $bigArray += $_ }) }
Days : 0
Hours : 0
Minutes : 7
Seconds : 25
Milliseconds : 271
Ticks : 4452713048
TotalDays : 0.00515360306481481
TotalHours : 0.123686473555556
TotalMinutes : 7.42118841333333
TotalSeconds : 445.2713048
TotalMilliseconds : 445271.3048
Notice that this process took over seven minutes. That's a long time if you're waiting for a script to finish. You may think that this is necessary to get all of these items in this array but it's not. In fact, we can get much better performance adding items to an array not by using the basic array data structure but but using the .NET System.Collections.ArrayList object. This is another kind of array that works similar to the array but is much more efficient in how it operates under the covers.
Let's perform the same task again by adding 99,999 items to our ArrayList object this time. Notice that we won't be using the += operator this time. The ArrayList object has an Add() method to add additional items. The Add() method also returns the index number of the item that was just added, and we'll typically not want this, so it's common practice to return the output to $null.
$bigarray = [System.Collections.ArrayList]@()
Measure-Command -Expression { @(0..99998).foreach({ $null = $bigArray.Add($_) })}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 319
Ticks : 3190912
TotalDays : 3.69318518518519E-06
TotalHours : 8.86364444444444E-05
TotalMinutes : 0.00531818666666667
TotalSeconds : 0.3190912
TotalMilliseconds : 319.0912
We just improved the speed by over 1400 percent!
In PowerShell, there is always more than one way to skin the proverbial cat. If you're running into a spot in your code that's taking a long time, chances are there are always ways to improve the speed. Investigate new ways of performing the same task, and you'll probably speed up your script exponentially and learn a thing or two in the process.
About the Author
Adam Bertram is a 20-year veteran of IT. He's an automation engineer, blogger, consultant, freelance writer, Pluralsight course author and content marketing advisor to multiple technology companies. Adam also founded the popular TechSnips e-learning platform. He mainly focuses on DevOps, system management and automation technologies, as well as various cloud platforms mostly in the Microsoft space. He is a Microsoft Cloud and Datacenter Management MVP who absorbs knowledge from the IT field and explains it in an easy-to-understand fashion. Catch up on Adam's articles at adamtheautomator.com, connect on LinkedIn or follow him on Twitter at @adbertram or the TechSnips Twitter account @techsnips_io.