Prof. Powershell

A Faster Get-Content Cmdlet

Some tricks for using the Get-Content cmdlet that will have you retrieving content in less than no time.

One very typical administrative task in PowerShell is to go through a list of computers and do something with each one:

PS C:\> Get-Content c:\work\computers.txt | foreach {MyFunction -computer $_}

Each line from computers.txt is piped to ForEach-Object which runs the MyFunction command passing the current computer name as the parameter value ($_) for the computer parameter. For a small list, this is fine. But for really large lists you might want to take other steps. These techniques also apply to any large text file where you want to work with the content.

First off, let's say your text file is 5,000 lines. But you want to test your pipelined expression with a small sample. You could create a separate file, but why? Instead use the -TotalCount parameter with Get-Content. The default value is -1 which reads all lines. But you can limit the total number of lines:

PS C:\> Get-Content c:\work\data.txt -totalcount 10

An expression like this will only send the first 10 lines of data.txt to the pipeline. If you specify a value that exceeds the total number of lines, don't worry--the command will end when the content is exhausted.

When piping large amounts of data from a text file using Get-Content to another command, you might see an overall performance gain by using the -ReadCount parameter. The default value is 1, which means each line is piped to the next command one at a time. But you can specify any number which has the effect of sending batches of lines to the next command in your pipelined expression:

PS C:\> Get-Content c:\work\computers.txt -readcount 25| foreach {MyFunction -computer $_}

If computers.txt has 100 lines, Get-Content will send them in batches of 25 to the next command in the pipeline. Use a value of 0 to send all lines at once. You can verify the performance with Measure-Command. I'm using the Get-Content alias gc to go through a 5,000-line text file:

PS S:\> (measure-command {gc s:\5000names.txt -readcount 1}).TotalMilliseconds
62.448
PS S:\> (measure-command {gc s:\5000names.txt -readcount 10}).TotalMilliseconds
15.5504
PS S:\> (measure-command {gc s:\5000names.txt -readcount 100}).TotalMilliseconds
9.0583
PS S:\> (measure-command {gc s:\5000names.txt -readcount 1000}).TotalMilliseconds
8.0273
PS S:\> (measure-command {gc s:\5000names.txt -readcount 0}).TotalMilliseconds
8.0196

Extensive testing seems to indicate I get better performance by reading a large number of lines, but that a value of 0 is slightly counterproductive, at least on large files. So the next time you are working with text file contents and wish performance were a little better, look at help for Get-Content and remember these parameters.

About the Author

Jeffery Hicks is an IT veteran with over 25 years of experience, much of it spent as an IT infrastructure consultant specializing in Microsoft server technologies with an emphasis in automation and efficiency. He is a multi-year recipient of the Microsoft MVP Award in Windows PowerShell. He works today as an independent author, trainer and consultant. Jeff has written for numerous online sites and print publications, is a contributing editor at Petri.com, and a frequent speaker at technology conferences and user groups.

comments powered by Disqus
Most   Popular