PowerShell Pipeline

Performing Joins and Splits in PowerShell

Here's how to make your data more useful -- or even just more presentable -- using the Join and Split operators and the Split method.

With PowerShell, there are many ways to work with data. In this case, I am going to show you some different approaches to splitting text, as well as joining data together.

You might think that it is as simple as just using the -Join or -Split operators to accomplish this goal, and to a certain extent you would be right. However, there are some useful tricks and gotchas to watch out for when using the operators. There's also the Split() method, which is also available for objects that are System.String.

With that, let's start off by splitting some text to show off both the –Split operator and the Split() method. While both perform the same fundamental task of splitting the text, how they perform this operation is greatly different.

Using the Split method, we can specify in the parameter what type of character we want to perform the split on. It can be a number of characters as long as they match the exact pattern that you provided.

PS C:\> ("This.is.some.weird.text.that.uses.a.period.instead.of.spaces").Split

OverloadDefinitions                                                                                                                      
-------------------                                                                                                                      
string[] Split(Params char[] separator)                                                                                                  
string[] Split(char[] separator, int count)                                                                                              
string[] Split(char[] separator, System.StringSplitOptions options)                                                                      
string[] Split(char[] separator, int count, System.StringSplitOptions options)                                                           
string[] Split(string[] separator, System.StringSplitOptions options)                                                                    
string[] Split(string[] separator, int count, System.StringSplitOptions options)

Looking at the various method overloads for Split, we can see near the bottom where it handles string input as a parameter along with an optional System.StringSplitOptions. Let's take a quick look at what our options are with this.

PS C:\> [System.StringSplitOptions] | Get-Member -Static -MemberType Properties


   TypeName: System.StringSplitOptions

Name               MemberType Definition                                                
----               ---------- ----------                                                
None               Property   static System.StringSplitOptions None {get;}              
RemoveEmptyEntries Property   static System.StringSplitOptions RemoveEmptyEntries {get;}

We have either None or RemoveEmptyEntries as our possible options for splitting our text up. First, I will split the provided text by a period (.) without specifying an option to see how the data is presented. Note: Run the following two examples in the PowerShell console versus the ISE to get a better visual of using the split option versus not using it.

PS C:\> ("This.is.some.weird.text.that.uses.a.period.instead.of.spaces.").Split('.')
This
is
some
weird
text
that
uses
a
period
instead
of
spaces

PS C:\>

We can see that it does indeed split the data by the period, which is what one would expect to have happen. But notice at the end there is an extra space before displaying the next prompt. This is due to the extra period at the end which was split off into a space. I wonder if using the RemoveEmptyEntries option might help resolve that from appearing.

PS C:\> ("This.is.some.weird.text.that.uses.a.period.instead.of.spaces.").Split('.',[System.StringSplitOptions]::Remove
mptyEntries)
This
is
some
weird
text
that
uses
a
period
instead
of
spaces
PS C:\>

Moving onto the -Split operator, if we attempt to use the same approach of just specifying a period as the splitter, we will have drastically different results.

PS C:\> "This.is.some.weird.text.that.uses.a.period.instead.of.spaces." -split '.'

In fact, I won't bother with displaying the results as you will just see a lot of space between the start of the command and the end of the command. Why does this happen? The answer is that this particular approach for splitting up text uses Regular Expressions to perform the split whereas the method used a literal string character. If we want to truly split the data by the period, which in Regular Expressions is a wild card, we will need to use a backslash (\) as an escape character for the period.

PS C:\> "This.is.some.weird.text.that.uses.a.period.instead.of.spaces." -split '\.'
This
is
some
weird
text
that
uses
a
period
instead
of
spaces

Now that looks more appropriate, right? Since we can use RegEx with the split operator, this really opens up more options as to how we can perform splits on various points of data, such as removing all numerical values from a string regardless of what the values are.

PS C:\> "This32is43some54text34" -split '\d+'
This
is
some
text

Pretty cool stuff to use to split our data when needed.

The last stop on this article is looking at using the –Join operator to bring data together, such as data in an array or even to clean up some text that we recently split up. Take out last example where we removed some numerical values from a string. We can now join that data back up and have a coherent sentence.

PS C:\> "This32is43some54text34" -split '\d+' -join ' '
This is some text

The –Join doesn't have to be at the end of the string if you do not care about a particular value being used for the join.

PS C:\> -join ("This32is43some54text34" -split '\d+')
Thisissometext

If you wanted to join by specific characters, that is as simple as specifying the character (or characters) to join with.

PS PS C:\> $list = [System.Collections.ArrayList]::new()
1..10 | ForEach {[void]$list.Add($_)}
$list -join ':'
1:2:3:4:5:6:7:8:9:10

When working with text in your data and you need to perform some manipulation of the data to make it more presentable or to simply make it more useful in other commands, you should definitely make sure to keep the Join and Split operators and Split method close by to help make the process much easier to perform.

About the Author

Boe Prox is a Microsoft MVP in Windows PowerShell and a Senior Windows System Administrator. He has worked in the IT field since 2003, and he supports a variety of different platforms. He is a contributing author in PowerShell Deep Dives with chapters about WSUS and TCP communication. He is a moderator on the Hey, Scripting Guy! forum, and he has been a judge for the Scripting Games. He has presented talks on the topics of WSUS and PowerShell as well as runspaces to PowerShell user groups. He is an Honorary Scripting Guy, and he has submitted a number of posts as a to Microsoft's Hey, Scripting Guy! He also has a number of open source projects available on Codeplex and GitHub. His personal blog is at http://learn-powershell.net.

comments powered by Disqus
Most   Popular