PowerShell How-To
How To Manage File Hashes in PowerShell
This will be a technical article introducing file hashes, how to generate hashes with PowerShell and, when copying files, how to use hashes to your advantage to ensure file copies were successful.
- By Adam Bertram
- 07/09/2015
Have you ever dropped a package at the nearest package delivery store destined for some remote place? If so, you probably expect it to get there exactly as you shipped it, right? It is the job of the package delivery service to accept your package, send it off, track it while it's in transit and successfully deliver the package to the recipient exactly how it left your hands. If this didn't happen and the package arrived beat up, you'd want to know about it. You'd want to get angry at the package delivery service for damaging it. Shouldn't the same be said for your important digital files?
We may not be delivering packages like UPS in the digital world but we still do deliver products. Our products are just files, folders, e-mails and registry keys. We transfer around all kinds of digital data every day. We don't think twice about if it's going to get there or not. The reason is because we rely various underlying transport protocols that are used to convert that file into a packet and back again to a file at the recipient. What if you wanted to see with your own eyes if your "package" got delivered in one piece once it arrived at its "recipient?" You can, with hashing. More specifically with comparing two hashes.
Hashing, in its simplest form, is just taking an object -- we'll use a file in this case -- and comparing it against a fancy algorithm to get some obscure set of predictable characters back representing what the file is made of. Notice that I used the adjective predictable. This is the key. When given a certain file with a hash algorithm applied to it, it will always output the exact same set of characters.
Apply the hash to a text file on your computer now and you might get back dk8xkw94j>0$%nlxn. Copy it over to another computer and you'll get that same dk8xkw94j>0$%nlxn string. If that text file is modified even with a single bit, that string will be different. What's the big deal? What if you could calculate the file hash before you copied the file and then right after you copied the file? What benefit would that provide? It would tell you down to the tiniest detail if any hair on your little file's head was changed.
Think about how handy it would be if you had some very important files you needed copied to a computer halfway across the world. Perhaps you decided to zip them up and copy them over a VPN connection. The files would first have to be put in a zip file, read from your computer, split apart over SMB/SSL/IPSEC and a whole slew of other protocols, sent through dozens of routers, potentially traverse the sea floor to finally get written back to that zip file again on the recipient's end, opened by the recipient and finally written back her computer's hard disk. It's a wild ride for a file! There's a lot of potential there for that file to become "damaged" or even maliciously modified. Wouldn't it be nice if you could confirm it got there exactly like it left. You're in luck! It can!
I've created a function called Copy-FileWithHashCheck.ps1 that allows you to copy a file from point A to point B while ensuring that the hashes will be the same when it arrives. The real work from these couple functions was done by Boe Prox. This is where the code for the Get-MyFileHash function came from in that script. I simply bolted on the ability to run it before and after a file was copied and compare the differences. Feel free to download and use the script. It's a great way to copy files that you need to be 100 percent sure get to where they're going in one piece.
About the Author
Adam Bertram is a 20-year veteran of IT. He's an automation engineer, blogger, consultant, freelance writer, Pluralsight course author and content marketing advisor to multiple technology companies. Adam also founded the popular TechSnips e-learning platform. He mainly focuses on DevOps, system management and automation technologies, as well as various cloud platforms mostly in the Microsoft space. He is a Microsoft Cloud and Datacenter Management MVP who absorbs knowledge from the IT field and explains it in an easy-to-understand fashion. Catch up on Adam's articles at adamtheautomator.com, connect on LinkedIn or follow him on Twitter at @adbertram or the TechSnips Twitter account @techsnips_io.