PowerShell Multithreading with PoshRSJob

Page content

Intro

PowerShell has had a native method for spawning multiple “threads” ever since I can remember, in the form of the *-Job functions. They work OK, but there are a couple downsides:

  • Each job is its own PowerShell process, so it takes a non-trivial amount of time and memory to spin each up
  • There’s no built-in method for throttling the number of concurrent jobs

This combination will become an ugly mess if something spins out of control and you spawn dozens or hundreds of jobs. PowerShell jobs are better tailored to small-scale asynchronous background processing. You can wrap the functions to limit the number of concurrent jobs but again, there’s a lot of overhead involved in creating and tearing down jobs. Warren Frame created Invoke-Parallel, which uses runspaces (lighter-weight than jobs) and allows for throttling, but isn’t quite as full-featured as jobs are.

Enter PoshRSJob

Boe Prox comes to the rescue with his PoshRSJob module. This module provides runspace-based functions that mirror PowerShell’s native job functions to create - Get, Receive, Remove, Start, Stop, and Wait - plus some additional functionality like the ability to import modules (without needing Import-Module inside the Scriptblock) and throttling the number of concurrent jobs running.

Code

Explanation

The above is a fairly trivial script but it serves the purpose of demonstrating multi-threading with this module.

  1. In a loop, I’m creating two dozen RSJobs to fetch the PoshRSJob module from GitHub and write it out to a directory on my computer.

  2. I’ve set a throttle limit of 3; while all 24 jobs will be created, only three will run at any given time and the remainder will be in a NotStarted state, patiently waiting their turn.

  3. I’m passing value of the loop counter and my output directory into the job’s scriptblock so that I can write out 24 individually-named files.

  4. So that the RSJob system can keep track of the jobs properly, I’m lumping them into a named Batch.

  5. Because the module’s zip file is so small, I introduced a bit of a delay into each job. Why?

    1. To give me time to capture screenshots while the jobs are in various states.
    2. To prevent tripping GitHub’s anti-flooding protections.

So let’s fire this up!

Jobs created, but only three running

I’ve started up the script and it’s created all the RSJobs, but only the first three are running because that’s where I set my throttle limit. Let’s check in on the progress with Get-RSJob - all the same concepts you’re used to with regular PowerShell jobs (receiving, removing, checking for errors, etc.) work with RSJobs too!

And we’re done!

You still need to clean up after yourself! Don’t forget to Receive-RSJob and then Remove-RSJob. I have something special for that.

Bonus Round!

Let’s say you’ve got a lot of RSJobs. Or maybe long-running RSJobs. Wouldn’t a nice, user-friendly view of the progress be handy? You bet it would!

This is both dense and verbose, but I wrote it really quickly and it works (but not just for this post, I’ve used it in production), so I’m not tinkering too much with it right now. Once the jobs are created (and a few have started), this script runs a loop watching the output of Get-RSJob and updates a progress bar to reflect where we are. It also takes care of receiving and removing the completed jobs.

[video width=“970” height=“546” mp4=“https://flxsql.com/wp-content/uploads/2018/12/RSJobsProgress.mp4"][/video]

It’s true that I’m not checking for any errors in my job output. In the production process that I multithreaded, the Scriptblock inside each thread does its own error logging to a file.

Double Bonus

Did you notice the paths I’m writing to and the terminal theme? Yeah, this works on PowerShell Core too. Everything you see here ran on macOS as I was writing this post.

Conclusion

For light to moderate use, PoshRSJob is a really handy way to multi-thread processes that aren’t natively multi-threaded in PowerShell. It doesn’t seem to scale well to hundreds of threads, though. There is an open issue that seems to be related on GitHub, but I’ll discuss my own experience with it on a near-future post.