PowerShell's strange comma operator
I have reached an enlightening moment in the land of PowerShell.
As it turns out, the comma in PowerShell is an operator—both binary and unary.
Unary use is a bit strange and out of the ordinary, but it’s not like the comma hasn’t been used as a binary operator before. For example, C/C++ uses the comma as both an operator and for statement separation, with JavaScript and Perl implementing that functionality with C as the inspiration. So what exactly is so enlightening about it being an operator in PowerShell?
Foremost is the use-case. If we read the PowerShell documentation for the comma operator, we see that it “creates an array or appends to the array being created”. This is in stark contrast to how most other languages use the comma operator, if they even have one. In cases where arrays need to be created, explicit language constructs are usually created for it, rather than offloading it into an operator.
Even for those languages that do use the comma as an operator, they typically assign it very low in the order of operations. In the three of our above example languages—C/C++, JavaScript, and Perl—the binary comma operator has the least precedence, with only Perl assigning a scant few things below it. But in PowerShell, the comma actually has higher precedence than assignment, logical, and arithmetic operators!
Thus, knowing that addition is lower precedence in the order of operations than the comma, we get this interesting language “feature”:
> $x = 'hello', 'world' + 'dog'
> $x
hello
world
dog
> $x.GetType().Name
Object[]
> $x.Count
3
If you are confused, don’t worry, it might take some time to wrap your head around it. You may also be confused as to why I showed the type of the resulting object. You’ll find out in a moment. But before that, let’s go over what’s happening here, in order:
- The comma operator
,
is invoked for the string operands'hello'
and'world'
, creating a new array containing those values. - The addition operator
+
, which has a lower precedence than the comma operator, is invoked after building this array. The addition operator for arrays is left associative, meaning that we add the string operand'dog'
from the right of the addition operator to the array we created on the left. That results in a new array being created containing the three operands.
Here’s another example, this time using the multiplication operator:
> 'hello', 'world' * 3
hello
world
hello
world
hello
world
In this case, the array is replicated three times, since the comma still has higher precedence than the multiplication and is performed first, following the same general logic as the addition example above.
By the way, the process by which these arrays are being modified by the arithmetic operators is an unimportant implementation detail for our current experiment; I provided links to the relevant documentation for the sake of clarity to the uninitiated. Rather, what is important to note is PowerShell’s broader concept of implicit type conversion, a concept that I want to drive home: PowerShell will attempt to convert types implicitly when it is required to do so in order to satisfy the requirements of an operation or expression.
Let’s try reversing the order of our operators and see what happens:
> $x = 'hello' + 'world', 'dog'
> $x
helloworld dog
Wait, isn’t the comma operator supposed to have higher precedence than addition? If so, why does it look like 'hello'
and 'world'
were added together before the array was created? Take a closer look:
> $x.GetType().Name
String
> $x.Count
1
So what’s happening in this order of operations?
- The comma operator
,
is invoked for the string operands'world'
and'dog'
. - A new array is created, containing the values
'world'
and'dog'
. - The addition operator
+
, which has a lower precedence than the comma operator, is invoked after building this array. - The addition operator for strings is left associative, just like arrays, meaning that we add the string on the left hand side to the array on the right.
- Since the array we built is now the operand to the right of the addition operator, the array is converted into a string before being appended to the left operand.
There is a significant nuance in the fact that 'world'
and 'dog'
are separated by a space here. Rather than format the array with newlines separating the values before it’s written to the host, as we saw in our previous examples, we instead have a single space separating our array values. That space is actually the currently set output field separator, which is space by default. This field separator is used to join the array values when converting it to a string.
Thus, our tale harkens back to the escapades of a wandering $null
. Speak to any seasoned PowerShell programmer, and you will hear copious moaning about objects being incorrectly equated to $null
, simply due to it appearing on the left or right of a comparison operator when the opposite should have been done. In PowerShell, there are reasons to do both, as both situations offer different functionality that may prove beneficial to the programmer. But in general, if you want to check if a value exists or not, $null
should appear on the left-hand side of the comparison, to avoid type conversion and unwanted array enumeration. Multilingual readers may recognize the situation being eerily similar to JavaScript’s strict and abstract equality operators, with similar problems cropping up for the erstwhile developer.
I argue that, when checking equality or performing operations, it is simply too difficult to rely on loose typing, implicit type conversion/casting, and automatic array enumeration, no matter how beneficial. These tempting pieces of syntactic sugar have resulted in considerable headache for thousands of programmers, no matter how useful they are—the typical anti-pattern. But with PowerShell’s decade plus of history, and the tens of thousands of scripts that make up that history… Like JavaScript, it’s far too late to go back.
Maybe PowerShell needs its own strict equality operator? We could add it to to the fleet of funny operators, like -ieq
for case-insensitive inequality. How about -seq
? Too close to -ceq
? Nah?