# Array Language Comparisons

Trying to summarize some array languages based on their features and how they work and feel.

## Laundry list

In the table below, the `+`

should be taken to mean any binary operand, like `+`

, or `-`

, or `>`

.
In the case where both arguments are collections, their shapes must agree in some way which is language-dependent.
Hopefully I got all these characteristics right, but I'm less sure about MATLAB since I haven't used it for several years, nor Julia since I have not used it much at all.

These languages all support operations on (s)calars, (v)ectors, and higher-dimensional (m)atrices:

language | s+s | s+v | v^{k}+v^{k} |
v^{k}+m^{l×k} |
v^{k}+m^{k×l} |
m^{k×l}+m^{k×l} |
m^{k×l}+m^{l×k} |
---|---|---|---|---|---|---|---|

APL | y | y | e | n | y | e | leading |

J | y | y | e | n | y | e | leading |

K | y | y | e | n | y | e | leading |

NumPy | y | y | e | y | n | e | trailing |

Julia | y | y | e | y | n | e | trailing |

MATLAB | y | y | e | n | y | e | leading |

(NOTE: "e" in the table above means "element-wise")

They're also usually interpreted:

language | aot-compiled | jit-compiled | interpreted | static types | dynamic types |
---|---|---|---|---|---|

APL | co-dfns only | n | y | n | y |

J | n | n | y | n | y |

K | n | kx | y | n | y |

NumPy | bytecode | pypy,numba | y | annotations | y |

Julia | optional | y | y | annotations | y |

MATLAB | optional | y | y | n | y |

And differ slightly in their approaches to programming:

language | named functions | lambdas | scope | closures |
---|---|---|---|---|

APL | y (tradfns) | y (dnfs) | dynamic (dfns:lexical) | n |

J | n | y | global, local | n |

K | n | y | global, local | n |

NumPy | y | y | dynamic (functions:lexical) | y (prefer classes) |

Julia | y | y | lexical | y |

MATLAB | y | y | dynamic | y (prefer classdef) |

## Scalar+Vector

Isn't it nice to not have to write out a for-loop? The following is valid in APL, J, K, and some others:

12 + 2 3 4 14 15 16

MATLAB and Octave do the same thing with a tiny bit more syntax in the form of square brackets:

octave> 12 + [2 3 4] ans = 14 15 16

BQN is similar, but has a special character for "stranding" a collection of values together (`‿`

), as well as general list notation (`⟨⟩`

):

12 + 2‿3‿4 ⟨ 14 15 16 ⟩

NumPy is a little farther away due to being a library rather than built-in syntax, but if you squint you may still see it:

>>> a = numpy.array([2,3,4]) >>> 12 + a array([14, 15, 16])

## Terminology

These language have different terminology for this behavior.
Some K documentation says the `+`

"is pervasive" or "penetrates" down to the elements of the arguments.
J refers to this in terms of rank: "the `+`

verb has rank 0".
NumPy, Julia, and call this "broadcasting".

In the example (`12 + 2 3 4`

) you could imagine it executing in one of two ways:

- the operation
`12+x`

is mapped over each "x" in`2 3 4`

`12`

is replicated and reshaped into`12 12 12`

, which adds elementwise with`2 3 4`

Broadcasting is usually explained with the second formulation, but internally most array language implementations have some kinds of optimizations to avoid unnecessarily allocating memory for temporary values.

## Vector+Vector

These languages also support adding collections to other collections when the shapes are identical:

2 3 4 + 10 5 2 12 8 6

This is simply an elementwise operation. If the shapes differ, all of these languages consider it an error.

## Higher dimensions

It gets a little more interesting when the shapes are not identical.
Here's an example in J showing which shapes are compatible (`i.y`

returns a range of numbers up to the product of y, with shape y):

i.2 3 NB. iota with y argument (2 3) 0 1 2 3 4 5 12 12 + i.2 3 12 13 14 15 16 17 12 12 + i.3 2 |length error | 12 12 +i.3 2 12 12 12 + i.3 2 12 13 14 15 16 17

In the first addition, left argument `12 12`

can add with the `2x3`

right argument because the shape of the leading axes match.
Similarly the last example adds `12 12 12`

to a `3x2`

argument.
But when adding `12 12`

to a 3x2 array, J makes a "length error" because the shapes are not compatible according to its rules.

Broadcasting in NumPy works a little differently:

>>> a23 = np.arange(6).reshape((2,3)) # 2x3 array >>> a32 = np.arange(6).reshape((3,2)) # 3x2 array >>> [12,12] + a32 array([[12, 13], [14, 15], [16, 17]]) >>> [12,12,12] + a23 array([[12, 13, 14], [15, 16, 17]]) >>> [12,12] + a23 ValueError: operands could not be broadcast together with shapes (2,) (2,3)

In NumPy style broadcasting, the *trailing* axes must match.
Actually it's a little more complicated than this, and there are broadcasting rules which determine if the operation is valid.
Broadcasting conceptually starts with trailing axis matching, but if there is a mismatch and the other dimension is a singleton dimension (i.e. the `1`

dimension in a `2x1x3`

tensor), then the singleton data is replicated to match the other dimension.

By contrast, J-style rank applies an operation to what they refer to as *cells* of the arguments.
When the shapes are equal, the operation applies elementwise.
Otherwise, the operation must match the leading axes of the cells of each argument.
This means you can add a tensor with shape `2x3x4`

to another with shape `2x3`

, but not shape `3x4`

or even `2x3x3`

, but you can also override this *rank* and apply the `+`

operation at a different level.

Finally, K emulates multi-dimensional data using vectors-of-vectors. It applies operations by default to the finest granularity, to leaf nodes of the structure, but if you want to override this then you must use different control structures to apply to "each", "eachleft", "eachright", or "eachprior". These do not cover all cases in the same general way as either broadcasting or rank, but in practice obtaining equivalent results is rare, and only slightly more awkward.