Futures in Jx

Jx is a J interpreter with built-in support for task-level parallelism. For more information about Jx, check out this page.

What are Futures?

In programming, futures are used to describe results of asynchronous computations. Basically, it's a way to program with variables even if some of those variables don't have values right now.

Hopefully an example will make this more clear.

ex1 = web_get("www.example.com/1")  # takes 2.4s
ex2 = web_get("www.example.com/2")  # takes 2.5s
ex3 = web_get("www.example.com/3")  # takes 1.9s
print(ex1, ex2, ex3)

In many programming languages (Python, JavaScript, J) a program like this would run sequentially. That means, the first line must finish before the second line can start, and so on. This program will take 6.8 seconds (sum of each web_get call).

However, if you could launch each web_get concurrently, the whole program could take as little as 2.5s, which is the time for the slowest of the three calls.

For a fully asynchronous program, the theoretical total time is the time taken by the slowest individual part, rather than the sum of the times for all the parts.

For practical programs, futures make it somewhat easier to combine sync+async code. In the above example, the print function needs the results of all 3 web_get functions, so it has to wait until they're all finished.

In Python, you could express this goal with the asyncio library:

import asyncio
ex1 = asyncio.create_task(web_get("www.example.com/1"))
ex2 = asyncio.create_task(web_get("www.example.com/2"))
ex3 = asyncio.create_task(web_get("www.example.com/3"))
print(await ex1, await ex2, await ex3)

This example looks pretty clean, although I've omitted the definition for web_get. And based on the first example, web_get must have been a normal (synchronous) function. The asyncio-enabled code therefore requires changing the definition for web_get to make it asyncronous. While this is a minor change, it is also a limitation, because web_get can be synchronous or asynchronous, but it can't be both within the same program.

background (if you don't know J)

In J you can define a function like this:

   sum =: {{+/ y}}
   sum 2 3 4
9

The integers function can create a matrix of shape (10, 10):

   a. 10 10
 0  1  2  3  4  5  6  7  8  9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99

Since sum doesn't say what shape its argument should be, it must work on arrays of arbitrary dimension. The question of "on what dimensions does this function work?" is answered in J with a concept called rank. All J functions have a rank, and some apply at multiple ranks. This means functions can be applied across different dimension of their arguments. For example matrixes have 2 dimensions (rows and columns) so a function can be applied to either of these dimensions:

   sum i. 10 10
450 460 470 480 490 500 510 520 530 540

If you don't specify a rank, the function applies at the largest rank possible. In the case of a matrix, the highest rank is rank 2 (columns), so sum operates on the columns.

You can specify other ranks for any function (built-in or user-defined):

   sum"1 i. 10 10
45 145 245 345 445 545 645 745 845 945

That extra "1 means "apply sum at rank 1 (rows)", so we get an array of row-sums instead of column-sums.

J with parallel rank

Finally, we can combine our knowledge about rank with Jx's parallel rank to run each of those separate sum calculations on their own thread:

   128!:30(1 10)  NB. use 1 thread for primitives, and up to 10 threads for futures
   (sum .. 0"1) i. 10 10
+--+---+---+---+---+---+---+---+---+---+
|45|145|245|345|445|545|645|745|845|945|
+--+---+---+---+---+---+---+---+---+---+

This time the results are boxed. To access the results, use unbox (>). Unboxing will block execution of the main thread until the results become available (although in this case the example is tiny so the results are available immediately anyway).

   >(sum .. 0"1) i. 10 10
45 145 245 345 445 545 645 745 845 945

There you have it - task parallelism without much extra work.

full source code

To compare, here is the original program:

sum =: {{+/ y}}
sum"1 i.10 10

…and here is the parallel version:

128!:30(1 10)
sum =: {{+/ y}}
(sum .. 0"1) i.10 10

summary

Personally, I find this implementation of parallelism very friendly to use. I've used Python's futures, and I've used promises in JavaScript, but they always felt awkward. They made thinking about the overall program much harder than thinking about a blocking program.

These are my points for and against the Jx implementation:

  • (pro) no change to original function
  • (pro) can use sync or async versions of function in same program
  • (con) no "await" to explicitly show blocking semantics

More importantly, this type of parallelism works well with J's inherent data parallelism, and I look forward to faster-running programs.