hoosier ee

wholesome midwestern tech/design weblog

J is Readable

2018-06-01 (revised 2018-06-07)

What makes some code readable and other code unreadable? I'll argue that readability, at least when it comes to source code, is not so much about how the code is arranged (in folders, on the page) or how it looks (whitespace, indentation, naming conventions). Instead, readability hinges on how easy it is to discover and modify the behavior of the running software.

Code is not poetry, prose, or a recipe in a cookbook. Source code is more like sheet music. For musicians, reading a page of sheet music pales in comparison to playing the piece, or listening to it. For programmers, merely reading source code is far less worthwhile than interacting with the running program while poking and prodding the source to see what happens.

Based on this definition, what makes code readable? My answer: minimal barriers to interacting with the running program, modifying its source, and inspecting what makes it tick. Notice that this answer contains nothing about naming conventions, or the correct length of a function. Nothing about indentation or whitespace around punctuation. No comment about comments. To me, reading code is synonymous with discovering its behavior, and style considerations only come into play when they distract or mislead away from that goal.

What makes J readable? Several design decisions stand out.

Interactive Session

J is an interpreted language, and the way I tend to write code in J is by incrementally building a transformation of some data until it works, then testing with more varied inputs until it's robust. Sometimes I'll benchmark the code and try to make it faster. Here's an example session (user-entered lines are indented, interpreter responses are unindented):

   par =: 'this is (a string (with some (nested) parentheses))'
   '()'i.par
2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 1 1
   1 _1 0{~'()'i.par
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 _1 0 0 0 0 0 0 0 0 0 0 0 0 _1 _1
   +/\1 _1 0{~'()'i.par
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 1 0
   ,":"0+/\1 _1 0{~'()'i.par
000000001111111111222222222223333333222222222222210
   par,:,":"0+/\1 _1 0{~'()'i.par
this is (a string (with some (nested) parentheses))
000000001111111111222222222223333333222222222222210

What's happening here? First, I define some data that I want to work on. I assign a string to a variable: par =: 'this is (a string (with some (nested) parentheses))'. My goal is to determine the nesting depth of its parentheses.

Next, I use dyadic i. (index-of) to get the locations of each type of paren:

   '()'i.par
2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 1 1

For each element in the right argument, it returns the index of the match in the left argument. As an aside, it might seem odd for index-of to return 2, especially if you're used to something like C or JS where similar functions usually return -1 to signify the element was not found. In J, negative numbers index into arrays starting at the end, so _1{4 5 6 is 6. Other languages might throw an exception in this case or return a non-numeric sentinel value like nil or NaN. But in the next step we'll see that in J, returning the length of the array permits a cute behavior when composing a longer expression:

   1 _1 0{~'()'i.par
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 _1 0 0 0 0 0 0 0 0 0 0 0 0 _1 _1

Here, since I know the results of the previous step will be an array containing only 0, 1, or 2, I can remap these values to something else using { (from). This is like bracket-indexing in other languages, except that it accepts arrays as well as scalars. For example 1{'hello' is 'e', and 3 4 3{'hello' is 'lol'. Since the selecting values are on the right side of { and the new mapping is on the left, I can use ~ (passive) to flip the arugments. a -~ b is the same as b - a. Altogether, this results in a new array containing 1 for each open paren, negative 1 (written _1 in J) for each close paren, and 0 otherwise. In this format, we can do a sum-scan (+/\) to determine the nesting depth of each character:

   +/\1 _1 0{~'()'i.par
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 1 0

Finally, I want to display this nesting depth alongside the s-expression string. But since J arrays have to contain elements of the same type, I have to first convert the numeric array to a string using ": (default format). However, I don't want to get the default format of the array with all the spaces, but a compact array of numbers only, so I use the rank conjunction to tell default format to operate on the individual items ("0 ), and then ravel (,) all of these characters together to make a single string:

   ,":"0+/\1 _1 0{~'()'i.par
000000001111111111222222222223333333222222222222210

And finally, put it together with the s-expression string using ,: (laminate).

   par,:,":"0+/\1 _1 0{~'()'i.par
this is (a string (with some (nested) parentheses))
000000001111111111222222222223333333222222222222210

Being able to build up an expression like this interactively allows for easy experimentation and keeps me in-tune with the data. The fact that J is strictly interpreted isn't a benefit in itself (in fact, I'd prefer if J had the option of producing a compiled binary executable) but what's really beneficial is having the option of running code in a REPL session.

Minimal Indirection

The second major aid to readability is the fact that you can enter the name of a verb in the interpreter and see its definition:

   fread
3 : 0
if. 1 = #y=. boxopen y do.
  1!:1 :: _1: fboxname y
else.
  1!:11 :: _1: (fboxname {.y),{:y
end.
:
x freads y
)

Here, I typed the word fread and the interpreter printed its definition. Inside the definition are more words like boxopen and freads, and I can repeat the process for those to find their definitions as well. On occasion, you might find a word which is defined in some other namespace, so to see its definition takes another step, but it's still trivial compared to e.g. Python, where the standard library and your own user-supplied definitions are bytecode-compiled and essentially hidden from you as you use the interpreter. It's almost like having ctags built in to the interpreter, but the mechanism behind it is much simpler - what you see is what's actually there. The definition isn't bytecode-compiled, it's always interpreted.

Check out the standard library verb nl (short for "name list"), which displays all of the names in a particular locale:

   'rx' nl_z_ verb
┌───┬─────┬──────┬─────┬────┬───────┬───────┬──────┬──────┬─────────┬────┬───────┬──────┬───────┬─────────┬──────┬──────┐
│rxE│rxall│rxcomp│rxcut│rxeq│rxerror│rxfirst│rxfree│rxfrom│rxhandles│rxin│rxindex│rxinfo│rxmatch│rxmatches│rxrplc│rxutf8│
└───┴─────┴──────┴─────┴────┴───────┴───────┴──────┴──────┴─────────┴────┴───────┴──────┴───────┴─────────┴──────┴──────┘

There's kind of a lot going on here. Since J evaluates from right to left, let's break it down in that order. verb is a cover word in the standard library which evalutates to 3. noun evaluates to 0, adverb evaluates to 1, and so on. These are codes for the parts of speech used by the interpreter. Next is nl_z_ which is calling the verb nl in locale z. This is the "lookup in namespace" step mentioned earlier. Finally we see the left argument to nl_z_ is the string 'rx'. This tells nl to return only those names which start with "rx", and we can see that in the result.

Other words like load and hfd (hex from decimal) are in the standard library, rather than built into the language as keywords. This makes it easy to peek into the definitions of library verbs you may already be using, and perhaps extract parts of them for some similar-but-slightly-different purpose of your own.

For me, these aspects of J's design make for a very pleasant development experience, without the need for powerful IDEs or editor plugins. A plain text editor and a terminal is usually all I need.