Gradual Types

Cramming static types into an already dense dynamic language.

Typing Disciplines

Any value in a program has at least one type. That type may be unimportant from the programmer's perspective, or unspecified by the programming language syntax, but it's always there. Some languages leverage types to help you write correct or performance-optimized programs, but languages differ in how much type information the programmer must specify, and they differ in terms of when that type information becomes relevant (e.g. at compile time or at execution time).

Dynamically typed languages (e.g. Python, Lua, Emacs Lisp) attach type information to values at execution time. Statically typed languages (e.g. C, Java, Haskell) either permit or require the use of type annotations in the source code, and are usually compiled before running1 so the type checking system can reject invalid programs that don't play by the rules. This is not to say that dynamically typed languages never reject invalid types, but rather they reject them later, when the program runs.

Within the static/dynamic spectrum, the K family of languages are firmly on the dynamic side. For the rest of this post, I'll explore syntax for adding type annotations to a K-like language, transforming it from an annotation-free dynamically typed language into something a little closer to the static end of the spectrum. Adding syntax to a dynamic language in this manner is a form of "gradual typing", defined as:

[A type system] that allows parts of a program to be dynamically typed and other parts to be statically typed. The programmer controls which parts are which by either leaving out type annotations or by adding them in.

Type Annotations

How do languages that already have type annotations spell them out in their syntax? Here are some examples, sorted from "near" to "far" in terms of the distance between the type annotation and the value it describes.

Near

In C (and C++, and Java, and…), a type annotation usually goes before the name:

int x;

Array values in C are spelled as: <type of each array element> <name> <square brackets>, for example int x[] denotes an array called x where each element of the array is an int type, and whose length is unspecified. Conversely, a declaration like int x[3] would specify the same array but with 3 elements. Function declarations put the return type first:

int foo(int x[]);

Go uses "near" type annotations like C, but with different ordering:

x int

Go puts the return type after the parameter list:

func foo(x []int) int {

Note that the array type in Go is still to the right of the name, and the order is <name> <brackets> <type of each element>.

In Rust, the type goes after the name (with an intervening :):

let x: i32;

Rust uses a different type for an array of known size versus when the size is not statically known:

fn foo(x: [i32; 3]) -> i32;  // known size
fn foo(x: &[i32]) -> i32;    // unknown size, uses a "slice" which is a dynamic view of an array

Medium

Haskell and other ML-family languages allow you to write the type of a function above its definition:

f : Int -> Int
f x = 3

Here the colon separates the name of the value from its type, and X -> Y indicates a function that takes a parameter of type X and returns one of type Y. This syntax makes it easier to use the type information as documentation, allowing you in some cases to ignore the details of the function definition and focus only on its types. These kinds of languages typically also have full type inference which means the primary purpose of type annotations is conveying meaning to people reading the code, rather than to the compiler.

Far

Still other systems put type information in a separate file, as is the case with JSON Schemas for API endpoints:

[
  {
    "id": 0,
    "name": "",
    "description": "",
    "created_at": "datetime",
    "updated_at": "datetime",
    ...
  }
]

Not only does this arrangement place the types lexically far from the code they're supposed to be annotating, in many implementations it also defers type checking to run-time. With "far" type annotations, checking the types may even occur on a separate machine altogether.

Adding Type Annotations

What about adding type annotations to a language that does not already have them? One approach is to change the formal grammar of the language. This is a breaking and backwards-incompatible change, which may be inconvenient or fracture the language's users into two communities, but on the upside it means you can't accidentally mix old and new versions of the language.

Another approach is to add typing metadata outside of the language itself, either in separate files, or as special "magic" comments, or by adding pragmas, directives, or other kinds of (previously) side-effect-free statements in the code. This approach is backwards-compatible if the old compiler ignores the new metadata. Here are some examples:

Magic comment:

f:{x+y}  / type(x):`i; type(y):`I

Side-effect-free statement

f:{"x`i"; "y`I"; x+y}

Metadata in a separate schema file (can even be a different filetype, such as JSON):

{
  "files": [
    "source.k": {
      "functions": ["f": {"args": ["x":"`i", "y":"`I"]}]
    }
  ]
}

For the metadata approach, the source file is unchanged:

/ filename: source.k
f:{x+y}

Type Annotations in K-like Languages

Does K need static types or type annotations? Clearly the answer is no, as proven by plenty of folks who use K without these features. But having them would make certain types of programming styles more natural, and could broaden the appeal to a segment of the programming world who would otherwise avoid using K for anything.

Q

A recent release of the Q language added support for pattern matching and dynamic type checking. Q version 4.1 already had "pattern matching" which other languages sometimes call "destructuring assignment":

(a;b): 1 2  /assigns a to 1 and b to 2

The new release also permits specifying types as symbols. A pattern with incorrect types generates an error:

q)(a:`s;b:`h):(`foo;38h)  /types match; a gets `foo, b gets 38h
q)(a:`s;b:`h):(`bar;36.5) /type mismatch: 36.5 is not type `h
'type

Halfkey

The halfkey language has optional type annotations for function parameters:

f:{[xi;yI]x+y}

The suffix i indicates an integer atom, whereas the suffix I indicates an integer list. This uses a restricted set of type annotations that are built in to the language, and as I understand also only works for variables named x, y, or z within functions.

Element

Now it's time for some creative brainstorming. What syntax should Element use for optional type annotations, and how much of the language should allow type annotations in the first place? One way to add new semantics without adding new syntax is to find an obscure corner of the existing language's syntax which currently results in an error, and give it a new (non-error) meaning. For the purpose of this post, I'll focus on using symbols for the type annotation, and combine them with the code in various ways.

Near

To start, let's examine the different ways to combine names with symbols to create a "near" type annotation either with or without a colon:

name `symbol `symbol name
name:`symbol `symbol:name

We can rule out name:`symbol because this is how you would assign a symbol literal to a variable. Next, name `symbol won't work because it could be function or dictionary application (equivalent to name[`symbol]), leaving the variants where the symbol occurs before the name. Of these, `symbol name conflicts with the combination of strands and indexing-by-juxtaposition: `a`b 1 indexes into the strand `a`b with index 1, resulting in the expression `b.

The only combination not already defined by the language is `symbol:name. Currently, symbol literals in both K and Element are immutable so assigning to one does not make sense.

`sym: 42  /currently an error

Instead, we can add a new meaning to the assignment operator so <symbol> ":" <name> now means "<name> has the type given by <symbol>". According to my near/far taxonomy, this annotation syntax is "near" the value, and the placement of the type (left of the name) is similar to C.

When defining a variable, instead of a single syntax we now have three:

f:1     /old style no annotation, type of "f" inferred from the value 1
`i:f:1  /declare f with type `i and value 1
`i:f    /declare f to have type `i without a value

Moving on, we can annotate a function's parameters:

f:{[`i:a;`I:b]a+b}

Or by omitting parameter names, we can still use the default x, y, or z arguments to functions while giving them type annotations:

f:{[`i;`I]x+y}

Finally, we can also annotate the function's return type by adding a <type> ":" before the function's definition.

f:`I:{[`i:a;`I:b] a+b}

Medium

Instead of writing each type adjacent to its value, we can write types on a separate line, perhaps as a pair of imimutable assignments (one for the type and another for the value):

f: `i`I`I  /argument types (`i`I) followed by return type `I
f: {[a;b]a+b}

This method is flexible and compact for functions but has lower "syntactic density" for non-function values because it repeats the <name> ":" part twice:

a: `i
a: 42

In addition, it is asymmetric for defining a typed mutable variable:

a: `i
a:: 42  /a is both typed and mutable

This asymmetry isn't necessarily a bad thing; if its appearance offends the programmer's sensibilities, we can imagine it as a subtle nudge toward using immutable variables, which is generally better.

While these examples showed the type annotation directly above the variable name definition, they could be written farther apart (e.g. in separate files) and perhaps also defined in arbitrary order, but I prefer the look of writing the type annotation directly before the expression.

A limitation of this method is it does not permit composing the mutable property wiht a type annotation, because in Element the :: operator applied twice would be two normal mutable assignment operations:

a::`i  /bind name a to symbol `i
a::42  /bind name a to number 42

Functions Only

We could also rely more heavily on either type inference or dynamic types and restrict our type annotations to only functions, or perhaps only function input parameters:

w:3
{[a:`i;b:`I]a+b}[w;2 3 4]

Summary

Even in a language like K where nearly every part of the syntax already has a defined meaning, it is possible to add type annotations. Whether such annotations are worthwhile to programmers remains to be seen.

Footnotes:

1

While static languages are often compiled, the compiled/interpreted aspect is actually independent from the static/dynamic aspect of a language, and there exist examples of (somewhat rare) compiled/dynamic languages as well as (very rare) interpreted/static languages.