Syntactic Holes

The set of possible strings is infinite, so the design of a programming language’s syntax is mostly about what not to allow. I find this topic interesting and fun, because constraints encourage creativity.

Numbers and Names

Let’s imagine a conventional programming language that lets you write numeric constants like these:

  • 42
  • 0.30000000000000004
  • -3.14

This language also lets the user define names, but with some limitations: they must start with an alphabetical character, but after the first character you can use a mix of alphanumeric characters and underscore. According to these rules, some examples of valid names are:

  • a
  • some_word
  • a2

Meanwhile, these are not valid names:

  • 2abc
  • 42_abc

Choices

Since something like `2abc` is not a valid name, when parsing the source code, most programming languages treat that as a syntax error. But that’s not the only option.

Instead, the language could treat number immediately followed by alphabetical similarly to how languages like Bash use sigils. In Bash, $foo means “get the value of the variable foo”. The dollar sign is the “sigil” here, used as a prefix to the name which would otherwise be treated as a string.

So 2foo could hypothetically mean “get some property of the variable foo”.

Lots of Choices

If you take this one step further, you not only have 0 through 9, but all lenths of numbers, so 999foo could mean yet another variation of poor foo.

Probably that would be taking things too far for most people’s sensibilities, so I won’t even discuss 3.14foo.