Lazy K

Executive summary

Lazy K is a garbage-collected, referentially transparent functional programming language, with a simple stream-based I/O system.

What distinguishes Lazy K from other such languages is its almost total lack of other features. It does not, for example, offer an integrated Hindley-Milner polymorphic type system. It is not shipped with an extensive standard library with support for platform-independent GUI programming and bindings to other languages. Nor could any such library be written since, among other things, Lazy K does not provide any way to define or refer to any functions other than built-ins. This inability is complemented by a matching lack of support for numbers, strings, or any other data type. Nevertheless, Lazy K is Turing-complete.

Sample code

Here's a Lazy K program which prints out the prime numbers (all of them, if you wait long enough). It's based on an elegant implementation of the sieve of Eratosthenes using lazy lists, and it's surprisingly fast: about 80 primes in the first second of execution on my machine (of course it gets progressively slower).


Here's a Lazy K program which simply reverses its input on a character-by-character basis. For example, if you type in "stressed", it'll spit out "desserts". It's written in the Jot dialect of Lazy K, which is substantially less compact than the CL dialect I used above.


Here's a complete implementation of the Unlambda language, written in Lazy K's Unlambda notation. I might point out that this is over 5 times shorter than the shortest existing Unlambda-in-Unlambda interpreter (which is missing support for several language features) and over 10 times shorter than the only fully-featured Unlambda-in-Unlambda interpreter.


If this looks like your idea of fun, read on.

Why the world needs Lazy K

... or, in other words, what's wrong with Unlambda?

Just as Brainfuck captures the distilled essence of (structured) imperative programming, so should there be a language which captures the essence of functional programming. Unlambda would seem to be the natural candidate for that position. But although it is a very pretty language, it just isn't functional enough for my taste.

The Unlambda page states that a functional programming language is one in which functions have first-class citizenship. But I think that this is incorrect: for one thing, by that definition almost any high-level language would qualify as functional, even Perl (and perhaps even C, which has qsort). More usually, "functional programming" is programming in which functions behave like real mathematical functions — that is, without side effects — and a pure functional language, then, is just one which requires this of all functions. In this sense, Unlambda isn't even close to being purely functional, because its I/O system depends crucially on side effects (or, equivalently, on the order of evaluation). The idea of side effects (or of meaning depending on evaluation order) is antithetical to the design of the combinator calculus, and there are strange interactions between them which make writing Unlambda programs a frustrating experience. This is, of course, deliberate: Unlambda is billed as "your functional programming language nightmares come true," and the author credits Intercal as an inspiration.

In designing Lazy K, I was interested only in elegance. Lazy K does not compromise the purity of the SKI combinator calculus in any way. There are no special combinators for I/O. There are no exceptions to the rules of abstraction elimination; the shortcuts are always valid. There are none of the evaluation-order games which permeate almost every other programming language, even most so-called functional languages: if there is any order in which a Lazy K program can be evaluated to work properly, the Lazy K interpreter will find it. There is no d operator, nor even any meaningful way to define one (except as equivalent to the identity function), and there's no need to add a dummy argument to argumentless functions, because the function and its value are interchangeable.

Lazy K programs live in the same timeless Platonic realm as mathematical functions, what the Unlambda page calls "the blessed realm of the pure untyped lambda calculus." Just as garbage collection hides the process of memory management from the programmer, so referential transparency hides the process of evaluation. The fact that some calculation is necessary in order to view a picture of the Mandelbrot set, or in order to "run" a Lazy K program, is an implementation detail. That's the essence of functional programming.

I/O in Lazy K

How to handle input and output in a language without side effects? In a certain sense, input and output aren't side effects; they are, so to speak, front- and back-effects. So it is in Lazy K, where a program is simply treated as a function from the space of possible inputs to the space of possible outputs.

Input and output streams are represented as lists of natural numbers from 0 to 255, each corresponding to one byte. End-of-file is represented by the value 256, not by end of list. (This is because it is often easier to deal with EOF as a character than as a special case. Nevertheless, I wonder if it wouldn't be better to use end-of-list.)

For the purpose of making these lists, pairs are represented in the usual way (that is, the pair (X . Y) is the function (lambda (f) (f X Y))) and natural numbers are represented as Church numerals (that is, the number n is represented by the operation of n-fold composition).

The input list is infinite in length, and all elements after the first occurrence of 256 are also 256. The output list is never examined beyond the first occurrence of 256; it doesn't matter what is in the cdr of that cell. In fact, since the car of each cell is examined by applying it to (lambda (x y) x), you can represent the end of your output simply by (k 256): for example, (lambda (input) (k 256)) is a valid program which ignores its input and prints nothing.

(Incidentally, the Church numeral 256 has a very compact representation in lambdas or combinators: it is ((lambda (n) (n n)) ((lambda (n) (n n)) (lambda (f x) (f (f x))))), or SII(SII(S(S(KS)K)I)).)

The present interpreter actually accepts any value of 256 or greater as the end of the output, and its return code to the operating system is that value minus 256.

Note that under these rules, the identity function is a program which copies its input to its output. Since Lazy K's syntax is such that the empty program is the identity function, the behavior of the UNIX cat utility (without arguments) can be expressed at least as compactly in Lazy K as in any other language.

Lazy K syntax and semantics

Lazy K's motto is "there's more than one way to do it." In fact, there are exactly four ways to do it: combinator-calculus style, Unlambda style, Iota style, and Jot style.

The grammar of Lazy K is approximately the union of the grammars of the four languages just described:

          Syntax                            Semantics

  Program  ::= CCExpr                     CCExpr

  CCExpr   ::= CCExpr Expr                (CCExpr Expr)
             | epsilon                    (lambda (x) x)

  Expr     ::= "i"                        (lambda (x) x)
             | Expr'                      Expr'

  IotaExpr ::= "i"                        (lambda (x) (x S K)) ; with S and K defined as below
             | Expr'                      Expr'

  Expr'    ::= "I"                        (lambda (x) x)
             | "K" | "k"                  (lambda (x y) x)
             | "S" | "s"                  (lambda (x y z) ((x z) (y z)))
             | NonemptyJotExpr            NonemptyJotExpr
             | "`" Expr1 Expr2            (Expr1 Expr2)
             | "*" IotaExpr1 IotaExpr2    (IotaExpr1 IotaExpr2)
             | "(" CCExpr ")"             CCExpr

           ::= JotExpr "0"                (JotExpr S K)
             | JotExpr "1"                (lambda (x y) (JotExpr (x y)))

  JotExpr  ::= NonemptyJotExpr            NonemptyJotExpr
             | epsilon                    (lambda (x) x)

An additional stipulation is necessary to disambiguate the grammar: a NonemptyJotExpr always matches the longest possible uninterrupted string of zeroes and ones.

Whitespace is ignored wherever it appears (including in the middle of a Jot expression). Comments are introduced by # and continue to the end of the line.

This grammar recognizes almost every valid CC, Unlambda, Iota and Jot expression. Annoyingly, there is one (and only one) collision between the languages: i is both a valid Unlambda expression and a valid Iota expression, and it has different meanings in the two. In Lazy K, the Unlambda meaning takes precedence. (This is implicit in the grammar above.)

Because this grammar allows you to mix your metaphors, the Lazy K language is much larger than the union of its four constituent languages. You can, for example, write SII``sii or ****i*i*i*ii*ii*ii11111110001111111110000011111111100000 instead of SII(SII). I haven't pursued this possibility because I think Lazy K is obfuscated enough already.

There's no reason that Lazy K couldn't also support a lambda-calculus syntax, but, at least for now, it doesn't.

The fall from grace

It's not difficult to write interactive programs in Lazy K. However, you should be aware that doing so is, technically speaking, a sin.

The trouble is that Lazy K provides no explicit mechanism for synchronizing input with output. In a referentially transparent language, anything not explicitly synchronized is fair game for evaluation in any order whatsoever, at the run-time system's discretion.

Consider a program which asks for the user's name and then prints a greeting, like this: (bold italics represent text typed by the user)

What is your name?
Hello, Ben!

In principle, you might get this instead:

What is your name?
Hello, Ben

... or this:

What is your name?
Hello, Ben!

... or even something in between.

In practice, however, interactive Lazy K programs tend not to exhibit these problems. The first case cannot arise if the "H" of "Hello" depends in some way on the end of the user's input. The most obvious way of writing this particular program is to cons together the "Hello, [name]!" string in an expression which is conditioned on receipt of a newline. If you do this you are safe, because there's no way for any evaluator to prove in advance that the user will ever type a newline.

The second case does not occur as long as the prompt ("What is your name?") is consed together irrespective of the input — again the obvious thing to do. The reason this works is that the Lazy K interpreter uses lazy evaluation, which by definition tries to produce output as early as possible and do everything else (including input) as late as possible.

So there's no practical problem with interactive software. Nevertheless, there's something unpleasant about the way the second case is prevented. A referentially transparent program should not have to rely on lazy evaluation in order to work properly.

How to escape this moral dilemma? The hard way is to switch to a more sophisticated I/O system, perhaps based on Haskell's, in which input and output are explicitly synchronized. I'm rather disinclined to do this, as I much prefer the simplicity of the current system. The easy way out is to write batch programs which happen to work well interactively. This is mainly just a matter of not prompting the user. The "interactive" programs included with the Lazy K distribution follow this model.

The interpreter

The Lazy K interpreter, cleverly named lazy, has the following command-line syntax:

    lazy [-b] { -e program-code | program-file.lazy } *

Arguments preceded by -e are interpreted as in-line Lazy K code (just as in Perl), and arguments not preceded by -e are interpreted as names of files containing Lazy K code. You can specify more than one chunk of code, in which case they are combined by backwards functional composition. Because of the way Lazy K's I/O model works, this is equivalent to piping the output of each program to the input of the next: i.e. lazy prog1 prog2 prog3 is equivalent to lazy prog1 | lazy prog2 | lazy prog3, except that it's more efficient (and the intermediate results needn't be byte streams). The first program's input always comes from standard input, and the last program's output always goes to standard output.

If you don't specify any programs at all, you'll get an empty composition, which results in the identity function, which causes lazy to behave like cat (except much slower). If you want lazy to read source code from standard input, use lazy -.

The -b option puts standard input and standard output into binary mode on systems which care (i.e. Windows). If you have problems with any of the sample programs (particularly the Unlambda interpreter), try using this switch.

The compiler

There is a Lazy K compiler, but it does not compile from Lazy K into machine code or C; rather, it compiles to Lazy K from a meta-language in which one actually has some hope of writing programs. Like Unlambda, Lazy K itself is better considered as bytecode for an unorthodox kind of virtual machine. The compiler is written in Scheme and is located in the file lazier.scm.

The language Lazier compiles is just the lambda calculus written in Scheme notation. The laze function performs optimized abstraction elimination. (Note that you must quote all arguments because Lazier functions are not special forms.)

  > (laze '(lambda (pair) (pair (lambda (a d) d))))
  ((s i) (k (k i)))

You can use the lazy-def function to define macros (not functions!) which are expanded in-place in other code. Macros live at the top level lexically, and can be hidden by lambda bindings. Macros can refer to other macros (as long as there are no dependency loops). Macro names can be anything that can be compared reliably with eqv?: this includes not only identifiers but also numbers, characters, #t and #f, and the empty list.

  > (lazy-def 'cdr '(lambda (pair) (pair (lambda (a d) d))))
  > (laze 'cdr)
  ((s i) (k (k i)))
  > (laze '(cdr x))
  (x (k i))
  > (lazy-def '(cdr pair) '(pair (lambda (a d) d)))
  > (laze 'cdr)
  ((s i) (k (k i)))
  > (laze '(cdr x))
  (x (k i))

There are four print-as functions which format laze's output as Lazy K source code.

  > (lazy-def '(myprog input) '(cdr (cdr input)))
  > (laze 'myprog)
  ((s ((s i) (k (k i)))) (k (k i)))
  > (print-as-cc (laze 'myprog))
  > (print-as-unlambda (laze 'myprog))
  > (print-as-iota (laze 'myprog))
  > (print-as-jot (laze 'myprog))

If you feel like programming closer to the bare metal, Lazier can help you by preserving free variables through the process of abstraction elimination, thus giving you a Lazy K "template" into which you can plug other expressions by hand.

  > (lazy-def '(cons a b) '(lambda (f) (f a b)))
  > (print-as-unlambda (laze 'cons))
  > (print-as-unlambda (laze '(cons p)))
  > (print-as-unlambda (laze '(cons p q)))

The file prelude.scm contains some useful predefined macros such as cons, if, +, and so on. Notably absent are any higher-level combinators (except for Y and functional composition); I've been writing everything out by hand each time. Sorry about that. The file prelude-numbers.scm contains space-optimized definitions of the Church numerals up to 128.

The hardest part of writing Lazier programs, other than debugging them, is avoiding code bloat from multiple inclusions of a macro. If you write a complicated "function" as a macro and want to use it more than once, you either need to bind it to a variable and then pass it around to every location in the code which calls it, or else live with the multiple inclusion. The former solution may be even less efficient than the latter, since introducing a new variable binding itself adds a lot of bloat at the combinator level.

Sample code

Here's a list of the Lazy K examples included in the distribution, with brief descriptions. More detailed descriptions can usually be found in the files themselves.

Prints ABABABAB... forever.
A Befunge-93 interpreter. Supports all opcodes except for ?.
A Brainfuck-to-Lazy K compiler which translates every Brainfuck instruction to a fixed block of Lazy K code. Matching [ and ] code blocks find each other at run time.
An infix expression evaluator which supports addition, multiplication, and exponentiation of arbitrary-precision integers.
Lazy K version of the "print the Fibonacci sequence as rows of asterisks" example from the Unlambda distribution.
Interpreters for the Iota and Jot language subsets. The Iota version weighs in at 1651 symbols, and the Jot version at 1541.
A nice mergesort implementation using lazy lists. (Just about every algorithm seems to work better with lazy lists.) Two applications are included: a UNIX sort clone and a Burrows-Wheeler block sort utility.
Like fib, except prints powers of 2.
Prints the prime numbers in decimal.
quine.lazy A Lazy K quine, 2792 symbols long. The Lazier code for this one is not included, although you'll probably be able to guess how it works by looking at the Lazy K code. (Or its output...)
Reverses the input character-by-character.
Implements the rot13 encryption algorithm. For double rot13, the standard in most discussion forums, specify rot13.lazy twice on the command line.
An Unlambda interpreter. Supports all Unlambda 2.0.0 features, including input and # comments.

Some obvious projects:

A False interpreter might also be fun (and not too hard) to write.

Bugs and warts