% Filename: lazylist.tex
% Author: Alan Jeffrey
% Last modified: 11 May 1990
% Modified: 24 July 2003 by Robin Fairbairns, for file name changes
% and use with LaTeX 2e
%
% Copyright (c) 1990 Alan Jeffrey. Permission is granted to Tugboat
% to publish any or all of this.
%
% A keyboard check:
%
% @ # $ % ^ & * ~ at hash dollar percent caret ampersand asterisk tilde
% : ; , . colon semicolon comma period
% ? ! question-mark exclamation-mark
% " ' ` double-quote apostrophe back-quote
% ( ) { } [ ] parentheses braces square-brackets
% - + = / \ minus plus equals forward-slash backslash
% _ | < > underscore vertical-bar less-than greater-than
%
\documentclass{ltugboat}
\usepackage{lazylist}
% This document defines a whole load of extra commands, some of which
% over-ride how LaTeX normally lays things out. For example, ~ is
% redefined to give a hairspace in math mode. This whole document
% should probably be put in a group to stop it getting in the way
% of other articles' macros.
\title{Lists in \TeX's Mouth}
\author{Alan Jeffrey}
\address{Programming Research Group\\
Oxford University\\
11 Keble Road\\
Oxford OX1 3QD}
\netaddress{Alan.Jeffrey@uk.ac.oxford.prg}
\makeatletter
% The mathcodes for the letters A, ..., Z, a, ..., z are changed to
% generate text italic rather than math italic by default. This makes
% multi-letter names look neater. The mathcode for character 'c'
% is set to "7000 (variable family) + "400 (text italic) + c.
%
% This neat bit of code is due to Mike Spivey.
%
% \def\@setmcodes#1#2#3{{\count0=#1 \count1=#3
% \loop \global\mathcode\count0=\count1 \ifnum \count0<#2
% \advance\count0 by1 \advance\count1 by1 \repeat}}
%
% \@setmcodes{`A}{`Z}{"7441}
% \@setmcodes{`a}{`z}{"7461}
% however, for all its neatness, the code doesn't actually work in 2e;
% the following looks even neater (if you don't consider what goes on
% behind the scenes), and does the specified job
\DeclareSymbolFont{letters} {OT1}{cmr} {m}{it}
\def\Number#1{\csname Number-#1\endcsname}
\def\Label#1{\csname Label-#1\endcsname}
\newcount\Lastnum
\def\Forward#1%
{\global\advance\Lastnum by 1
\csnameafter\xdef{Number-#1}%
{\the\Lastnum}%
\csnameafter\xdef{Label-\the\Lastnum}%
{\@currentlabel}}
\def\csnameafter#1#2%
{\expandafter#1\csname#2\endcsname}
\def\Bylist#1%
{\Map\Label
{\Insertsort\Lessthan
{\Map\Number{#1}}}}
\def\By{\Show\Bylist}
\let\bindspace=~
\def~{\ifmmode \, \else \bindspace \fi}
\def\start#1{\lefteqn{#1}\quad\\}
\def\nil{[\,\,]}
\newtheorem{fact}{Fact}
\def\thefact{\@roman\c@fact}
\def\cstok#1{\leavevmode\thinspace\hbox{\vrule\vtop{\vbox{\hrule\kern1pt
\hbox{\vphantom{\tt/}\thinspace{\tt#1}\thinspace}}%
\kern1pt\hrule}\vrule}\thinspace}
\begingroup \catcode `|=0 \catcode `[= 1
\catcode`]=2 \catcode `\{=12 \catcode `\}=12
\catcode`\\=12 |gdef|@xTeXcode#1\end{TeXcode}[#1|end[TeXcode]]
|endgroup
\def\TeXcode
{\@verbatim \smallskip\hrule\medskip \frenchspacing\@vobeyspaces \@xTeXcode}
\def\endTeXcode
{\medskip\hrule\smallskip\endtrivlist}
\makeatother
\begin{document}
\maketitle
\section{Why lists?}
Originally, I wanted lists in \TeX\ for
a paper I was writing which contained a lot of facts.
\begin{fact}
\Forward{Fac-cows}
Cows have four legs.
\end{fact}
\begin{fact}
\Forward{Fac-people}
People have two legs.
\end{fact}
\begin{fact}
\Forward{Fac-yawn}
Lots of facts in a row can be dull.
\end{fact}
These are generated with commands like
\begin{verbatim}
\begin{fact}
\Forward{Fac-yawn}
Lots of facts in a row can be dull.
\end{fact}
\end{verbatim}
I can then refer to these facts by saying
\begin{verbatim}
\By[Fac-yawn,Fac-cows,Fac-people]
\end{verbatim}
to get
\By[Fac-yawn,Fac-cows,Fac-people].
And as if by magic, the facts come out sorted, rather than in
the jumbled order I typed them. This is very useful, as I can
reorganize my document to my heart's content, and not have to worry
about getting my facts straight.
Originally I tried programming this sorting routine in \TeX's
list macros, from Appendix~D of \TB, but I soon ran into trouble.
The problem is that all the Appendix~D macros work by assigning
values to macros. For example:
\begin{verbatim}
\concatenate\foo=\bar\baz
\end{verbatim}
expands out to
\begin{verbatim}
\ta=\expandafter{\bar}
\tb=\expandafter{\baz}
\edef\foo{\the\ta\the\tb}
\end{verbatim}
which assigns the macro \verb|\foo| the contents of \verb|\bar|
followed by the contents of \verb|\baz|. Programming sorting routines
(which are usually recursive) in terms of these lists became rather
painful, as I was constantly having to watch out for local variables,
worrying about what happened if a local variable had the same name
as a global one, and generally having a hard time.
Then I had one of those ``flash of light'' experiences ---
``You can do lambda-calculus in \TeX,'' I thought,
and since you can do lists directly in lambda calculus,
you should be able to do lists straightforwardly in \TeX. And so you
can. Well, fairly straightforwardly anyway.
So I went and did a bit of mathematics, and derived the \TeX\ macros
you see here. They were formally verified, and worked first time
(modulo typing errors, of which there were two).
\section{\TeX's mouth and \TeX's stomach}
\TeX's programming facilities come in two forms --- there are \TeX's
{\em macros\/} which are expanded in its mouth, and some additional
{\em assignment\/} operations like \verb|\def| which take place in the
stomach. \TeX\ can often spring surprises on you as exactly what
gets evaluated where.
For example, in \LaTeX\ I can put down a
label by saying \verb|\label{Here}|.
\label{Here}
Then I can refer back to that label by saying
\verb|Section~\ref{Here}|, which
produces Section~\ref{Here}. Unfortunately, \verb|\ref{Here}| does
{\em not\/} expand out to {\tt\ref{Here}}! Instead, it expands out to:
\begin{verbatim}
\edef\@tempa{\@nameuse{r@Here}}
\expandafter\@car\@tempa\@nil\null
\end{verbatim}
This means that I can't say
\begin{verbatim}
\ifnum\ref{Here}<4 Hello\fi
\end{verbatim}
and hope that this will expand out to Hello. Instead I
get an error message. Which is rather a pity, as \TeX's mouth is
quite a powerful programming language (as powerful as a Turing Machine in
fact).
\section{Functions}
A {\em function\/} is a mathematical object that takes in an argument
(which could well be another function) and returns some other mathematical
object. For example the function $Not$ takes in a boolean and returns
its complement. I'll write function application without brackets,
so $Not~b$ is the boolean complement of $b$.
Function application
binds to the left, so $f~a~b$ is $(f~a)~b$ rather than $f~(a~b)$.
For example, $Or~a~b$ is the boolean or of $a$ and $b$, and
$Or~True$ is a perfectly good function that takes in a boolean
and returns $True$.
The obvious equivalents of functions in \TeX\ are macros ---
if I define a function $Foo$ to be:
\begin{eqnarray*}
Foo~x & = & True
\end{eqnarray*}
then it can be translated into \TeX\ as:
\begin{verbatim}
\def\Foo#1{\True}
\end{verbatim}
So where $Foo$ is a function that takes in one argument, \verb|\Foo|
is a macro that takes in one parameter. Nothing has changed except
the jargon and the font. \TeX\ macros can even be partially applied,
for example if we defined:
\begin{eqnarray*}
Baz & = & Or~True
\end{eqnarray*}
then the \TeX\ equivalent would be
\begin{verbatim}
\def\Baz{\Or\True}
\end{verbatim}
Once \verb|\Baz| is expanded, it will expect to be given a parameter,
but when we are defining things, we can go around partially applying
them all we like.
Here, I'm using $=$ without formally defining it, which is rather
naughty. If I say $x = y$, this means
``given enough parameters, $x$ and $y$ will eventually
expand out to the same thing.'' For example $Foo = Baz$, because
for any $x$,
\begin{eqnarray*}
\start{Foo~x}
& = & True \\
& = & Or~True~x \\
& = & Baz~x
\end{eqnarray*}
Normally, functions have to {respect equality\/} which means that:
\begin{itemize}
\item if $x = y$ then $f~x = f~y$, and
\item if $x$ respects equality, then $f~x$ respects equality.
\end{itemize}
However, some \TeX\ control sequences don't obey this. For example,
\verb|\string\Foo| and \verb|\string\Baz| are different, even though
$Foo = Baz$. Hence $string$ doesn't respect equality.
Unless otherwise stated, we won't assume functions respect equality,
although all the functions defined here do.
All of our functions have capital letters, so that their \TeX\ equivalents
(\verb|\Not|, \verb|\Or| and so on) don't clash with standard \TeX\ or
\LaTeX\ macros.
\subsection{Identity}
The simplest function is the {\em identity\/} function, called
$Identity$ funnily enough, which is defined:
\begin{eqnarray*}
Identity~x & = & \Identity{x}
\end{eqnarray*}
This, it must be admitted, is a pretty dull function, but
it's a useful basic combinator. It can be implemented
in \TeX\ quite simply.
\begin{TeXcode}
\def\Identity#1{#1}
\end{TeXcode}
The rules around this definition mean that it is actually part of
\verb|Lambda.sty| and not just another example.
\subsection{Error}
Whereas $Identity$ does nothing in a fairly pleasant sort of way,
$Error$ does nothing in a particularly brutal and harsh fashion.
Mathematically, $Error$ is the function that destroys everything
else in front of it. It is often written as $\perp$.
\begin{eqnarray*}
Error~x & = & Error
\end{eqnarray*}
In practice, destroying the entire document when we hit one error
is a bit much, so we'll just print out an error message.
The user can carry on past an error at their own risk, as the code
will no longer be formally verified.
\begin{TeXcode}
\def\Error
{\errmessage{Abandon verification all
ye who enter here}}
\end{TeXcode}
Maybe this function ought to return a more useful error message \ldots
\subsection{First and Second}
Two other basic functions are $First$ and $Second$, both of which
take in two arguments, and do the obvious thing. They are defined:
\begin{eqnarray*}
First~x~y & = & x \\
Second~x~y & = & y
\end{eqnarray*}
We could, in
fact, define $Second$ in terms of $Identity$ and $First$.
For any $x$ and $y$,
\begin{eqnarray*}
\start{First~Identity~x~y}
& = & Identity~y \\
& = & y \\
& = & Second~x~y
\end{eqnarray*}
So $First~Identity = Second$. This means that anywhere in our \TeX\ code
we have \verb|\First\Identity| we could replace it by \verb|\Second|.
This is perhaps not the most astonishing \TeX\ fact known to humanity,
but this sort of proof did enable more complex bits of \TeX\ to be
verified before they were run.
The \TeX\ definitions of \verb|\First| and \verb|\Second| are pretty
obvious.
\begin{TeXcode}
\def\First#1#2{#1}
\def\Second#1#2{#2}
\end{TeXcode}
Note that in \TeX\, \verb|\First\foo\bar| expands out to
\verb|\foo| {\em without\/} expanding out \verb|\bar|.
This is very useful, as we can write macros that would take
forever and a day to run if they expanded all their arguments,
but which actually terminate quite quickly. This is called
{\em lazy evaluation\/} by the functional programming community.
\subsection{Compose}
Given two functions $f$ and $g$ we would like to be able to {\em compose\/}
them to produce a function that first applies $g$ then applies $f$.
Normally, this is written as $f \circ g$, but unfortunately \TeX\ doesn't
have infix functions, so we'll have to write it $Compose~f~g$.
\begin{eqnarray*}
Compose~f~g~x & = & f~(g~x)
\end{eqnarray*}
>From this definition, we can deduce that $Compose$ is associative:
\begin{eqnarray*}
\start{Compose~(Compose~f~g)~h}
& = & Compose~f~(Compose~g~h)
\end{eqnarray*}
and $Identity$ is the left unit of $Compose$:
\begin{eqnarray*}
Compose~Identity~f & = & f
\end{eqnarray*}
The reader may wonder why $Identity$ is called a {\em left\/} unit
even though it occurs on the right of the $Compose$ --- this is a side-effect
of using prefix notations where infix is more normal. The infix version
of this equation is:
\begin{eqnarray*}
Identity \circ f & = & f
\end{eqnarray*}
so $Identity$ is indeed on the left of the composition.
$Compose$ can be implemented in \TeX\ as
\begin{TeXcode}
\def\Compose#1#2#3{#1{#2{#3}}}
\end{TeXcode}
\subsection{Twiddle}
Yet another useful little function is $Twiddle$, which takes in
a function and reverses the order that function takes its (first two)
arguments.
\begin{eqnarray*}
Twiddle~f~x~y & = & f~y~x
\end{eqnarray*}
Again, there aren't many immediate uses for such a function, but it'll
come in handy later on. It satisfies the properties
\begin{eqnarray*}
Twiddle~First & = & Second \\
Twiddle~Second & = & First \\
Compose~Twiddle~Twiddle & = & Identity
\end{eqnarray*}
Its \TeX\ equivalent is
\begin{TeXcode}
\def\Twiddle#1#2#3{#1{#3}{#2}}
\end{TeXcode}
This function is called ``twiddle'' because it is sometimes written
$\widetilde f$ (and $\sim$ is pronounced ``twiddle'').
It also twiddles its arguments around,
which is quite nice if your sense of humour runs to appalling puns.
\section{Booleans}
As we're trying to program a sorting routine, it would be nice to
be able to define orderings on things, and to do this we need some
representation of boolean variables. Unfortunately \TeX\ doesn't have a type
for booleans, so we'll have to invent our own. We'll
implement a boolean as a function $b$ of the form
\begin{eqnarray*}
b~x~y &
= &
\left\{
\begin{array}{ll}
x & \mbox{if $b$ is true} \\
y & \mbox{otherwise}
\end{array}
\right.
\end{eqnarray*}
More formally, a
boolean $b$ is a function which respects equality,
such that for all $f$, $g$ and $z$:
\begin{eqnarray*}
b~f~g~z & = & b~(f~z)~(g~z)
\end{eqnarray*}
and for all $f$ and $g$ which respect equality,
\begin{eqnarray*}
b~(f~b)~(g~b) & = & b~(f~First)~(g~Second)
\end{eqnarray*}
All the functions in this section satisfy these properties. Surprisingly
enough, so does $Error$, which is quite useful, as it allows us to
reason about booleans which ``go wrong''.
\subsection{True, False and Not}
Since we are implementing booleans as functions, we already have the
definitions of $True$, $False$ and $Not$.
\begin{eqnarray*}
True & = & First \\
False & = & Second \\
Not & = & Twiddle
\end{eqnarray*}
So for free we get the following results:
\begin{eqnarray*}
Not~True & = & False \\
Not~False & = & True \\
Compose~Not~Not & = & Identity
\end{eqnarray*}
The \TeX\ implementation is not exactly difficult:
\begin{TeXcode}
\let\True=\First
\let\False=\Second
\let\Not=\Twiddle
\end{TeXcode}
\subsection{And and Or}
The definitions of $And$ and $Or$ are:
\begin{eqnarray*}
And~a~b &
= &
\left\{
\begin{array}{ll}
b & \mbox{if $a$ is true} \\
False & \mbox{otherwise}
\end{array}
\right.
\\
Or~a~b &
= &
\left\{
\begin{array}{ll}
True & \mbox{if $a$ is true} \\
b & \mbox{otherwise}
\end{array}
\right.
\end{eqnarray*}
With our definition of what a boolean is, this is just the same as
\begin{eqnarray*}
And~a~b & = & a~b~False \\
Or~a~b & = & a~True~b
\end{eqnarray*}
>From these conditions, we can show that $And$ is associative, and
has left unit $True$ and left zeros $False$ and $Error$:
\begin{eqnarray*}
And~(And~a~b)~c & = & And~a~(And~b~c) \\
And~True~b & = & b \\
And~False~b & = & False \\
And~Error~b & = & Error
\end{eqnarray*}
$Or$ is associative, has left unit $False$ and left zeros $True$ and $Error$:
\begin{eqnarray*}
Or~(Or~a~b)~c & = & Or~a~(Or~b~c) \\
Or~False~b & = & b \\
Or~True~b & = & True \\
Or~Error~b & = & Error
\end{eqnarray*}
De~Morgan's laws hold:
\begin{eqnarray*}
Not~(And~a~b) & = & Or~(Not~a)~(Not~b) \\
Not~(Or~a~b) & = & And~(Not~a)~(Not~b)
\end{eqnarray*}
and $And$ and $Or$ left-distribute through one another:
\begin{eqnarray*}
Or~a~(And~b~c) & = & And~(Or~a~b)~(Or~a~c) \\
And~a~(Or~b~c) & = & Or~(And~a~b)~(And~a~c)
\end{eqnarray*}
$And$ and $Or$ are {\em not\/} commutative, though. For example,
\begin{eqnarray*}
\start{Or~True~Error}
& = & True~True~Error \\
& = & True
\end{eqnarray*}
but
\begin{eqnarray*}
\start{Or~Error~True}
& = & Error~True~True \\
& = & Error
\end{eqnarray*}
This is actually quite useful since there are some booleans that
need to return an error occasionally. If $a$ is $True$ when $b$
is safe (i.e.\ doesn't become $Error$) and is $False$ otherwise, we can
say $Or~a~b$ and know we're not going to get an error. This is handy
for things like checking for division by zero, or trying to get the
first element of an empty list.
Similarly, because of the possibility of $Error$,
$And$ and $Or$ don't right-distribute through each other,
as
\begin{eqnarray*}
\start{Or~(And~False~Error)~True}
& \ne & And~(Or~False~True)~(Or~Error~True)
\end{eqnarray*}
As errors shouldn't crop up, this needn't worry us too much.
\begin{TeXcode}
\def\And#1#2{#1{#2}\False}
\def\Or#1#2{#1\True{#2}}
\end{TeXcode}
\subsection{Lift}
Quite a lot of the time we won't be dealing with booleans, but with
{\em predicates}, which are just functions that return a boolean.
For example, the predicate $Lessthan$ is defined below so that
$Lessthan~i~j$ is true whenever $i