Literate Korn Shell

This is the first draft of Literate Korn Shell, the unix shell 'ksh'
written with all of its innards exposed and explained.

The goal of Literate Programming is to compile source code into two
objects---the executable program with which we are familiar and in
additional a document presenting the source code in a format suitable
for reading in order to, hopefully, understand it.

One advantage in particular offered by literate programming is to
break up and re-order the code so that its parts can be introduced
to the reader in an order and manner which is focussed on the needs
of a human reader who may be unfamiliar with the code without the
need to bow to the esoteric demands of a compiler.

This draft of literate ksh has concentrated mostly on this feature,
to order the code so that a narrative can be threaded from start
to finish which, piece by piece, introduces all of the components
which go towards making ksh work.

This version of ksh was taken from the recent OpenBSD 7.0 release.
This early draft is almost an identical copy of the C code present
in that release in order that this and the OpenBSD release can be
compared (and demonstrated to be identical) mechanically.

CWEB, the pair of tools which enable literate programming, does not
ignore comments but formats them along with the source code it's
presenting, so comments have been changed and/or moved drastically.
Frequently they have been removed to be absorbed into the description
associated with each section.

Apart from comments the other change which will be unnoticed by the
C compiler is that whitespace has changed. After porting a few
hundred lines I made an effort to preserve TAB characters present
in the original source but this has not been consistent. However I
have not reformatted lines except in a few places---the main
whitespace change has been to remove indentation that was no longer

Other changes will change the ksh binary compiled from these sources
vs. the source in OpenBSD's tree. This is because of some changes
which were made necessary by CWEB:

* CWEB makes no distinction between C's various namespaces and sees,
for example, the function 'include' and the preprocessor directive
'include' as the same. The same thing occurs with 'struct' objects
with variables given the same name, such as 'struct shf shf'. To
work around this some objects have been renamed by capitalising the
first letter (except the other 'token' which was changed to 'Mtoken').

There are certainly better ways to deal with this problem---this
solution was chosen to minimise the difference between ports.

* With hindsight this was a bad idea, but all type definitions and
preprocessor definitions (#define) are together (apart from a few which
simply couldn't) in sh.h.

* The functions, variables, etc. in each file are now in a totally
different order.

Making this took much longer than I expected (I am a programmer,
so I should have expected that, but I neglected to take Hofstadter's
law into account) so my immediate plans are to take a break and
leave this codebase well alone for a while but after that I do plan
to work on it some more.

The next improvements to make are two-fold: most obviously I need
to add a *lot* more description, especially to the parts ported
after I'd understood the core memory structures, lexer/parser &
exec loop, as these are all a lot easier to understand on their own.

Secondly, in order to keep code changes to a minimum I have faithfully
followed (I hope) the file structure of the original source, that
is apart from the order, comments, type definitions and preprocessor
definitions, the files produced by CWEB should be "the same" as the
files in OpenBSD's source tree. This is at odds with the way a
literate program's build process would be arranged and in addition
my method of bridging the two has changed as I became more familiar
with the code.


I moved house, I had no internet access and I needed something to
do. I could have worked on my own project (a programming language
also written using literate programming) but this was an opportunity
to become more familiar with CWEB and TeX, and also to learn more
how ksh works as I'd attempted to convert it into a library in the
past and linked with libperl and eventually moved on after failing
to figure out how ksh strings worked (very simply, it turns out;
see pages 83 & 84).

Who is this for?

Whoever wants it. The OpenBSD source was marked Mostly Public Domain
and I see no reason to change that for what little I've added. At
some point I plan to figure out what that "mostly" refers to.

However this is NOT intended to replace the contents of /usr/src/bin/ksh.
OpenBSD are of course welcome to use this as they see fit, perhaps
as contributed documentation, but it is certainly not part of some
secret long term plan to eventually convert all code everywhere to
this paradigm and reveal a computer's inner mysteries. Certainly
not. There is no TeXabal.

In any case I had a look into how CWEB might fit into a build process
and let's just say it's written in Pascal and leave it at that.

How to build it?

  Install OpenBSD or port the source and Makefile to your platform.

  Install texlive_base, which is huge.

  Run "make ksh" or "make ksh.pdf".