%=============================================================================
% ----------------------------------------------------------------------------
% Noxref, the cross referencing program for Noweb

\documentstyle[noweb]{article}
%\RCSdef $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $

% ----------------------------------------------------------------------------
%\title{{\tt Noxref\thanks{\RCSId},}
\title{{\tt Noxref\thanks{Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp},}
		   the cross referencing program for {\tt Noweb}}
\author{\setcounter{footnote}{6}%
    JG Krom\thanks{JET Joint Undertaking, e-mail address: jgk@jet.uk}}

\date{Printed: \today}

% ----------------------------------------------------------------------------
\makeindex
\begin{document}
\maketitle

% ----------------------------------------------------------------------------
% A bit of jargon:
\newcommand{\noweb}{{\tt noweb}}
\newcommand{\noweave}{{\tt noweave}}
\newcommand{\notangle}{{\tt notangle}}
\newcommand{\noxref}{{\tt noxref}}	\newcommand{\Noxref}{{\tt Noxref}}
\newcommand{\markup}{{\tt markup}}
\newcommand{\awk}{{\rm awk}}
\newcommand{\unix}{{\sc unix}}
\newcommand{\dos}{{\sc msdos}}
\newcommand{\chklist}{{$\setminus$\tt nowebchunks}}
%----------------------------------------------------------------------------
\section{Introduction}

N Ramsey presents in [2] a clean, language independent system for
literate programming, called \noweb.  One of the components of 
\noweave, the ``weave'' program for this system, is the \noxref\ program.

This system has been been ported to \dos\ by L Wittenberg.

The author of this paper customised this \noxref\ program.  The purpose
of this paper is to describe these customisations.  In order to
implement these cutomisations in a ``literate Programming'' style, the
codes written by the above mentioned authors are included in this
document.

% ----------------------------------------------------------------------------
\section{Problem Definition}

The \noxref\ program consists of an \awk\ [1] program, driven by an
\unix\ shell script or, as appropriate, by an \dos\ batch file.  This
\noxref\ program adds page number references to the usage and
definitions of the code chunks in a ``woven'' printing of a literate
program.

A feature that is available in other implementations of the \noxref\
program, the alphabetical chunk list, was missing from the \awk\
implementation of this program.  As this feature seemed useful, it is
implemented as an addition to the existing \noxref\ \awk\ program.

This noxref program is the proper place to create this chunk list,
since all information required for this list is already collected by
this program.

The chunk list should take a form, similar to a table of contents:
chunk names, in the ``\LA{chunk name}\RA{}'' format, on the left hand
side of the page, a list of page numbers on the right hand side of the
page and leaders between the two.  The list of page numbers should first
contain the pages on which the chunk is defined and then the pages on
which the chunk is used.  Root chunks are to be indicated with the word
``Root''.  Chunks that are used, but not defined, are marked with the
word ``Undefined''.

This whole chunk list, formatted using \LaTeX\ commands, replaces any
line, in the original source, containing only the word ``\chklist''.

\Noxref\ was, and still is, intended as a stage in the \noweave\
pipeline.  This means that it will receive input in the ``marked-up''
format generated by the \markup\ program.  The output of \noxref\
should also be in this format.

% ----------------------------------------------------------------------------
\pagebreak
\section{Web Structure}
This document describes three different files for two different environments:
\begin{enumerate}
\item [[noxref]] The executable shell script for use under \unix.
		  This file includes the awk script.
\item [[noxref.bat]] The executable shell script for use under \dos.
		  This file calls upon the following file.
\item [[noxref.awk]] The awk source code used or included by the
		  files above.  
\end{enumerate}
Each of these can be generated by specifying the required filename as
the root chunk name when executing the \notangle\ program on this web.
To obtain the \dos\ batch file the following command should be executed:
\begin{center}
[[notangle -Rnoxref.bat noxref.nw > noxref.bat]]
\end{center}
Users of these shell scripts might have to adapt the [[awk]] program
name in these scripts to match their local system configuration.
<<noxref>>=
#!/bin/sh
# $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $
nawk '<<noxref.awk>>'
<<noxref.bat>>=
@echo off
REM # $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $
REM The NOWEB environment variable must be set to the directory
REM where NOXREF.AWK is.  It must end in '/' or '\' as required
REM by the AWK interpreter in use.
awk -f %NOWEB%noxref.awk
<<noxref.awk>>=
# $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $
<<Noxref awk source>>
<<Noxref awk chunk list additions>>
@
% ----------------------------------------------------------------------------
\section{The AWK Source Code}
This is mostly Ramsey's original code. The fragment that has been
changed is included as the chunk: \LA{Find and process the \chklist\
request}\RA.  Module label generation has been upgraded to the
algorithm used in N~Ramsey's last release of the \noweb\ system.
<<Noxref awk source>>=
BEGIN { defns[0] = 0 ; uses[0] = 0 ; dcounts[0] = 0 ; firstdef[0] = 0; 
              ucounts[0] = 0 ; idtable[0] = 0 ; keycounts[0] = 0 ;
              firstdefnout[0] = 0; filetable[0] = 0 }
{ lines[nextline++] = $0 }
/^@defn / { logname("DEFN", defns, dcounts, substr($0, 7)) }
/^@use /  { logname("USE", uses, ucounts, substr($0, 6)) }
/^@file / { curfile = modid(substr($0, 7) substr($0, 10, 3)) }
<<Noxref awk source>>=
function logname(which, tbl, counts, name, id) {
  counts[name] = counts[name] + 1
  id = which curfile "-" modid(name) "-" counts[name]
  tbl[name] = tbl[name] id " "
  lines[nextline++] = "@literal \\label{" id "}"
  if (which == "DEFN" && firstdef[name] == "") firstdef[name] = id
}
<<Noxref awk source>>=
function modid(name, key) {
  if (idtable[name] == "") {
    key = name
    gsub(/[\[\]\\{} -]/, "*", key)
    if (length(key) > 6) key = substr(key,1,3) substr(key, length(key)-2, 3)
    keycounts[key] = keycounts[key] + 1
    idtable[name] = key "-" keycounts[key]
  }
  return idtable[name]
}
<<Noxref awk source>>=
END {
  for (i=0; i < nextline; i++) {
    name = substr(lines[i], 2)
    name = substr(name, 1, index(name, " ")-1)
    arg = substr(lines[i], length(name)+3)
    if (name == "defn") {
      thischunk = arg
      printf "@defn %s~{\\footnotesize\\rm\\pageref{%s}}\n", arg, firstdef[arg]
    } else if (name == "use") {
      if (firstdef[arg] != "") 
        printf "@use %s~{\\footnotesize\\rm\\pageref{%s}}\n", arg, firstdef[arg]
      else
        printf "@use %s~{\\footnotesize\\rm (never defined)}\n", arg
    } else if (name == "end") {
        if (substr(arg, 1, 4) == "code" && firstdefnout[thischunk] == 0) {
          firstdefnout[thischunk] = 1
          n = split(defns[thischunk], a)
          if (n > 1) {
            printf "@literal \\nwalsodefined{"
            for (j = 2; j <= n; j++) 
              printf "\\\\{%s}", a[j]
            printf "}\n@nl\n"
          }
 	  if (uses[thischunk] != "") {
  	    printf "@literal \\nwused{"
            n = split(uses[thischunk], a)
            for (j = 1; j <= n; j++) 
              printf "\\\\{%s}", a[j]
            printf "}\n@nl\n"
          } else 
              printf "@literal \\nwnotused\n@nl\n"
        }
        print lines[i]
    }
    <<Find and process the \chklist\ request>>
    else
        print lines[i]
    delete lines[i]
  }
}
@ Finding the \chklist\ command is straight forward, it must be on a
\verb+@text+ line.  The unclean way of using a chunk to insert an
[[else]]~[[if]] clause in a larger [[if -- else if -- else]]
structure should be noted.
<<Find and process the \chklist\ request>>=
else if (name == "text") {
	if (arg == "\\nowebchunks") 
		printChunkList()
	else
		print lines[i]
}
@ If the keyword has been found the function [[printChunkList()]] is
called to do the actual printing.
% ----------------------------------------------------------------------------
\section{The Chunk List Additions} 
These additions consist, except for the chunk \LA{Find and process the
\chklist\ request}\RA\ mentioned above, of two functions: one to sort
the list of chunks, and one to print this list.

<<Noxref awk chunk list additions>>=
<<Sort chunk list>>
<<Print chunk list>>
@
% ----------------------------------------------------------------------------
\subsection{The Sorting Routine}
This function implements essentially a simple insertion sort.  If
performance becomes a problem, some effort could be invested to use a
better algorithm, but that seems unnecessary at the moment.

\subsubsection{The Sorting Function Framework}
Two global variables, [[nextFreeIdx]] and [[sortedNames]], carry the
results of this function.  

The [[sortedNames]] array is empty to start with, except for the
first element, which contains the null string as a sentinel; no
string compares to less than the null string.  The invariant on this
array is that it will always contain chunk names in sorted order,
with the lowest (according to the awk comparison rules) coming first.

The invariant on [[nextFreeIdx]] is that it always contains the index
number of the next free element in the array.
<<Sort chunk list>>=
function sortChunkNames(     <<Sort --- Local Variables>>)
{
    sortedNames[0] = ""
    nextFreeIdx=1;        #  The next index to use (range 1--N)
    <<Run through the [[chunkname]]s as stored in [[defns]] array>>
    <<Run through the [[chunkname]]s as stored in [[uses]] array>>
}
@
\subsubsection{Scan the Arrays}
All chunk names have been stored in the [[defns]] array when they were
defined.  Using the ``{\tt for \em xyz \tt in \em arrayname}'' feature of
awk, it is possible to step through all elements of the array.  The
zero element in the arrays would confuse the sorting algorithm, so these
elements have to be discarded. 
<<Run through the [[chunkname]]s as stored in [[defns]] array>>=
for (chunkname in defns) {
	if (chunkname != 0) {
		<<Insert in proper place in sorted array>>
        } 
}
<<Sort --- Local Variables>>=
chunkname,
@ All names that have been used are stored in the array [[uses]].  This
array has to be scanned for chunk names that might have been used, but
that were not defined.  Such chunks should also be included in the chunk
list, so they are inserted in the [[sortedNames]] array.
<<Run through the [[chunkname]]s as stored in [[uses]] array>>=
for (chunkname in uses) {
	if ((chunkname != 0) && !(chunkname in defns)) {
		<<Insert in proper place in sorted array>>
        } 
}
@
\subsubsection{Insert into the Sorted Array}
The proper place for the insertion is found by scanning the sorted
array from the end to the beginning.  The local variable [[idx]] is
used for this scan, it will always point at a possible insertion
location.  [[nextFreeIdx]] is incremented, since it is now known that
there is another element which will be inserted.

If the element before the current scan location is greater than the
chunkname to be inserted, then the chunkname will be inserted before
that element.  The scanned element should be moved one position up (to
the current insertion location) at this point.

Otherwise, the chunk name should come after the element before the
scan location, ie. it should be inserted at the current position.
[[idx]] is pushed to the end condition, to stop the scan over the
sorted array and to get a new chunk name.
<<Insert in proper place in sorted array>>=
for ( idx = nextFreeIdx++ ; idx>0 ; idx-- ) {
	if ( sortedNames[idx-1] > chunkname )
	    sortedNames[idx] = sortedNames[idx-1] ;
	else {
		sortedNames[idx] = chunkname ;
		idx = -1 ;
	}
}
<<Sort --- Local Variables>>=
idx
@ 
% ----------------------------------------------------------------------------
\subsection{Print the Chunk List}
The function to print the chunk list, first calls upon the sorting 
function to get the names in order.  It then inserts, if required,
some heading material and lastly prints the names.
<<Print chunk list>>=
function printChunkList(    <<Print --- Local Variables>> )
{
    sortChunkNames() ;
    <<Optional Header material>>
    <<Header material>>
	<<Print Loop>>
    <<Closing Material>>
}
@
% ----------------------------------------------------------------------------
\subsubsection{The Printing Loop}
This loops steps through the indices of the sorted names array, up to
the next free index number.  It prints each name, using the \markup\
\verb+@use+ directive, followed by a row of dots.  The printing of the
page numbers, root markers etc. is delegated to the chunk \LA{print page
numbers etc.}\RA.
<<Print Loop>>=
for (idx=1; idx<nextFreeIdx; idx++) {
        print "@use " sortedNames[idx] ;
        print "@literal \\dotfill" ;
	<<Print page numbers etc.>>
        print "@nl" ;
}
<<Print --- Local Variables>>=
idx,
@ When printing the page references, the following cases should be considered.
\begin{itemize}
\item If a name does not appear in the [[uses]] array, it must have been
	in the [[defns]] arrray. It is therefore a root chunk.
\item If a name does not appear in the [[defns]] array, it is undefined in
	the source file currently processed.
\item If a name is defined, print the page references of the definitions.
\item If a name is used, print the usage page references.
\end{itemize}
<<Print page numbers etc.>>=
if (uses[sortedNames[idx]] == "") {
	print "@literal { \\rm Root}," ;
}
if (defns[sortedNames[idx]] == "") {
	print "@literal { \\rm Undefined}" ;
}
else {
	<<Print definition page numbers>>
}
if (uses[sortedNames[idx]] != "") {
	<<Print usage page numbers>>
}
@
\paragraph{Definition References.}
Definition page references are derived from the [[defns]] arrays, by
splitting them into fields with the [[split]] function.  The first one
should not be preceded by a ``,'' (comma character), but all subsequent
numbers (if any) should have a comma in front of them.  Page references
for the definitions are printed underlined. 
<<Print definition page numbers>>=
n = split(defns[sortedNames[idx]], a)
print "@literal { \\underline{\\pageref{" a[1] "}}}";
if (2 <= n) 
	for (j = 2; j <= n; j++) 
		print "@literal {, \\underline{\\pageref{" a[j] "}}}";
<<Print --- Local Variables>>=
n, a, j
@
\paragraph{Usage References.}
The page references for the places where the chunks are used are
obtained from the [[uses]] array.  These always have
preceding text (definition page references or the word ``Undefined''),
so these  should always have a ``,'' in front of them.
<<Print usage page numbers>>=
n = split(uses[sortedNames[idx]], a)
for (j = 1; j <= n; j++) 
	print "@literal {\\rm, \\pageref{" a[j] "}}";
@
\paragraph{A Small Test.}
Both chunk names should appear in the chunk list: one
marked as ``Root'', the other as ``Undefined''.
<<An unused (therefore root) chunk>>=
<<An undefined chunk>>
@
\pagebreak
\subsubsection{List Opening and Closing Definitions}
If required, some commands could be included to generate a chapter or
section heading above the chunk list.  However, the author of this
code prefers to have such sectioning commands under the control of the
final document source file.

Users who prefer to have these section commands automatically
generated (like the Icon implementation of the \noxref\ program does)
can redefine \LA{Optional Header material}\RA\ to be equal to the
current definition of \LA{Not used header material}\RA.
<<Optional Header material>>=
<<Not used header material>>=
print "@literal \\ifx\\chapter\\undefined\\section*"
print "@literal {Alphabetical List of Chunk Names}" ;
print "@literal \\else\\chapter"
print "@literal {Alphabetical List of Chunk Names}" ;
print "@literal \\fi"
print "@nl" ;
@ The following header material is required, it sets up the
environment for the list.
<<Header material>>=
print "@literal {\\obeylines" ;
print "@literal \\setlength{\\parindent}{0mm}" ;
print "@literal \\setlength{\\parskip}{1.4ex}"  ;
print "@nl" ;
<<Closing Material>>=
print "@literal }" ;
@
% ----------------------------------------------------------------------------
\newpage
\section{References}
% This is faked (ie, not a real LaTeX bibliography), since this file 
% is likely to get included in other files, with other bibliographies. 
{
\begin{description}
\sfcode`\.=1000\relax

\item[{\rm [1]~~~}]
Aho AV.,  Kernighan BW., Weinberger PJ. 1988,
{\sl The AWK Programming Language,}
Addison-Wesley.

\item[{\rm [2]~~~}]
Ramsey N. 1992--1993,
``Literate Programming Tools Need Not Be Complex,''
To be published in {\sl IEEE Software.} 1993.

\end{description}
}


% ----------------------------------------------------------------------------
% This should go in RCS.sty!
\newenvironment{RCSlog}%
{\begin{trivlist} \item[]%
\setlength{\parindent}{0mm}%
\setlength{\parskip}{3ex}%
\catcode`\$=12%
\hbadness=10000\ignorespaces\obeycr}%
{\end{trivlist}}

\section{RCS Maintained Log}
\begin{RCSlog}
$Log: noxref.nw,v $
Revision 1.2  1993/05/06  18:15:40  jgk
Moved from using bold to underlining for the page references
of definitions. (On advice of Lee Wittenberg.)
A few linguistic improvements.  RCS ID strings included in progs.

Revision 1.1  1993/05/01  21:08:21  JG~Krom
Initial revision

A version with the same code, but some errors and typos 
in the documentation text was known as:
Revision 1.1  1993/04/28  17:03:23  jgk
Initial revision

This file was derived from:
``NOXREF.BAT'' by L~Wittenberg and ``noxref'' By N~Ramsey.
(No change log was available for these files.)

And from:
Log: noxref.awk
Revision 1.5  1993/04/23  12:52:16  JG~Krom
On advice of Lee Wittenberg, used the new way of label generation.

Revision 1.4  1993/04/20  22:41:44  JG~Krom
Improved layout of chunklist.

Revision 1.3  1993/04/11  17:47:38  JG~Krom
Indicate root chunks in the chunklist.

Revision 1.2  1993/04/11  15:52:53  JG~Krom
First stab at the chunklist command.

Revision 1.1 1992/10/21 17:00:00 LEEW
checked in with -k by JG~Krom at 1993/04/10 16:53:28

Which in turn was also derived from: ``noxref'' By N~Ramsey.
\end{RCSlog}
% ----------------------------------------------------------------------------
\newpage
\section{Alphabetical List of Chunk Names}
\nowebchunks
% ----------------------------------------------------------------------------
\input{noxref.ind}
% ----------------------------------------------------------------------------
\end{document}
% End of noweb code
%=============================================================================