\expandafter\ifx\csname DocumentPart\endcsname\relax \documentclass{book} \usepackage{tw} \title{Tango/Weevil --- A WEB Tangler and Weaver} \author{Corey Minyard} \date{\today} \makeindex \begin{document} \maketitle \fi \section{Introduction} Tango/Weevil was born out of my frustration with the state of literate programming software available. All the stuff I found was inadequate for large projects, made you do to much work, or didn't do everything a nice literate programming tool should do. The \texttt{noweb} package came very close, but it didn't support inter-document cross references and it used somewhat esoteric programming languages. Although I don't claim Tango/Weevil is the end-all for literate programming, it is a start onto something more useful in a broad sense. It is written completely in C. It generates cross-references for variables in C programs. It generates variable cross-references that may be shared between files. I came up with the name Tango/Weevil on a late-night programming session. I don't guess my mind was working clearly and it sounded good at the time. It sounds a little hokey now. \section{Requirements} Tango/Weevil requires a C compiler and \LaTeX2e. It will not work with the old \LaTeX. \LaTeX2e\ is a whole lot better, anyway, so you should get it if you haven't already. \section{Theory of Operation} Tango/Weevil consists of two programs named \texttt{tango} and \texttt{weevil}. The \texttt{tango} program takes a literate program and extracts code onto standard out. It also generates the cross reference file (if invoked with the proper parameters). The \texttt{weevil} program takes a literate program and generates a \LaTeX document from it. It will use the cross reference file generated by \texttt{tango} to list the variables. Figure~\ref{fig:tango-weevil-operation} shows this operations. \begin{figure}[ht] .PS right If: box "Input File" "(\textit{file}.tw)" invis arrow Tb: ellipse "tango" arrow Of: box width 1.5 * boxwid "Program Source" "(\textit{file}.c)" invis move to Tb.b down arrow Cb: box "Cross Reference" "(\textit{file}.xfr)" invis arrow Wb: ellipse "weevil" move to If.b line to Wb.l + (-linewid,0) right arrow to Wb.l move to Wb.r arrow box width 1.1 * boxwid "\LaTeX Source" "(\textit{file}.tex)" invis .PE \centerline{\box\graph} \caption{Tango/Weevil operation} \label{fig:tango-weevil-operation} \end{figure} A Tango/Weevil file has two main parts: code and documentation. These may be interspersed in the document. Both programs start in documentation mode, a \verb|@code| directive switches to code mode and a \verb|@endcode| switched back to documentation mode. The \verb|code| directive takes a macro name, everything between that \verb|@code| and the next \verb|@endcode| is called a macro and may be referred to with the macro's name. \section{Procedure Headers} The tango/weevil styles provide a nice format for documenting procedures and functions in \LaTeX. Other document processor could be designed to take this information and use it. The format is: \begin{verbatim} \begin{twproc}{Procedure\_Name} \Description Describes the function. The format of the input is not really important, the output will indent it properly. \ReturnValues \verb|int| - A function return value \SideEffects Things that the procedure changes that will have effects on other things. \Inputs \begin{twparmlist} \item[inparm1] This is an input parameter \item[inparm2] This is another input parameter \end{twparmlist} \Outputs \begin{twparmlist} \item[outparm] This is for non-return value output parameters. \end{twparmlist} \StartCode % @code should come after this statement. @code int Procedure_Name(int inparm1, char *inparm2, int *outparm) { } @endcode \end{twproc} \end{verbatim} The output from this would be: \begin{twproc}{Procedure\_Name} \Description Describes the function. The format of the input is not really important, the output will indent it properly. \ReturnValues \verb|int| - A function return value \SideEffects Things that the procedure changes that will have effects on other things. \Inputs \begin{twparmlist} \item[inparm1] This is an input parameter \item[inparm2] This is another input parameter \end{twparmlist} \Outputs \begin{twparmlist} \item[outparm] This is for non-return value output parameters. \end{twparmlist} \StartCode \begingroup\medskip \twmacrodecl{Procedure\_Name}{userguid.tpc:macro:ProcedureName} \begin{verbatimcmd}\twstartcode 1 int 2 Procedure_Name(int inparm1, 3 char *inparm2, 4 int *outparm) 5 { 6 } \twendcode\end{verbatimcmd} \endgroup \end{twproc} \noindent All the sections (except \verb|\StartCode|) are optional. \section{Conditional Documentation} A special macro is created for including/excluding sections in certain situations. If the command \begin{verbatim} \def\twnocode{} \end{verbatim} is included before the ``tw'' package is included (at the very top of the file), everything between the \verb|@code| and \verb|@endcode| will not be output. If the command \begin{verbatim} \def\twnoproc{} \end{verbatim} is included before the ``tw'' package is included (at the very top of the file), everything between the \verb|\begin{twproc}| and \verb|\end{twproc}| will not be output. The commands \begin{verbatim} \twiscode text \fi \end{verbatim} will not generate any output for anything between \verb|\twiscode| and \verb|\fi| if \verb|\twnocode| was defined. Similarly, \begin{verbatim} \twisproc text \fi \end{verbatim} will not generate any output for anything between \verb|\twisproc| and \verb|\fi| if \verb|\twnoproc| was defined. \section{File Processing} When \texttt{tango} processes a file, it does not spit out the macro text as it receives it. Instead, it collects all the macro chunks until the end. After it has collected all the macro chunks, it looks for a special macro named ``*'', that is the start macro. It starts outputting that macro. Whenever it sees a \verb|@|, it will go to the macro with the given name and output it. Assuming no recursive macro references exit, the process will eventually end and a source file will be produced. For example, consider this small program: \begin{verbatim} \documentclass{article} \usepackage{tw,makeidx} \title{Hello World --- A Famous Program} \author{Corey Minyard} \date{\today} \makeindex \begin{document} \maketitle \section{Description} This program is the famous ``hello world'' program. Certainly we have all written one of these programs. \section{Code} \subsection{Start Macro} This is the main part of the program. @code <*> @ @ @endcode \subsection{Includes} This program needs \texttt{stdio.h} because it uses printf. @code #include @endcode \subsection{Routines} This program only has one routine, the main routine. @code @
@endcode \begin{twproc}{main} \Description The main routine. \ReturnValues \verb|int| - Program return code \SideEffects Well, it prints something! \Inputs \begin{twparmlist} \item[argc] Argument count \item[argv] The arguments \end{twparmlist} \StartCode @code
int main(int argc, char *argv) { @ @ } @endcode \subsection{Output Hello} This code chunk actually does the output. @code printf("hello world\n"); @endcode \subsection{Leave Program} This code chunk causes the program to exit. @code exit(0); @endcode \end{twproc} \printindex \end{document} \end{verbatim} You can notice a few things about the program. Standard \LaTeX\ stuff is used in the documentation. The code is not in the file in the same order that it will be output to the C file to be compiled. A file named hello.tw should be included in the distribution that is the same as this file, try the following commands on that file: \begin{verbatim} tango -autoxref -lang c hello.tw >hello.c cc -o hello hello.c ./hello weevil -lang c hello.tw >hello.tex latex hello latex hello makeindex hello.idx latex hello \end{verbatim} These command will create an executable \verb|hello| program and a \verb|hello.dvi| file. Look at the dvi file by using a dvi previewer or by printing it out. The output will be nicely formatted with the code line numbered and cross references after every macro chunk. \section{Use} Most of the major features of and concepts of Tango/Weevil have been presented. This section presents the rest. \subsection{Cross Referencing to Other Files} As mentioned before, Tango/Weevil lets you reference other files for cross references. This is done with the following directives followed by a filename. This must be the entire filename, including the \verb|.xfr| suffix. \begin{tabular}{lp{3in}} \verb|@includefile| & The given file is included (using a \#include) in the current file.\\ \verb|@reffile| & This file uses variables and functions in the given file.\\ \end{tabular} \subsection{Creating Your Own Cross References} If you don't trust the cross reference generator or you are using a language that \texttt{tango} will not generate cross references for, you can put the cross references in yourself. The following directives define different types of cross references. They are all followed by a string that is the cross reference name. \begin{tabular}{lp{3in}} \verb|@uses| & The given variable is used in the code chunk.\\ \verb|@defines| & The given variable is a \#define in the code chunk.\\ \verb|@externdecls| & The given variable is declared in the code chunk and may be used outside the file.\\ \verb|@staticdecls| & The given variable is declared in the code chunk but may only be used in the file.\\ \end{tabular} These directives must appear in the code chunk they are referencing. They may not occur outside a code chunk. When these are used, auto cross reference generation should be turned off but cross referencing should be turned on. \subsection{Controlling Input Files and Line Numbers} Since Tango/Weevil files may be produced by another program, some way must exist for that file name and the proper line numbers to be passed along to the compiler so debuggers work correctly. This is done using the following directives: \begin{tabular}{lp{3in}} \verb|@line| & The line number is set to the number following the directive.\\ \verb|@file| & The file name is set to the name following the directive.\\ \end{tabular} \subsection{Other Stuff} If an \verb|@| is needed in the code (outside of a quote or comment) then \verb|@@| will produce on at-sign. \subsection{Running the programs} \subsubsection{tango} The \verb|tango| program has the following format: \begin{verbatim} tango [options] inputfile \end{verbatim} The output is sent to standard out. Valid options are: \begin{tabular}{lp{3in}} -xref & Generate cross references\\ -autoxref & Automatically generate cross references. This will only work for ``C'' programs.\\ -xreffile \emph{fname} & Send the output for the crossreferences to \emph{fname} instead of the input file prefix with \verb|.xfr| on the end.\\ -lang \emph{lang} & Sets the programming language mode. Currently only ``c'' is supported. This options is required, so it's not really an option.\\ -nolinenum & Don't generate \#line references in the output file.\\ \end{tabular} \subsubsection{weevil} The \verb|weevil| program takes several options for specifying input language modes, output formatter modes, and include directories for cross reference files. \begin{tabular}{lp{3in}} -I \emph{inc dir} & Adds a directory to the list of places to look for cross reference (.xfr) files.\\ -lang \emph{lang} & Sets the programming language mode. Currently only ``c'' is supported. This options is required, so it's not really an option.\\ -mode \emph{formatter} & Sets the output formatter mode. Currently only ``latex'' is supported. This options is required, so it's not really an option.\\ \end{tabular} \section{Internals} \subsection{The Cross Reference Document Format} The cross reference files have a very simple line-oriented format. The first character on a line identifies the type of cross reference, or identifies the file or macro name. The second character must be a space. The third character to the end of the line gives a name. The hello world program presented earlier has the following cross reference file: \begin{verbatim} f hello.tw m Leave Program u exit m Output Hello u printf m Routines e main m Includes m * \end{verbatim} The following first characters are valid: \begin{tabular}{lp{4in}} f & The file name. This must be the first entry in the file, but may occur more than once in the file. This allows multiple files to be concatenated into one big file with the filenames the entries come from preserved.\\ m & A macro name. This sets the current macro; all cross references between this line and the next macro name will belong to this macro.\\ u & A variable use. A variable (static, external, or \#defined) was used.\\ e & An external define. A variable visible to other modules was defined.\\ s & A static define. A variable was defined that will not be visible to modules that cross reference this module. The variable will be visible to modules that include this module.\\ d & A \#define. A constant or macro was defined. This will only be visible to modules that include this module.\\ \end{tabular} \subsection{Adding New Languages} Adding new languages to \verb|tango| and \verb|weevil| is relatively easy. The language must provide two routines and must add these to the array of valid languages in \verb|tango| and \verb|weevil|. For \verb|tango| the routines are: \begin{tabular}{lp{3in}} init\_\emph{x}\_mode & Initialize the language \emph{x} mode. A variable name \verb|code_info| is available in the \verb|t_lptangodat| structure that the language mode may use to store a pointer.\\ \emph{x}\_scan\_input & Takes an input block of data as a character pointer, a length, and a line number. At a minimum, it must set the \verb|inquote| and \verb|in_comment| variables in the \verb|t_lptangodat| structure if the code was in a quote or a comment at the end of each code block. Cross reference may also be created if the \verb|auto_xref| variable in the structure is set TRUE.\\ \end{tabular} For \verb|weevil| the routines are: \begin{tabular}{lp{3in}} init\_\emph{x}\_mode & Initialize the language \emph{x} mode. A variable name \verb|code_info| is available in the \verb|t_lptangodat| structure that the language mode may use to store a pointer.\\ \emph{x}\_handle\_char & Takes a single character at a time from the input file. It must set the \verb|inquote| and \verb|in_comment| variables in the \verb|t_lptangodat| structure if the code was in a quote or a comment after each character.\\ \end{tabular} The reason for quote and comment sensing is to avoid having the user have to enter \verb|@@| in comments and quotes. If a language mode is emitting cross references, it must put those into lists of names. Four types of lists are kept for each macro; one for each type of cross reference. These lists are kepts in one-way linked lists of \verb|t_namelist| items. Each item has a character pointer with the name and a pointer to the next item. The routine \verb|create_namelist_item| should be used to allocate this structure. The routine \verb|free_namelist_item| should be used to free the structure. To add a structure to a list, use the \verb|list_insert_unique| routine: \begin{verbatim} /* Add the item to the macro's list of variable uses. */ list_insert_unique(lptd, &(lptd->curr_macro->uses), item); /* Add the item to the list of external declarations. */ list_insert_unique(lptd, &(lptd->curr_macro->globaldefs), item); /* Add the item to the list of static declarations. */ list_insert_unique(lptd, &(lptd->curr_macro->staticdefs), item); /* Add the item to the list of #defines. */ list_insert_unique(lptd, &(lptd->curr_macro->pounddefs), item); \end{verbatim} The following routines are also available for use: \begin{verbatim} find_name_in_list - Find a name in a namelist. r_strtok - A reentrant strtok. stralloc - Allocates and copies a string. \end{verbatim} \expandafter\ifx\csname DocumentPart\endcsname\relax \end{document} \fi