26-Aug-88 06:58:29-PDT,25088;000000000000 Return-Path: From: Peter King Date: Fri, 26 Aug 88 14:46:57 BST Subject: refer to BiBTeX conversion I have written a shell script (using both sed and awk) which will convert refer (or bib) databases to BiBTeX format. The program r2bib on the UNIX tape does a similar job, but I belive my script is more useful because it a) does a better job of generating keys b) converts accents and (some) troff special characters c) is readily customisable (if you know awk) d) does a slightly better job of classification of type of reference. It's probably too long (c. 850 lines) to include in the mailshot, but here it is for archiving. I have sent Pierre MacKay a copy. The manual page was adapted from the r2bib manual page. Peter King Peter King, Computer Science Department JANET: pjbk@uk.ac.hw.cs Heriot-Watt University ARPA: pjbk@cs.hw.ac.uk 79 Grassmarket, Edinburgh EH1 2HJ or pjbk%cs.hw.ac.uk@ucl-cs Phone: (+44) 31 225 6465 Ext. 555 UUCP: ..!ukc!cs.hw.ac.uk!pjbk ------------------------------Cut here----------------------------------- export PATH || exec /bin/sh $0 $* : "This is a shar archive; use /bin/sh to extract" : "Extracted files will be owned by you, and will have" : "default permissions" PATH=/bin:/usr/bin echo If this archive is complete, \"End of archive\" will appear at the end echo Extracting ref2bib sed 's/^X//' <<'End-of-file' >ref2bib X#!/bin/sh X# X# shell script to convert refer (or bib) databases to BiBTeX format X# X# reads its arguments (or standard input) X# and writes the BibTeX to standard output X# X# the in-line files ref2b*.{sed,awk} do not change, and could be X# stored in a library somewhere. The sed script can actually be X# given as an argument in ' ' quotes provided the ' in the X# file are replaced with '\'' !! X# the awk script is too large for this treatment. X# X# The gnereation of keys can be altered by changing the X# values of some awk variables X# X# errors etc. in ref2bib.errs X# Xcat << 'ZZ' >ref2b$$.sed X# X# sed script to do some of the ref to bib database conversion X# X# written by Peter King, Heriot-Watt University X# You may do anything you like with this code X# EXCEPT claim that you wrote it X# X# First alter the TeX special characters Xs/\(.\)%/\1\\%/g Xs/&/\\&/g Xs/\$/\\$/g Xs/#/\\#/g Xs/_/\\_/g Xs/{/\\{/g Xs/}/\\}/g X# convert the special characters and accents from troff to BibTeX X# assumes the accents are those of the Berkeley -ms with .AM X# X/\\/s/\(.\)\\\\*\*'/{\\'\1}/g X/\\/s/\(.\)\\\\*\*`/{\\`\1}/g X/\\/s/\(.\)\\\\*\*^/{\\^\1}/g X/\\/s/\(.\)\\\\*\*:/{\\"\1}/g X/\\/s/\(.\)\\\\*\*~/{\\~\1}/g X/\\/s/\(.\)\\\\*\*_/{\\=\1}/g X/\\/s/\([oO]\)\\\\*\*\//{\\\1}/g X/\\/s/\([aA]\)\\\\*\*o/{\\\1\1}/g X/\\/s/\(.\)\\\\*\*,/{\\c{\1}}/g X/\\/s/\(.\)\\\\*\*v/{\\v{\1}}/g X/\\/s/\(.\)\\\\*\*"/{\\H{\1}}/g X/\\/s/\(.\)\\\\*\*\./{\\d{\1}}/g X/\\/s/\\\\*\*8/{\\ss}/g X/\\/s/\\\\*\*(P\([lL]\)/{\\\1}/g X/\\/s/\\\\*\*(\([oO]\)\//{\\\1}/g X# quotes X/\\/s/\\\\*\*Q/``/g X/\\/s/\\\\*\*U/''/g X/\\/s/\\\\*\*-/---/g X# \0 as space between surname de\0Souza etc. X/\\/s/\\\\*0\([a-z]*\)\\\\*0/\\0\1 /g X/\\/s/ \([a-z]*\)\\\\*0/ \1 /g X# but trap the ones that start with a capital letter and convert them to X# ties X/\\/s/\\\\*0/~/g X# X# now deal with special characters and Greek X/\\/s/\\\\*(em/---/g X/\\/s/\\\\*(if/$\\infty$/g X/\\/s/\\\\*(\*a/$\\alpha$/g X/\\/s/\\\\*(\*b/$\\beta$/g X/\\/s/\\\\*(\*g/$\\gamma$/g X/\\/s/\\\\*(\*d/$\\delta$/g X/\\/s/\\\\*(\*e/$\\epsilon$/g X/\\/s/\\\\*(\*z/$\\zeta$/g X/\\/s/\\\\*(\*y/$\\eta$/g X/\\/s/\\\\*(\*h/$\\theta$/g X/\\/s/\\\\*(\*i/$\\iota$/g X/\\/s/\\\\*(\*k/$\\kappa$/g X/\\/s/\\\\*(\*l/$\\lambda$/g X/\\/s/\\\\*(\*m/$\\mu$/g X/\\/s/\\\\*(\*n/$\\nu$/g X/\\/s/\\\\*(\*c/$\\xi$/g X/\\/s/\\\\*(\*o/$o$/g X/\\/s/\\\\*(\*p/$\\pi$/g X/\\/s/\\\\*(\*r/$\\rho$/g X/\\/s/\\\\*(\*s/$\\sigma$/g X/\\/s/\\\\*(\*t/$\\tau$/g X/\\/s/\\\\*(\*u/$\\upsilon$/g X/\\/s/\\\\*(\*f/$\\phi$/g X/\\/s/\\\\*(\*x/$\\chi$/g X/\\/s/\\\\*(\*q/$\\psi$/g X/\\/s/\\\\*(\*w/$\\omega$/g X/\\/s/\\\\*(\*A/A/g X/\\/s/\\\\*(\*B/B/g X/\\/s/\\\\*(\*G/$\\Gamma$/g X/\\/s/\\\\*(\*D/$\\Delta$/g X/\\/s/\\\\*(\*E/E/g X/\\/s/\\\\*(\*Z/Z/g X/\\/s/\\\\*(\*Y/H/g X/\\/s/\\\\*(\*H/$\\Theta$/g X/\\/s/\\\\*(\*I/I/g X/\\/s/\\\\*(\*K/K/g X/\\/s/\\\\*(\*L/$\\Lambda$/g X/\\/s/\\\\*(\*M/M/g X/\\/s/\\\\*(\*N/N/g X/\\/s/\\\\*(\*C/$\\Xi$/g X/\\/s/\\\\*(\*O/$O$/g X/\\/s/\\\\*(\*P/$\\Pi$/g X/\\/s/\\\\*(\*R/P/g X/\\/s/\\\\*(\*S/$\\Sigma$/g X/\\/s/\\\\*(\*T/T/g X/\\/s/\\\\*(\*U/$\\Upsilon$/g X/\\/s/\\\\*(\*F/$\\Phi$/g X/\\/s/\\\\*(\*X/X/g X/\\/s/\\\\*(\*Q/$\\Psi$/g X/\\/s/\\\\*(\*W/$\\Omega$/g X# Now trap title words that must be capitalised X/^%[^T]/b X# X# first all words that are all capitals (at least two consecutive) X# we need the slashes to allow for M/M/1 queues Xs;[A-Z][A-Z/][A-Z/0-9]*;{&};g X# X# then some proper names X# first some mathematicians X# (for some I've added the Pattern [^ -]* toe the end to get Markov, X# Markovian, etc. Xs/Abel/{&}/g Xs/Bernoulli/{&}/g Xs/Bessel/{&}/g Xs/Beta/{&}/g Xs/Borel/{&}/g Xs/Cauchy/{&}/g Xs/Church/{&}/g Xs/Rosser/{&}/g Xs/Dedekind/{&}/g Xs/Descartes/{&}/g Xs/Dirichlet/{&}/g Xs/Euclid[^ -,;]*/{&}/g Xs/Euler/{&}/g Xs/Fibonacci/{&}/g Xs/Fermat/{&}/g Xs/Fourier/{&}/g Xs/Fresnel/{&}/g Xs/Frobenius/{&}/g Xs/Perron/{&}/g Xs/Gamma/{&}/g Xs/Gauss[^ -,;]*/{&}/g Xs/Hilbert/{&}/g Xs/Horner/{&}/g Xs/Holder/{&}/g Xs/Jacobi[^ -,;]*/{&}/g Xs/Jensen/{&}/g Xs/Markov[^ -,;]*/{&}/g Xs/Arnoldi/{&}/g Xs/Laplace/{&}/g Xs/Laguerre/{&}/g Xs/Lagrange/{&}/g Xs/Legendre/{&}/g Xs/Leibnitz/{&}/g Xs/Rayleigh/{&}/g Xs/Ritz/{&}/g Xs/Riemann/{&}/g X# this is really Rouche (acute accent) , but the accent processing will disrupt it Xs/Rouch/{&}/g Xs/Stieltjes/{&}/g Xs/Stiener/{&}/g Xs/Schwarz/{&}/g Xs/Weibull/{&}/g Xs/Wald/{&}/g Xs/Kronecker/{&}/g Xs/Diophantine/{&}/g Xs/Delbrouck/{&}/g Xs/Bayes[^ -,;]*/{&}/g Xs/Jackson/{&}/g Xs/Newhall/{&}/g Xs/Turing/{&}/g Xs/Norton/{&}/g Xs/Petri/{&}/g Xs/Wilkinson/{&}/g Xs/Skinner/{&}/g Xs/Schafer/{&}/g Xs/Dempster/{&}/g Xs/Runge/{&}/g Xs/Kutta/{&}/g Xs/Pollaczek/{&}/g Xs/Khinchin/{&}/g Xs/Palm/{&}/g Xs/Erlang/{&}/g Xs/Engset/{&}/g Xs/Little's/{&}/g Xs/Kosten/{&}/g Xs/Gittins/{&}/g Xs/Feller/{&}/g Xs/Cox/{&}/g Xs/Poisson/{&}/g Xs/Chapman/{&}/g Xs/Kolmogorov/{&}/g Xs/Smirnov/{&}/g Xs/Weiner/{&}/g Xs/Hopf/{&}/g Xs/Stirling/{&}/g X X# computing Xs/Buzen/{&}/g Xs/Gordon/{&}/g Xs/Newell/{&}/g Xs/Lemoine/{&}/g Xs/Pierce/{&}/g Xs/Harrison/{&}/g Xs/Cambridge/{&}/g Xs/Ethernet/{&}/g Xs/Aloha/{&}/g X X# coding theory Xs/Hamming/{&}/g Xs/Huffman/{&}/g Xs/Reed/{&}/g Xs/Shannon/{&}/g Xs/Solomon/{&}/g Xs/Viterbi/{&}/g XZZ Xcat << 'ZZ' > ref2b$$.awk X# X# awk script to convert refer (or bib) format databases X# to BiBTeX format. X# X# written by Peter King, Heriot-Watt University X# use freely, but dont claim that you wrote it X# X# Generates keys using authors names and year (see %A entry ) X# X# You may wish to alter treatment of key fields that are ignored X# such as %U %W %Y %K etc. X# X# regular expressions should be sorted according to frequency X# so that minimal tests are made X# From tests in a local data base the order given appears quite good X# 2883 %A X# 1813 blank lines X# 1774 %T X# 1764 %D X# 1505 %P X# 1347 %J X# 1331 %V X# 1201 %N X# 773 .. continuation lines X# 501 %C X# 424 %I X# 192 %B X# 187 %E X# 92 %S X# 89 %R X# 33 %X X# 30 %K X# 16 %O X# 12 %any other % lines X# XBEGIN { X for(i=1;i<=27;i++) X addkey[i] = substr(" abcdefghijklmnopqrstuvwxyz",i,1); X lkey = 3; # number of characters used from authors to make key X maxauthor = 3; # maximum number of authors to use in X # constructing key X rx = 1 X } X X/\\[*u0]/ || /\\d[^{]/|| /\\s[^s]/ { X err = 1 X print "Non translated \\ symbol : Reference " rx > "ref2bib.errs" X print $0 > "ref2bib.errs" X } X X/^%A/ { X if (A==0) keys=""; X A ++; lastx = "A"; X authors[A] = substr($0,4) X if(A> maxauthor) next X ic = 0 X lc = 1 X while(ic < lkey && lc <= length($NF) ){ X kc = substr( $NF, lc, 1) X if ( kc ~ /[a-zA-Z]/ ){ X keys = keys kc X ic++ X if(ic==lkey) next X } X else if ( kc == "\\" ) lc ++; X lc ++; X } X next X } X X/^$/ { X if(NR==pr+1){ X } X else { X refs ++ X # if FILENAME != prevname then new file X acnt[A]++;if(A>MaxA)MaxA=A; X if(T==0)print "No title : Reference "refs" "keys > "ref2bib.errs" X if(A==0)print "No author : Reference "refs" "keys > "ref2bib.errs" X if(D==0)print "No date : Reference "refs" "keys > "ref2bib.errs" X if( (!T)||(!A)||(!D))err=1; X # classify the reference X if(J){ X #journal or conference X if(B||E||R)print "Journal & book?: Reference "refs" "keys > "ref2bib.errs" X if(C||I) {conf++ X type = "Inproceedings" X } X else{ X jour ++; X type = "Article" X } X if(!P) print "No page nos.? : Reference "refs" "keys > "ref2bib.errs" X if( B||E||R||(!P))err=1 X if(err){ X print "Journal reference in error" > "ref2bib.errs" X } X } X else X if(B){ X # article in book X type = "Incollection" X bookart++ X if(N||R||(!E)||(!I)||(!C)||(!P)||(V&&(!S)))err=1 X if(!E) print "No editor? Reference "refs" " keys > "ref2bib.errs" X if(!I) print "No publisher? Reference "refs" " keys > "ref2bib.errs" X if(!C) print "No city? Reference "refs" " keys > "ref2bib.errs" X if(!P) print "No page nos.? Reference "refs" " keys > "ref2bib.errs" X if(V&&(!S))print "Volume but no Series Reference "refs" " keys > "ref2bib.errs" X if(N)print "Issue no.? Reference "refs" " keys > "ref2bib.errs" X if(R)print "Report? Reference "refs" " keys > "ref2bib.errs" X if(err){ X print "Book reference in error" > "ref2bib.errs" X } X } X else if(R){ X #report X type = "Techreport" X reps++ X if(E||N)err=1 X if(N)print "Issue no.? Reference "refs" " keys > "ref2bib.errs" X if(E) print "Editor? Reference "refs" " keys > "ref2bib.errs" X if(err){ X print "Report reference in error" > "ref2bib.errs" X } X } X else if(I){ X wholebook ++ X type = "Book" X # book X if(N||R||E||(!C)||(V&&(!S)))err=1 X if(!C) print "No city? Reference "refs" " keys > "ref2bib.errs" X if(N)print "Issue no.? Reference "refs" " keys > "ref2bib.errs" X if(E)print "Editor? Reference "refs" " keys > "ref2bib.errs" X if(V&&(!S))print "Volume but no Series Reference "refs" " keys > "ref2bib.errs" X if(err){ X print "Book reference in error" > "ref2bib.errs" X } X } X else { X unclass ++ X type = "Misc" X err=1 X print "Unclassified reference in error" > "ref2bib.errs" X } X X # generate date X ndate = split(date,df) X if ( ndate > 2) print " Funny date " date > "ref2bib.errs" X if (ndate == 1 ) { df[2] = df[1]; df[1] = ""; } X X X # generate key X if(keys == "") keys = "ANON" X keys = keys substr(df[2],3,2) X if(keyused[keys] >=1) { X key_suffix = keyused[keys]++; X keys = keys addkey[key_suffix]; X } X else keyused[keys] = 1 X if (err) { X print "Key: " keys > "ref2bib.errs" X if(A) for (i=1;i<=A;i++) X print "%A " authors[i] > "ref2bib.errs" X if(T) print "%T " title > "ref2bib.errs" X if(J) print "%J "journal > "ref2bib.errs" X if(B) print "%B "book > "ref2bib.errs" X if(V) print "%V "volume > "ref2bib.errs" X if(N) print "%N "number > "ref2bib.errs" X if(I) print "%I "publisher > "ref2bib.errs" X if(C) print "%C "city > "ref2bib.errs" X if(E) for (i=1;i<=E;i++)print "%E "editor[i] > "ref2bib.errs" X if(S) print "%S "series > "ref2bib.errs" X if(P) print "%P "pages > "ref2bib.errs" X if(R) print "%R "report > "ref2bib.errs" X if(D) print "%D "date > "ref2bib.errs" X if(O) print "%O "other > "ref2bib.errs" X print "" > "ref2bib.errs" X } X X if(T){ X twc = split(title,z) X title = z[1]; lt = length(z[1]); X for(i=2;i<=twc;i++) { X if(lt +length(z[i]) >= 55) {sc = "\n\t\t";lt = 0;} X else sc = " "; X title = title sc z[i] X lt += length(z[i]) + 1 X } X } X if(O){ X twc = split(other,z) X other = z[1]; lt = length(z[1]); X for(i=2;i<=twc;i++) { X if(lt + length(z[i]) >= 55) {sc = "\n\t\t";lt = 0;} X else sc = " "; X other = other sc z[i] X lt += length(z[i]) + 1 X } X } X if(X){ X twc = split(abstr,z) X abstr = z[1]; lt = length(z[1]); X for(i=2;i<=twc;i++) { X if(lt + length(z[i]) >= 55) {sc = "\n\t\t";lt = 0;} X else sc = " "; X abstr = abstr sc z[i] X lt += length(z[i]) + 1 X } X } X X printf "@%s{\t%s",type,keys X if(A) { X printf ",\n\tAuthor = { %s",authors[1] X for(i=2;i<=A;i++) printf " and\n\t\t%s",authors[i] X printf " }" X } X if(T) printf ",\n\tTitle = { %s }",title X if(B) printf ",\n\tBooktitle = { %s }",book X if(E) { X printf ",\n\tEditor = { %s",editor[1] X for(i=2;i<=E;i++) printf " and\n\t\t%s",editor[i] X printf " }" X } X if(I) printf ",\n\tPublisher = { %s }",publisher X if(C) printf ",\n\tAddress = { %s }",city X if(J) { # substitute the journal abbreviations from the standard styles X journal = "{ " journal " }" X # {acmcs} {"ACM Computing Surveys"} X if ( journal ~ /Comp.* Sur/ ) journal = "acmcs" X # {acta} {"Acta Informatica"} X if ( journal ~ /Acta Inf/ ) journal = "acta" X # {cacm} {"Communications of the ACM"} X if ( journal ~ /Com.* ACM/ ) journal = "cacm" X if ( journal ~ /CACM/ ) journal = "cacm" X # {ibmjrd} {"IBM Journal of Research and Development"} X if ( journal ~ /IBM J.*R.*D/ ) journal = "ibmjrd" X # {ibmsj} {"IBM Systems Journal"} X if ( journal ~ /IBM Sy.*J/ ) journal = "ibmsj" X # {ieeese} {"IEEE Transactions on Software Engineering"} X if ( journal ~ /IEEE Tran.*Soft.*Eng/ ) journal = "ieeese" X # {ieeetc} {"IEEE Transactions on Computers"} X if ( journal ~ /IEEE Tran.*Computers/ ) journal = "ieeetc" X # {ieeetcad} X if ( journal ~ /IEEE Tran.*Comp.*Desig/ ) journal = "ieeetcad" X # {ipl} {"Information Processing Letters"} X if ( journal ~ /Inf.*Proc.*Lett/ ) journal = "ipl" X # {jacm} {"Journal of the ACM"} X if ( journal ~ /Jou.* ACM/ ) journal = "jacm" X if ( journal ~ /JACM/ ) journal = "jacm" X # {jcss} {"Journal of Computer and System Sciences"} X if ( journal ~ /J.*Comp.*Sys.*Sc/ ) journal = "jcss" X # {scp} {"Science of Computer Programming"} X if ( journal ~ /Sc.*Comp.*Prog/ ) journal = "scp" X # {sicomp} {"SIAM Journal on Computing"} X if ( journal ~ /SIAM .*Comp/ ) journal = "sicomp" X # {tocs} {"ACM Transactions on Computer Systems"} X if ( journal ~ /ACM Tran.*Comp.*Sys/ ) journal = "tocs" X # {tods} {"ACM Transactions on Database Systems"} X if ( journal ~ /ACM Tran.*Data.*Sys/ ) journal = "tods" X # {tog} {"ACM Transactions on Graphics"} X if ( journal ~ /ACM Tran.*Grap/ ) journal = "tog" X # {toms} {"ACM Transactions on Mathematical Software"} X if ( journal ~ /ACM Tran.*Math.*Soft/ ) journal = "toms" X # {toois} {"ACM Transactions on Office Information Systems"} X if ( journal ~ /ACM Tran.*Off.*Inf.*Sys/ ) journal = "toois" X # {toplas} {"ACM Transactions on Programming Languages and Systems"} X if ( journal ~ /ACM Tran.*Prog.*Lan.*Sys/ ) journal = "toplas" X # {tcs} {"Theoretical Computer Science"} X if ( journal ~ /Th.*Comp.*Sci/ ) journal = "tcs" X X printf ",\n\tJournal = %s",journal X } X if(V) printf ",\n\tVolume = { %s }",volume X if(N) printf ",\n\tNumber = { %s }",number X if(P) printf ",\n\tPages = { %s }",pages X if(O) printf ",\n\tNote = { %s }",other X if(R) printf ",\n\tNumber = { %s }",report X if(S) printf ",\n\tSeries = { %s }",series X if(df[1] != "") X printf ",\n\tMonth = { %s }",df[1] X if(D) printf ",\n\tYear = { %s }",df[2] X if(X) printf ",\n\tAnnote = { %s }",abstr X if(L) printf ",\n\tKey = { %s }",label X printf "\t}\n\n" X X A=0;B=0;C=0;D=0;E=0;F=0;G=0;H=0;I=0;J=0; X K=0;L=0;M=0;N=0;O=0;P=0;Q=0;R=0;S=0;T=0; X U=0;V=0;W=0;X=0;Y=0;Z=0; X type = "" X book="" X title = "" X volume = "" X city = "" X date = "" X publisher = "" X journal = "" X number = "" X other = "" X page = "" X report = "" X series = "" X toterr +=err X rx++ X } X err = 0 X pr = NR X next X } X X/^%T/ { X T ++; lastx = "T" X if(T>1){err=1 X print "Two titles: Reference " rx > "ref2bib.errs" X print title > "ref2bib.errs" X } X title = substr($0,4) X next X } X X/^%D/ { X D ++; lastx = "D" X if(D>1){err=1 X print "Two dates: Reference " rx > "ref2bib.errs" X print date > "ref2bib.errs" X } X if(($NF<1900)||($NF>=2000)){err=1 X print "Date error? : Reference " rx > "ref2bib.errs" X } X date = substr($0,4); X next X } X X/^%P/ { X P ++; lastx = "P" X if(P>1){err=1 X print "Two page nos? : Reference " rx > "ref2bib.errs" X print pages > "ref2bib.errs" X } X pages = substr($0,4) X next X } X X/^%J/ { X J ++; lastx = "J" X if(J>1){err=1 X print "Two journals: Reference " rx > "ref2bib.errs" X print journal > "ref2bib.errs" X } X journal = substr($0,4) X next X } X X/^%V/ { X V ++; lastx = "V" X if(V>1){err=1 X print "Two volumes: Reference " rx > "ref2bib.errs" X print volume > "ref2bib.errs" X } X volume = substr($0,4) X next X } X X/^%N/ { X N ++; lastx = "N" X if(N>1){err=1 X print "Two issue numbers: Reference " rx > "ref2bib.errs" X print number > "ref2bib.errs" X } X number = substr($0,4) X next X } X X/^[^%]/ { X if( lastx == "A") authors[A] = authors[A] " " $0 X if( lastx == "B") book = book " " $0 X if( lastx == "C") city = city " " $0 X if( lastx == "D") date = date " " $0 X if( lastx == "E") editor[E] = editor[E] " " $0 X if( lastx == "I") publisher = publisher " " $0 X if( lastx == "J") journal = journal " " $0 X if( lastx == "L") label = label " " $0 X if( lastx == "N") number = number " " $0 X if( lastx == "O") other = other " " $0 X if( lastx == "P") pages = pages " " $0 X if( lastx == "R") report = report " " $0 X if( lastx == "S") series = series " " $0 X if( lastx == "T") title = title " " $0 X if( lastx == "V") volume = volume " " $0 X if( lastx == "X") abstr = abstr " " $0 X next X } X X/^%C/ { X C ++; lastx = "C" X if(C>1){err=1 X print "Two cities: Reference " rx > "ref2bib.errs" X print city > "ref2bib.errs" X print " 2 cities " FILENAME, pr+1, NR > "ref2bib.errs" X } X city = substr($0,4) X next X } X X/^%I/ { X I ++; lastx = "I" X if(I>1){err=1 X print "Two publishers: Reference " rx > "ref2bib.errs" X print publisher > "ref2bib.errs" X } X publisher = substr($0,4) X next X } X X/^%B/ { X B ++; lastx = "B" X if(B>1){err=1 X print "Two books: Reference " rx > "ref2bib.errs" X print book > "ref2bib.errs" X } X book = substr($0,4) X next X } X X/^%E/ { # this really deals with 'bib' format X # refer only allows one %E fielsd, so we ought to X # split it somehow X E ++; lastx = "E" X editor[E] = substr($0,4) X next X } X X/^%[^ABCDEIJKLNOPRSTVX]/ { X F ++; lastx = "F"; # should not get these X print "Unexpected flag: Reference " rx > "ref2bib.errs" X print $0 > "ref2bib.errs" X err = 1 X next X } X X/^%O/ { X O ++; lastx = "O" X if(O>1){err=1 X print "Two others: Reference " rx > "ref2bib.errs" X print other > "ref2bib.errs" X } X other = substr($0,4) X next X } X X/^%S/ { X S ++; lastx = "S" X if(S>1){err=1 X print "Two series: Reference " rx > "ref2bib.errs" X print series > "ref2bib.errs" X } X series = substr($0,4) X next X } X X/^%R/ { X R ++; lastx = "R" X if(R>1){err=1 X print "Two reports: Reference " rx > "ref2bib.errs" X print report > "ref2bib.errs" X } X report = substr($0,4) X next X } X X/^%X/ { X X ++; lastx = "X" X abstr = substr($0,4) X if(X>1){err=1 X print "Two abstracts: Reference " rx > "ref2bib.errs" X } X next X } X X/^%K/ { X lastx = "K" X next X } XEND { X print refs " references" > "ref2bib.errs" X if(toterr) print toterr " erroneous" > "ref2bib.errs" X if(conf) print conf " conference papers" > "ref2bib.errs" X if(jour) print jour " journal articles" > "ref2bib.errs" X if(wholebook) print wholebook " books" > "ref2bib.errs" X if(totB) print totB " book articles" > "ref2bib.errs" X if(reps) print reps " reports" > "ref2bib.errs" X if(unclass) print unclass " Unclassified" > "ref2bib.errs" X if(totO) print totO " have additional information." > "ref2bib.errs" X if(totK) print totK " have additional keywords." > "ref2bib.errs" X if(totX) print totX " have abstracts/commentaries." > "ref2bib.errs" X print totA " authors" > "ref2bib.errs" X for(i=0;i<=MaxA;i++)if(acnt[i]){ X print i, " authors ", acnt[i] > "ref2bib.errs" X av += i*acnt[i] X } X print "Average ", av/refs > "ref2bib.errs" X print totT " titles" > "ref2bib.errs" X print "Key frequencies" > "ref2bib.errs" X for(k in keyused) print k, keyused[k] > "ref2bib.errs" X X } XZZ Xsed -f ref2b$$.sed $* | awk -f ref2b$$.awk Xrm -f ref2b$$.sed ref2b$$.awk End-of-file echo Extracting ref2bib.1 sed 's/^X//' <<'End-of-file' >ref2bib.1 X.TH REF2BIB 1-local X.SH NAME Xref2bib \- convert refer input files to bibtex .bib files X.SH SYNOPSIS X.B r2bib Xfile ... X.br X.SH DESCRIPTION X.B ref2bib Xreads the X.I files Xand produces a X.B bibtex Xreference list (a .bib file) on the standard output. XIf no files are given, ref2bib reads Xstandard input. X.PP XA rudimentary attempt is made to convert X.I troff Xspecial characters and accents to the equivalent X.I TeX Xones. XThe file ``ref2bib.errs'' contains complaints about references that were Xnot recognised, and other problems, as well as a summary of the Xnumber of conversions completed. X.PP XSince X.B refer Xfiles are inherently unstructured (compared to X.B bibtex ) X.B ref2bib Xonly does a passable job. In particular X.B refer Xdoesn't require a keyword, while X.B bibtex Xdoes. X.B ref2bib Xgenerates one using the following procedure: Xthe first 3 characters of the last names of the first three authors Xare concatenated, (preserving the capital letters), and the last two Xdigits of the date are appended. If this key has already been used, Xthen 'a', 'b', 'c', are appended as needed. X.PP XJournal entries that appear to be in the standard bibliography style Xfiles list of @strings, are converted. XThe %D field is converted to month and year entries if there are two Xfields, otherwise it is assumed to contain only the year. XA large number of proper names, such as Hilbert, Turing, etc., Xwhich are often found in the titles of articles are enclosed in braces X{} to protect them. This treatment is also applied to any strings of Xmore than two consecutive capital letters. X.PP XTo determine the type of reference that the X.B refer Xentry is, X.B ref2bib Xhas to do some ``calculated guessing''. The heuristic used Xhere (again, in order of precedence) is: X.PP X1. If it has a journal entry (%J) then it's considered to Xbe an @article, unless there is a city entry (%C) or a publisher entry X(%I) as well, in which case it's Xtreated as an @inproceedings. X.PP X2. If it has a book entry (%B) then it's considered to Xbe an @incollection. X.PP X3. If it has a report entry (%R) then it's considered to Xbe a @techreport. X.PP X4. If it has a issuer entry (%I) then it's considered to Xbe a @book. X.PP X5. Otherwise it's considered to be a @misc. XAll @misc entries are listed in the ``ref2bib.errs'' file. X.PP XQuite often X.B ref2bib Xwill misguess and you will need to edit (by hand) the resulting .bib Xfile. X.PP XAny fields that X.B ref2bib Xdoesn't know about it will ignore (and complain about on stderr). X.SH ACKNOWLEDGMENT XThis manual page is based on the manual page for X.I r2bib , Xa program which performs a simpler version of the same conversion, Xwriotten by XRusty Wright, Center For Music Experiment, University of California San XDiego. X.SH AUTHOR XPeter King, Computer Science Department, Heriot-Watt University, XEdinburgh. X.SH BUGS XImplemented as a X.I sh(1) Xscript, using X.I sed(1) Xand X.I awk(1) . XThis makes the conversion very slow, but also means that it is easily Xmodified to alter the heuristics. In particular, the key generation Xalgorithm is easily changed. End-of-file echo End of archive exit 0 -------