refer(1) — Linux manual page
refer(1) General Commands Manual refer(1)
Name
refer - process bibliographic references for groff
Synopsis
refer [-bCenPRS] [-a n] [-B field.macro] [-c fields] [-f n]
[-i fields] [-k field] [-l range-expression] [-p database-
file] [-s fields] [-t n] [file ...]
refer --help
refer -v
refer --version
Description
The GNU implementation of refer is part of the groff(1) document
formatting system. refer is a troff(1) preprocessor that
prepares bibliographic citations by looking up keywords specified
in a roff(7) input document, obviating the need to type such
annotations, and permitting the citation style in formatted
output to be altered independently and systematically. It copies
the contents of each file to the standard output stream, except
that it interprets lines between .[ and .] as citations to be
translated into groff input, and lines between .R1 and .R2 as
instructions regarding how citations are to be processed. refer
interprets and generates roff lf requests so that file names and
line numbers in messages produced by commands that read its
output correctly describe the source document. Normally, refer
is not executed directly by the user, but invoked by specifying
the -R option to groff(1). If no file operands are present, or
if file is “-”, refer reads the standard input stream.
A citation identifies a work by reference to a bibliographic
record detailing it. Select a work from a database of records by
listing keywords that uniquely identify its entry.
Alternatively, a document can specify a record for the work at
the point its citation occurs. A document can use either or both
strategies as desired.
For each citation, refer produces a mark in the text, like a
superscripted footnote number or “[Lesk1978a]”. A mark consists
of a label between brackets. The mark can be separated from
surrounding text and from other labels in various ways. refer
produces roff language requests usable by a document or a macro
package such as me, mm, mom, or ms to produce a formatted
reference for each citation. A citation's reference can be
output immediately after it occurs (as with footnotes), or
references may accumulate, with corresponding output appearing
later in the document (as with endnotes). When references
accumulate, multiple citations of the same reference produce a
single formatted entry.
Interpretation of lines between .R1 and .R2 tokens as
preprocessor commands is a GNU refer extension. Documents
employing this feature can still be processed by AT&T refer by
adding the lines
.de R1
.ig R2
..
to the beginning of the document. The foregoing input causes
troff to ignore everything between .R1 and .R2. The effects of
some refer commands can be achieved by command-line options;
these are supported for compatibility with AT&T refer. It is
usually more convenient to use commands.
Bibliographic records
A bibliographic record describes a referenced work in sufficient
detail that it may be cited to accepted standards of scholarly
and professional clarity. The record format permits annotation
and extension that a document may use or ignore. A record is a
plain text sequence of fields, one per line, each consisting of a
percent sign %, an alphanumeric character classifying it, one
space, and its contents. If a field's contents are empty, the
field is ignored.
Frequently, such records are organized into a bibliographic
database, with each entry separated by blank lines or file
boundaries. This practice relieves documents of the need to
maintain bibliographic data themselves. The programs lookbib(1)
and lkbib(1) consult a bibliographic database, and indxbib(1)
indexes one to speed retrieval from it, reducing document
processing time. Use of these tools is optional.
The conventional uses of the bibliographic field entries are as
follows. Within a record, fields other than %A and %E replace
previous occurrences thereof. The ordering of multiple %A and %E
fields is significant.
%A names an author. If the name contains a suffix such as
“Jr.” or “III”, it should be separated from the surname by
a comma. We recommend always supplying an %A field or a
%Q field.
%B records the title of the book within which a cited article
is collected. See %J and %T.
%C names the city or other place of publication.
%D indicates the date of publication. Specify the year in
full. If the month is specified, use its name rather than
its number; only the first three letters are required. We
recommend always supplying a %D field; if the date is
unknown, use “in press” or “unknown” as its contents.
%E names an editor of the book within which a cited article
is collected. Where a work has editors but no authors,
name the editors in %A fields and append “, (ed.)” or
“, (eds.)” to the last of these.
%G records the U.S. government ordering number, ISBN, DOI, or
other unique identifier.
%I names the publisher (issuer).
%J records the title of the journal within which a cited
article is collected. See %B and %T.
%K lists keywords intended to aid searches.
%L is a label; typically unused in database entries, it can
override the label format otherwise determined.
%N records the issue number of the journal within which a
cited article is collected.
%O presents additional (“other”) information, typically
placed at the end of the reference.
%P lists the page numbers of a cited work that is part of a
larger collection. Specify a range with m-n.
%Q names an institutional author when no %A fields are
present. Only one %Q field is permitted.
%R is an identifier for a report, thesis, memorandum, or
other unpublished work.
%S records the title of a series to which the cited work
belongs.
%T is the work's title. See %B and %J.
%V is the volume number of the journal or book containing the
cited work.
%X is an annotation. By convention, it is not formatted in
the citing document.
If the obsolescent “accent strings” feature of the ms or me macro
packages is used, such strings should follow the character to be
accented; an ms document must call the AM macro before using
them. Do not quote accent strings: use one \ rather than two.
See groff_char(7) for a modern approach to the problem of
diacritics.
Citations
Citations have a characteristic format.
.[opening-text
flags keyword ...
field
...
.]closing-text
opening-text, closing-text, and flags are optional, and only one
keyword or field need be specified. If keywords are present,
refer searches the bibliographic database(s) for a unique
reference matching them. Multiple matches are an error; add more
keywords to disambiguate the reference. In the absence of
keywords, fields constitute the bibliographic record. Otherwise,
fields specify additional data to replace or supplement those in
the reference. When references are accumulating and keywords are
present, specify additional fields at most on the first citation
of a particular reference; they apply to all further citations
thereof.
opening-text and closing-text are roff input used to bracket the
label, overriding the bracket-label command. Leading and
trailing spaces are significant. If either of these is non-
empty, the corresponding arguments to the bracket-label command
are not used; alter this behavior with the [ and ] flags.
flags is a list of non-alphanumeric characters each of which
modifies the treatment of the particular citation. AT&T refer
treats these flags as keywords, but ignores them since they are
non-alphanumeric. The following flags direct GNU refer.
# Use the label specified by the short-label command, if
any. refer otherwise uses the normal label. Typically, a
short label implements author-date citation styles
consisting of a name, a year, and a disambiguating letter
if necessary. “#” is meant to suggest such a
(quasi-)numeric label.
[ Precede opening-text with the first argument given to the
bracket-label command.
] Follow closing-text with the second argument given to the
bracket-label command.
An advantage of the [ and ] flags over use of opening-text and
closing-text is that you can update the document's bracketing
style in one place using the bracket-label command. Another is
that sorting and merging of citations is not necessarily
inhibited if the flags are used.
refer appends any label resulting from a citation to the roff
input line preceding the .[ token. If there is no such line,
refer issues a warning diagnostic.
There is no special notation for citing multiple references in
series. Use a sequence of citations, one for each reference,
with nothing between them. refer attaches all of their labels to
the line preceding the first. These labels may be sorted or
merged. See the description of the <> label expression, and of
the sort-adjacent-labels and abbreviate-label-ranges commands. A
label is not merged if its citation has a non-empty opening-text
or closing-text. However, the labels for two adjacent citations,
the former using the ] flag and without any closing-text, and the
latter using the [ flag and without any opening-text, may be
sorted and merged even if the former's opening-text or the
latter's closing-text is non-empty. (To prevent these
operations, use the dummy character escape sequence \& as the
former's closing-text.)
Commands
Commands are contained between lines starting with .R1 and .R2.
The -R option prevents recognition of these lines. When a refer
encounters a .R1 line, it flushes any accumulated references.
Neither .R1 nor .R2 lines, nor anything between them, is output.
Commands are separated by newlines or semicolons. A number sign
(#) introduces a comment that extends to the end of the line, but
does not conceal the newline. Each command is broken up into
words. Words are separated by spaces or tabs. A word that
begins with a (neutral) double quote (") extends to the next
double quote that is not followed by another double quote. If
there is no such double quote, the word extends to the end of the
line. Pairs of double quotes in a word beginning with a double
quote collapse to one double quote. Neither a number sign nor a
semicolon is recognized inside double quotes. A line can be
continued by ending it with a backslash “\”; this works
everywhere except after a number sign.
Each command name that is marked with * has an associated
negative command no-name that undoes the effect of name. For
example, the no-sort command specifies that references should not
be sorted. The negative commands take no arguments.
In the following description each argument must be a single word;
field is used for a single upper or lower case letter naming a
field; fields is used for a sequence of such letters; m and n are
used for a non-negative numbers; string is used for an arbitrary
string; file is used for the name of a file.
abbreviate* fields string1 string2 string3 string4
Abbreviate the first names of fields. An initial letter
will be separated from another initial letter by string1,
from the surname by string2, and from anything else (such
as “von” or “de”) by string3. These default to a period
followed by a space. In a hyphenated first name, the
initial of the first part of the name will be separated
from the hyphen by string4; this defaults to a period. No
attempt is made to handle any ambiguities that might
result from abbreviation. Names are abbreviated before
sorting and before label construction.
abbreviate-label-ranges* string
Three or more adjacent labels that refer to consecutive
references will be abbreviated to a label consisting of
the first label, followed by string, followed by the last
label. This is mainly useful with numeric labels. If
string is omitted, it defaults to “-”.
accumulate*
Accumulate references instead of writing out each
reference as it is encountered. Accumulated references
will be written out whenever a reference of the form
.[
$LIST$
.]
is encountered, after all input files have been processed,
and whenever a .R1 line is recognized.
annotate* field string
field is an annotation; print it at the end of the
reference as a paragraph preceded by the line
.string
If string is omitted, it will default to AP; if field is
also omitted it will default to X. Only one field can be
an annotation.
articles string ...
Each string is a definite or indefinite article, and
should be ignored at the beginning of T fields when
sorting. Initially, “a”, “an”, and “the” are recognized
as articles.
bibliography file ...
Write out all the references contained in each
bibliographic database file. This command should come
last in an .R1/.R2 block.
bracket-label string1 string2 string3
In the text, bracket each label with string1 and string2.
An occurrence of string2 immediately followed by string1
will be turned into string3. The default behavior is as
follows.
bracket-label \*([. \*(.] ", "
capitalize fields
Convert fields to caps and small caps.
compatible*
Recognize .R1 and .R2 even when followed by a character
other than space or newline.
database file ...
Search each bibliographic database file. For each file,
if an index file.i created by indxbib(1) exists, then it
will be searched instead; each index can cover multiple
databases.
date-as-label* string
string is a label expression that specifies a string with
which to replace the D field after constructing the label.
See subsection “Label expressions” below for a description
of label expressions. This command is useful if you do
not want explicit labels in the reference list, but
instead want to handle any necessary disambiguation by
qualifying the date in some way. The label used in the
text would typically be some combination of the author and
date. In most cases you should also use the
no-label-in-reference command. For example,
date-as-label D.+yD.y%a*D.-y
would attach a disambiguating letter to the year part of
the D field in the reference.
default-database*
The default database should be searched. This is the
default behavior, so the negative version of this command
is more useful. refer determines whether the default
database should be searched on the first occasion that it
needs to do a search. Thus a no-default-database command
must be given before then, in order to be effective.
discard* fields
When the reference is read, fields should be discarded; no
string definitions for fields will be output. Initially,
fields are XYZ.
et-al* string m n
Configure use of “et al” in the evaluation of @
expressions in label expressions. If u is the number of
authors needed to make the author sequence unambiguous and
the total number of authors is t, then the last t-u
authors will be replaced by string provided that t-u is
not less than m and t is not less than n. The default
behavior is as follows.
et-al " et al" 2 3
Note the absence of a dot from the end of the
abbreviation, which is arguably not correct. (Et al[.]
is short for et alli, as etc. is short for et cetera.)
include file
Include file and interpret the contents as commands.
join-authors string1 string2 string3
Join multiple authors together with strings. When there
are exactly two authors, they will be joined with string1.
When there are more than two authors, all but the last two
will be joined with string2, and the last two authors will
be joined with string3. If string3 is omitted, it will
default to string1; if string2 is also omitted it will
also default to string1. For example,
join-authors " and " ", " ", and "
will restore the default method for joining authors.
label-in-reference*
When outputting the reference, define the string [F to be
the reference's label. This is the default behavior, so
the negative version of this command is more useful.
label-in-text*
For each reference output a label in the text. The label
will be separated from the surrounding text as described
in the bracket-label command. This is the default
behavior, so the negative version of this command is more
useful.
label string
string is a label expression describing how to label each
reference.
separate-label-second-parts string
When merging two-part labels, separate the second part of
the second label from the first label with string. See
the description of the <> label expression.
move-punctuation*
In the text, move any punctuation at the end of line past
the label. We recommend employing this command unless you
are using superscripted numbers as labels.
reverse* string
Reverse the fields whose names are in string. An optional
integer after a field name limits the number of such
fields to the given count; no integer means no limit.
search-ignore* fields
While searching for keys in databases for which no index
exists, ignore the contents of fields. Initially, fields
XYZ are ignored.
search-truncate* n
Only require the first n characters of keys to be given.
In effect when searching for a given key words in the
database are truncated to the maximum of n and the length
of the key. Initially, n is 6.
short-label* string
string is a label expression that specifies an alternative
(usually shorter) style of label. This is used when the #
flag is given in the citation. When using author-date
style labels, the identity of the author or authors is
sometimes clear from the context, and so it may be
desirable to omit the author or authors from the label.
The short-label command will typically be used to specify
a label containing just a date and possibly a
disambiguating letter.
sort* string
Sort references according to string. References will
automatically be accumulated. string should be a list of
field names, each followed by a number, indicating how
many fields with the name should be used for sorting. “+”
can be used to indicate that all the fields with the name
should be used. Also . can be used to indicate the
references should be sorted using the (tentative) label.
(Subsection “Label expressions” below describes the
concept of a tentative label.)
sort-adjacent-labels*
Sort labels that are adjacent in the text according to
their position in the reference list. This command should
usually be given if the abbreviate-label-ranges command
has been given, or if the label expression contains a <>
expression. This has no effect unless references are
being accumulated.
Label expressions
Label expressions can be evaluated both normally and tentatively.
The result of normal evaluation is used for output. The result
of tentative evaluation, called the tentative label, is used to
gather the information that normal evaluation needs to
disambiguate the label. Label expressions specified by the
date-as-label and short-label commands are not evaluated
tentatively. Normal and tentative evaluation are the same for
all types of expression other than @, *, and % expressions. The
description below applies to normal evaluation, except where
otherwise specified.
field [n]
is the nth part of field. If n is omitted, it defaults
to 1.
'string'
The characters in string literally.
@ All authors joined as specified by the join-authors
command. The whole of each author's name is used.
However, if the references are sorted by author (that is,
the sort specification starts with “A+”), then authors'
surnames will be used instead, provided that this does not
introduce ambiguity, and also an initial subsequence of
the authors may be used instead of all the authors, again
provided that this does not introduce ambiguity. Given
any two referenced works with n authors, the use of only
the surname for the nth author of a reference is regarded
as ambiguous if the other reference shares the first n-1
authors, the nth authors of each reference are not
identical, but the nth authors' surnames are the same. A
proper initial subsequence of the sequence of authors for
some reference is considered to be ambiguous if there is a
reference with some other sequence of authors which also
has that subsequence as a proper initial subsequence.
When an initial subsequence of authors is used, the
remaining authors are replaced by the string specified by
the et-al command; this command may also specify
additional requirements that must be met before an initial
subsequence can be used. @ tentatively evaluates to a
canonical representation of the authors, such that authors
that compare equally for sorting purposes have the same
representation.
%n
%a
%A
%i
%I The serial number of the reference formatted according to
the character following the %. The serial number of a
reference is 1 plus the number of earlier references with
same tentative label as this reference. These expressions
tentatively evaluate to an empty string.
expr* If there is another reference with the same tentative
label as this reference, then expr, otherwise an empty
string. It tentatively evaluates to an empty string.
expr+n
expr-n The first (+) or last (-) n upper or lower case letters or
digits of expr. roff special characters (such as \('a)
count as a single letter. Accent strings are retained but
do not count towards the total.
expr.l expr converted to lowercase.
expr.u expr converted to uppercase.
expr.c expr converted to caps and small caps.
expr.r expr reversed so that the surname is first.
expr.a expr with first names abbreviated. Fields specified in
the abbreviate command are abbreviated before any labels
are evaluated. Thus .a is useful only when you want a
field to be abbreviated in a label but not in a reference.
expr.y The year part of expr.
expr.+y
The part of expr before the year, or the whole of expr if
it does not contain a year.
expr.-y
The part of expr after the year, or an empty string if
expr does not contain a year.
expr.n The surname part of expr.
expr1~expr2
expr1 except that if the last character of expr1 is - then
it will be replaced by expr2.
expr1 expr2
The concatenation of expr1 and expr2.
expr1|expr2
If expr1 is non-empty then expr1 otherwise expr2.
expr1&expr2
If expr1 is non-empty then expr2 otherwise an empty
string.
expr1?expr2:expr3
If expr1 is non-empty then expr2 otherwise expr3.
<expr> The label is in two parts, which are separated by expr.
Two adjacent two-part labels which have the same first
part will be merged by appending the second part of the
second label onto the first label separated by the string
specified in the separate-label-second-parts command
(initially, a comma followed by a space); the resulting
label will also be a two-part label with the same first
part as before merging, and so additional labels can be
merged into it. It is permissible for the first part to
be empty; this may be desirable for expressions used in
the short-label command.
(expr) The same as expr. Used for grouping.
The above expressions are listed in order of precedence (highest
first); & and | have the same precedence.
Macro interface
Each reference starts with a call to the macro ]-. The string [F
will be defined to be the label for this reference, unless the
no-label-in-reference command has been given. There then follows
a series of string definitions, one for each field: string [X
corresponds to field X. The register [P is set to 1 if the P
field contains a range of pages. The [T, [A and [O registers are
set to 1 according as the T, A and O fields end with any of .?!
(an end-of-sentence character). The [E register will be set to 1
if the [E string contains more than one name. The reference is
followed by a call to the ][ macro. The first argument to this
macro gives a number representing the type of the reference. If
a reference contains a J field, it will be classified as type 1,
otherwise if it contains a B field, it will be type 3, otherwise
if it contains a G or R field it will be type 4, otherwise if it
contains an I field it will be type 2, otherwise it will be
type 0. The second argument is a symbolic name for the type:
other, journal-article, book, article-in-book, or tech-report.
Groups of references that have been accumulated or are produced
by the bibliography command are preceded by a call to the ]<
macro and followed by a call to the ]> macro.
Options
--help displays a usage message, while -v and --version show
version information; all exit afterward.
-R Don't recognize lines beginning with .R1/.R2.
Other options are equivalent to refer commands.
-a n reverse An
-b no-label-in-text; no-label-in-reference
-B See below.
-c fields
capitalize fields
-C compatible
-e accumulate
-f n label %n
-i fields
search-ignore fields
-k label L~%a
-k field
label field~%a
-l label A.nD.y%a
-l m label A.n+mD.y%a
-l ,n label A.nD.y-n%a
-l m,n label A.n+mD.y-n%a
-n no-default-database
-p db-file
database db-file
-P move-punctuation
-s spec
sort spec
-S label "(A.n|Q) ', ' (D.y|D)"; bracket-label " (" ) "; "
-t n search-truncate n
The B option has command equivalents with the addition that the
file names specified on the command line are processed as if they
were arguments to the bibliography command instead of in the
normal way.
-B annotate X AP; no-label-in-reference
-B field.macro
annotate field macro; no-label-in-reference
Environment
REFER Assign this variable a file name to override the default
database.
Files
/usr/dict/papers/Ind
Default database.
file.i Index files.
/usr/local/share/groff/1.23.0/tmac/refer.tmac
defines macros and strings facilitating integration with
macro packages that wish to support refer.
refer uses temporary files. See the groff(1) man page for
details of where such files are created.
Bugs
In label expressions, <> expressions are ignored inside .char
expressions.
Examples
We can illustrate the operation of refer with a sample
bibliographic database containing one entry and a simple roff
document to cite that entry.
$ cat > my-db-file
%A Daniel P.\& Friedman
%A Matthias Felleisen
%C Cambridge, Massachusetts
%D 1996
%I The MIT Press
%T The Little Schemer, Fourth Edition
$ refer -p my-db-file
Read the book
.[
friedman
.]
on your summer vacation.
<Control+D>
.lf 1 -
Read the book\*([.1\*(.]
.ds [F 1
.]-
.ds [A Daniel P. Friedman and Matthias Felleisen
.ds [C Cambridge, Massachusetts
.ds [D 1996
.ds [I The MIT Press
.ds [T The Little Schemer, Fourth Edition
.nr [T 0
.nr [A 0
.][ 2 book
.lf 5 -
on your summer vacation.
The foregoing shows us that refer (a) produces a label “1”; (b)
brackets that label with interpolations of the “[.” and “.]”
strings; (c) calls a macro “]-”; (d) defines strings and
registers containing the label and bibliographic data for the
reference; (e) calls a macro “][”; and (f) uses the lf request to
restore the line numbers of the original input. As discussed in
subsection “Macro interface” above, it is up to the document or a
macro package to employ and format this information usefully.
Let us see how we might turn groff_ms(7) to this task.
$ REFER=my-db-file groff -R -ms
.LP
Read the book
.[
friedman
.]
on your summer vacation.
Commentary is available.\*{*\*}
.FS \*{*\*}
Space reserved for penetrating insight.
.FE
ms's automatic footnote numbering mechanism is not aware of
refer's label numbering, so we have manually specified a
(superscripted) symbolic footnote for our non-bibliographic
aside.
See also
“Refer — A Bibliography System”, by Bill Tuthill, 1983, Computing
Services, University of California, Berkeley.
“Some Applications of Inverted Indexes on the Unix System”, by M.
E. Lesk, 1978, AT&T Bell Laboratories Computing Science Technical
Report No. 69.
indxbib(1), lookbib(1), lkbib(1)
COLOPHON
This page is part of the groff (GNU troff) project. Information
about the project can be found at
⟨http://www.gnu.org/software/groff/⟩. If you have a bug report
for this manual page, see ⟨http://www.gnu.org/software/groff/⟩.
This page was obtained from the project's upstream Git repository
⟨https://git.savannah.gnu.org/git/groff.git⟩ on 2024-06-14. (At
that time, the date of the most recent commit that was found in
the repository was 2024-06-10.) If you discover any rendering
problems in this HTML version of the page, or you believe there
is a better or more up-to-date source for the page, or you have
corrections or improvements to the information in this COLOPHON
(which is not part of the original manual page), send a mail to
man-pages@man7.org