Go to the first, previous, next, last section, table of contents.

A Programmer's Guide to Gnus

It is my hope that other people will figure out smart stuff that Gnus can do, and that other people will write those smart things as well. To facilitate that I thought it would be a good idea to describe the inner workings of Gnus. And some of the not-so-inner workings, while I'm at it.

You can never expect the internals of a program not to change, but I will be defining (in some details) the interface between Gnus and its backends (this is written in stone), the format of the score files (ditto), data structures (some are less likely to change than others) and general method of operations.

Backend Interface

Gnus doesn't know anything about nntp, spools, mail or virtual groups. It only knows how to talk to virtual servers. A virtual server is a backend and some backend variables. As examples of the first, we have nntp, nnspool and nnmbox. As examples of the latter we have nntp-port-number and nnmbox-directory.

When Gnus asks for information from a backend -- say nntp -- on something, it will normally include a virtual server name in the function parameters. (If not, the backend should use the "current" virtual server.) For instance, nntp-request-list takes a virtual server as its only (optional) parameter. If this virtual server hasn't been opened, the function should fail.

Note that a virtual server name has no relation to some physical server name. Take this example:

(nntp "odd-one" 
      (nntp-address "ifi.uio.no") 
      (nntp-port-number 4324))

Here the virtual server name is `"odd-one"' while the name of the physical server is `"ifi.uio.no"'.

The backends should be able to switch between several virtual servers. The standard backends implement this by keeping an alist of virtual server environments that it pulls down/pushes up when needed.

There are two groups of interface functions: required functions, which must be present, and optional functions, which Gnus will always check whether are present before attempting to call.

All these functions are expected to return data in the buffer nntp-server-buffer (`" *nntpd*"'), which is somewhat unfortunately named, but we'll have to live with it. When I talk about "resulting data", I always refer to the data in that buffer. When I talk about "return value", I talk about the function value returned by the function call.

Some backends could be said to be server-forming backends, and some might be said to not be. The latter are backends that generally only operate on one group at a time, and have no concept of "server" -- they have a group, and they deliver info on that group and nothing more.

In the examples and definitions I will refer to the imaginary backend nnchoke.

Required Backend Functions

(nnchoke-retrieve-headers ARTICLES &optional GROUP SERVER)
articles is either a range of article numbers or a list of Message-IDs. Current backends do not fully support either - only sequences (lists) of article numbers, and most backends do not support retrieval of Message-IDs. But they should try for both. The result data should either be HEADs or NOV lines, and the result value should either be headers or nov to reflect this. This might later be expanded to various, which will be a mixture of HEADs and NOV lines, but this is currently not supported by Gnus. Here's an example HEAD:
221 1056 Article retrieved.
Path: ifi.uio.no!sturles
From: sturles@ifi.uio.no (Sturle Sunde)
Newsgroups: ifi.discussion
Subject: Re: Something very droll
Date: 27 Oct 1994 14:02:57 +0100
Organization: Dept. of Informatics, University of Oslo, Norway
Lines: 26
Message-ID: <38o8e1$a0o@holmenkollen.ifi.uio.no>
References: <38jdmq$4qu@visbur.ifi.uio.no>
NNTP-Posting-Host: holmenkollen.ifi.uio.no
.
So a headers return value would imply that there's a number of these in the data buffer. Here's a BNF definition of such a buffer:
headers        = *head
head           = error / valid-head
error-message  = [ "4" / "5" ] 2number " " <error message> eol
valid-head     = valid-message *header "." eol
valid-message  = "221 " <number> " Article retrieved." eol
header         = <text> eol
If the return value is nov, the data buffer should contain network overview database lines. These are basically fields separated by tabs.
nov-buffer = *nov-line
nov-line   = 8*9 [ field <TAB> ] eol
field      = <text except TAB>
For a closer explanation what should be in those fields, See section Headers.
(nnchoke-open-server SERVER &optional DEFINITIONS)
server is here the virtual server name. definitions is a list of (VARIABLE VALUE) pairs that defines this virtual server. If the server can't be opened, no error should be signaled. The backend may then choose to refuse further attempts at connecting to this server. In fact, it should do so. If the server is opened already, this function should return a non-nil value. There should be no data returned.
(nnchoke-close-server &optional SERVER)
Close connection to server and free all resources connected to it. There should be no data returned.
(nnchoke-request-close)
Close connection to all servers and free all resources that the backend have reserved. All buffers that have been created by that backend should be killed. (Not the nntp-server-buffer, though.) There should be no data returned.
(nnchoke-server-opened &optional SERVER)
This function should return whether server is opened, and that the connection to it is still alive. This function should under no circumstances attempt to reconnect to a server that is has lost connection to. There should be no data returned.
(nnchoke-status-message &optional SERVER)
This function should return the last error message from server. There should be no data returned.
(nnchoke-request-article ARTICLE &optional GROUP SERVER TO-BUFFER)
The result data from this function should be the article specified by article. This might either be a Message-ID or a number. It is optional whether to implement retrieval by Message-ID, but it would be nice if that were possible. If to-buffer is non-nil, the result data should be returned in this buffer instead of the normal data buffer. This is to make it possible to avoid copying large amounts of data from one buffer to another, and Gnus mainly request articles to be inserted directly into its article buffer.
(nnchoke-open-group GROUP &optional SERVER)
Make group the current group. There should be no data returned by this function.
(nnchoke-request-group GROUP &optional SERVER)
Get data on group. This function also has the side effect of making group the current group. Here's an example of some result data and a definition of the same:
211 56 1000 1059 ifi.discussion
The first number is the status, which should be `211'. Next is the total number of articles in the group, the lowest article number, the highest article number, and finally the group name. Note that the total number of articles may be less than one might think while just considering the highest and lowest article numbers, but some articles may have been cancelled. Gnus just discards the total-number, so whether one should take the bother to generate it properly (if that is a problem) is left as an excercise to the reader.
group-status = [ error / info ] eol
error        = [ "4" / "5" ] 2<number> " " <Error message>
info         = "211 " 3* [ <number> " " ] <string>
(nnchoke-close-group GROUP &optional SERVER)
Close group and free any resources connected to it. This will be a no-op on most backends. There should be no data returned.
(nnchoke-request-list &optional SERVER)
Return a list of all groups available on server. And that means all. Here's an example from a server that only carries two groups:
ifi.test 0000002200 0000002000 y
ifi.discussion 3324 3300 n
On each line we have a group name, then the highest article number in that group, the lowest article number, and finally a flag.
active-file = *active-line
active-line = name " " <number> " " <number> " " flags eol
name        = <string>
flags       = "n" / "y" / "m" / "x" / "j" / "=" name
The flag says whether the group is read-only (`n'), is moderated (`m'), is dead (`x'), is aliased to some other group (`=other-group' or none of the above (`y').
(nnchoke-request-post &optional SERVER)
This function should post the current buffer. It might return whether the posting was successful or not, but that's not required. If, for instance, the posting is done asynchronously, it has generally not been completed by the time this function concludes. In that case, this function should set up some kind of sentinel to beep the user loud and clear if the posting could not be completed. There should be no result data from this function.
(nnchoke-request-post-buffer POST GROUP SUBJECT HEADER ARTICLE-BUFFER INFO FOLLOW-TO RESPECT-POSTER)
This function should return a buffer suitable for composing an article to be posted by nnchoke-request-post. If post is non-nil, this is not a followup, but a totally new article. group is the name of the group to be posted to. subject is the subject of the message. article-buffer is the buffer being followed up, if that is the case. info is the group info. follow-to is the group that one is supposed to re-direct the article to. If respect-poster is non-nil, the special `"poster"' value of a Followup-To header is to be respected. There should be no result data returned.

Optional Backend Functions

(nnchoke-retrieve-groups GROUPS &optional SERVER)
groups is a list of groups, and this function should request data on all those groups. How it does it is of no concern to Gnus, but it should attempt to do this in a speedy fashion. The return value of this function can be either active or group, which says what the format of the result data is. The former is in the same format as the data from nnchoke-request-list, while the latter is a buffer full of lines in the same format as nnchoke-request-group gives.
group-buffer = *active-line / *group-status
(nnchoke-request-update-info GROUP INFO &optional SERVER)
A Gnus group info (see section Group Info) is handed to the backend for alterations. This comes in handy if the backend really carries all the information (as is the case with virtual an imap groups). This function may alter the info in any manner it sees fit, and should return the (altered) group info. This function may alter the group info destructively, so no copying is needed before boogying. There should be no result data from this function.
(nnchoke-request-scan &optional GROUP SERVER)
This function may be called at any time (by Gnus or anything else) to request that the backend check for incoming articles, in one way or another. A mail backend will typically read the spool file or query the POP server when this function is invoked. The group doesn't have to be heeded -- if the backend decides that it is too much work just scanning for a single group, it may do a total scan of all groups. It would be nice, however, to keep things local if that's practical. There should be no result data from this function.
(nnchoke-request-asynchronous GROUP &optional SERVER ARTICLES)
This is a request to fetch articles asynchronously later. articles is an alist of (article-number line-number). One would generally expect that if one later fetches article number 4, for instance, some sort of asynchronous fetching of the articles after 4 (which might be 5, 6, 7 or 11, 3, 909 depending on the order in that alist) would be fetched asynchronouly, but that is left up to the backend. Gnus doesn't care. There should be no result data from this function.
(nnchoke-request-group-description GROUP &optional SERVER)
The result data from this function should be a description of group.
description-line = name <TAB> description eol
name             = <string>
description      = <text>
(nnchoke-request-list-newsgroups &optional SERVER)
The result data from this function should be the description of all groups available on the server.
description-buffer = *description-line
(nnchoke-request-newgroups DATE &optional SERVER)
The result data from this function should be all groups that were created after `date', which is in normal human-readable date format. The data should be in the active buffer format.
(nnchoke-request-create-groups GROUP &optional SERVER)
This function should create an empty group with name group. There should be no return data.
(nnchoke-request-expire-articles ARTICLES &optional GROUP SERVER FORCE)
This function should run the expiry process on all articles in the articles range (which is currently a simple list of article numbers.) It is left up to the backend to decide how old articles should be before they are removed by this function. If force is non-nil, all articles should be deleted, no matter how new they are. This function should return a list of articles that it did not/was not able to delete. There should be no result data returned.
(nnchoke-request-move-article ARTICLE GROUP SERVER ACCEPT-FORM
&optional LAST) This function should move article (which is a number) from group by calling accept-form. This function should ready the article in question for moving by removing any header lines it has added to the article, and generally should "tidy up" the article. Then it should eval accept-form in the buffer where the "tidy" article is. This will do the actual copying. If this eval returns a non-nil value, the article should be removed. If last is nil, that means that there is a high likelihood that there will be more requests issued shortly, so that allows some optimizations. There should be no data returned.
(nnchoke-request-accept-article GROUP &optional LAST)
This function takes the current buffer and inserts it into group. If last in nil, that means that there will be more calls to this function in short order. There should be no data returned.
(nnchoke-request-replace-article ARTICLE GROUP BUFFER)
This function should remove article (which is a number) from group and insert buffer there instead. There should be no data returned.

Score File Syntax

Score files are meant to be easily parsable, but yet extremely mallable. It was decided that something that had the same read syntax as an Emacs Lisp list would fit that spec.

Here's a typical score file:

(("summary"
  ("win95" -10000 nil s)
  ("Gnus"))
 ("from"
  ("Lars" -1000))
 (mark -100))

BNF definition of a score file:

score-file       = "" / "(" *element ")"
element          = rule / atom
rule             = string-rule / number-rule / date-rule
string-rule      = "(" quote string-header quote space *string-match ")"
number-rule      = "(" quote number-header quote space *number-match ")"
date-rule        = "(" quote date-header quote space *date-match ")"
quote            = <ascii 34>
string-header    = "subject" / "from" / "references" / "message-id" / 
                   "xref" / "body" / "head" / "all" / "followup"
number-header    = "lines" / "chars"
date-header      = "date"
string-match     = "(" quote <string> quote [ "" / [ space score [ "" / 
                   space date [ "" / [ space string-match-t ] ] ] ] ] ")"
score            = "nil" / <integer>
date             = "nil" / <natural number>
string-match-t   = "nil" / "s" / "substring" / "S" / "Substring" / 
                   "r" / "regex" / "R" / "Regex" /
                   "e" / "exact" / "E" / "Exact" /
                   "f" / "fuzzy" / "F" / "Fuzzy"
number-match     = "(" <integer> [ "" / [ space score [ "" / 
                   space date [ "" / [ space number-match-t ] ] ] ] ] ")"
number-match-t   = "nil" / "=" / "<" / ">" / ">=" / "<="
date-match       = "(" quote <string> quote [ "" / [ space score [ "" / 
                   space date [ "" / [ space date-match-t ] ] ] ] ")"
date-match-t     = "nil" / "at" / "before" / "after"
atom             = "(" [ required-atom / optional-atom ] ")"
required-atom    = mark / expunge / mark-and-expunge / files /
                   exclude-files / read-only / touched
optional-atom    = adapt / local / eval 
mark             = "mark" space nil-or-number
nil-or-t         = "nil" / <integer>
expunge          = "expunge" space nil-or-number
mark-and-expunge = "mark-and-expunge" space nil-or-number
files            = "files" *[ space <string> ]
exclude-files    = "exclude-files" *[ space <string> ]
read-only        = "read-only" [ space "nil" / space "t" ]
adapt            = "adapt" [ space "nil" / space "t" / space adapt-rule ]
adapt-rule       = "(" *[ <string> *[ "(" <string> <integer> ")" ] ")"
local            = "local" *[ space "(" <string> space <form> ")" ]
eval             = "eval" space <form>
space            = *[ " " / <TAB> / <NEWLINE> ]

Any unrecognized elements in a score file should be ignored, but not discarded.

As you can see, white space is needed, but the type and amount of white space is irrelevant. This means that formatting of the score file is left up to the programmer -- if it's simpler to just spew it all out on one looong line, then that's ok.

The meaning of the various atoms are explained elsewhere in this manual.

Headers

Gnus uses internally a format for storing article headers that corresponds to the NOV format in a mysterious fashion. One could almost suspect that the author looked at the NOV specification and just shamelessly stole the entire thing, and one would be right.

Header is a severly overloaded term. "Header" is used in RFC1036 to talk about lines in the head of an article (eg., From). It is used by many people as a synonym for "head" -- "the header and the body". (That should be avoided, in my opinion.) And Gnus uses a format interanally that it calls "header", which is what I'm talking about here. This is a 9-element vector, basically, with each header (ouch) having one slot.

These slots are, in order: number, subject, from, date, id, references, chars, lines, xref. There are macros for accessing and setting these slots -- they all have predicatable names beginning with mail-header- and mail-header-set-, respectively.

The xref slot is really a misc slot. Any extra info will be put in there.

Ranges

GNUS introduced a concept that I found so useful that I've started using it a lot and have elaborated on it greatly.

The question is simple: If you have a large amount of objects that are identified by numbers (say, articles, to take a wild example) that you want to callify as being "included", a normal sequence isn't very useful. (A 200,000 length sequence is a bit long-winded.)

The solution is as simple as the question: You just collapse the sequence.

(1 2 3 4 5 6 10 11 12)

is transformed into

((1 . 6) (10 . 12))

To avoid having those nasty `(13 . 13)' elements to denote a lonesome object, a `13' is a valid element:

((1 . 6) 7 (10 . 12))

This means that comparing two ranges to find out whether they are equal is slightly tricky:

((1 . 6) 7 8 (10 . 12))

and

((1 . 5) (7 . 8) (10 . 12))

are equal. In fact, any non-descending list is a range:

(1 2 3 4 5)

is a perfectly valid range, although a pretty longwinded one. This is also legal:

(1 . 5)

and is equal to the previous range.

Here's a BNF definition of ranges. Of course, one must remember the semantic requirement that the numbers are non-descending. (Any number of repetition of the same number is allowed, but apt to disappear in range handling.)

range           = simple-range / normal-range
simple-range    = "(" number " . " number ")"
normal-range    = "(" start-contents ")"
contents        = "" / simple-range *[ " " contents ] / 
                  number *[ " " contents ]

Gnus currently uses ranges to keep track of read articles and article marks. I plan on implementing a number of range operators in C if The Powers That Be are willing to let me. (I haven't asked yet, because I need to do some more thinking on what operators I need to make life totally range-based without ever having to convert back to normal sequences.)

Group Info

Gnus stores all permanent info on groups in a group info list. This list is from three to six elements (or more) long and exhaustively describes the group.

Here are two example group infos; one is a very simple group while the second is a more complex one:

("no.group" 5 (1 . 54324))

("nnml:my.mail" 3 ((1 . 5) 9 (20 . 55))
                ((tick (15 . 19)) (replied 3 6 (19 . 23)))
                (nnml "")
                (auto-expire (to-address "ding@ifi.uio.no")))

The first element is the group name as Gnus knows the group; the second is the group level; the third is the read articles in range format; the fourth is a list of article marks lists; the fifth is the select method; and the sixth contains the group parameters.

Here's a BNF definition of the group info format:

info          = "(" group space level space read 
                [ "" / [ space marks-list [ "" / [ space method [ "" /
                space parameters ] ] ] ] ] ")" 
group         = quote <string> quote
level         = <integer in the range of 1 to inf>
read          = range
marks-lists   = nil / "(" *marks ")"
marks         = "(" <string> range ")"
method        = "(" <string> *elisp-forms ")"
parameters    = "(" *elisp-forms ")"

Actually that `marks' rule is a fib. A `marks' is a `<string>' consed on to a `range', but that's a bitch to say in pseudo-BNF.


Go to the first, previous, next, last section, table of contents.