Hassle-free delivery of a multi-file Lisp app

Assume the end-user has an ANSI-compliant Common Lisp, and downloads your application, which you distribute as a tarball named, say, coffee.tar.bz2, that unpacks to a directory called coffee. What is the easiest way for them to use your app?

If your distribution contains the single Lisp file coffee.lisp, the user can load it into their CL, thus,

(load "coffee")

and use whatever functions, macros, and variables you provided as the external interface. They can even do

(compile-file "coffee")

and then

(load "coffee")

is smart enough to pick up the faster compiled file — which has the same basename as the source file, but with the file extension changed from .lisp to something like .fasl, .fas, .dx64fsl, or .sse2f (depending on the user’s CL implementation).

Typically, though, your distribution contains several Lisp files spread across several directories, to ease your development process, i.e. to keep things neat and maintainable. coffee.lisp may load caf.lisp and decaf.lisp; caf.lisp may then load espresso.lisp, mocha.lisp and cappuccino.lisp; and decaf.lisp may load a completely different mocha.lisp and cappuccino.lisp (in some other directory of course). The various calls to load use relative pathnames, viz.,

(load (merge-pathnames ... *load-pathname*))

So the user can still load the “main” file coffee.lisp, as before:

(load "coffee")

and expect things to work. Just to consolidate our example situation, let’s say the directory structure of the files in the distribution is as follows:

coffee/coffee.lisp
coffee/type/caf.lisp
coffee/type/decaf.lisp
coffee/caf/espresso.lisp
coffee/caf/mocha.lisp
coffee/caf/cappuccino.lisp
coffee/decaf/mocha.lisp
coffee/decaf/cappuccino.lisp

Since all these several coffee-related files must be introducing symbols galore, you of course want to define a package coffee, so that the coffee symbols don’t clash with any symbols the user has, either of their own, or from other packages. So you put

(defpackage :coffee
  (:use :cl)
  (:export :WHATEVER ...))

(in-package :coffee)

at the head of coffee.lisp. This should be enough, because when coffee.lisp loads other files, and these other files load still other files, the load function fluidly sets the special variable *package* to the prevailing value of *package*, so all the files will inherit coffee as their *package*. However, for the purpose of documentation, or maybe even to play nice with any slimy text-editor setup you may have, you may still want to put

(in-package :coffee)

at the head of each constituent file. It doesn’t hurt.

Let’s say the user likes your app, but finds it somewhat on the slow side, and wants to speed things up with

(compile-file "coffee")

This is when most app deliverers reach for ASDF. ASDF, however requires some non-trivial effort on the part of the user (and of course, much more on the part of the developer too). The user has to download ASDF, they have to create a registry directory, and for every app they download, they have to put the .asd file in the registry directory, and they have to follow a new syntax to load the app. I wondered if it was possible to have something really simple instead, where the user approach remains the same as for a program consisting of a single Lisp file. I.e., the user calls load on the main file, and they can optionally compile-file that main file.

Turns out there is a protocol — and quite an easy one — for securing this. Again, as with adding defpackage and in-package, the little bit of extra code goes just in the main coffee.lisp file. We place one requirement: all the names of the Lisp files and subdirectories they are in can be identified with Lisp symbols without any escaping mechanism. Thus, the file and directory names are lowercase and are allowed to use the same characters that a Lisp symbol can, provided they don’t interfere with the OS’s own convention for unadorned file naming. (Thus, the name can’t be a number even though the OS doesn’t mind, and it can’t contain a slash even though Lisp doesn’t mind.)

Recall that coffee.lisp loads caf.lisp and decaf.lisp. First, make sure that you don’t specify the extension .lisp in the calls to load, as we want the compiled versions of the files to be picked up whenever possible.

Now add an expression to coffee.lisp that compiles these subfiles whenever coffee.lisp itself is compiled. We could try

(eval-when (:compile-toplevel)
  (compile-file (merge-pathnames "type/caf" *compile-file-pathname*))
  (compile-file (merge-pathnames "type/decaf" *compile-file-pathname*)))

But we don’t just want to compile type/caf.lisp and type/decaf.lisp; we also want to compile the other files that these files load, viz., caf/espresso.lisp, caf/mocha.lisp, and caf/cappuccino.lisp; and decaf/mocha.lisp and decaf/cappuccino.lisp. And we want them compiled in the right order, because these files depend on each other in a certain order. Instead of explicitly writing the various compile-files in the right order, we can simply specify the dependencies in a succinct makefile-like way, and have a Lisp macro take care of calling compile-file on the files in the correct topological order. For this we use the macro do-in-topological-order, defined in the file dotopo.lisp.

do-in-topological-order’s first argument is a function, and its second argument is an expression of the dependencies among the symbols representing the various files. It topologically arranges the symbols, and then calls the function on them, ensuring that a file is treated before any file depending on it. Here’s a more maintainable way to add the compile-files to coffee.lisp, assuming dotopo.lisp is in the same directory as coffee.lisp:

(eval-when (:compile-toplevel)
  (load (merge-pathnames "dotopo" *compile-file-pathname*)))

(eval-when (:compile-toplevel)
  (do-in-topological-order
      (lambda (f)
        (compile-file (merge-pathnames (string-downcase (symbol-name f))
                                       *compile-file-pathname*)))
    (coffee
     type/caf type/decaf)
    (type/caf
     caf/espresso caf/mocha caf/cappuccino)
    (type/decaf
     decaf/mocha decaf/cappuccino)))

This is almost right, except for two issues: First, one of the files that do-in-topological-order’s first argument will attempt to compile is coffee.lisp, the file that we are already in the process of compiling when this expression is encountered! So we add a conditional disallowing the loop in do-in-topological-order’s first argument:

(eval-when (:compile-toplevel)
  (do-in-topological-order
      (lambda (f)
        (UNLESS (EQ F 'COFFEE)
          (compile-file (merge-pathnames (string-downcase (symbol-name f))
                                         *compile-file-pathname*))))
    ...))

The second issue has to do with file B using a special variable introduced in another file A. In some implementations, compile-file may issue a warning that file B has an “undeclared free variable”. To avoid this annoyance, make the introduction of the special variable in file A visible to compile-file, e.g.,

(EVAL-WHEN (:COMPILE-TOPLEVEL :LOAD-TOPLEVEL :EXECUTE)
  (defvar *bean-type*))

That’s all there is to it.

To summarize:

1. You include dotopo.lisp in your distribution alongside the main file;

2. include the above changes to the content of that main file; and

3. make sure that all special variables used outside their files are made visible to the compiler.

The user who unpacks your tarball then simply loads your main file, either as source, or after compile-file-ing it. They do not have to worry about dealing with the other files at all, so long as the latter’s relative path to the main file is not altered.


2009-05-03