Skip to main content

The PreTeXt Guide

Section 46.10 File Management

PreTeXt, at its core, is the formal specification of the XML vocabulary, as expressed in the DTD (Section 5.4). We have provided converters to process source files into useful output. However, we have not yet built a point-and-click application for the production of a book. So you need to take some responsibility in a large project for managing your files, both input and output. We have tried to provide flexible tools to make an author’s job easier. The following is advice and practices we have successfully employed in several book projects.

Source.

I am fond of describing my own books with an initialism formed from the title. So A First Course in Linear Algebra becomes FCLA, and in file and directory names becomes fcla. So I have a top-level directory books and then books/fcla, but this directory is not the book itself, this is all the extra stuff that goes along with writing a book, much of it in books/fcla/local. The actual book, the part everybody sees with an open license, lives in books/fcla/fcla. This subdirectory has files like COPYING, which is a free software standard for license information, and README.md which is a file in the simplistic Markdown format that is picked up automatically by GitHub and displayed nicely at the book’s repository’s main page. Subdirectories include src for the actual XML files, xsl for any customizing XSL (Section 28.2), and script for shell scripts used to process the book (see below).
I do not use any additional directory structure below src to manage modular files for a book, since the XML and the --xinclude mechanism manage that just fine. I see little benefit to extra subdirectories for organization and some resulting inconvenience. I do typically have a single subdirectory src/images for raster images and other graphics files.
I believe it is critically important to put your project under revision control, and if licensed openly, in a public GitHub repository. So the books/fcla/fcla directory and all of its contents and subdirectories is tracked as a git repository and hosted on GitHub. Because this directory is source I try very hard to never have any temporary files in these directories since I do not want to accidentally incorporate them into the git repository. As a general rule-of-thumb, only original material goes in this directory and anything that can be re-created belongs outside.
A tutorial on git would be way outside the scope of this guide, but Beezer and Farmer have written Git For Authors, so perhaps look for that.

Image Files.

Some images are raster images (e.g. photographs) that are not easily changed, and perhaps unlikely to be changed. Other images will come from source-level languages via the pretext script. For your convenience, this script has a command-line option that allows you to direct output (graphics files) to a directory of your choice.
In the early stages of writing a book, I put image files produced from source code in a directory outside of what is tracked by git. It is only when a project is very mature that I begin to include completed graphics files into the src/images directory for tracking by git.

Build Scripts.

When you have a mature book project, the various files, processing options, and a desire for multiple outputs can all get a bit confusing. Writing simple scripts is a good idea and the investment of time doing this early in a project will pay off through the course of further writing and editing. The particular setup you employ is less important.
I have fallen into the habit of using the make program. It allows me to define common variables upfront (such as paths to the PreTeXt distribution and the main directory for the project it applies to). Then I can easily make “targets” for different outputs. So, for example I typically go make pdf or make html to produce output, and have simple companion targets so that I can go make viewpdf or make viewhtml. Other targets do things like checking my source against the DTD (Section 5.4). I have split out the variable definitions in a way that a collaborator can join the project and simply edit the file of definitions just once to reflect their setup, and still participate in future upgrades to the script by pulling from GitHub and not overwrite their local information.
My use of make is a bit of an abuse, since it is really designed for large software projects, with the aim of reducing duplicative compilations and that is not at all the purpose. You could likely have exactly the same effect with a shell script and a case (or switch) statement.
My general strategy is to assemble all the necessary files into a temporary directory (under /tmp in Linux) by copying them out of their permanent home, copy customizing XSL into the right place (typically mathbook/user), run the pretext script as necessary and direct the results to the right place, and finally copy results out of the temporary directory if they are meant to be permanent. Interesting, an exception to staging all these files is the source of the book itself which is only read for each conversion and then not needed for the output. So you can just point directly to a top-level file and the xinclude mechanism locates any other necessary source files.
A good example of this general strategy is the use and placement of image files for HTML output. It is your responsibility to place images into the location your resulting HTML files expect to locate them. By default, this is a subdirectory of the directory holding the HTML files, named images. You will want to copy images, such as photographs, out of your main source directory (src/images?). But you may be actively modifying source code for diagrams, and you want to re-run the pretext script for each run, and make sure the output of the script is directed to the correct subdirectory for the HTML output. Running the pretext script frequently can get tiresome, so maybe you have a makefile target make diagrams that updates a permanent directory, outside of your tracked files in the repository, and you copy those files into the correct subdirectory for the output. That way, you can update images only when you are actively editing them, or when you are producing a draft that you want to be as up-to-date as possible. As a project matures, you can add images into the directory tracked by git so they are available to others without getting involved with the pretext script.
We did not say it would be easy, but we feel much of this sort of project management is outside the scope of the PreTeXt project itself, while in its initial stages, and existing tools to manage the complexity are available and documented. (We have been encouraged to create sample scripts, which we may do.) Just remember the strategy: stage necessary components in a temporary directory, build output in that directory, copy out desired semi-permanent results, and limit additions to the source directory to that which is original, or mature and time-consuming to reproduce.