Section 5.11 File Management
PreTeXt, at its core, is the formal specification of the XML vocabulary, as expressed in the DTD (Section 5.3). We have provided converters to process source files into useful output. However, we have not yet built a point-and-click application for the production of a book. So you need to take some responsibility in a large project for managing your files, both input and output. We have tried to provide flexible tools to make an author's job easier. The following is advice and practices we have successfully employed in several book projects.
I am fond of describing my own books with an initialism formed from the title. So A First Course in Linear Algebra becomes FCLA, and in file and directory names becomes
fcla. So I have a top-level directory
books and then
books/fcla, but this directory is not the book itself, this is all the extra stuff that goes along with writing a book, much of it in
books/fcla/local. The actual book, the part everybody sees with an open license, lives in
books/fcla/fcla. This subdirectory has files like
COPYING, which is a free software standard for license information, and
README.md which is a file in the simplistic Markdown format that is picked up automatically by GitHub and displayed nicely at the book's repository's main page. Subdirectories include
src for the actual XML files,
xsl for any customizing XSL (Section 5.6), and
script for shell scripts used to process the book (see below).
I do not use any additional directory structure below
src to manage modular files for a book, since the XML and the
--xinclude mechanism manage that just fine. I see little benefit to extra subdirectories for organization and some resulting inconvenience. I do typically have a single subdirectory
src/images for raster images and other graphics files.
I believe it is critically important to put your project under revision control, and if licensed openly, in a public GitHub repository. So the
books/fcla/fcla directory and all of its contents and subdirectories is tracked as a
git repository and hosted on GitHub. Because this directory is source I try very hard to never have any temporary files in these directories since I do not want to accidentally incorporate them into the
git repository. As a general rule-of-thumb, only original material goes in this directory and anything that can be re-created belongs outside.
A tutorial on
git would be way outside the scope of this guide, but Beezer and Farmer have written Git For Authors, so perhaps look for that.
Some images are raster images (e.g. photographs) that are not easily changed, and perhaps unlikely to be changed. Other images will come from source-level languages via the
pretext script. For your convenience, this script has a command-line option that allows you to direct output (graphics files) to a directory of your choice.
In the early stages of writing a book, I put image files produced from source code in a directory outside of what is tracked by
git. It is only when a project is very mature that I begin to include completed graphics files into the
src/images directory for tracking by
When you have a mature book project, the various files, processing options, and a desire for multiple outputs can all get a bit confusing. Writing simple scripts is a good idea and the investment of time doing this early in a project will pay off through the course of further writing and editing. The particular setup you employ is less important.
I have fallen into the habit of using the
make program. It allows me to define common variables upfront (such as paths to the PreTeXt distribution and the main directory for the project it applies to). Then I can easily make “targets” for different outputs. So, for example I typically go
make pdf or
make html to produce output, and have simple companion targets so that I can go
make viewpdf or
make viewhtml. Other targets do things like checking my source against the DTD (Section 5.3). I have split out the variable definitions in a way that a collaborator can join the project and simply edit the file of definitions just once to reflect their setup, and still participate in future upgrades to the script by pulling from GitHub and not overwrite their local information.
My use of
make is a bit of an abuse, since it is really designed for large software projects, with the aim of reducing duplicative compilations and that is not at all the purpose. You could likely have exactly the same effect with a shell script and a case (or switch) statement.
My general strategy is to assemble all the necessary files into a temporary directory (under
/tmp in Linux) by copying them out of their permanent home, copy customizing XSL into the right place (typically
mathbook/user), run the
pretext script as necessary and direct the results to the right place, and finally copy results out of the temporary directory if they are meant to be permanent. Interesting, an exception to staging all these files is the source of the book itself which is only read for each conversion and then not needed for the output. So you can just point directly to a master file and the
xinclude mechanism locates any other necessary source files.
A good example of this general strategy is the use and placement of image files for HTML output. It is your responsibility to place images into the location your resulting HTML files expect to locate them. By default, this is a subdirectory of the directory holding the HTML files, named
images. You will want to copy images, such as photographs, out of your main source directory (
src/images?). But you may be actively modifying source code for diagrams, and you want to re-run the
pretext script for each run, and make sure the output of the script is directed to the correct subdirectory for the HTML output. Running the
pretext script frequently can get tiresome, so maybe you have a makefile target
make diagrams that updates a permanent directory, outside of your tracked files in the repository, and you copy those files into the correct subdirectory for the output. That way, you can update images only when you are actively editing them, or when you are producing a draft that you want to be as up-to-date as possible. As a project matures, you can add images into the directory tracked by
git so they are available to others without getting involved with the
We did not say it would be easy, but we feel much of this sort of project management is outside the scope of the PreTeXt project itself, while in its initial stages, and existing tools to manage the complexity are available and documented. (We have been encouraged to create sample scripts, which we may do.) Just remember the strategy: stage necessary components in a temporary directory, build output in that directory, copy out desired semi-permanent results, and limit additions to the source directory to that which is original, or mature and time-consuming to reproduce.