Section 46.10 File Management
PreTeXt, at its core, is the formal specification of the XML vocabulary, as expressed in the DTD (Section 5.4). We have provided converters to process source files into useful output. However, we have not yet built a point-and-click application for the production of a book. So you need to take some responsibility in a large project for managing your files, both input and output. We have tried to provide flexible tools to make an author’s job easier. The following is advice and practices we have successfully employed in several book projects.
I am fond of describing my own books with an initialism formed from the title. So A First Course in Linear Algebra becomes FCLA, and in file and directory names becomes
fcla. So I have a top-level directory
books/fcla, but this directory is not the book itself, this is all the extra stuff that goes along with writing a book, much of it in
books/fcla/local. The actual book, the part everybody sees with an open license, lives in
books/fcla/fcla. This subdirectory has files like
COPYING, which is a free software standard for license information, and
README.mdwhich is a file in the simplistic Markdown format that is picked up automatically by GitHub and displayed nicely at the book’s repository’s main page. Subdirectories include
srcfor the actual XML files,
xslfor any customizing XSL (Section 28.2), and
scriptfor shell scripts used to process the book (see below).
I do not use any additional directory structure below
srcto manage modular files for a book, since the XML and the
--xincludemechanism manage that just fine. I see little benefit to extra subdirectories for organization and some resulting inconvenience. I do typically have a single subdirectory
src/imagesfor raster images and other graphics files.
I believe it is critically important to put your project under revision control, and if licensed openly, in a public GitHub repository. So the
books/fcla/fcladirectory and all of its contents and subdirectories is tracked as a
gitrepository and hosted on GitHub. Because this directory is source I try very hard to never have any temporary files in these directories since I do not want to accidentally incorporate them into the
gitrepository. As a general rule-of-thumb, only original material goes in this directory and anything that can be re-created belongs outside.
A tutorial on
gitwould be way outside the scope of this guide, but Beezer and Farmer have written Git For Authors, so perhaps look for that.
Some images are raster images (e.g. photographs) that are not easily changed, and perhaps unlikely to be changed. Other images will come from source-level languages via the
pretextscript. For your convenience, this script has a command-line option that allows you to direct output (graphics files) to a directory of your choice.
In the early stages of writing a book, I put image files produced from source code in a directory outside of what is tracked by
git. It is only when a project is very mature that I begin to include completed graphics files into the
src/imagesdirectory for tracking by
When you have a mature book project, the various files, processing options, and a desire for multiple outputs can all get a bit confusing. Writing simple scripts is a good idea and the investment of time doing this early in a project will pay off through the course of further writing and editing. The particular setup you employ is less important.
I have fallen into the habit of using the
makeprogram. It allows me to define common variables upfront (such as paths to the PreTeXt distribution and the main directory for the project it applies to). Then I can easily make “targets” for different outputs. So, for example I typically go
make htmlto produce output, and have simple companion targets so that I can go
make viewhtml. Other targets do things like checking my source against the DTD (Section 5.4). I have split out the variable definitions in a way that a collaborator can join the project and simply edit the file of definitions just once to reflect their setup, and still participate in future upgrades to the script by pulling from GitHub and not overwrite their local information.
My use of
makeis a bit of an abuse, since it is really designed for large software projects, with the aim of reducing duplicative compilations and that is not at all the purpose. You could likely have exactly the same effect with a shell script and a case (or switch) statement.
My general strategy is to assemble all the necessary files into a temporary directory (under
/tmpin Linux) by copying them out of their permanent home, copy customizing XSL into the right place (typically
mathbook/user), run the
pretextscript as necessary and direct the results to the right place, and finally copy results out of the temporary directory if they are meant to be permanent. Interesting, an exception to staging all these files is the source of the book itself which is only read for each conversion and then not needed for the output. So you can just point directly to a top-level file and the
xincludemechanism locates any other necessary source files.
A good example of this general strategy is the use and placement of image files for HTML output. It is your responsibility to place images into the location your resulting HTML files expect to locate them. By default, this is a subdirectory of the directory holding the HTML files, named
images. You will want to copy images, such as photographs, out of your main source directory (
src/images?). But you may be actively modifying source code for diagrams, and you want to re-run the
pretextscript for each run, and make sure the output of the script is directed to the correct subdirectory for the HTML output. Running the
pretextscript frequently can get tiresome, so maybe you have a makefile target
make diagramsthat updates a permanent directory, outside of your tracked files in the repository, and you copy those files into the correct subdirectory for the output. That way, you can update images only when you are actively editing them, or when you are producing a draft that you want to be as up-to-date as possible. As a project matures, you can add images into the directory tracked by
gitso they are available to others without getting involved with the
We did not say it would be easy, but we feel much of this sort of project management is outside the scope of the PreTeXt project itself, while in its initial stages, and existing tools to manage the complexity are available and documented. (We have been encouraged to create sample scripts, which we may do.) Just remember the strategy: stage necessary components in a temporary directory, build output in that directory, copy out desired semi-permanent results, and limit additions to the source directory to that which is original, or mature and time-consuming to reproduce.