Section 6.2 Validation Plus
The RELAX-NG schema is very good at specifying “parent-child” relationships, in other words, which elements can nest directly under/within other elements. But we have situations where the possible elements depend on grandparents, great-grandparents, or older ancestors. An example is the
<var>
element, which is only useful if it is contained somewhere within a <webwork>
element. You can describe these situations with RELAX-NG, but it becomes cumbersome and redundant. So our stategy is to allow some prohibited situations in the RELAX-NG schema, and use an additional stylesheet to identify the prohibited situations. Continuing our WeBWorK example, the RELAX-NG schema makes it appear that <var>
can be used many places, but the “validation-plus” stylesheet provides a helpful message indicating you have used it outside the context of a WeBWorK problem.You have put a lot of time and effort into your source, and we want to help you make the best possible output. A little more effort from you will allow us to make the fine distinctions that produce really high-quality output. So this stylesheet is our best attempt to help you make the very best possible source. It is full of (automated) advice and warnings.
To use this stylesheet, simply apply it at the command-line with
xsltproc
like any other stylesheet.xsltproc -xinclude -o report.txt schema/pretext-validation-plus.xsl aota/animals.xml
The output will be a text file that indicates the suspect element by its location in the document tree.
You may get lots of output on first use, especially if your source was born “somewhere else,” not meant for use by PreTeXt. We could make improvements in managing all this output, but for now we have one suggestion. Sorting on the actual messages realyed, rather than a hodgepodge of messages in document order, can help you identify consistent situations that you might be able to fix in bulk. First, apply the stylesheet again, but now use the stringparam
single.line.output
set to the value yes
(Section 28.1). As you suspect, this will put all the output on one line, and the message text will be in the third “field”, which can be used by the command-line utility sort
,cat report.txt | sort -k 3 > report-sorted.txt
We once used Schematron for this purpose. Its author, Rick Jelliffe, says “Schematron is a feather duster to reach the corners that other schema languages cannot reach.” Our additional stylesheet is similar.
Why do we have two tools for validation? We have explained the necessity of an extra stylesheet. Why not describe the entire grammar in this stylesheet? The reason is that RELAX-NG is a recognized standard, and so can be converted to other formats, and may also be utilized by XML editors or integrated development environments (IDE) to provide features like code completion. Besides, it would be very tedious to provide all the code for checking everything that is possible and everything that is not.