Section 5.28 URLs and External References
Subsection 5.28.1 URLs to External Web Pages
The
<url>
element is used to point to
external web pages, or other online resources (as distinct from other internal portions of your current document, which is accomplished with the
<xref/>
element,
Section 4.4). The
@href
attribute is always necessary, as it contains the full and complete address of the external page or resource. Include everything the
URL needs, such as the protocol, since this will be most reliable, and as you will see it never needs to be visible. The element always allows, and then employs, a
@visual
attribute for a provided more-friendly version of the address. Finally, the content of the element, which becomes the clickable text in electronic formats, can be authored with the full range of PreTeXt markup generally available in a title or sentence. A typical use might look like
<url href="https://example.com/" visual="example.com">Demo Site</url>
This will render as
Demo Site 1 . Note the automatic footnote providing the visual version in a monospace font. If a
<url>
has content, and no
@visual
attribute is given, then the
@href
will be placed in a footnote, though there will be an attempt to remove standard protocols. Compare
<url href="https://example.com/">Demo Site</url>
<url href="mailto:nobody@example.com">Bouncing Email</url>
If you do not provide any content for a
<url/>
element, then the clickable text will be the actual
URL with a preference for the (optional)
@visual
attribute, rather than the mandatory
@href
attribute. This should be considered as disruptive to the flow of your text, and so a poor alternative to the content version just discussed (see
Best Practice 5.28.1). But it might be a good choice in something like a list of interesting web sites. Whether or not a simplified version of the address, via the
@visual
attribute, is desirable will depend on the application. As an example, using the optional
@visual
attribute we have
<url href="https://example.com/" visual="example.com"/>
This will render as
example.com
. Note that there is no footnote since the visual version is already apparent.
If you want to squelch the automatic footnote on a
<url>
element with content, you can explicitly set the
@visual
attribute to an empty string as
visual=""
. This signal will inhibit the automatic footnote. This should be a
very rare occurence, since you are denying readers of some formats from seeing even a hint of the actual
URL. One example of this is a regular footnote, which contains a
URL. The automatic footnote, inside another footnote, becomes problematic in some conversions. Better to squelch the footnote-within-a-footnote, and just list nearby a
URL, likely using the
<c>
element to get a monospace font.
A <url>
inside a <title>
has been accounted for, but should be used with caution.
As with the rest of PreTeXt we have taken care to handle all of the exceptional characters that might arise in a
<url>
. So author normally, using the necessary keyboard characters, only taking care with the two
XML characters,
<
and
&
, which need escaping (see
Section 4.14). Use
percent-encoding (aka
URL encoding) for the
@href
attribute, if necesary, to include special characters, such as spaces. See
Subsection 5.28.3 below for a common need for the ampersand character, and a further caution about percent-encoding of
URLs.
Finally, for conversion to
LaTeX/PDF output it gets extremely tricky to handle all the various meanings of certain escape characters in
URLs in more complicated contexts (such as tables, footnotes, and titles), so there may be some special cases where the formatting is off or you get an error when compiling your
LaTeX. We have anticipated most of these situations, but we always appreciate reports of missed cases.
Subsection 5.28.2 Visual URLs
By a
visual URL we mean a version of a
URL that is simpler than the “real”
URL, but that provides enough information that a reader can type the
URL into some other device with a minimum of effort, and with success. Consider that your project may someday be a print (hardcopy) book, or that your project will be converted to braille for a blind reader. These are some ideas about making a
URL simpler. We welcome more ideas.
Remove standard/default protocols like http://
and https://
which most browsers will furnish in their absence.
Sites like
StackExchange 4 list posts with a long identifying number, followed by something that looks like the title. In practice, the number is enough.
Experiment with dropping a trailing slash—they are frequently unnecessary.
Often a leading www.
in a domain name is not necessary.
Try providing just a domain name in place of a top-level landing page, it will often redirect to a longer
URL.
You could use a
URL shortener 5 , though some thought should be given to
its longevity 6 . Will you remember where your short
URLs point once they are no longer functional. Safer to have your long
URLs in an
@href
in your source, and use PreTeXt to make them friendlier.
Subsection 5.28.3 Characters in URLs
A
URL can have a
query string, which has a list of parameters following a question-mark. The parameters are separated by ampersands (&), which will need to be escaped, so as to not confuse the
XML processor. So use
&
anywhere the ampersand
character is necessary, such as a
@source
attribute, or a monospace version of a
URL achieved with a
<c>
element. Also, the question-mark character should
not be
URL-encoded (
%3F
) (despite advice just given above), so if necessary edit it to be the actual character. General advice about exceptional characters in
XML source can be found in
Section 4.14.
Subsection 5.28.4 URLs to External Data Files
The <url>
element can be used to make data files available to your reader. Consider the example of a spreadsheet containing a large data set that a reader needs to analyze as part of an exercise. Here are our recommendations on how to accomplish this:
If the file is hosted on some server unassociated with your project, and does not have a license compatible with your project, then just set the
@href
to the complete address. Be sure to include enough of the address for the reader of a a print version to be able to type in the
URL, either as the content of the
<url>
or in close vicinity.
If you authored the spreadsheet, or you are allowed to legally copy and distribute it, then place it on your server where you host your book project. Then do as above and use the full
URL for the
@href
attribute, with a visible version available for
PDF and print versions.
If you have control over the placement of the file, you can host it on your server, and use a
URL relative to the location of your
HTML,
PDF, or other files that comprise your document. This might be a good choice if your book will be posted many places and you can give it to others as an archive, like a
*.zip
file. It is a bad idea if a reader downloads a PDF without the data file following along and remaining in the same relative location. It is an impossible idea if your document gets printed on paper and there is no idea what a relative
URL means and there is not even a link to click on.
Consider your audience and think about how much guidance they need about using context menus or helper/viewer applications to make use of the file formats you are providing. This advice may be different depending on the type of files and the types of output for your document.
mailto:nobody@example.com