Skip to main content

The PreTeXt Guide

Section 4.31 URLs and External References

Subsection 4.31.1 URLs to External Web Pages

The <url> element is used to point to external web pages, or other online resources (as distinct from other internal portions of your current document, which is accomplished with the <xref/> element, Section 3.4). The @href attribute is always necessary, as it contains the full and complete address of the external page or resource. Include everything the URL needs, such as the protocol, since this will be most reliable, and as you will see it never needs to be visible. The element always allows, and then employs, a @visual attribute for a provided more-friendly version of the address. Finally, the content of the element, which becomes the clickable text in electronic formats, can be authored with the full range of PreTeXt markup generally available in a title or sentence. A typical use might look like
<url href="https://example.com/" visual="example.com">Demo Site</url>
This will render as Demo Site
 1 
example.com
. Note the automatic footnote providing the visual version in a monospace font. If a <url> has content, and no @visual attribute is given, then the @href will be placed in a footnote, though there will be an attempt to remove standard protocols. Compare
<url href="https://example.com/">Demo Site</url>
which will render as Demo Site
 2 
example.com/
versus
<url href="mailto:nobody@example.com">Bouncing Email</url>
which will render as Bouncing Email
 3 
mailto:nobody@example.com
.
If you do not provide any content for a <url/> element, then the clickable text will be the actual URL with a preference for the (optional) @visual attribute, rather than the mandatory @href attribute. This should be considered as disruptive to the flow of your text, and so a poor alternative to the content version just discussed (see Best Practice 4.31.1). But it might be a good choice in something like a list of interesting web sites. Whether or not a simplified version of the address, via the @visual attribute, is desirable will depend on the application. As an example, using the optional @visual attribute we have
<url href="https://example.com/" visual="example.com"/>
This will render as example.com. Note that there is no footnote since the visual version is already apparent.
If you want to squelch the automatic footnote on a <url> element with content, you can explicitly set the @visual attribute to an empty string as visual="". This signal will inhibit the automatic footnote. This should be a very rare occurence, since you are denying readers of some formats from seeing even a hint of the actual URL.
An extreme example of this behavior is a regular footnote which contains a URL. Because an automatic footnote, inside another footnote, becomes problematic in some conversions, we squelch the footnote-within-a-footnote. A best practice here is to just list nearby a URL, likely using the <c> element to get a monospace font.
A <url> inside a <title> has been accounted for, but should be used with caution.
As with the rest of PreTeXt we have taken care to handle all of the exceptional characters that might arise in a <url>. So author normally, using the necessary keyboard characters, only taking care with the two XML characters, < and &, which need escaping (see Section 3.14). Use percent-encoding (aka URL encoding) for the @href attribute, if necesary, to include special characters, such as spaces. See Subsection 4.31.4 below for a common need for the ampersand character, and a further caution about percent-encoding of URLs.
Finally, for conversion to /PDF output it gets extremely tricky to handle all the various meanings of certain escape characters in URLs in more complicated contexts (such as tables, footnotes, and titles), so there may be some special cases where the formatting is off or you get an error when compiling your . We have anticipated most of these situations, but we always appreciate reports of missed cases.

Subsection 4.31.2 Data URLs

A <dataurl> element is very similar to the <url> element just described. The purpose is to point to an actual file that will be of use to your readers. What actual happens when a reader clicks on it is dependent on the format of the PreTeXt output and that reader’s environment. Maybe the file will be downloaded, or maybe a particular application will open the file. That part is out of our hands. Use an @href attribute in the same way as for <url>, and the content and the @visual attribute also behave similarly.
The one key differerence is that you can also use a @source attribute in place of @href and point to a file that you provide as part of your project (not unlike providing a photographic image via the <image> element). Place the file in your collection of external files (see Section 5.6) and provide the path to your file from below the directory of external files in the @source attribute. For HTML output, PreTeXt will do the rest. For more static formats, you can set a base URL (see Subsection 44.4.2) and you will get a complete URL that points to the instance of your file hosted with the rest of your HTML output.
Notice that this element provides limited functionality, at best just a hyperlink to a file. For data files that you want a reader’s in-browser computer program to process, read about the <datafile> element at Section 4.17.

Subsection 4.31.3 Visual URLs

By a visual URL we mean a version of a URL that is simpler than the “real” URL, but that provides enough information that a reader can type the URL into some other device with a minimum of effort, and with success. Consider that your project may someday be a print (hardcopy) book, or that your project will be converted to braille for a blind reader. These are some ideas about making a URL simpler. We welcome more ideas.
  • Remove standard/default protocols like http:// and https:// which most browsers will furnish in their absence.
  • Sites like StackExchange
     4 
    stackexchange.com
    list posts with a long identifying number, followed by something that looks like the title. In practice, the number is enough.
  • Experiment with dropping a trailing slash—they are frequently unnecessary.
  • Often a leading www. in a domain name is not necessary.
  • Try providing just a domain name in place of a top-level landing page, it will often redirect to a longer URL.
  • You could use a URL shortener
     5 
    https://w.wiki/4QA
    , though some thought should be given to its longevity
     6 
    https://w.wiki/4eEM
    . Will you remember where your short URLs point once they are no longer functional. Safer to have your long URLs in an @href in your source, and use PreTeXt to make them friendlier.

Best Practice 4.31.1. Craft URLs Carefully.

Your writing will be smoother, and easier on your readers, if you do not interrupt a sentence with a long URL, unless somehow it is really of interest and relevant right there. So provide content (the “clickable” text) when you use the <url> element (rather than an empty <url/>). This obligates you to provide a @visual attribute, which feels a little like a tedious exercise. But this will be very welcome to some of your readers, those who are unable or prefer not to use electronic formats. Just above (Subsection 4.31.3), we provide suggestions for crafting these to be more pleasing, but still useful, versions of URLs.

Subsection 4.31.4 Characters in URLs

A URL can have a query string, which has a list of parameters following a question-mark. The parameters are separated by ampersands (&), which will need to be escaped, so as to not confuse the XML processor. So use &amp; anywhere the ampersand character is necessary, such as a @source attribute, or a monospace version of a URL achieved with a <c> element. Also, the question-mark character should not be URL-encoded (%3F) (despite advice just given above), so if necessary edit it to be the actual character. General advice about exceptional characters in XML source can be found in Section 3.14.