web analytics
Skip to main content

The PreTeXt Guide

Section 5.28 URLs and External References

Subsection 5.28.1 URLs to External Web Pages

The <url> element is used to point to external web pages, or other online resources (as distinct from other internal portions of your current document, which is accomplished with the <xref/> element, Section 4.4). The @href attribute is always necessary, as it contains the full and complete address of the external page or resource. Include everything the URL needs, such as the protocol, since this will be most reliable, and as you will see it never needs to be visible. The element always allows, and then employs, a @visual attribute for a provided more-friendly version of the address. Finally, the content of the element, which becomes the clickable text in electronic formats, can be authored with the full range of PreTeXt markup generally available in a title or sentence. A typical use might look like
<url href="https://example.com/" visual="example.com">Demo Site</url>
This will render as Demo Site 1 . Note the automatic footnote providing the visual version in a monospace font. If a <url> has content, and no @visual attribute is given, then the @href will be placed in a footnote, though there will be an attempt to remove standard protocols. Compare
<url href="https://example.com/">Demo Site</url>
which will render as Demo Site 2  versus
<url href="mailto:nobody@example.com">Bouncing Email</url>
which will render as Bouncing Email 3 .
If you do not provide any content for a <url/> element, then the clickable text will be the actual URL with a preference for the (optional) @visual attribute, rather than the mandatory @href attribute. This should be considered as disruptive to the flow of your text, and so a poor alternative to the content version just discussed (see Best Practice 5.28.1). But it might be a good choice in something like a list of interesting web sites. Whether or not a simplified version of the address, via the @visual attribute, is desirable will depend on the application. As an example, using the optional @visual attribute we have
<url href="https://example.com/" visual="example.com"/>
This will render as example.com. Note that there is no footnote since the visual version is already apparent.
If you want to squelch the automatic footnote on a <url> element with content, you can explicitly set the @visual attribute to an empty string as visual="". This signal will inhibit the automatic footnote. This should be a very rare occurence, since you are denying readers of some formats from seeing even a hint of the actual URL. One example of this is a regular footnote, which contains a URL. The automatic footnote, inside another footnote, becomes problematic in some conversions. Better to squelch the footnote-within-a-footnote, and just list nearby a URL, likely using the <c> element to get a monospace font.
A <url> inside a <title> has been accounted for, but should be used with caution.
As with the rest of PreTeXt we have taken care to handle all of the exceptional characters that might arise in a <url>. So author normally, using the necessary keyboard characters, only taking care with the two XML characters, < and &, which need escaping (see Section 4.14). Use percent-encoding (aka URL encoding) for the @href attribute, if necesary, to include special characters, such as spaces. See Subsection 5.28.3 below for a common need for the ampersand character, and a further caution about percent-encoding of URLs.
Finally, for conversion to /PDF output it gets extremely tricky to handle all the various meanings of certain escape characters in URLs in more complicated contexts (such as tables, footnotes, and titles), so there may be some special cases where the formatting is off or you get an error when compiling your . We have anticipated most of these situations, but we always appreciate reports of missed cases.

Subsection 5.28.2 Visual URLs

By a visual URL we mean a version of a URL that is simpler than the “real” URL, but that provides enough information that a reader can type the URL into some other device with a minimum of effort, and with success. Consider that your project may someday be a print (hardcopy) book, or that your project will be converted to braille for a blind reader. These are some ideas about making a URL simpler. We welcome more ideas.
  • Remove standard/default protocols like http:// and https:// which most browsers will furnish in their absence.
  • Sites like StackExchange 4  list posts with a long identifying number, followed by something that looks like the title. In practice, the number is enough.
  • Experiment with dropping a trailing slash—they are frequently unnecessary.
  • Often a leading www. in a domain name is not necessary.
  • Try providing just a domain name in place of a top-level landing page, it will often redirect to a longer URL.
  • You could use a URL shortener 5 , though some thought should be given to its longevity 6 . Will you remember where your short URLs point once they are no longer functional. Safer to have your long URLs in an @href in your source, and use PreTeXt to make them friendlier.

Best Practice 5.28.1. Craft URLs Carefully.

Your writing will be smoother, and easier on your readers, if you do not interrupt a sentence with a long URL, unless somehow it is really of interest and relevant right there. So provide content (the “clickable” text) when you use the <url> element (rather than an empty <url/>). This obligates you to provide a @visual attribute, which feels a little like a tedious exercise. But this will be very welcome to some of your readers, those who are unable or prefer not to use electronic formats. Just above (Subsection 5.28.2, we provide suggestions for crafting these more pleasing, but still useful, versions of URLs.

Subsection 5.28.3 Characters in URLs

A URL can have a query string, which has a list of parameters following a question-mark. The parameters are separated by ampersands (&), which will need to be escaped, so as to not confuse the XML processor. So use &amp; anywhere the ampersand character is necessary, such as a @source attribute, or a monospace version of a URL achieved with a <c> element. Also, the question-mark character should not be URL-encoded (%3F) (despite advice just given above), so if necessary edit it to be the actual character. General advice about exceptional characters in XML source can be found in Section 4.14.

Subsection 5.28.4 URLs to External Data Files

The <url> element can be used to make data files available to your reader. Consider the example of a spreadsheet containing a large data set that a reader needs to analyze as part of an exercise. Here are our recommendations on how to accomplish this:
  • If the file is hosted on some server unassociated with your project, and does not have a license compatible with your project, then just set the @href to the complete address. Be sure to include enough of the address for the reader of a a print version to be able to type in the URL, either as the content of the <url> or in close vicinity.
  • If you authored the spreadsheet, or you are allowed to legally copy and distribute it, then place it on your server where you host your book project. Then do as above and use the full URL for the @href attribute, with a visible version available for PDF and print versions.
  • If you have control over the placement of the file, you can host it on your server, and use a URL relative to the location of your HTML, PDF, or other files that comprise your document. This might be a good choice if your book will be posted many places and you can give it to others as an archive, like a *.zip file. It is a bad idea if a reader downloads a PDF without the data file following along and remaining in the same relative location. It is an impossible idea if your document gets printed on paper and there is no idea what a relative URL means and there is not even a link to click on.
Consider your audience and think about how much guidance they need about using context menus or helper/viewer applications to make use of the file formats you are providing. This advice may be different depending on the type of files and the types of output for your document.