Skip to main content

Section 10 Font Tests

We place various blocks of Unicode characters here to determine the minimum configuration necessary to make them render. Alan Wood’s Unicode Resources
 1 
www.alanwood.net/unicode/unicode_samples.html
site has been helpful in formulating these tests.

Basic Latin, U+0000U+007F.

These 95 characters are the most basic, and will all render using xelatex with no special setup. U+0000 to U+001F are control codes and not used here. U+007F is also a control code and so is excluded. In the source we have authored each character by its escaped version using its Unicode number (in hexadecimal). So, for example, capital-B is authored as B.
Table 10.1. Basic Latin, Regular
0 1 2 3 4 5 6 7 8 9 A B C D E F
002_ ! " # $ % & ( ) * + , - . /
003_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
004_ @ A B C D E F G H I J K L M N O
005_ P Q R S T U V W X Y Z [ \ ] ^ _
006_ ` a b c d e f g h i j k l m n o
007_ p q r s t u v w x y z { | } ~

Monospace, Basic Latin, U+0000U+007F.

These are exactly the same characters as above, but now we wrap them in the <c> element intended for inline use. This does not test all verbatim situations but is a good simple first test.
Table 10.2. Basic Latin, Monospace
0 1 2 3 4 5 6 7 8 9 A B C D E F
002_ ! " # $ % & ' ( ) * + , - . /
003_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
004_ @ A B C D E F G H I J K L M N O
005_ P Q R S T U V W X Y Z [ \ ] ^ _
006_ ` a b c d e f g h i j k l m n o
007_ p q r s t u v w x y z { | } ~
Note that the single and double quotes are upright and dumb, not curly and smart: " ' " ' " '. The zero is distinguished from the capital “oh”: 0 O 0 O 0 O. And the numeral one is slightly different from the lower-case “ell”: 1 l 1 l 1 l. The hyphen should be short and not expanded into some other kind of dash: - - -. These characters should all cut/paste out of a PDF into a text editor with no conversion to other characters.
Note also that we have again entered all these characters into the source with the &#x00NN; XML notation.

Latin-1 Supplement, U+0080U+00FF.

These 94 characters will all render using either pdflatex or xelatex with no special setup. U+0080 to U+009F are control codes and not used here. U+00A0 (non-breaking space) and U+00AD (soft hyphen) are also excluded. In the source we have authored each character by its escaped version using its Unicode number (in hexadecimal). So, for example, a copyright symbol is authored as &#x00A9;.
Table 10.3. Latin-1 Supplement, Regular
0 1 2 3 4 5 6 7 8 9 A B C D E F
00A_   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯
00B_ ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
00C_ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
00D_ Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
00E_ à á â ã ä å æ ç è é ê ë ì í î ï
00F_ ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Monospace, Latin-1 Supplement, U+0080U+00FF.

The same 94 characters as above, wrapped in a <c> element as if being used inside a sentence. These will all render with xelatex and none will render with pdflatex (so there is just blank space below). If we improve the latter, then these will get duplicated into the sample article.
Table 10.4. Latin-1 Supplement, Monospace
0 1 2 3 4 5 6 7 8 9 A B C D E F
00A_   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯
00B_ ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
00C_ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
00D_ Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
00E_ à á â ã ä å æ ç è é ê ë ì í î ï
00F_ ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Latin Extended-A, U+0100U+017F.

Good success rendering with xelatex and no extra setup, subject to glyphs actually being available in whatever font you use. Our default font is Latin Modern. About 25% of these are missing when rendered with pdflatex.
Table 10.5. Latin Extended-A
0 1 2 3 4 5 6 7 8 9 A B C D E F
010_ Ā ā Ă ă Ą ą Ć ć Ĉ ĉ Ċ ċ Č č Ď ď
011_ Đ đ Ē ē Ĕ ĕ Ė ė Ę ę Ě ě Ĝ ĝ Ğ ğ
012_ Ġ ġ Ģ ģ Ĥ ĥ Ħ ħ Ĩ ĩ Ī ī Ĭ ĭ Į į
013_ İ ı IJ ij Ĵ ĵ Ķ ķ ĸ Ĺ ĺ Ļ ļ Ľ ľ Ŀ
014_ ŀ Ł ł Ń ń Ņ ņ Ň ň ʼn Ŋ ŋ Ō ō Ŏ ŏ
015_ Ő ő Œ œ Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Ş ş
016_ Š š Ţ ţ Ť ť Ŧ ŧ Ũ ũ Ū ū Ŭ ŭ Ů ů
017_ Ű ű Ų ų Ŵ ŵ Ŷ ŷ Ÿ Ź ź Ż ż Ž ž ſ
Rendered with xelatex and no special setup, with default latin Modern fonts, we seem to be missing only four characters:
  • U+0138 (LATIN SMALL LETTER KRA, Greenlandic, removed 1973)
  • U+0149 (LATIN SMALL LETTER N PRECEDED BY APOSTROPHE, Afrikaans, deprecated as of Unicode version 5.2.0)
  • U+0166 (LATIN CAPITAL LETTER T WITH STROKE, Northern Sámi alphabet, used in northern parts of Norway, Sweden and Finland)
  • U+0167 (LATIN SMALL LETTER T WITH STROKE, Northern Sámi alphabet, used in northern parts of Norway, Sweden and Finland)

Latin Extended-B, U+0180U+024F.

Rendering with xelatex and no extra setup, with default latin Modern fonts, maybe 50% missing, and some constructions of accents are clearly wrong. Almost none of these appear when rendered with pdflatex. (When processed with lualatex the incorrectly accented characters are not even visible, but we have not learned much about using fonts in LuaTeX.) Latin Modern does not claim to support any of this range.
Table 10.6. Latin Extended-B
0 1 2 3 4 5 6 7 8 9 A B C D E F
018_ ƀ Ɓ Ƃ ƃ Ƅ ƅ Ɔ Ƈ ƈ Ɖ Ɗ Ƌ ƌ ƍ Ǝ Ə
019_ Ɛ Ƒ ƒ Ɠ Ɣ ƕ Ɩ Ɨ Ƙ ƙ ƚ ƛ Ɯ Ɲ ƞ Ɵ
01A_ Ơ ơ Ƣ ƣ Ƥ ƥ Ʀ Ƨ ƨ Ʃ ƪ ƫ Ƭ ƭ Ʈ Ư
01B_ ư Ʊ Ʋ Ƴ ƴ Ƶ ƶ Ʒ Ƹ ƹ ƺ ƻ Ƽ ƽ ƾ ƿ
01C_ ǀ ǁ ǂ ǃ DŽ Dž dž LJ Lj lj NJ Nj nj Ǎ ǎ Ǐ
01D_ ǐ Ǒ ǒ Ǔ ǔ Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ ǝ Ǟ ǟ
01E_ Ǡ ǡ Ǣ ǣ Ǥ ǥ Ǧ ǧ Ǩ ǩ Ǫ ǫ Ǭ ǭ Ǯ ǯ
01F_ ǰ DZ Dz dz Ǵ ǵ Ƕ Ƿ Ǹ ǹ Ǻ ǻ Ǽ ǽ Ǿ ǿ
020_ Ȁ ȁ Ȃ ȃ Ȅ ȅ Ȇ ȇ Ȉ ȉ Ȋ ȋ Ȍ ȍ Ȏ ȏ
021_ Ȑ ȑ Ȓ ȓ Ȕ ȕ Ȗ ȗ Ș ș Ț ț Ȝ ȝ Ȟ ȟ
022_ Ƞ ȡ Ȣ ȣ Ȥ ȥ Ȧ ȧ Ȩ ȩ Ȫ ȫ Ȭ ȭ Ȯ ȯ
023_ Ȱ ȱ Ȳ ȳ ȴ ȵ ȶ ȷ ȸ ȹ Ⱥ Ȼ ȼ Ƚ Ⱦ ȿ

Latin Extended Additional, U+1E00U+1EFF.

Latin Modern, our default font, supports this range of 256 characters, which includes 90 Vietnamese characters. Their documentation shows about 140 characters rendered correctly, which seems to jibe with the examples here that render properly.
Table 10.7. Latin Extended Additional
0 1 2 3 4 5 6 7 8 9 A B C D E F
1E0_
1E1_
1E2_
1E3_ ḿ
1E4_
1E5_
1E6_
1E7_ ṿ
1E8_
1E9_
1EA_
1EB_ ế
1EC_
1ED_
1EE_
1EF_ ỿ