babel...notebecauseofthewaybabelhasevolved,“language”canreferto(1)asetof...

221
Babel Version 3.60 2021/06/02 Johannes L. Braams Original author Javier Bezos Current maintainer Localization and internationalization Unicode T E X pdfT E X LuaT E X XeT E X

Upload: others

Post on 31-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Babel

    Version 3.60

    2021/06/02

    Johannes L. BraamsOriginal author

    Javier BezosCurrent maintainer

    Localization and

    internationalization

    Unicode

    TEX

    pdfTEX

    LuaTEX

    XeTEX

  • Contents

    I User guide 4

    1 The user interface 4

    1.1 Monolingual documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2 Multilingual documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.3 Mostly monolingual documents . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.4 Modifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.5 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.6 Plain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.7 Basic language selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.8 Auxiliary language selectors . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    1.9 More on selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    1.10 Shorthands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.11 Package options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    1.12 The base option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    1.13 ini files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    1.14 Selecting fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    1.15 Modifying a language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    1.16 Creating a language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    1.17 Digits and counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    1.18 Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    1.19 Accessing language info . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    1.20 Hyphenation and line breaking . . . . . . . . . . . . . . . . . . . . . . . . 36

    1.21 Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    1.22 Selection based on BCP 47 tags . . . . . . . . . . . . . . . . . . . . . . . . . 40

    1.23 Selecting scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    1.24 Selecting directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    1.25 Language attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    1.26 Hooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    1.27 Languages supported by babel with ldf files . . . . . . . . . . . . . . . . . 47

    1.28 Unicode character properties in luatex . . . . . . . . . . . . . . . . . . . . 48

    1.29 Tweaking some features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    1.30 Tips, workarounds, known issues and notes . . . . . . . . . . . . . . . . . 49

    1.31 Current and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    1.32 Tentative and experimental code . . . . . . . . . . . . . . . . . . . . . . . 51

    2 Loading languages with language.dat 51

    2.1 Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    3 The interface between the core of babel and the language definition files 52

    3.1 Guidelines for contributed languages . . . . . . . . . . . . . . . . . . . . . 53

    3.2 Basic macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    3.3 Skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.4 Support for active characters . . . . . . . . . . . . . . . . . . . . . . . . . 56

    3.5 Support for saving macro definitions . . . . . . . . . . . . . . . . . . . . . 57

    3.6 Support for extending macros . . . . . . . . . . . . . . . . . . . . . . . . . 57

    3.7 Macros common to a number of languages . . . . . . . . . . . . . . . . . . 57

    3.8 Encoding-dependent strings . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    4 Changes 61

    4.1 Changes in babel version 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    1

  • II Source code 62

    5 Identification and loading of required files 62

    6 locale directory 62

    7 Tools 63

    7.1 Multiple languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    7.2 The Package File (LATEX, babel.sty) . . . . . . . . . . . . . . . . . . . . . . 67

    7.3 base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    7.4 Conditional loading of shorthands . . . . . . . . . . . . . . . . . . . . . . . 72

    7.5 Cross referencing macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    7.6 Marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

    7.7 Preventing clashes with other packages . . . . . . . . . . . . . . . . . . . 77

    7.7.1 ifthen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

    7.7.2 varioref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

    7.7.3 hhline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    7.7.4 hyperref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    7.7.5 fancyhdr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    7.8 Encoding and fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    7.9 Basic bidi support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    7.10 Local Language Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 86

    8 The kernel of Babel (babel.def, common) 90

    8.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    9 Multiple languages 91

    9.1 Selecting the language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    9.2 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    9.3 Hooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    9.4 Setting up language files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

    9.5 Shorthands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    9.6 Language attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    9.7 Support for saving macro definitions . . . . . . . . . . . . . . . . . . . . . 120

    9.8 Short tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

    9.9 Hyphens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

    9.10 Multiencoding strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

    9.11 Macros common to a number of languages . . . . . . . . . . . . . . . . . . 129

    9.12 Making glyphs available . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

    9.12.1 Quotation marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

    9.12.2 Letters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

    9.12.3 Shorthands for quotation marks . . . . . . . . . . . . . . . . . . . 132

    9.12.4 Umlauts and tremas . . . . . . . . . . . . . . . . . . . . . . . . . . 133

    9.13 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

    9.14 Load engine specific macros . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    9.15 Creating and modifying languages . . . . . . . . . . . . . . . . . . . . . . 135

    10 Adjusting the Babel bahavior 155

    11 Loading hyphenation patterns 157

    12 Font handling with fontspec 162

    2

  • 13 Hooks for XeTeX and LuaTeX 166

    13.1 XeTeX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

    13.2 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    13.3 LuaTeX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

    13.4 Southeast Asian scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

    13.5 CJK line breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

    13.6 Arabic justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

    13.7 Common stuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

    13.8 Automatic fonts and ids switching . . . . . . . . . . . . . . . . . . . . . . . 183

    13.9 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

    13.10 Auto bidi with basic and basic-r . . . . . . . . . . . . . . . . . . . . . . . 201

    14 Data for CJK 211

    15 The ‘nil’ language 212

    16 Support for Plain TEX (plain.def) 212

    16.1 Not renaming hyphen.tex . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

    16.2 Emulating some LATEX features . . . . . . . . . . . . . . . . . . . . . . . . . 213

    16.3 General tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

    16.4 Encoding related macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

    17 Acknowledgements 220

    Troubleshoooting

    Paragraph ended before \UTFviii@three@octets was complete . . . . . . . . . . . 5

    No hyphenation patterns were preloaded for (babel) the language ‘LANG’ into the

    format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    You are loading directly a language style . . . . . . . . . . . . . . . . . . . . . . . 8

    Unknown language ‘LANG’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Argument of \language@active@arg” has an extra } . . . . . . . . . . . . . . . . . 12

    Package fontspec Warning: ’Language ’LANG’ not available for font ’FONT’ with

    script ’SCRIPT’ ’Default’ language used instead’ . . . . . . . . . . . . . . . . . 28

    Package babel Info: The following fonts are not babel standard families . . . . . 28

    3

  • Part I

    User guide

    What is this document about? This user guide focuses on internationalization and

    localization with LATEX and pdftex, xetex and luatex with the babel package. There are

    also some notes on its use with e-Plain and pdf-Plain TEX. Part II describes the code, and

    usually it can be ignored.

    What if I’m interested only in the latest changes? Changes and new features with

    relation to version 3.8 are highlighted with New X.XX , and there are some notes for the

    latest versions in the babel repository. The most recent features can be still unstable.

    Can I help? Sure! If you are interested in the TEX multilingual support, please join the

    kadingira mail list. You can follow the development of babel in GitHub and make

    suggestions; feel free to fork it and make pull requests. If you are the author of a

    package, send to me a few test files which I’ll add to mine, so that possible issues can be

    caught in the development phase.

    It doesn’t work for me! You can ask for help in some forums like tex.stackexchange, but if

    you have found a bug, I strongly beg you to report it in GitHub, which is much better

    than just complaining on an e-mail list or a web forum. Remember warnings are not

    errors by themselves, they just warn about possible problems or incompatibilities.

    How can I contribute a new language? See section 3.1 for contributing a language.

    I only need learn the most basic features. The first subsections (1.1-1.3) describe the

    traditional way of loading a language (with ldf files), which is usually all you need. The

    alternative way based on ini files, which complements the previous one (it does not

    replace it, although it is still necessary in some languages), is described below; go to

    1.13.

    I don’t like manuals. I prefer sample files. This manual contains lots of examples and

    tips, but in GitHub there are many sample files.

    1 The user interface

    1.1 Monolingual documents

    In most cases, a single language is required, and then all you need in LATEX is to load the

    package using its standard mechanism for this purpose, namely, passing that language as

    an optional argument. In addition, you may want to set the font and input encodings.

    Another approach is making the language a global option in order to let other packages

    detect and use it. This is the standard way in LATEX for an option – in this case a language –

    to be recognized by several packages.

    Many languages are compatible with xetex and luatex. With them you can use babel to

    localize the documents. When these engines are used, the Latin script is covered by default

    in current LATEX (provided the document encoding is UTF-8), because the font loader is

    preloaded and the font is switched to lmroman. Other scripts require loading fontspec. You

    may want to set the font attributes with fontspec, too.

    EXAMPLE Here is a simple full example for “traditional” TEX engines (see below for xetex and luatex).

    The packages fontenc and inputenc do not belong to babel, but they are included in the example

    because typically you will need them. It assumes UTF-8, the default encoding:

    4

    https://github.com/latex3/babel/tree/master/news-guideshttp://tug.org/mailman/listinfo/kadingirahttps://github.com/latex3/babelhttps://github.com/latex3/babel/issueshttps://github.com/latex3/babel/tree/master/samples

  • pdftex\documentclass{article}

    \usepackage[T1]{fontenc}

    \usepackage[french]{babel}

    \begin{document}

    Plus ça change, plus c'est la même chose!

    \end{document}

    Now consider something like:

    \documentclass[french]{article}

    \usepackage{babel}

    \usepackage{varioref}

    With this setting, the package varioref will also see the option french and will be able to use it.

    EXAMPLE And now a simple monolingual document in Russian (text from the Wikipedia) with xetex

    or luatex. Note neither fontenc nor inputenc are necessary, but the document should be encoded

    in UTF-8 and a so-called Unicode font must be loaded (in this example \babelfont is used,

    described below).

    luatex/xetex\documentclass[russian]{article}

    \usepackage{babel}

    \babelfont{rm}{DejaVu Serif}

    \begin{document}

    Россия, находящаяся на пересечении множества культур, а также

    с учётом многонационального характера её населения, — отличается

    высокой степенью этнокультурного многообразия и способностью к

    межкультурному диалогу.

    \end{document}

    TROUBLESHOOTING A common source of trouble is a wrong setting of the input encoding.

    Depending on the LATEX version you can get the following somewhat cryptic error:

    ! Paragraph ended before \UTFviii@three@octets was complete.

    Or the more explanatory:

    ! Package inputenc Error: Invalid UTF-8 byte ...

    Make sure you set the encoding actually used by your editor.

    5

  • NOTE Because of the way babel has evolved, “language” can refer to (1) a set of hyphenation

    patterns as preloaded into the format, (2) a package option, (3) an ldf file, and (4) a name used in

    the document to select a language or dialect. So, a package option refers to a language in a

    generic way – sometimes it is the actual language name used to select it, sometimes it is a file

    name loading a language with a different name, sometimes it is a file name loading several

    languages. Please, read the documentation for specific languages for further info.

    TROUBLESHOOTING The following warning is about hyphenation patterns, which are not under the

    direct control of babel:

    Package babel Warning: No hyphenation patterns were preloaded for

    (babel) the language `LANG' into the format.

    (babel) Please, configure your TeX system to add them and

    (babel) rebuild the format. Now I will use the patterns

    (babel) preloaded for \language=0 instead on input line 57.

    The document will be typeset, but very likely the text will not be correctly hyphenated. Some

    languages may be raising this warning wrongly (because they are not hyphenated); it is a bug to

    be fixed – just ignore it. See the manual of your distribution (MacTEX, MikTEX, TEXLive, etc.) for

    further info about how to configure it.

    NOTE With hyperref you may want to set the document language with something like:

    \usepackage[pdflang=es-MX]{hyperref}

    This is not currently done by babel and you must set it by hand.

    NOTE Although it has been customary to recommend placing \title, \author and other elements

    printed by \maketitle after \begin{document}, mainly because of shorthands, it is advisable to

    keep them in the preamble. Currently there is no real need to use shorthands in those macros.

    1.2 Multilingual documents

    In multilingual documents, just use a list of the required languages as package or class

    options. The last language is considered the main one, activated by default. Sometimes, the

    main language changes the document layout (eg, spanish and french).

    EXAMPLE In LATEX, the preamble of the document:

    \documentclass{article}

    \usepackage[dutch,english]{babel}

    would tell LATEX that the document would be written in two languages, Dutch and English, and

    that English would be the first language in use, and the main one.

    You can also set the main language explicitly, but it is discouraged except if there a real

    reason to do so:

    \documentclass{article}

    \usepackage[main=english,dutch]{babel}

    Examples of cases where main is useful are the following.

    NOTE Some classes load babelwith a hardcoded language option. Sometimes, the main language can

    be overridden with something like that before \documentclass:

    6

  • \PassOptionsToPackage{main=english}{babel}

    WARNING Languages may be set as global and as package option at the same time, but in such a case

    you should set explicitly the main language with the package option main:

    \documentclass[italian]{book}

    \usepackage[ngerman,main=italian]{babel}

    WARNING In the preamble the main language has not been selected, except hyphenation patterns

    and the name assigned to \languagename (in particular, shorthands, captions and date are not

    activated). If you need to define boxes and the like in the preamble, you might want to use some

    of the language selectors described below.

    To switch the language there are two basic macros, described below in detail:

    \selectlanguage is used for blocks of text, while \foreignlanguage is for chunks of text

    inside paragraphs.

    EXAMPLE A full bilingual document with pdftex follows. The main language is french, which is

    activated when the document begins. It assumes UTF-8:

    pdftex\documentclass{article}

    \usepackage[T1]{fontenc}

    \usepackage[english,french]{babel}

    \begin{document}

    Plus ça change, plus c'est la même chose!

    \selectlanguage{english}

    And an English paragraph, with a short text in

    \foreignlanguage{french}{français}.

    \end{document}

    EXAMPLE With xetex and luatex, the following bilingual, single script document in UTF-8 encoding

    just prints a couple of ‘captions’ and \today in Danish and Vietnamese. No additional packages

    are required.

    luatex/xetex\documentclass{article}

    \usepackage[vietnamese,danish]{babel}

    \begin{document}

    \prefacename{} -- \alsoname{} -- \today

    \selectlanguage{vietnamese}

    \prefacename{} -- \alsoname{} -- \today

    \end{document}

    NOTE Once loaded a language, you can select it with the corresponding BCP47 tag. See section 1.22

    for further details.

    7

  • 1.3 Mostly monolingual documents

    New 3.39 Very often, multilingual documents consist of a main language with small

    pieces of text in another languages (words, idioms, short sentences). Typically, all you need

    is to set the line breaking rules and, perhaps, the font. In such a case, babel now does not

    require declaring these secondary languages explicitly, because the basic settings are

    loaded on the fly when the language is selected (and also when provided in the optional

    argument of \babelfont, if used.)

    This is particularly useful, too, when there are short texts of this kind coming from an

    external source whose contents are not known on beforehand (for example, titles in a

    bibliography). At this regard, it is worth remembering that \babelfont does not load any

    font until required, so that it can be used just in case.

    EXAMPLE A trivial document with the default font in English and Spanish, and FreeSerif in Russian

    is:

    luatex/xetex\documentclass[english]{article}

    \usepackage{babel}

    \babelfont[russian]{rm}{FreeSerif}

    \begin{document}

    English. \foreignlanguage{russian}{Русский}.

    \foreignlanguage{spanish}{Español}.

    \end{document}

    NOTE Instead of its name, you may prefer to select the language with the corresponding BCP47 tag.

    This alternative, however, must be activated explicitly, because a two- or tree-letter word is a

    valid name for a language (eg, yi). See section 1.22 for further details.

    1.4 Modifiers

    New 3.9c The basic behavior of some languages can be modified when loading babel by

    means ofmodifiers. They are set after the language name, and are prefixed with a dot (only

    when the language is set as package option – neither global options nor the main key

    accepts them). An example is (spaces are not significant and they can be added or

    removed):1

    \usepackage[latin.medieval, spanish.notilde.lcroman, danish]{babel}

    Attributes (described below) are considered modifiers, ie, you can set an attribute by

    including it in the list of modifiers. However, modifiers are a more general mechanism.

    1.5 Troubleshooting

    • Loading directly sty files in LATEX (ie, \usepackage{〈language〉}) is deprecated and youwill get the error:2

    1No predefined “axis” formodifiers are provided because languages and their scripts have quite different needs.2In old versions the error read “You have used an old interface to call babel”, not very helpful.

    8

  • ! Package babel Error: You are loading directly a language style.

    (babel) This syntax is deprecated and you must use

    (babel) \usepackage[language]{babel}.

    • Another typical error when using babel is the following:3

    ! Package babel Error: Unknown language `#1'. Either you have

    (babel) misspelled its name, it has not been installed,

    (babel) or you requested it in a previous run. Fix its name,

    (babel) install it or just rerun the file, respectively. In

    (babel) some cases, you may need to remove the aux file

    The most frequent reason is, by far, the latest (for example, you included spanish, but

    you realized this language is not used after all, and therefore you removed it from the

    option list). In most cases, the error vanishes when the document is typeset again, but

    in more severe ones you will need to remove the aux file.

    1.6 Plain

    In e-Plain and pdf-Plain, load languages styles with \input and then use \begindocument

    (the latter is defined by babel):

    \input estonian.sty

    \begindocument

    WARNING Not all languages provide a sty file and some of them are not compatible with those

    formats. Please, refer to Using babel with Plain for further details.

    1.7 Basic language selectors

    This section describes the commands to be used in the document to switch the language in

    multilingual documents. In most cases, only the two basic macros \selectlanguage and

    \foreignlanguage are necessary. The environments otherlanguage, otherlanguage*

    and hyphenrules are auxiliary, and described in the next section.

    The main language is selected automatically when the document environment begins.

    {〈language〉}\selectlanguageWhen a user wants to switch from one language to another he can do so using the macro

    \selectlanguage. This macro takes the language, defined previously by a language

    definition file, as its argument. It calls several macros that should be defined in the

    language definition files to activate the special definitions for the language chosen:

    \selectlanguage{german}

    This command can be used as environment, too.

    NOTE For “historical reasons”, a macro name is converted to a language name without the leading \;

    in other words, \selectlanguage{\german} is equivalent to \selectlanguage{german}. Using a

    macro instead of a “real” name is deprecated. New 3.43 However, if the macro name does not

    match any language, it will get expanded as expected.

    3In old versions the error read “You haven’t loaded the language LANG yet”.

    9

    https://github.com/latex3/babel/blob/master/news-guides/guides/using-babel-with-plain.md

  • WARNING If used inside braces there might be some non-local changes, as this would be roughly

    equivalent to:

    {\selectlanguage{} ...}\selectlanguage{}

    If you want a change which is really local, you must enclose this code with an additional

    grouping level.

    [〈option-list〉]{〈language〉}{〈text〉}\foreignlanguageThe command \foreignlanguage takes two arguments; the second argument is a phrase

    to be typeset according to the rules of the language named in its first one.

    This command (1) only switches the extra definitions and the hyphenation rules for the

    language, not the names and dates, (2) does not send information about the language to

    auxiliary files (i.e., the surrounding language is still in force), and (3) it works even if the

    language has not been set as package option (but in such a case it only sets the

    hyphenation patterns and a warning is shown). With the bidi option, it also enters in

    horizontal mode (this is not done always for backwards compatibility).

    New 3.44 As already said, captions and dates are not switched. However, with the

    optional argument you can switch them, too. So, you can write:

    \foreignlanguage[date]{polish}{\today}

    In addition, captions can be switched with captions (or both, of course, with date,

    captions). Until 3.43 you had to write something like {\selectlanguage{..} ..}, which

    was not always the most convenient way.

    1.8 Auxiliary language selectors

    {〈language〉} … \end{otherlanguage}\begin{otherlanguage}The environment otherlanguage does basically the same as \selectlanguage, except that

    language change is (mostly) local to the environment.

    Actually, there might be some non-local changes, as this environment is roughly equivalent

    to:

    \begingroup

    \selectlanguage{}

    ...

    \endgroup

    \selectlanguage{}

    If you want a change which is really local, you must enclose this environment with an

    additional grouping, like braces {}.

    Spaces after the environment are ignored.

    [〈option-list〉]{〈language〉} … \end{otherlanguage*}\begin{otherlanguage*}Same as \foreignlanguage but as environment. Spaces after the environment are not

    ignored.

    This environment was originally intended for intermixing left-to-right typesetting with

    right-to-left typesetting in engines not supporting a change in the writing direction inside a

    10

  • line. However, by default it never complied with the documented behavior and it is just a

    version as environment of \foreignlanguage, except when the option bidi is set – in this

    case, \foreignlanguage emits a \leavevmode, while otherlanguage* does not.

    1.9 More on selection

    {〈tag1〉 = 〈language1〉, 〈tag2〉 = 〈language2〉, …}\babeltags

    New 3.9i In multilingual documents with many language-switches the commands above

    can be cumbersome. With this tool shorter names can be defined. It adds nothing really

    new – it is just syntactical sugar.

    It defines \text〈tag1〉{〈text〉} to be \foreignlanguage{〈language1〉}{〈text〉}, and\begin{〈tag1〉} to be \begin{otherlanguage*}{〈language1〉}, and so on. Note \〈tag1〉 isalso allowed, but remember to set it locally inside a group.

    WARNING There is a clear drawback to this feature, namely, the ‘prefix’ \text... is heavily

    overloaded in LATEX and conflicts with existing macros may arise (\textlatin, \textbar,

    \textit, \textcolor and many others). The same applies to environments, because arabic

    conflicts with \arabic. Except if there is a reason for this ‘syntactical sugar’, the best option is to

    stick to the default selectors or to define your own alternatives.

    EXAMPLE With

    \babeltags{de = german}

    you can write

    text \textde{German text} text

    and

    text

    \begin{de}

    German text

    \end{de}

    text

    NOTE Something like \babeltags{finnish = finnish} is legitimate – it defines \textfinnish and

    \finnish (and, of course, \begin{finnish}).

    NOTE Actually, there may be another advantage in the ‘short’ syntax \text〈tag〉, namely, it is notaffected by \MakeUppercase (while \foreignlanguage is).

    [include=〈commands〉,exclude=〈commands〉,fontenc=〈encoding〉]{〈language〉}\babelensure

    New 3.9i Except in a few languages, like russian, captions and dates are just strings, and

    do not switch the language. That means you should set it explicitly if you want to use them,

    or hyphenation (and in some cases the text itself) will be wrong. For example:

    \foreignlanguage{russian}{text \foreignlanguage{polish}{\seename} text}

    Of course, TEX can do it for you. To avoid switching the language all the while,

    \babelensure redefines the captions for a given language to wrap them with a selector:

    11

  • \babelensure{polish}

    By default only the basic captions and \today are redefined, but you can add further

    macros with the key include in the optional argument (without commas). Macros not to

    be modified are listed in exclude. You can also enforce a font encoding with the option

    fontenc.4 A couple of examples:

    \babelensure[include=\Today]{spanish}

    \babelensure[fontenc=T5]{vietnamese}

    They are activated when the language is selected (at the afterextras event), and it makes

    some assumptions which could not be fulfilled in some languages. Note also you should

    include only macros defined by the language, not global macros (eg, \TeX of \dag).

    With ini files (see below), captions are ensured by default.

    1.10 Shorthands

    A shorthand is a sequence of one or two characters that expands to arbitrary TEX code.

    Shorthands can be used for different kinds of things; for example: (1) in some languages

    shorthands such as "a are defined to be able to hyphenate the word if the encoding is OT1;

    (2) in some languages shorthands such as ! are used to insert the right amount of white

    space; (3) several kinds of discretionaries and breaks can be inserted easily with "-, "=, etc.

    The package inputenc as well as xetex and luatex have alleviated entering non-ASCII

    characters, but minority languages and some kinds of text can still require characters not

    directly available on the keyboards (and sometimes not even as separated or precomposed

    Unicode characters). As to the point 2, now pdfTeX provides \knbccode, and luatex can

    manipulate the glyph list. Tools for point 3 can be still very useful in general.

    There are four levels of shorthands: user, language, system, and language user (by order of

    precedence). In most cases, you will use only shorthands provided by languages.

    NOTE Keep in mind the following:

    1. Activated chars used for two-char shorthands cannot be followed by a closing brace } and the

    spaces following are gobbled. With one-char shorthands (eg, :), they are preserved.

    2. If on a certain level (system, language, user, language user) there is a one-char shorthand,

    two-char ones starting with that char and on the same level are ignored.

    3. Since they are active, a shorthand cannot contain the same character in its definition (except

    if deactivated with, eg, \string).

    TROUBLESHOOTING A typical error when using shorthands is the following:

    ! Argument of \language@active@arg" has an extra }.

    It means there is a closing brace just after a shorthand, which is not allowed (eg, "}). Just add {}

    after (eg, "{}}).

    {〈shorthands-list〉}\shorthandon

    12

  • * {〈shorthands-list〉}\shorthandoffIt is sometimes necessary to switch a shorthand character off temporarily, because it must

    be used in an entirely different way. For this purpose, the user commands \shorthandoff

    and \shorthandon are provided. They each take a list of characters as their arguments.

    The command \shorthandoff sets the \catcode for each of the characters in its argument

    to other (12); the command \shorthandon sets the \catcode to active (13). Both commands

    only work on ‘known’ shorthand characters.

    New 3.9a However, \shorthandoff does not behave as you would expect with

    characters like ~ or ^, because they usually are not “other”. For them \shorthandoff* is

    provided, so that with

    \shorthandoff*{~^}

    ~ is still active, very likely with the meaning of a non-breaking space, and ^ is the

    superscript character. The catcodes used are those when the shorthands are defined,

    usually when language files are loaded.

    If you do not need shorthands, or prefer an alternative approach of your own, you may

    want to switch them off with the package option shorthands=off, as described below.

    * {〈char〉}\useshorthandsThe command \useshorthands initiates the definition of user-defined shorthand

    sequences. It has one argument, the character that starts these personal shorthands.

    New 3.9a User shorthands are not always alive, as they may be deactivated by languages

    (for example, if you use " for your user shorthands and switch from german to french, they

    stop working). Therefore, a starred version \useshorthands*{〈char〉} is provided, whichmakes sure shorthands are always activated.

    Currently, if the package option shorthands is used, you must include any character to be

    activated with \useshorthands. This restriction will be lifted in a future release.

    [〈language〉,〈language〉,...]{〈shorthand〉}{〈code〉}\defineshorthandThe command \defineshorthand takes two arguments: the first is a one- or two-character

    shorthand sequence, and the second is the code the shorthand should expand to.

    New 3.9a An optional argument allows to (re)define language and system shorthands

    (some languages do not activate shorthands, so you may want to add

    \languageshorthands{〈lang〉} to the corresponding \extras〈lang〉, as explained below).By default, user shorthands are (re)defined.

    User shorthands override language ones, which in turn override system shorthands.

    Language-dependent user shorthands (new in 3.9) take precedence over “normal” user

    shorthands.

    EXAMPLE Let’s assume you want a unified set of shorthand for discretionaries (languages do not

    define shorthands consistently, and "-, \-, "= have different meanings). You can start with, say:

    \useshorthands*{"}

    \defineshorthand{"*}{\babelhyphen{soft}}

    \defineshorthand{"-}{\babelhyphen{hard}}

    However, the behavior of hyphens is language-dependent. For example, in languages like Polish

    and Portuguese, a hard hyphen inside compound words are repeated at the beginning of the next

    line. You can then set:

    4With it, encoded strings may not work as expected.

    13

  • \defineshorthand[*polish,*portuguese]{"-}{\babelhyphen{repeat}}

    Here, options with * set a language-dependent user shorthand, which means the generic one

    above only applies for the rest of languages; without * they would (re)define the language

    shorthands instead, which are overridden by user ones.

    Now, you have a single unified shorthand ("-), with a content-based meaning (‘compound word

    hyphen’) whose visual behavior is that expected in each context.

    {〈language〉}\languageshorthandsThe command \languageshorthands can be used to switch the shorthands on the

    language level. It takes one argument, the name of a language or none (the latter does what

    its name suggests).5 Note that for this to work the language should have been specified as

    an option when loading the babel package. For example, you can use in english the

    shorthands defined by ngerman with

    \addto\extrasenglish{\languageshorthands{ngerman}}

    (You may also need to activate them as user shorthands in the preamble with, for example,

    \useshorthands or \useshorthands*.)

    EXAMPLE Very often, this is a more convenient way to deactivate shorthands than \shorthandoff,

    for example if you want to define a macro to easy typing phonetic characters with tipa:

    \newcommand{\myipa}[1]{{\languageshorthands{none}\tipaencoding#1}}

    {〈shorthand〉}\babelshorthandWith this command you can use a shorthand even if (1) not activated in shorthands (in

    this case only shorthands for the current language are taken into account, ie, not user

    shorthands), (2) turned off with \shorthandoff or (3) deactivated with the internal

    \bbl@deactivate; for example, \babelshorthand{"u} or \babelshorthand{:}. (You can

    conveniently define your own macros, or even your own user shorthands provided they

    do not overlap.)

    EXAMPLE Since by default shorthands are not activated until \begin{document}, you may use this

    macro when defining the \title in the preamble:

    \title{Documento científico\babelshorthand{"-}técnico}

    For your records, here is a list of shorthands, but you must double check them, as they may

    change:6

    Languages with no shorthands Croatian, English (any variety), Indonesian, Hebrew,

    Interlingua, Irish, Lower Sorbian, Malaysian, North Sami, Romanian, Scottish, Welsh

    5Actually, any name not corresponding to a language group does the same as none. However, follow this con-

    vention because it might be enforced in future releases of babel to catch possible errors.6Thanks to Enrico Gregorio

    14

  • Languages with only " as defined shorthand character Albanian, Bulgarian, Danish,

    Dutch, Finnish, German (old and new orthography, also Austrian), Icelandic, Italian,

    Norwegian, Polish, Portuguese (also Brazilian), Russian, Serbian (with Latin script),

    Slovene, Swedish, Ukrainian, Upper Sorbian

    Basque " ' ~

    Breton : ; ? !

    Catalan " ' `

    Czech " -

    Esperanto ^

    Estonian " ~

    French (all varieties) : ; ? !

    Galician " . ' ~ < >

    Greek ~

    Hungarian `

    Kurmanji ^

    Latin " ^ =

    Slovak " ^ ' -

    Spanish " . < > ' ~

    Turkish : ! =

    In addition, the babel core declares ~ as a one-char shorthand which is let, like the

    standard ~, to a non breaking space.7

    {〈character〉}{〈true〉}{〈false〉}\ifbabelshorthandNew 3.23 Tests if a character has been made a shorthand.

    {〈original〉}{〈alias〉}\aliasshorthandThe command \aliasshorthand can be used to let another character perform the same

    functions as the default shorthand character. If one prefers for example to use the

    character / over " in typing Polish texts, this can be achieved by entering

    \aliasshorthand{"}{/}. For the reasons in the warning below, usage of this macro is not

    recommended.

    NOTE The substitute character must not have been declared before as shorthand (in such a case,

    \aliashorthands is ignored).

    EXAMPLE The following example shows how to replace a shorthand by another

    \aliasshorthand{~}{^}

    \AtBeginDocument{\shorthandoff*{~}}

    WARNING Shorthands remember somehow the original character, and the fallback value is that of

    the latter. So, in this example, if no shorthand if found, ^ expands to a non-breaking space,

    because this is the value of ~ (internally, ^ still calls \active@char~ or \normal@char~).

    Furthermore, if you change the system value of ^ with \defineshorthand nothing happens.

    1.11 Package options

    New 3.9a These package options are processed before language options, so that they are

    taken into account irrespective of its order. The first three options have been available in

    previous versions.

    15

  • Tells babel not to deactivate shorthands after loading a language file, so that they are alsoKeepShorthandsActive

    available in the preamble.

    For some languages babel supports this options to set ' as a shorthand in case it is not doneactiveacute

    by default.

    Same for `.activegrave

    〈char〉〈char〉... | offshorthands=The only language shorthands activated are those given, like, eg:

    \usepackage[esperanto,french,shorthands=:;!?]{babel}

    If ' is included, activeacute is set; if ` is included, activegrave is set. Active characters

    (like ~) should be preceded by \string (otherwise they will be expanded by LATEX before

    they are passed to the package and therefore they will not be recognized); however, t is

    provided for the common case of ~ (as well as c for not so common case of the comma).

    With shorthands=off no language shorthands are defined, As some languages use this

    mechanism for tools not available otherwise, a macro \babelshorthand is defined, which

    allows using them; see above.

    none | ref | bibsafe=Some LATEX macros are redefined so that using shorthands is safe. With safe=bib only

    \nocite, \bibcite and \bibitem are redefined. With safe=ref only \newlabel, \ref and

    \pageref are redefined (as well as a few macros from varioref and ifthen).

    With safe=none no macro is redefined. This option is strongly recommended, because a

    good deal of incompatibilities and errors are related to these redefinitions. As of

    New 3.34 , in �TEX based engines (ie, almost every engine except the oldest ones)shorthands can be used in these macros (formerly you could not).

    active | normalmath=Shorthands are mainly intended for text, not for math. By setting this option with the

    value normal they are deactivated in math mode (default is active) and things like ${a'}$

    (a closing brace after a shorthand) are not a source of trouble anymore.

    〈file〉config=Load 〈file〉.cfg instead of the default config file bblopts.cfg (the file is loaded even withnoconfigs).

    〈language〉main=Sets the main language, as explained above, ie, this language is always loaded last. If it is

    not given as package or global option, it is added to the list of requested languages.

    〈language〉headfoot=By default, headlines and footlines are not touched (only marks), and if they contain

    language-dependent macros (which is not usual) there may be unexpected results. With

    this option you may set the language in heads and foots.

    7This declaration serves to nothing, but it is preserved for backward compatibility.

    16

  • Global and language default config files are not loaded, so you can make sure yournoconfigs

    document is not spoilt by an unexpected .cfg file. However, if the key config is set, this

    file is loaded.

    Prints to the log the list of languages loaded when the format was created: numbershowlanguages

    (remember dialects can share it), name, hyphenation file and exceptions file.

    New 3.9l Language settings for uppercase and lowercase mapping (as set by \SetCase)nocase

    are ignored. Use only if there are incompatibilities with other packages.

    New 3.9l No warnings and no infos are written to the log file.8silent

    generic | unicode | encoded | 〈label〉 | 〈font encoding〉strings=Selects the encoding of strings in languages supporting this feature. Predefined labels are

    generic (for traditional TEX, LICR and ASCII strings), unicode (for engines like xetex and

    luatex) and encoded (for special cases requiring mixed encodings). Other allowed values

    are font encoding codes (T1, T2A, LGR, L7X...), but only in languages supporting them. Be

    aware with encoded captions are protected, but they work in \MakeUppercase and the like

    (this feature misuses some internal LATEX tools, so use it only as a last resort).

    off | first | select | other | other*hyphenmap=New 3.9g Sets the behavior of case mapping for hyphenation, provided the language

    defines it.9 It can take the following values:

    off deactivates this feature and no case mapping is applied;

    first sets it at the first switching commands in the current or parent scope (typically,

    when the aux file is first read and at \begin{document}, but also the first

    \selectlanguage in the preamble), and it’s the default if a single language option has

    been stated;10

    select sets it only at \selectlanguage;

    other also sets it at otherlanguage;

    other* also sets it at otherlanguage* as well as in heads and foots (if the option headfoot

    is used) and in auxiliary files (ie, at \select@language), and it’s the default if several

    language options have been stated. The option first can be regarded as an optimized

    version of other* for monolingual documents.11

    default | basic | basic-r | bidi-l | bidi-rbidi=New 3.14 Selects the bidi algorithm to be used in luatex and xetex. See sec. 1.24.

    layout=

    New 3.16 Selects which layout elements are adapted in bidi documents. See sec. 1.24.

    1.12 The base option

    With this package option babel just loads some basic macros (those in switch.def),

    defines \AfterBabelLanguage and exits. It also selects the hyphenation patterns for the

    8You can use alternatively the package silence.9Turned off in plain.

    10Duplicated options count as several ones.11Providing foreign is pointless, because the case mapping applied is that at the end of the paragraph, but if

    either xetex or luatex change this behavior it might be added. On the other hand, other is provided even if I [JBL]

    think it isn’t really useful, but who knows.

    17

  • last language passed as option (by its name in language.dat). There are two main uses:

    classes and packages, and as a last resort in case there are, for some reason, incompatible

    languages. It can be used if you just want to select the hyphenation patterns of a single

    language, too.

    {〈option-name〉}{〈code〉}\AfterBabelLanguageThis command is currently the only provided by base. Executes 〈code〉when the file loadedby the corresponding package option is finished (at \ldf@finish). The setting is global. So

    \AfterBabelLanguage{french}{...}

    does ... at the end of french.ldf. It can be used in ldf files, too, but in such a case the code

    is executed only if 〈option-name〉 is the same as \CurrentOption (which could not be thesame as the option name as set in \usepackage!).

    EXAMPLE Consider two languages foo and bar defining the same \macro with \newcommand. An

    error is raised if you attempt to load both. Here is a way to overcome this problem:

    \usepackage[base]{babel}

    \AfterBabelLanguage{foo}{%

    \let\macroFoo\macro

    \let\macro\relax}

    \usepackage[foo,bar]{babel}

    WARNING Currently this option is not compatible with languages loaded on the fly.

    1.13 ini files

    An alternative approach to define a language (or, more precisely, a locale) is by means of

    an ini file. Currently babel provides about 200 of these files containing the basic data

    required for a locale.

    ini files are not meant only for babel, and they has been devised as a resource for other

    packages. To easy interoperability between TEX and other systems, they are identified with

    the BCP 47 codes as preferred by the Unicode Common Locale Data Repository, which was

    used as source for most of the data provided by these files, too (the main exception being

    the \...name strings).

    Most of them set the date, and many also the captions (Unicode and LICR). They will be

    evolving with the time to add more features (something to keep in mind if backward

    compatibility is important). The following section shows how to make use of them by

    means of \babelprovide. In other words, \babelprovide is mainly meant for auxiliary

    tasks, and as alternative when the ldf, for some reason, does work as expected.

    EXAMPLE Although Georgian has its own ldf file, here is how to declare this language with an ini

    file in Unicode engines.

    luatex/xetex\documentclass{book}

    \usepackage{babel}

    \babelprovide[import, main]{georgian}

    \babelfont{rm}[Renderer=Harfbuzz]{DejaVu Sans}

    18

  • \begin{document}

    \tableofcontents

    \chapter{სამზარეულო და სუფრის ტრადიციები}

    ქართული ტრადიციული სამზარეულო ერთ-ერთი უმდიდრესია მთელ მსოფლიოში.

    \end{document}

    New 3.49 Alternatively, you can tell babel to load all or some languages passed as options

    with \babelprovide and not from the ldf file in a few few typical cases. Thus, provide=*

    means ‘load the main language with the \babelprovide mechanism instead of the ldf file’

    applying the basic features, which in this case means import, main. There are (currently)

    three options:

    • provide=* is the option just explained, for the main language;

    • provide+=* is the same for additional languages (the main language is still the ldf file);

    • provide*=* is the same for all languages, ie, main and additional.

    EXAMPLE The preamble in the previous example can be more compactly written as:

    \documentclass{book}

    \usepackage[georgian, provide=*]{babel}

    \babelfont{rm}[Renderer=Harfbuzz]{DejaVu Sans}

    Or also:

    \documentclass[georgian]{book}

    \usepackage[provide=*]{babel}

    \babelfont{rm}[Renderer=Harfbuzz]{DejaVu Sans}

    NOTE The ini files just define and set some parameters, but the corresponding behavior is not

    always implemented. Also, there are some limitations in the engines. A few remarks follow

    (which could no longer be valid when you read this manual, if the packages involved han been

    updated). The Harfbuzz renderer has still some issues, so as a rule of thumb prefer the default

    renderer, and resort to Harfbuzz only if the former does not work for you. Fortunately, fonts can

    be loaded twice with different renderers; for example:

    \babelfont[spanish]{rm}{FreeSerif}

    \babelfont[hindi]{rm}[Renderer=Harfbuzz]{FreeSerif}

    Arabic Monolingual documents mostly work in luatex, but it must be fine tuned, particularly

    graphical elements like picture. In xetex babel resorts to the bidi package, which seems to

    work.

    Hebrew Niqqud marks seem to work in both engines, but depending on the font cantillation

    marks might be misplaced (xetex or luatex with Harfbuzz seems better, but still problematic).

    Devanagari In luatex and the the default renderer many fonts work, but some others do not, the

    main issue being the ‘ra’. You may need to set explicitly the script to either deva or dev2, eg:

    \newfontscript{Devanagari}{deva}

    19

  • Other Indic scripts are still under development in the default luatex renderer, but should

    work with Renderer=Harfbuzz. They also work with xetex, although unlike with luatex fine

    tuning the font behavior is not always possible.

    Southeast scripts Thai works in both luatex and xetex, but line breaking differs (rules can be

    modified in luatex; they are hard-coded in xetex). Lao seems to work, too, but there are no

    patterns for the latter in luatex. Khemer clusters are rendered wrongly with the default

    renderer. The comment about Indic scripts and lualatex also applies here. Some quick

    patterns can help, with something similar to:

    \babelprovide[import, hyphenrules=+]{lao}

    \babelpatterns[lao]{1ດ 1ມ 1ອ 1ງ 1ກ 1າ} % Random

    East Asia scripts Settings for either Simplified of Traditional should work out of the box, with

    basic line breaking with any renderer. Although for a few words and shorts texts the ini files

    should be fine, CJK texts are best set with a dedicated framework (CJK, luatexja, kotex, CTeX,

    etc.). This is what the class ltjbook does with luatex, which can be used in conjunction with

    the ldf for japanese, because the following piece of code loads luatexja:

    \documentclass[japanese]{ltjbook}

    \usepackage{babel}

    Latin, Greek, Cyrillic Combining chars with the default luatex font renderer might be wrong;

    on then other hand, with the Harfbuzz renderer diacritics are stacked correctly, but many

    hyphenations points are discarded (this bug seems related to kerning, so it depends on the

    font). With xetex both combining characters and hyphenation work as expected (not quite,

    but in most cases it works; the problem here are font clusters).

    NOTE Wikipedia defines a locale as follows: “In computing, a locale is a set of parameters that defines

    the user’s language, region and any special variant preferences that the user wants to see in their

    user interface. Usually a locale identifier consists of at least a language code and a country/region

    code.” Babel is moving gradually from the old and fuzzy concept of language to the more modern

    of locale. Note each locale is by itself a separate “language”, which explains why there are so

    many files. This is on purpose, so that possible variants can be created and/or redefined easily.

    Here is the list (u means Unicode captions, and l means LICR captions):

    af Afrikaansul

    agq Aghem

    ak Akan

    am Amharicul

    ar Arabicul

    ar-DZ Arabicul

    ar-MA Arabicul

    ar-SY Arabicul

    as Assamese

    asa Asu

    ast Asturianul

    az-Cyrl Azerbaijani

    az-Latn Azerbaijani

    az Azerbaijaniul

    bas Basaa

    be Belarusianul

    bem Bemba

    bez Bena

    bg Bulgarianul

    bm Bambara

    bn Banglaul

    bo Tibetanu

    brx Bodo

    bs-Cyrl Bosnian

    bs-Latn Bosnianul

    bs Bosnianul

    ca Catalanul

    ce Chechen

    cgg Chiga

    chr Cherokee

    ckb Central Kurdish

    cop Coptic

    cs Czechul

    cu Church Slavic

    cu-Cyrs Church Slavic

    cu-Glag Church Slavic

    cy Welshul

    da Danishul

    dav Taita

    de-AT Germanul

    de-CH Germanul

    de Germanul

    20

  • dje Zarma

    dsb Lower Sorbianul

    dua Duala

    dyo Jola-Fonyi

    dz Dzongkha

    ebu Embu

    ee Ewe

    el Greekul

    el-polyton Polytonic Greekul

    en-AU Englishul

    en-CA Englishul

    en-GB Englishul

    en-NZ Englishul

    en-US Englishul

    en Englishul

    eo Esperantoul

    es-MX Spanishul

    es Spanishul

    et Estonianul

    eu Basqueul

    ewo Ewondo

    fa Persianul

    ff Fulah

    fi Finnishul

    fil Filipino

    fo Faroese

    fr Frenchul

    fr-BE Frenchul

    fr-CA Frenchul

    fr-CH Frenchul

    fr-LU Frenchul

    fur Friulianul

    fy Western Frisian

    ga Irishul

    gd Scottish Gaelicul

    gl Galicianul

    grc Ancient Greekul

    gsw Swiss German

    gu Gujarati

    guz Gusii

    gv Manx

    ha-GH Hausa

    ha-NE Hausal

    ha Hausa

    haw Hawaiian

    he Hebrewul

    hi Hindiu

    hr Croatianul

    hsb Upper Sorbianul

    hu Hungarianul

    hy Armenianu

    ia Interlinguaul

    id Indonesianul

    ig Igbo

    ii Sichuan Yi

    is Icelandicul

    it Italianul

    ja Japanese

    jgo Ngomba

    jmc Machame

    ka Georgianul

    kab Kabyle

    kam Kamba

    kde Makonde

    kea Kabuverdianu

    khq Koyra Chiini

    ki Kikuyu

    kk Kazakh

    kkj Kako

    kl Kalaallisut

    kln Kalenjin

    km Khmer

    kn Kannadaul

    ko Korean

    kok Konkani

    ks Kashmiri

    ksb Shambala

    ksf Bafia

    ksh Colognian

    kw Cornish

    ky Kyrgyz

    lag Langi

    lb Luxembourgish

    lg Ganda

    lkt Lakota

    ln Lingala

    lo Laoul

    lrc Northern Luri

    lt Lithuanianul

    lu Luba-Katanga

    luo Luo

    luy Luyia

    lv Latvianul

    mas Masai

    mer Meru

    mfe Morisyen

    mg Malagasy

    mgh Makhuwa-Meetto

    mgo Metaʼ

    mk Macedonianul

    ml Malayalamul

    mn Mongolian

    mr Marathiul

    ms-BN Malayl

    ms-SG Malayl

    ms Malayul

    mt Maltese

    mua Mundang

    21

  • my Burmese

    mzn Mazanderani

    naq Nama

    nb Norwegian Bokmålul

    nd North Ndebele

    ne Nepali

    nl Dutchul

    nmg Kwasio

    nn Norwegian Nynorskul

    nnh Ngiemboon

    nus Nuer

    nyn Nyankole

    om Oromo

    or Odia

    os Ossetic

    pa-Arab Punjabi

    pa-Guru Punjabi

    pa Punjabi

    pl Polishul

    pms Piedmonteseul

    ps Pashto

    pt-BR Portugueseul

    pt-PT Portugueseul

    pt Portugueseul

    qu Quechua

    rm Romanshul

    rn Rundi

    ro Romanianul

    rof Rombo

    ru Russianul

    rw Kinyarwanda

    rwk Rwa

    sa-Beng Sanskrit

    sa-Deva Sanskrit

    sa-Gujr Sanskrit

    sa-Knda Sanskrit

    sa-Mlym Sanskrit

    sa-Telu Sanskrit

    sa Sanskrit

    sah Sakha

    saq Samburu

    sbp Sangu

    se Northern Samiul

    seh Sena

    ses Koyraboro Senni

    sg Sango

    shi-Latn Tachelhit

    shi-Tfng Tachelhit

    shi Tachelhit

    si Sinhala

    sk Slovakul

    sl Slovenianul

    smn Inari Sami

    sn Shona

    so Somali

    sq Albanianul

    sr-Cyrl-BA Serbianul

    sr-Cyrl-ME Serbianul

    sr-Cyrl-XK Serbianul

    sr-Cyrl Serbianul

    sr-Latn-BA Serbianul

    sr-Latn-ME Serbianul

    sr-Latn-XK Serbianul

    sr-Latn Serbianul

    sr Serbianul

    sv Swedishul

    sw Swahili

    ta Tamilu

    te Teluguul

    teo Teso

    th Thaiul

    ti Tigrinya

    tk Turkmenul

    to Tongan

    tr Turkishul

    twq Tasawaq

    tzm Central Atlas Tamazight

    ug Uyghur

    uk Ukrainianul

    ur Urduul

    uz-Arab Uzbek

    uz-Cyrl Uzbek

    uz-Latn Uzbek

    uz Uzbek

    vai-Latn Vai

    vai-Vaii Vai

    vai Vai

    vi Vietnameseul

    vun Vunjo

    wae Walser

    xog Soga

    yav Yangben

    yi Yiddish

    yo Yoruba

    yue Cantonese

    zgh Standard Moroccan

    Tamazight

    zh-Hans-HK Chinese

    zh-Hans-MO Chinese

    zh-Hans-SG Chinese

    zh-Hans Chinese

    zh-Hant-HK Chinese

    zh-Hant-MO Chinese

    zh-Hant Chinese

    zh Chinese

    zu Zulu

    22

  • In some contexts (currently \babelfont) an ini file may be loaded by its name. Here is the

    list of the names currently supported. With these languages, \babelfont loads (if not done

    before) the language and script names (even if the language is defined as a package option

    with an ldf file). These are also the names recognized by \babelprovide with a valueless

    import.

    aghem

    akan

    albanian

    american

    amharic

    ancientgreek

    arabic

    arabic-algeria

    arabic-DZ

    arabic-morocco

    arabic-MA

    arabic-syria

    arabic-SY

    armenian

    assamese

    asturian

    asu

    australian

    austrian

    azerbaijani-cyrillic

    azerbaijani-cyrl

    azerbaijani-latin

    azerbaijani-latn

    azerbaijani

    bafia

    bambara

    basaa

    basque

    belarusian

    bemba

    bena

    bengali

    bodo

    bosnian-cyrillic

    bosnian-cyrl

    bosnian-latin

    bosnian-latn

    bosnian

    brazilian

    breton

    british

    bulgarian

    burmese

    canadian

    cantonese

    catalan

    centralatlastamazight

    centralkurdish

    chechen

    cherokee

    chiga

    chinese-hans-hk

    chinese-hans-mo

    chinese-hans-sg

    chinese-hans

    chinese-hant-hk

    chinese-hant-mo

    chinese-hant

    chinese-simplified-hongkongsarchina

    chinese-simplified-macausarchina

    chinese-simplified-singapore

    chinese-simplified

    chinese-traditional-hongkongsarchina

    chinese-traditional-macausarchina

    chinese-traditional

    chinese

    churchslavic

    churchslavic-cyrs

    churchslavic-oldcyrillic12

    churchsslavic-glag

    churchsslavic-glagolitic

    colognian

    cornish

    croatian

    czech

    danish

    duala

    dutch

    dzongkha

    embu

    english-au

    english-australia

    english-ca

    english-canada

    english-gb

    english-newzealand

    english-nz

    english-unitedkingdom

    12The name in the CLDR is Old Church Slavonic Cyrillic, but it has been shortened for practical reasons.

    23

  • english-unitedstates

    english-us

    english

    esperanto

    estonian

    ewe

    ewondo

    faroese

    filipino

    finnish

    french-be

    french-belgium

    french-ca

    french-canada

    french-ch

    french-lu

    french-luxembourg

    french-switzerland

    french

    friulian

    fulah

    galician

    ganda

    georgian

    german-at

    german-austria

    german-ch

    german-switzerland

    german

    greek

    gujarati

    gusii

    hausa-gh

    hausa-ghana

    hausa-ne

    hausa-niger

    hausa

    hawaiian

    hebrew

    hindi

    hungarian

    icelandic

    igbo

    inarisami

    indonesian

    interlingua

    irish

    italian

    japanese

    jolafonyi

    kabuverdianu

    kabyle

    kako

    kalaallisut

    kalenjin

    kamba

    kannada

    kashmiri

    kazakh

    khmer

    kikuyu

    kinyarwanda

    konkani

    korean

    koyraborosenni

    koyrachiini

    kwasio

    kyrgyz

    lakota

    langi

    lao

    latvian

    lingala

    lithuanian

    lowersorbian

    lsorbian

    lubakatanga

    luo

    luxembourgish

    luyia

    macedonian

    machame

    makhuwameetto

    makonde

    malagasy

    malay-bn

    malay-brunei

    malay-sg

    malay-singapore

    malay

    malayalam

    maltese

    manx

    marathi

    masai

    mazanderani

    meru

    meta

    mexican

    mongolian

    morisyen

    mundang

    nama

    nepali

    newzealand

    ngiemboon

    ngomba

    norsk

    24

  • northernluri

    northernsami

    northndebele

    norwegianbokmal

    norwegiannynorsk

    nswissgerman

    nuer

    nyankole

    nynorsk

    occitan

    oriya

    oromo

    ossetic

    pashto

    persian

    piedmontese

    polish

    polytonicgreek

    portuguese-br

    portuguese-brazil

    portuguese-portugal

    portuguese-pt

    portuguese

    punjabi-arab

    punjabi-arabic

    punjabi-gurmukhi

    punjabi-guru

    punjabi

    quechua

    romanian

    romansh

    rombo

    rundi

    russian

    rwa

    sakha

    samburu

    samin

    sango

    sangu

    sanskrit-beng

    sanskrit-bengali

    sanskrit-deva

    sanskrit-devanagari

    sanskrit-gujarati

    sanskrit-gujr

    sanskrit-kannada

    sanskrit-knda

    sanskrit-malayalam

    sanskrit-mlym

    sanskrit-telu

    sanskrit-telugu

    sanskrit

    scottishgaelic

    sena

    serbian-cyrillic-bosniaherzegovina

    serbian-cyrillic-kosovo

    serbian-cyrillic-montenegro

    serbian-cyrillic

    serbian-cyrl-ba

    serbian-cyrl-me

    serbian-cyrl-xk

    serbian-cyrl

    serbian-latin-bosniaherzegovina

    serbian-latin-kosovo

    serbian-latin-montenegro

    serbian-latin

    serbian-latn-ba

    serbian-latn-me

    serbian-latn-xk

    serbian-latn

    serbian

    shambala

    shona

    sichuanyi

    sinhala

    slovak

    slovene

    slovenian

    soga

    somali

    spanish-mexico

    spanish-mx

    spanish

    standardmoroccantamazight

    swahili

    swedish

    swissgerman

    tachelhit-latin

    tachelhit-latn

    tachelhit-tfng

    tachelhit-tifinagh

    tachelhit

    taita

    tamil

    tasawaq

    telugu

    teso

    thai

    tibetan

    tigrinya

    tongan

    turkish

    turkmen

    ukenglish

    ukrainian

    uppersorbian

    urdu

    25

  • usenglish

    usorbian

    uyghur

    uzbek-arab

    uzbek-arabic

    uzbek-cyrillic

    uzbek-cyrl

    uzbek-latin

    uzbek-latn

    uzbek

    vai-latin

    vai-latn

    vai-vai

    vai-vaii

    vai

    vietnam

    vietnamese

    vunjo

    walser

    welsh

    westernfrisian

    yangben

    yiddish

    yoruba

    zarma

    zulu afrikaans

    Modifying and adding values to ini files

    New 3.39 There is a way to modify the values of ini files when they get loaded with

    \babelprovide and import. To set, say, digits.native in the numbers section, use

    something like numbers/digits.native=abcdefghij. Keys may be added, too. Without

    import you may modify the identification keys.

    This can be used to create private variants easily. All you need is to import the same ini

    file with a different locale name and different parameters.

    1.14 Selecting fonts

    New 3.15 Babel provides a high level interface on top of fontspec to select fonts. There

    is no need to load fontspec explicitly – babel does it for you with the first \babelfont.13

    [〈language-list〉]{〈font-family〉}[〈font-options〉]{〈font-name〉}\babelfont

    NOTE See the note in the previous section about some issues in specific languages.

    The main purpose of \babelfont is to define at once in a multilingual document the fonts

    required by the different languages, with their corresponding language systems (script and

    language). So, if you load, say, 4 languages, \babelfont{rm}{FreeSerif} defines 4 fonts

    (with their variants, of course), which are switched with the language by babel. It is a tool

    to make things easier and transparent to the user.

    Here font-family is rm, sf or tt (or newly defined ones, as explained below), and font-name

    is the same as in fontspec and the like.

    If no language is given, then it is considered the default font for the family, activated when

    a language is selected.

    On the other hand, if there is one or more languages in the optional argument, the font will

    be assigned to them, overriding the default one. Alternatively, you may set a font for a

    script – just precede its name (lowercase) with a star (eg, *devanagari). With this optional

    argument, the font is not yet defined, but just predeclared. This means you may define as

    many fonts as you want ‘just in case’, because if the language is never selected, the

    corresponding \babelfont declaration is just ignored.

    Babel takes care of the font language and the font script when languages are selected (as

    well as the writing direction); see the recognized languages above. In most cases, you will

    not need font-options, which is the same as in fontspec, but you may add further key/value

    pairs if necessary.

    EXAMPLE Usage in most cases is very simple. Let us assume you are setting up a document in

    Swedish, with some words in Hebrew, with a font suited for both languages.

    13See also the package combofont for a complementary approach.

    26

  • luatex/xetex\documentclass{article}

    \usepackage[swedish, bidi=default]{babel}

    \babelprovide[import]{hebrew}

    \babelfont{rm}{FreeSerif}

    \begin{document}

    Svenska \foreignlanguage{hebrew}{ תיְִרבִע } svenska.

    \end{document}

    If on the other hand you have to resort to different fonts, you can replace the red line above with,

    say:

    luatex/xetex\babelfont{rm}{Iwona}

    \babelfont[hebrew]{rm}{FreeSerif}

    \babelfont can be used to implicitly define a new font family. Just write its name instead

    of rm, sf or tt. This is the preferred way to select fonts in addition to the three basic

    families.

    EXAMPLE Here is how to do it:

    luatex/xetex\babelfont{kai}{FandolKai}

    Now, \kaifamily and \kaidefault, as well as \textkai are at your disposal.

    NOTE You may load fontspec explicitly. For example:

    luatex/xetex\usepackage{fontspec}

    \newfontscript{Devanagari}{deva}

    \babelfont[hindi]{rm}{Shobhika}

    This makes sure the OpenType script for Devanagari is deva and not dev2, in case it is not

    detected correctly. You may also pass some options to fontspec: with silent, the warnings about

    unavailable scripts or languages are not shown (they are only really useful when the document

    format is being set up).

    NOTE Directionality is a property affecting margins, indentation, column order, etc., not just text.

    Therefore, it is under the direct control of the language, which applies both the script and the

    direction to the text. As a consequence, there is no need to set Script when declaring a font with

    \babelfont (nor Language). In fact, it is even discouraged.

    NOTE \fontspec is not touched at all, only the preset font families (rm, sf, tt, and the like). If a

    language is switched when an ad hoc font is active, or you select the font with this command,

    neither the script nor the language is passed. You must add them by hand. This is by design, for

    several reasons —for example, each font has its own set of features and a generic setting for

    several of them can be problematic, and also preserving a “lower-level” font selection is useful.

    27

  • NOTE The keys Language and Script just pass these values to the font, and do not set the script for

    the language (and therefore the writing direction). In other words, the ini file or \babelprovide

    provides default values for \babelfont if omitted, but the opposite is not true. See the note above

    for the reasons of this behavior.

    WARNING Using \setxxxxfont and \babelfont at the same time is discouraged, but very often

    works as expected. However, be aware with \setxxxxfont the language system will not be set by

    babel and should be set with fontspec if necessary.

    TROUBLESHOOTING Package fontspec Warning: ’Language ’LANG’ not available for font ’FONT’ with

    script ’SCRIPT’ ’Default’ language used instead’.

    This is not and error. This warning is shown by fontspec, not by babel. It can be irrelevant for

    English, but not for many other languages, including Urdu and Turkish. This is a useful and

    harmless warning, and if everything is fine with your document the best thing you can do is just

    to ignore it altogether.

    TROUBLESHOOTING Package babel Info: The following fonts are not babel standard families.

    This is not and error. babel assumes that if you are using \babelfont for a family, very likely

    you want to define the rest of them. If you don’t, you can find some inconsistencies between

    families. This checking is done at the beginning of the document, at a point where we cannot

    know which families will be used.

    Actually, there is no real need to use \babelfont in a monolingual document, if you set the

    language system in \setmainfont (or not, depending on what you want).

    As the message explains, there is nothing intrinsically wrong with not defining all the families. In

    fact, there is nothing intrinsically wrong with not using \babelfont at all. But you must be aware

    that this may lead to some problems.

    1.15 Modifying a language

    Modifying the behavior of a language (say, the chapter “caption”), is sometimes necessary,

    but not always trivial. In the case of caption names a specific macro is provided, because

    this is perhaps the most frequent change:

    {〈language-name〉}{〈caption-name〉}{〈string〉}\setlocalecaptionNew 3.51 Here caption-name is the name as string without the trailing name. An example,

    which also shows caption names are often a stylistic choice, is:

    \setlocalecaption{english}{contents}{Table of Contents}

    This works not only with existing caption names, because it also serves to define new ones

    by setting the caption-name to the name of your choice (name will be postpended). Captions

    so defined or redefined behave with the ‘new way’ described in the following note.

    NOTE There are a few alternative methods:

    • With data import’ed from ini files, you can modify the values of specific keys, like:

    \babelprovide[import, captions/listtable = Lista de tablas]{spanish}

    (In this particular case, instead of the captions group you may need to modify the

    captions.licr one.)

    • The ‘old way’, still valid for many languages, to redefine a caption is the following:

    \addto\captionsenglish{%

    \renewcommand\contentsname{Foo}%

    }

    28

  • As of 3.15, there is no need to hide spaces with % (babel removes them), but it is advisable to

    do so. This redefinition is not activated until the language is selected.

    • The ‘new way’, which is found in bulgarian, azerbaijani, spanish, french, turkish,

    icelandic, vietnamese and a few more, as well as in languages created with \babelprovide

    and its key import, is:

    \renewcommand\spanishchaptername{Foo}

    This redefinition is immediate.

    NOTE Do not redefine a caption in the following way:

    \AtBeginDocument{\renewcommand\contentsname{Foo}}

    The changes may be discarded with a language selector, and the original value restored.

    Macros to be run when a language is selected can be add to \extras〈lang〉:

    \addto\extrasrussian{\mymacro}

    There is a counterpart for code to be run when a language is unselected: \noextras〈lang〉.

    NOTE These macros (\captions〈lang〉, \extras〈lang〉) may be redefined, butmust not be used assuch – they just pass information to babel, which executes them in the proper context.

    Another way to modify a language loaded as a package or class option is by means of

    \babelprovide, described below in depth. So, something like:

    \usepackage[danish]{babel}

    \babelprovide[captions=da, hyphenrules=nohyphenation]{danish}

    first loads danish.ldf, and then redefines the captions for danish (as provided by the ini

    file) and prevents hyphenation. The rest of the language definitions are not touched.

    Without the optional argument it just loads some aditional tools if provided by the ini file,

    like extra counters.

    1.16 Creating a language

    New 3.10 And what if there is no style for your language or none fits your needs? You

    may then define quickly a language with the help of the following macro in the preamble

    (which may be used to modify an existing language, too, as explained in the previous

    subsection).

    [〈options〉]{〈language-name〉}\babelprovideIf the language 〈language-name〉 has not been loaded as class or package option and thereare no 〈options〉, it creates an “empty” one with some defaults in its internal structure: thehyphen rules, if not available, are set to the current ones, left and right hyphen mins are

    set to 2 and 3. In either case, caption, date and language system are not defined.

    If no ini file is imported with import, 〈language-name〉 is still relevant because in such acase the hyphenation and like breaking rules (including those for South East Asian and

    CJK) are based on it as provided in the ini file corresponding to that name; the same

    applies to OpenType language and script.

    Conveniently, some options allow to fill the language, and babel warns you about what to

    do if there is a missing string. Very likely you will find alerts like that in the log file:

    29

  • Package babel Warning: \chaptername not set for 'mylang'. Please,

    (babel) define it after the language has been loaded

    (babel) (typically in the preamble) with:

    (babel) \setlocalecaption{mylang}{chapter}{..}

    (babel) Reported on input line 26.

    In most cases, you will only need to define a few macros. Note languages loaded on the fly

    are not yet available in the preamble.

    EXAMPLE If you need a language named arhinish:

    \usepackage[danish]{babel}

    \babelprovide{arhinish}

    \setlocalecaption{arhinish}{chapter}{Chapitula}

    \setlocalecaption{arhinish}{refname}{Refirenke}

    \renewcommand\arhinishhyphenmins{22}

    EXAMPLE Locales with names based on BCP 47 codes can be created with something like:

    \babelprovide[import=en-US]{enUS}

    Note, however, mixing ways to identify locales can lead to problems. For example, is yi the name

    of the language spoken by the Yi people or is it the code for Yiddish?

    The main language is not changed (danish in this example). So, you must add

    \selectlanguage{arhinish} or other selectors where necessary.

    If the language has been loaded as an argument in \documentclass or \usepackage, then

    \babelprovide redefines the requested data.

    〈language-tag〉import=New 3.13 Imports data from an ini file, including captions and date (also line breaking

    rules in newly defined languages). For example:

    \babelprovide[import=hu]{hungarian}

    Unicode engines load the UTF-8 variants, while 8-bit engines load the LICR (ie, with macros

    like \' or \ss) ones.

    New 3.23 It may be used without a value. In such a case, the ini file set in the

    corresponding babel-.tex (where is the last argument in

    \babelprovide) is imported. See the list of recognized languages above. So, the previous

    example can be written:

    \babelprovide[import]{hungarian}

    There are about 250 ini files, with data taken from the ldf files and the CLDR provided by

    Unicode. Not all languages in the latter are complete, and therefore neither are the ini

    files. A few languages may show a warning about the current lack of suitability of some

    features.

    Besides \today, this option defines an additional command for dates: \date,

    which takes three arguments, namely, year, month and day numbers. In fact, \today calls

    \today, which in turn calls

    \date{\the\year}{\the\month}{\the\day}. New 3.44 More convenient is

    usually \localedate, with prints the date for the current locale.

    30

  • 〈language-tag〉captions=Loads only the strings. For example:

    \babelprovide[captions=hu]{hungarian}

    〈language-list〉hyphenrules=With this option, with a space-separated list of hyphenation rules, babel assigns to the

    language the first valid hyphenation rules in the list. For example:

    \babelprovide[hyphenrules=chavacano spanish italian]{chavacano}

    If none of the listed hyphenrules exist, the default behavior applies. Note in this example

    we set chavacano as first option – without it, it would select spanish even if chavacano

    exists.

    A special value is +, which allocates a new language (in the TEX sense). It only makes sense

    as the last value (or the only one; the subsequent ones are silently ignored). It is mostly

    useful with luatex, because you can add some patterns with \babelpatterns, as for

    example:

    \babelprovide[hyphenrules=+]{neo}

    \babelpatterns[neo]{a1 e1 i1 o1 u1}

    In other engines it just suppresses hyphenation (because the pattern list is empty).

    New 3.58 Another special value is unhyphenated, which activates a line breking mode

    that allows spaces to be stretched to arbitrary amounts.

    This valueless option makes the language the main one (thus overriding that set whenmain

    babel is loaded). Only in newly defined languages.

    EXAMPLE Let’s assume your document is mainly in Polytonic Greek, but with some sections in

    Italian. Then, the first attempt should be:

    \usepackage[italian, greek.polutonic]{babel}

    But if, say, accents in Greek are not shown correctly, you can try:

    \usepackage[italian]{babel}

    \babelprovide[import, main]{polytonicgreek}

    Remerber there is an alternative syntax for the latter:

    \usepackage[italian, polytonicgreek, provide=*]{babel}

    〈script-name〉script=New 3.15 Sets the script name to be used by fontspec (eg, Devanagari). Overrides the

    value in the ini file. If fontspec does not define it, then babel sets its tag to that provided

    by the ini file. This value is particularly important because it sets the writing direction, so

    you must use it if for some reason the default value is wrong.

    31

  • 〈language-name〉language=New 3.15 Sets the language name to be used by fontspec (eg, Hindi). Overrides the value

    in the ini file. If fontspec does not define it, then babel sets its tag to that provided by the

    ini file. Not so important, but sometimes still relevant.

    〈counter-name〉alph=Assigns to \alph that counter. See the next section.

    〈counter-name〉Alph=Same for \Alph.

    A few options (only luatex) set some properties of the writing system used by the language.

    These properties are always applied to the script, no matter which language is active.

    Although somewhat inconsistent, this makes setting a language up easier in most typical

    cases.

    ids | fontsonchar=New 3.38 This option is much like an ‘event’ called when a character belonging to the

    script of this locale is found (as its name implies, it acts on characters, not on spaces). There

    are currently two ‘actions’, which can be used at the same time (separated by a space):

    with ids the \language and the \localeid are set to the values of this locale; with fonts,

    the fonts are changed to those of this locale (as set with \babelfont). This option is not

    compatible with mapfont. Characters can be added or modified with \babelcharproperty.

    NOTE An alternative approach with luatex and Harfbuzz is the font option

    RawFeature={multiscript=auto}. It does not switch the babel language and therefore the line

    breaking rules, but in many cases it can be enough.

    〈base〉 〈shrink〉 〈stretch〉intraspace=Sets the interword space for the writing system of the language, in em units (so, 0 .1 0 is

    0em plus .1em). Like \spaceskip, the em unit applied is that of the current text (more

    precisely, the previous glyph). Currently used only in Southeast Asian scrips, like Thai, and

    CJK.

    〈penalty〉intrapenalty=Sets the interword penalty for the writing system of this language. Currently used only in

    Southeast Asian scrips, like Thai. Ignored if 0 (which is the default value).

    kashida | elongated | unhyphenatedjustification=New 3.59 There are currently three options, mainly for the Arabic script. It sets the

    linebreaking and justification method, which can be based on the the arabic tatweel

    character or in the ‘justification alternatives’ OpenType table (jalt). For an explanation

    see the babel site.

    New 3.59 Just a synonymous for justification.linebreaking=

    directionmapfont=

    Assigns the font for the writing direction of this language (only with bidi=basic).

    Whenever possible, instead of this option use onchar, based on the script, which usually

    32

    https://github.com/latex3/babel/blob/master/news-guides/news/whats-new-in-babel-3.59.md

  • makes more sense. More precisely, what mapfont=direction means is, ‘when a character

    has the same direction as the script for the “provided” language, then change its font to

    that set for this language’. There are 3 directions, following the bidi Unicode algorithm,

    namely, Arabic-like, Hebrew-like and left to right. So, there should be at most 3 directives

    of this kind.

    NOTE (1) If you need shorthands, you can define them with \useshorthands and \defineshorthand

    as described above. (2) Captions and \today are “ensured” with \babelensure (this is the default

    in ini-based languages).

    1.17 Digits and counters

    New 3.20 About thirty ini files define a field named digits.native. When it is present,

    two macros are created: \digits and \counter (only xetex and

    luatex). With the first, a string of ‘Latin’ digits are converted to the native digits of that

    language; the second takes a counter name as argument. With the option maparabic in

    \babelprovide, \arabic is redefined to produce the native digits (this is done globally, to

    avoid inconsistencies in, for example, page numbering, and note as well dates do not rely

    on \arabic.)

    For example:

    \babelprovide[import]{telugu} % Telugu better with XeTeX

    % Or also, if you want:

    % \babelprovide[import, maparabic]{telugu}

    \babelfont{rm}{Gautami}

    \begin{document}

    \telugudigits{1234}

    \telugucounter{section}

    \end{document}

    Languages providing native digits in all or some variants are:

    Arabic

    Assamese

    Bangla

    Tibetar

    Bodo

    Central Kurdish

    Dzongkha

    Persian

    Gujarati

    Hindi

    Khmer

    Kannada

    Konkani

    Kashmiri

    Lao

    Northern Luri

    Malayalam

    Marathi

    Burmese

    Mazanderani

    Nepali

    Odia

    Punjabi

    Pashto

    Tamil

    Telugu

    Thai

    Uyghur

    Urdu

    Uzbek

    Vai

    Cantonese

    Chinese

    New 3.30 With luatex there is an alternative approach for mapping digits, namely,

    mapdigits. Conversion is based on the language and it is applied to the typeset text (not

    math, PDF bookmarks, etc.) before bidi and fonts are processed (ie, to the node list as

    generated by the TEX code). This means the local digits have the correct bidirectional

    behavior (unlike Numbers=Arabic in fontspec, which is not recommended).

    NOTE With xetex you can use the option Mapping when defining a font.

    New 4.41 Many ‘ini‘ locale files has been extended with information about

    non-positional numerical systems, based on those predefined in CSS. They only work with

    xetex and luatex and are fully expendable (even inside an unprotected \edef). Currently,

    they are limited to numbers below 10000.

    There are several ways to use them (for the availabe styles in each language, see the list

    below):

    • \localenumeral{〈style〉}{〈number〉}, like \localenumeral{abjad}{15}

    33

  • • \localecounter{〈style〉}{〈counter〉}, like \localecounter{lower}{section}

    • In \babelprovide, as an argument to the keys alph and Alph, which redefine what

    \alph and \Alph print. For example:

    \babelprovide[alph