Internationalization Status and Directions: IETF, JET, and ICANN
John C KlensinOctober 2002
© 2002 John C Klensin
Four topics today
• IETF Work and Status
• Opportunities, risks, and registry restrictions
• Registry restrictions for CJK strings: the JET work
• Another look at multilingual TLDs
Disclaimer
Unless specified as a committee recommendation, policy recommendations have not been discussed enough in the ICANN IDN committee to know whether there is consensus
IETF Work and Status
• Several separate components• Unicode handling and encoding
– Stringprep
– Punycode
• DNS-specific– Nameprep
– IDNA
• Approved for publication as Proposed Standards
Unicode Handling Protocols
• Tables for matching and filtering– “Stringprep”
• Encoding to ASCII-compatible (ACE) form– “Punycode”
DNS-specific Internationalization
• Nameprep– A profile of “stringprep” for DNS
internationalization
• IDNA– “Internationalizing Domain Names in
Applications”– Base protocol, containing “ToUnicode” and
“ToASCII” operations
Opportunities, risks, and registry restrictions
• IETF work is part of the solution – The problem it solves may still not be clearly understood
– How to put a name into the DNS, and query it, given that name is appropriate
• Still leaves many risks, problems, and issues– Not IETF’s job to solve policy problems
– ICANN recommendation to prohibit non-language characters was not accepted by IETF
– Solution to these problems lies with ICANN and Zone administrators
– “No solution” may equal chaos or effective Internet fragmentation
Risks, problems, and issues
• Character-related issues– Confusion of names– Alternative characters– Reserved name issues– Non-language characters– Mixed scripts
… (not new … most discussed in Melbourne)
Extending existing remedies
• UDRP not prepared for this– Confusion of character appearances is not grounds for
revocation
• WHOIS committee was asked to look at internationalization– Not reflected in report
Since the protocols won’t provide protection, some alternatives
• Registry restrictions: Per-zone or global restrictions on what can be registered
• Script homogeneity restrictions ?• Letting the market sort it out
Registry restrictions for CJK strings: the JET work
• So far, most advanced work on registry restrictions and specific character handling is for CJK
• Recognized problems earlier and got started
• Good cooperative effort, focusing on special needs of Chinese characters
Other Languages and Scripts
• CJK has special problems– Language overlaying– Recent character reforms– Japanese and Korean are mixed-script
• But every language and script has traps and potential ambiguities– Even English and ASCII
What are the JET Guidelines about?
• Problems– Some Chinese characters are different in
different areas – same words, different characters – but need to match
– Matching rules cannot be applied simply on a per-character basis
– Can’t “fix” Chinese and wreck Korean or Japanese
– And Korean and Japanese have their own issues
JET Guideline Approach
• Registry restrictions on what can be registered: invalid forms not permitted
• Careful handling of “variant” characters:– If a string is registered, preferred form must be used.– Reservation “package” of preferred name and variants– Variants of string can be registered only by the same
registrant (or not at all)
• Definitions of permitted characters, preferences, and variant tables are per-zone (typically per-country)
• Need not restrict to SLD registrations
Variants• About characters:
– Tables for each national use of language– E.g., not required to agree on one universal table for
Chinese (important, e.g., some areas have not adopted Simplified forms)
• Variant labels– Generated by combining variants of all characters
present– If have ABCD, with two variants for B (X and Y) and
one for C (Z), six potential labels:ABCD, AXCD, AYCD, ABZD, AXZD, AYZD
– Some may then be excluded
JET Guidelines and other languages/ scripts
• Details will differ, principles of what to look for may be useful
• Principle of registration restrictions is the important one: ultimately may be the only tool we have
• Zones bear some responsibility for overall stability of Internet, integrity of references, etc.
Restrictions by TLD Type
• Language and script restrictions are plausible for ccTLDs– Any such restrictions start with “this language (or
script) is more important than that one” decision.– Harder with each additional supported script
• A generic TLD cannot prefer one language or script– So may not be able to adopt and use effective
registration restriction rules.– Which makes IDNs much more dangerous.
Another look at multilingual TLDs
• TLDs with names other than Roman-derived ISO 3166-1 codes
• Motivation is not clear– Use of national language in country?– A “free” extra domain (or more than one) for
commercial exploitation?– ???
• Important to understand problem
Administrative hierarchy structure of DNS
• Very hard to accurately administer parallel structures.
• No “see also” construction• TLDs are special – must be administratively
heterogeneous• These are not issues if the reason for a
“multilingual TLD” is “free TLD with different administration”
Options and tradeoffs
• New TLDs anyway– But IDN Committee recommended normal
approval process, not a free ride– The administrative problems happen– Allocation is a nasty problem– So are countries with multiple official
languages
• Translation
The Translation Issue
• Presentation– Ultimately, users don’t care what is in the DNS– They care, greatly, about what they see and type
• Localization– For a limited namespace, users can see whatever the
application-writer likes• Two-letter code in, user-preference out• (or national preference, or local language preference, or…)
– Problem: users need to understand that there is an internal/global form
• But IDNA is already going to require this
The Role of the DNS• Is the DNS the right place to solve these
problems?– Many restrictions and requirements for central
administrative hierarchy– Poor search support capability when exact name is not
known, but “exact” gets harder with IDN
• Seeing evolution from product-name.TLD to product.company.TLD or http://company.TLD/product
• There are alternatives and “search engines” are only one group of them.
The Role of a Domain Adminstration
• Responsibility to the overall Internet community and to users
• For ccTLDs, ICANN probably can not compel and should not try, but can recommend
• Registries who cause (or permit) messes that damage others will ultimately be held responsible.
IETF Specification of Name Validity• Something of a myth
– DNS Protocol does not require LDH – recommends as good/safe practice
– Hostname rules were NIC document, not technical standards-track
– DNS rules of late 80s and early 90s (including RFC 1591) were IANA documents, not IETF
• IETF provides “how to” register and look up, and systems/technical constrains
• Specific syntax and character constraints are a zone administration and IANA/ICANN issue.
Independent of ICANN
• Domain administrations who• Care about the Internet• Exist to serve users, registrants, and the Internet community
– will develop and use registration restrictions that minimize the risk of confusion and mismatches (accidental or deliberate)
• No one said this would be easy but…– Internationalization is very important– So is stability and name integrity– This appears to be the price of having both
Some closing thoughts
• Are there localization solutions that are effective and that meet user needs?
• Localization does not require ICANN approval or involvement
• In looking at the DNS to solve a range of i18n issues, are we sure we are asking the right questions?
• The primary role of ICANN is preserve DNS stability. I hope it can examine this area, and move decisively, before it is too late.
• “Too late” could be only a month or two from now.
For further reading• IETF Proposed Standards for IDN encoding
– Final drafts:• draft-hoffman-stringprep-03.txt• draft-ietf-idn-nameprep-11.txt• draft-ietf-idn-punycode-03.txt• draft-ietf-idn-idna-14.txt
• JET Guidelines• Current draft
– Draft-jseng-idn-admin-01.txt
• Role of the DNS• draft-klensin-dns-role-04.txt (and others)
• Local translation• draft-klensin-idn-tld-00.txt
• Searching, not exact matching• draft-klensin-dns-search-04.txt (and others)
Internet Drafts available from
• http://www.ietf.org/internet-drafts/xxx• (and elsewhere)
For further reading• IETF Proposed Standards for IDN encoding
– Final drafts:• draft-hoffman-stringprep-03.txt• draft-ietf-idn-nameprep-11.txt• draft-ietf-idn-punycode-03.txt• draft-ietf-idn-idna-14.txt
• JET Guidelines• Current draft
– Draft-jseng-idn-admin-01.txt
• Role of the DNS• draft-klensin-dns-role-04.txt (and others)
• Local translation• draft-klensin-idn-tld-00.txt
• Searching, not exact matching• draft-klensin-dns-search-04.txt (and others)