international domain name twnic nai-wen hsu [email protected]

26
International Domain Name TWNIC Nai-Wen Hsu [email protected]

Upload: geoffrey-nichols

Post on 18-Dec-2015

232 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

International Domain Name

TWNICNai-Wen Hsu

[email protected]

Page 2: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Domain name RFC 1035

A label can not longer than 63 characters A domain name can not longer than 255

characters Maximum labels: 127 Only accept a-z,0-9,’-’ as domain name

Limited ASCII character code point, 37 LDH (Letter-Digit-Hyphen)

Page 3: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

International Domain Name IETF IDN WG adopt UNICODE 3.2

Greek, Cyrillic, Armenian, Hebrew, Arabic,Syriac, Thaana, Devanagari, Bengali,Gurmukhi, Gujarati, Oriya, Tamil, Telugu,Kannada, Malayalam, Sinhala, Thai, …

95,156 characters

Page 4: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

International Domain Name sample レコード会社 .jp gwmöbler.com 慎昌鐘錶 .tw 阿克苏诺贝尔油漆公司 .cn 소프트웨어 .kr םוק. לארשי

Page 5: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

IETF IDN Standard IDNA (RFC3490)

Internationalizing Domain Names in Applications NAMEPREP(RFC3491)

A Stringprep Profile for Internationalized Domain Names

PUNYCODE(RFC3492) A Bootstring encoding of Unicode for Internation

alized Domain Names in Applications STRINGPREP(RFC3454)

Preparation of Internationalized Strings

Page 6: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

User

IDNA-aware Application (ToASCII and ToUnicode

operations may be called here)

Resolver

DNS ServersApplication

Servers

DNS ProtocolACE

Call to resolverACE

Application-specificProtocol: ACEUnless the protocol Is updated to handleOther encodings

Input and display: local interface methods (pen, keyboard, ...)

End system

"Application" is where the application splits a hostname into labels, sets the appropriate flags, and performs the ToASCIIand ToUnicode operations.

IDNA components and interfaces

IDNAIDNA

xn--de-jg4avhby1noc0d

Page 7: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

IDNA Structure

NAMEPREP• Mapping• Normalization• Prohibit

ACE(PUNYCODE)

User input

(UNICODE)

STRINGPREP

To resolverACE

Nameprep:A Stringprep Profile for Internationalized Domain Names IDNAIDNA

ToASCII ToUnicode

Page 8: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

NAMEPREP A Stringprep Profile for Internationaliz

ed Domain Names Mapping

Stringprep table B.1,B.2 Normalization

Form KC Prohibited Output

Stringprep table C.1.2,2.2,3,4,5,6,7,8,9

Page 9: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

NAMEPREP -- Mapping Commonly mapped to nothing: 27

Ex: Mapping for case-folding used with

NFKC: 1371 Ex:

A a (U+0041U+0061) (U+03ABU+03CB) (U+3371U+0068 U+0070 U+0061)

Page 10: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

NAMEPREP -- Normalization Unicode normalization with form

KC

Page 11: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

NAMEPREP -- Normalization ‘u’+‘‥’ ‘ü’ ‘ a’‘ a’

Page 12: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

NAMEPREP – Prohibited output Non-ASCII space characters: 17

Ex: (NO-BREAK SPACE) Non-ASCII control characters: 54

Ex: (DEVICE CONTROL STRING) Private use: 133371 Non-character code points: 49 Surrogate codes: 2048

Page 13: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

NAMEPREP – Prohibited output Inappropriate for plain text: 4 Inappropriate for canonical

representation: 12 Change display properties or

are deprecated: 13 Tagging characters: 97

Page 14: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

PUNYCODE A Bootstring encoding of Unicode for I

DNA One of the ACE(ASCII Compatible Encoding)

Translate non-ASCII characters to ASCII characters

Prefix: xn-- Ex:

慎昌鐘錶 .tw xn--ciun9hb52c2za.tw

Page 15: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Insufficient in IDN standard Current IDN standard (IDNA, NAMEPR

EP, PUNYCODE) can not solve Chinese domain name requirement Tradition/Simplify Chinese mapping

Ex: 台 臺 Writing variant mapping

Ex: 峰 峯

Page 16: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw
Page 17: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Insufficient in IDN standard They are the same meaning but it is di

fferent character in different countries In China:

劝 (529D) In Japan:

勧 (52E7) In Taiwan:

勸 (52F8)

Page 18: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

IDN administration guide line Registration policy to solve those pro

blems listed above Every language has a variant table wit

h 3 fields: valid code point recommended variant character variant

Page 19: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Variant Table sample

Valid code point(VCP)

Recommended variants by .tw

(twRV)

Recommended variants by .

cn(cnRV)

Character Variant(s)

(CV)Remarks

丁 (4E01) 丁 (4E01) 丁 (4E01) 丁 (4E01) Singular-relation character(1)

丄 (4E04) 上 (4E0A) 上 (4E0A)丄 (4E04) 上(4E0A) Pair-relation

characters(2.1)

上 (4E0A) 上 (4E0A) 上 (4E0A) 丄 (4E04) 上(4E0A)

万 (4E07) 万 (4E07) 万 (4E07) 万 (4E07) 萬(842C) Pair-relation

characters(2.2)

萬 (842C) 萬 (842C) 万 (4E07) 万 (4E07) 萬(842C)

Page 20: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Valid code point(VCP)

Recommended variants by .t

w(twRV)

Recommended variants by .c

n(cnRV)

Character Variant(s)

(CV)remarks

叶 (53F6) 葉 (8449) 叶 (53F6) 叶 (53F6)葉 (8449) Pair-relation

characters (2.3)葉 (8449) 葉 (8449) 叶 (53F6)

叶 (53F6)葉 (8449)

个 (4E2A) 個 (500B) 个 (4E2A)个 (4E2A)個 (500B)箇 (7B87)

Multiple-relationCharacters

個 (500B) 個 (500B) 个 (4E2A)个 (4E2A)個 (500B)箇 (7B87)

箇 (7B87) 個 (500B) 个 (4E2A)个 (4E2A)個 (500B)箇 (7B87)

Variant Table sample

Page 21: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Variant Table

Singular-relation character (VCP=twRV=cnRV=CV): 13888(66.4%)

VCP=twRV≠cnRV: 2783 (13.3%) VCP=cnRV≠twRV: 2453(11.7%) VCP≠(twRV=cnRV): 333(1.6%) VCP≠twRV≠SCR: 387(1.9%)

Page 22: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Variant Table

Number of character variant(s)

1 2 3 4 5 6 7 8

Number of Characters

1388866.4%

515624.7

%

11585.5%

4242.0%

1650.79%

600.29%

350.17%

160.08%

Page 23: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Variant Table

• The table draft is prepared by the CCMT Task force organized by TWNIC from January, 2002.

• Task force members have 9 experts from language linguist, computer experts and DNS experts.

• The table draft has submitted to the Bureau of Standards, Ministry of Economic Affairs to final review.

Page 24: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Registration procedure A Registrant should select the language(s) Activation of the requested domain

name(s) & Reservation of the equivalence(s) should be provided by the Registry, within the language-based character set

The registrant can require the activation of the reserved equivalent domain name(s) at any time

Page 25: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Registration sample A user select zh-tw and zh-cn languag

e with domain name 丁上萬 .com 丁上萬 .com (Recommended variants for

zh-tw) 丁上万 .com (Recommended variants for

zh-cn) 丁丄万 .com (Character Variant) 丁丄萬 .com (Character Variant)

Page 26: International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

Q & A