this talk lasts 三十分钟

Post on 16-Aug-2015

332 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

This talk lasts Localisation is easy

Administrative Notes

• @pilif on twitter

• pilif on github

• working at Sensational AG

• @pilif on twitter

• pilif on github

• working at Sensational AG

• warming up to shirts

Thanks Richard for the Recording

About that 💩

Maybe ES6…?

My host name is a horrible spoiler if you're into JRPGs. Disregard

however…

close enough.

Back to the topic at hand

Let’s talk terms

• Language is a language as it is spoken or written

• Locale is the name given to a set of parameters that define how things should be done for users speaking a certain language in a certain place

• There are many more locales than countries

Locale

• Locales consist of a language…

• … and a country

• … and sometimes specific variants

Specifying locales

• IETF BCP-47 document

• See RFC 5646 and RFC 4647

• Use language-script-territory@modifier

• POSIX uses language_territory.encoding@modifier

fr-Latn-CH

fr-CH

fr_CH.utf-8

The Locale affects many things

Number formatting• Probably the most obvious of the bunch.

• Decimal separator

• Thousands separator

• Sign

• Also: Currency information

Some Samples

de-CH de-DE en-US

decimal separator . , .

thousands separator ' . ,

12,435

en-US twelve thousand four hundred and thirty five

de-DE twelve comma four three five

de-CH error

Date Formatting

• Obviously names of months and weekdays

• Order of distinct parts

• Separator character

• Commonly used formats in different contexts

Date Formatting• Libraries usually provide a generic short/

medium/long format

• Libraries also provide templates

• If your library’s template language has any characters that are not for replacement, they are doing it wrong

• Apple does it right since 10.11 and iOS9

2015-07-18 17:47Long Medium Short

en-US July 18, 2015 at 4:58:00 PM CEST

Jul 18, 2015, 4:58:00 PM 7/18/15, 4:58 PM

fr-CA 18 juillet 2015 16:58:00 UTC+2

18 juil. 2015 16:58:00 15-07-18 16:58

fr-CH 18 juillet 2015 16:58:00 UTC+2

18 juil. 2015 16:58:00 18.07.15 16:58

fr-FR 18 juillet 2015 16:58:00 UTC+2

18 juil. 2015 16:58:00 18/07/2015 16:58

Choice of calendar• Most of the world is using the Gregorian

calendar

• The Julian calendar uses the same month names but is off by 13 days (they have July 5th right now)

• Other calendars use different month names

• Might affect holiday calculations

Collation order

• How to compare to strings. Which one is first?

• Where to put the characters with pesky accents?

• How to deal with case differences?

• What about non-latin scripts?

Collation fun*• Phonebook german vs. ordinary german, vs.

Austrian german (dealing with umlauts)

• Contractions (Spanish ch counts as one letter, ch in Czech sorts after h, but c after b, etc)

• Handling of accents is language-dependent

• Case insensitive is a mess

Case folding• Some languages don’t differentiate between upper- and

lowercase

• Inconsistent mapping between upper- and lowercase (ß => SS, the reverse is not always true)

• Uppercasing accented characters is language (and sometimes locale) dependent. French characters often loose accents when uppercasing

• Inconsistent uppercasing for some languages (uppercase turkish i is İ. Lowercase turkish I is ı)

Double the fun• Collation and Case-Folding provide an interesting

team

• Depending on locale, upper- and lowercase should be sorted together or apart

• In some locales, case doesn’t matter at all when sorting

• In some locales, case always matters when sorting

• Depends on the use-case

Collation strength

• icu created the concept of “collation strength”

• strength 1 is the most lenient

• strength 5 is the most exact

• Example: Strength 2 removes accents unless the language is Danish

‘nough said

RTL

Perspectives matter

Context matters• “This slide lasts one minute”

• “This talk lasts 30 minutes”

• “Lunch lasted 1:30 hours”

• “Tomorrow I’ll sleep in”

• “August, 1th is a national holiday”

Let’s get practical

Locale handling is like escaping

• Always store raw unformatted data

• Format near the end of the chain

• Just before you escape

• Parse user input as early as possible

• Use native data types

UI Language is not locale

• Users might prefer to use the os in a different language than what’s inferred by their locale

• Just because I’m in de_CH it doesn’t mean I want your software to speak german to me

• UI language is completely different from the users locale

Avoid this mess

Avoid this mess

Avoid this mess

Mixing Locales• Forming sentences in UI language with locale formatted

data is… challenging

• Be mindful that language might influence some locale formatting.

• “This talk lasts ”

• or rather “This talk lasts 30 minutes”

• It depends. Does the locale also use hours and minutes?

Never be helpful* and translate units

1kg in de_CH is not 1lbs in en_US

Btw: Apple’s APIs are really good at this

What about web sites?

• Never, ever infer UI language by IP Geolocation.

People from Google: This slide is for you!

What about web sites?

• Never, ever infer UI language by IP Geolocation.

• Ever. Ever. EVER.

People from Google: This slide is for you!

What about web sites?

• Never, ever infer UI language by IP Geolocation.

• Ever. Ever. EVER.

• Promise!

People from Google: This slide is for you!

What about web sites?

• Never, ever infer UI language by IP Geolocation.

• Ever. Ever. EVER.

• Promise!

• You may infer Locale from IP Geolocation though

People from Google: This slide is for you!

Rely on HTTP• Trust Accept-Language - by now browser set

it correctly

• Use the header to determine UI language

• Use the header to determine default locale

• But ask the user

• Same goes for time zones

SHOW ME SOME CODE ALREADY!!!

The past• There has always been date formatting

(Date.toLocaleString). Mostly useless

• People were self-nebling (search youtube for “ich neble selber”) for example in date pickers and libraries

• hint: applying substr() to Date.toDateString() is not a correct solution.

• same goes for using replace(‘.’, ‘,’) on a number

The present• Microsoft has donated a huge chunk of localisation code to the

jQuery project.

• It’s not integrated into jQuery, but maintained by the jQuery project

• Check out https://github.com/jquery/globalize

• Doesn’t support collation

• The library is big

• But most of it is data and this problem can only be solved with a huge database of special cases

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));

The future• ECMA-402 from 2012

• Yes. Specs from 2012 are “the future” in JS land

• Provides the global Intl object

• Date, Number formatting and Collation

• see: http://www.ecma-international.org/ecma-402/1.0/

Could be worse

node.js is still bikeshedding because icu

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));

Conclusion• Proper localisation is part of our job to make the web useful for

everybody

• Use the libraries provided

• Whenever you think you know better than the library: No. You don’t.

• Remember that UI language and Locale are not always connected

• Don’t do IP geolocation for language choice

• When in doubt: Ask the user. She’ll know for sure.

Before I leave

""".length

[…"""].length

In case you answered 11 and 8, I salute you

Thanks everyone and enjoy your evening

• U+1F468 (MAN) 👨

• U+200D (ZERO WIDTH JOINER)

• U+2764 (HEAVY BLACK HEART) ❤

• U+FE0F (VARIATION SELECTOR-16)

• U+200D (ZERO WIDTH JOINER)

• U+1F48B (KISS MARK) 💋

• U+200D (ZERO WIDTH JOINER)

• U+1F468 (MAN) 👨

top related