i18n with php 5.3

24
PHP Internationalization with ICU By Stas Malyshev, Zend Technologies

Upload: zendcon

Post on 13-May-2015

7.163 views

Category:

Technology


5 download

DESCRIPTION

Talk by Stas Malyshev of Zend at ZendCon 2009

TRANSCRIPT

Page 1: I18n with PHP 5.3

PHP Internationalization with ICU

By Stas Malyshev, Zend Technologies

Page 2: I18n with PHP 5.3

2

What and why?

•ICU - http://icu-project.org/ (IBM)

•Unicode

•CLDR - http://cldr.unicode.org/

Page 3: I18n with PHP 5.3

3

Intl extension•Locale

•Collator

•Number & Currency formatter

•Date & Time formatter

•Message & Choice formatter

•Normalizer

•Graphemes

•IDN

•Calendars

•Resources

Page 4: I18n with PHP 5.3

4

Intl extension

•Dual API OO and procedural

•Same implementation underneath

collator_create() == new Collator()

numfmt_format() == NumberFormatter::format()

locale_get_default() == Locale::getDefault()

Page 5: I18n with PHP 5.3

5

Locale

•Relies on ICU locales

<language>[_<script>]_<country>[_<variant>][@<keywords>]

•Default locale

new Collator(Locale::DEFAULT)

Locale::setDefault, Locale::getDefault

You can use null

Page 6: I18n with PHP 5.3

6

Locale

Locale pieces

getPrimaryLanguage($locale)

getScript($locale)

getRegion($locale)

getVariant($locale)

getKeywords($locale)

Page 7: I18n with PHP 5.3

7

LocaleLocale display pieces

getDisplayName($locale, $in_locale = null)

getDisplayLanguage($locale, $in_locale = null)

getDisplayScript($locale, $in_locale = null)

getDisplayRegion($locale, $in_locale = null)

Example:

getDisplayScript(getScript("zh-Hant-TW"), "en-US") returns “Traditional Chinese”

Page 8: I18n with PHP 5.3

8

Locale building blocks•parseLocale() - returns array composed of locale

subtags

•composeLocale() - creates locale ID out of subtags

parseLocale('sr-Latn-RS') returns

array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)

composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS’

Page 9: I18n with PHP 5.3

9

Locale guessing

•acceptFromHttp - Accept-Language to locale

•lookup – find in the list

•filterMatches – are they the same?

Page 10: I18n with PHP 5.3

10

Collator

•Comparing, sorting strings

•Collation level (strength)

•All ICU collator attributes

Numeric collation

Ignoring punctuation

•Not yet: custom “tailoring” rules

Page 11: I18n with PHP 5.3

11

Collator

$coll = new Collator("fr_CA");

if ($coll->compare("côte", "coté") < 0) {

     echo "less\n"; 

} else {

     echo "greater\n"; 

}  côte < coté

Page 12: I18n with PHP 5.3

12

Collator

$strings = array("cote", "côte", "Côte", "coté","Coté", "côté", "Côté", "coter");

$coll = new Collator("fr_CA"); 

$coll->sort($strings);

cotecôteCôtecotéCotécôtéCôtécoter

sort($array, $flags)asort($array, $flags)sortWithSortKeys($array)

Page 13: I18n with PHP 5.3

13

NumberFormatter

•Formatting and parsing

•Numbers and currency

numfmt_create($locale, $style, $pattern = null)

NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINALNumberFormatter::DECIMALNumberFormatter::DURATIONNumberFormatter::CURRENCY NumberFormatter::SCIENTIFICNumberFormatter::PERCENT NumberFormatter::SPELLOUT

Page 14: I18n with PHP 5.3

14

NumberFormatterFormatting

$fmt = new NumberFormatter(‘en_US’,                           NumberFormatter::DECIMAL);

echo $fmt->format(1234);

// result is 1,234

$fmt = new NumberFormatter(‘de_CH’,                           NumberFormatter::DECIMAL);

echo $fmt->format(1234);

// result is 1'234

Page 15: I18n with PHP 5.3

15

NumberFormatterParsing

$fmt = new NumberFormatter(‘de_DE’,                           NumberFormatter::DECIMAL);

$num = ‘1.234,567 min’;

$fmt->parse($num, NumberFormatter::TYPE_DOUBLE, $pos);

// result is 1234.567 , $pos = 9

$fmt->parse($num, NumberFormatter::TYPE_INT32);

// result is 1234

Page 16: I18n with PHP 5.3

16

MessageFormatter

•Formatting and parsing whole messages, including data inside

•Also allows choice between things printed:

0≤are no files|1≤is one file|1<are many files

Page 17: I18n with PHP 5.3

17

MessageFormatter

$fmt = new MessageFormatter("en_US", "{0,number,integer}  monkeys on {1,number,integer} trees  make {2,number} monkeys per tree");echo $fmt->format(array(4560, 123, 4560/123));

$fmt = new MessageFormatter("de", "{0,number,integer}  Affen über {1,number,integer} Bäume  um {2,number} Affen pro Baum");

echo $fmt->format(array(4560, 123, 4560/123));

Page 18: I18n with PHP 5.3

18

IntlDateFormatter

•Allows using locale-dependent canned patterns

•Short, medium, long date & time

Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST

Medium: January 12, 1952 or 3:30:32pm

Short: 12/13/52 or 3:30pm

•Also allows free-form patterns

"yyyy.MM.dd G 'at' HH:mm:ss vvvv"

1996.07.10 AD at 15:08:56 Pacific Time

Page 19: I18n with PHP 5.3

19

IntlDateFormatter

$fmt = new IntlDateFormatter( "en_US" , IntlDateFormatter::FULL, IntlDateFormatter::FULL,'America/Los_Angeles',IntlDateFormatter::GREGORIAN);echo $fmt->format(0);

// Wednesday, December 31, 1969 4:00:00 PM PT $fmt = new IntlDateFormatter( "de-DE" , IntlDateFormatter::FULL, IntlDateFormatter::FULL,'America/Los_Angeles',IntlDateFormatter::GREGORIAN);echo $fmt->format(0); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00

Page 20: I18n with PHP 5.3

20

Normalizer

•Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD

•normalize(), isNormalized()

$combining_ring_above = "\xCC\x8A";  // 'COMBINING RING ABOVE' (U+030A) $chars = Normalizer::normalize( 'A' . $combining_ring_above, Normalizer::FORM_C );

echo urlencode($chars);

// %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)

Page 21: I18n with PHP 5.3

21

Grapheme functions

•Graphemes are multi-char entities, like letter + accent mark(s)

•Same as string functions, but operate on grapheme units

•Strlen, substr, strpos, strstr

•Extraction function – extract to fill limited buffer, but always keep graphemes whole

Page 22: I18n with PHP 5.3

22

IDN

idn.icann.org ↔ xn--5dbqzzl.idn.icann.org.עברית

русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org

•idn_to_ascii

•idn_to_utf8

Page 23: I18n with PHP 5.3

23

TODO

•ResourceHandler

•Transliteration

•StringSearch

•Tighter integration with other modules in 6.0

Page 24: I18n with PHP 5.3

24

Thanks!http://php.net/intl for futher information.