internationalised domain names & internet investigations

Post on 01-Jun-2015

3.152 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

English is not the only language that the Internet “speaks.” Internationalised Domain Names (IDNs) now allow for domain names in Arabic, Cyrillic, Chinese, and other non-Latin characters. This session will show how to trace IDNs and will examine some of the IDN info security issues. There will be a quick introduction to working with foreign language Websites and useful tips for using online search and translation tools.

TRANSCRIPT

Internationalised DomainNames, Foreign LanguageWebsites, & Investigations

Jonathan D. AbolinsThu, 28 July 201111:00 AM - 12:00 PM PDT (GMT-08:00)

Post-Webinar Version with additional notes.

Introduction

About me

Why this topic

Some notes about this presentation’s approach.

Note About Translation Tools

Machine translation tools help a lot.

But they can also leave out much or mislead.

Helps to know the languages involved or workwith a competent translator.

But the translators might not know about somerecent Internet developments.

Quick Overview of Terms

Labels – example: www.veresoftware.com

TLD – Top Level Domain (e.g., .com or .uk)

ccTLD – Country Code TLD (e.g., .uk, .ru)

IDN – Internationalised Domain Name

Unicode

ACE – ASCII Compatible Encoding

Punycode (RFC 3492), a form of ACE

Label 1 Label 2 Label 3

OSINT in an Alphabet Soup ofthe Networked World

But see http://www.cartoonistgroup.com/store/add.php?iid=8381Sometimes, alphabet soup is soup, not a coded message.

A couple of Examples of non-English Windows 7 Desktops

First is Russian.

Second is Arabic. Note the shift to the right.

They were done by switching the languages onone of my Windows 7 Ultimate PCs.

The GUI labels for My Documents, My Music,etc. are localised. But the underlying directorynames, as seen via dir command in a CMDwindow, did not change.

The Net No Longer “Speaks”Primarily English

Old days

Had to use code pages (character encodings) fornon-Latin text. Can be confusing.

Difficult to mix languages.

Now

Unicode covers most of the world’s writing systems.90+ scripts.

Still encounter code pages.

But Underlying Code isUniversal

Bits & Bytes

Programming languages

HTML codes

IP Adresses

Etc.

This can work to your advantage!

If a foreign site offers English, whyread the foreign language version?

http://krebsonsecurity.com/2010/12/russian-police-only-translate-the-good-news/

What if you can’t read Russian?

File/Pathnames May Have Clues…

http://www.mvd.ru/news/

File/Pathnames May Have Clues…

http://www.mvd.ru/presscenter/

Note for the Previous Slides…

Sometimes the foreign site might be using a sitestructure developed in the English speakingworld. Particularly the case with some Webforums.

Other times, the Web designers are trying toavoid problems with mixing texts for directoryand file names.

In any case, the file path info often can be ahelp.

Tip: Google Chrome HasBuilt-in Translation Function

http://habrahabr.ru/blogs/DIY/

Search Tip:A Picture is Worth 1K Words

An image search might help to zero in on the entries ofinterest.

Especially useful if you want to save time wadingthrough foreignlanguage hits.

Example search for theRASKAT (Раскат) data destruction device fromRussia. Look for imagesthe look “computerish”.

Google Translate Annoyance:URL Conversion

/

Uncheck the Phonetic Typing boxbefore entering URLs for sitetranslation

Tried to type in “http://www.xakep.ru”but Google “Russified” it.

Internationalised DomainNames (IDN)

Intro – The Phonebook Analogy

Imagine a phonebook where people could have entries in their preferedscripts. Mr. Wong could have his in Chinese. Ms. Romanov could haveher in Russian. And so on. Many people will choose to have both Latintext and foreign text entries for the same phone number. Makes it easierfor their family and friends to find them. But others fret about thedifferent texts.

Underneath it all, however, the phone system hardware, networks, andthe phone numbers remain the same.

Something like this is happening with the Internet.

The First Four IDN ccTLDs

In May 2010

United Arab Emirates: .امارات

Saudi Arabia: .السعودیة

Russian Federation: .рф

Egypt: .مصر

More IDN ccTLDs have been launched.

Remember, IDNs can also exist under non-IDN ccTLDs.Example: גינדי.com or bücher.com

http://blog.icann.org/2010/05/idn-cctlds-%E2%80%93-the-first-four/

Examples of IDNs & Punycode

com.גינדי

스타벅스코리아.com

газпром.рф

مصر.سجل

汕头大学.中国

xn--pssza05mm53a.xn--fiqs8s/

Gindi Realty (Israel)com.גינדי

Punycode: http://xn--6dbcrb7a.com/

Offline IDN Example

Starbucks Korea스타벅스코리아.com

Punycode: http://xn--oy2b35ckwhba574atvuzkc.com/

Shantou University (PRC)汕头大学.中国/

Same ashttp://stu.edu.cn

Punycode: http://xn--pssza05mm53a.xn--fiqs8s/

Sajela.MiSr (Egypt)مصر.سجل

Punycode: http://xn--rgbn6c.xn--wgbh1c/

Fun with Arabic & OtherRTL (right to Left) IDN URLs

Reading direction can switch.

Example URL.http:// مصر.سجل /Files/GeneralPolicy.pdf

The direction changes can cause problems invarious tools and procedures.

This is where Punycode really helps.http://xn--rgbn6c.xn--wgbh1c/Files/GeneralPolicy.pdf

1 ----> <----------2 3 --------------------------------------------->

Punycode

DNS works with Punycode for IDN labelsExample: مصر.سجلPunycode: xn--rgbn6c.xn--wgbh1c

.xn--wgbh1c is Punycode for the مصر IDN ccTLD. Note the distinctive xn– prefix.

Much safer way to store & use IDNs.

Various online and offline tools for conversion.

Conversions works in both directions.Unicode IDN <-> Punycode.

An Online Converter

http://idnaconv.phlymail.de/

idn: An Offline IDN Converter(Linux)

Challenges with IDNs

Recognising what it is.(domain name, URL, e-mail address).

Which end is the ccTLD?

What language is it?

What country of registry?

Sad 'cause I can't find the ص (Saad) key.(How do I enter the IDN?) Some characters have multiple codes.

Many tools don't work correctly with IDNs.

Homograph (Look-alike) Attacks

Recognising IDNs. Not just URLs.How About IDN E-mail Addresses?

What if you found a note with this:ваше_имя@письмо.рф ?

Would you know it’s ane-mail address?

Would your translatorrecognise it as an e-mailaddress?

By the Way, What About Vocalisation ofURLs & e-Mail Addresses in ForeignLanguages?

The way a URL or an email address – IDN or not – issaid can differ across languages.

How is the “at” symbol or the “dot” said? Example with Russian and “Ivan@pochta.ru”:

“Ivan sobachka pochta tochka ru”or“Ivan sobachka pochta dot ru” Sobachka (собачка – “little dog”) is a popular Russian way of

voicalising the “@” sign. Tochka (точка – “point”) or Dot (дот) used for the “.” mark.

How to say an e-mail address in Russian:http://www.themoscowtimes.com/opinion/article/the-really-cool-people-say-dot/439857.html

What Does the IDN URL Mean?

How Do I Type the IDN?

Copy & Paste Directly from page

Google Translate

Wikipedia

Keyboard input Need the right keyboard or

keytops.

System setup for allowingthe foreign language input.

Character map tools

One Character, Multiple Codes

http://singapore41.icann.org/meetings/singapore2011/presentation-idn-variant-tlds-update-20jun11-en.pdf

Common Net Commands & IDN

Windows cmd CLI a problem w/o modifcation

Tools have to be able to handle Unicode.

ping

nslookup

dig

Whois (can be tricky at times)

Punycode is more reliable.

Not All Our Tools Are Unicodeor IDN-Ready

Whois & IDN ccTLD Domains

Whois on the domain name might not alwayswork well with some IDN ccTLD domains.

But there are options, including:

Get and lookup IP address

Use IANA db & Delegation Record

IANA Root Zone dbhttp://www.iana.org/domains/root/db/#

IANA Delegation Records

http://www.iana.org/domains/root/db/xn--p1ai.html

Security Concern:Homograph Attacks

Are These Sets The Same?

АаВьСсЕеНКкМРрОоТуХхЗ

AaBbCcEeHKkMPpOoTyXx3

Looking at the Underlying Code

АаВьСсЕеНКкМРрОоТуХхЗ <-Cryllic

AaBbCcEeHKkMPpOoTyXx3 <-ASCII

0410 0430 0412 044C 0421 0441 0415 0435 041D041A 043A 041C 0420 0440 041E 043E 0422 04430425 0445 0417

0041 0061 0042 0062 0043 0063 0045 0065 0048004B 006B 004D 0050 0070 004F 006F 0054 00790058 0078 0033

Homographs for Fraud& Punycode for Detection

http://www.facebook.com/Really is http://www.facebook.com/

http://www.facebοok.com/http://www.xn--facebok-dpf.com/

http://www.faceboοk.com/http://www.xn--facebok-epf.com/

http://www.facebοοk.com/http://www.xn--facebk-m0ea.com/

http://idnaconv.phlymail.de/

Homograph Attack Concerns

Raised by various people, including 3ricJohanson at Shmoocon in 2005.

He registered www.xn—pypal-4ve.com to spoofPaypal.

Anti-Phishing Working Group Global PhishingSurvey 1H2010: last true homograph attackwas in 2009. A “hotmail.net” look-alike:xn--hotmal-t9a.netGlobal Phishing Survey 1H2010: http://tinyurl.com/2ch5o87

Not All Homographs Are Bad.Clever Homograph: xakep.ru

Special Topic:Character Encodings

Code Pages /CharacterEncodings

Examples:

Arabic: Windows 1256, IBM 864

Cyrillic: IBM 855, KOI8-R, Windows 1251

Hebrew: IBM 862, Windows 1255

See also http://en.wikipedia.org/wiki/Code_pages

Character Encoding in Internetdocuments

If page doesn’t render properly:

Check HTML source for clues like<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=KOI8-R">

Server’s country location might be a clue.

Try browser’s character encoding tools. (FireFoxexample)

For Cyrillic, check out these tools:

Universal Cyrillic Decoder page http://2cyr.com/decode/

Russian Anywhere (re) package for many Linux distros.

Example

http://www.lena.ru/songs.html

Firefox – Character EncodingSet to Auto Detect

http://www.lena.ru/songs.html

In recent versions of Firefox,Firefox button-> Web Developer-> Character Encoding-> Auto Detect

In some cases, trial & error isneeded.

This method also can workfor local files.

Resources

ICANN IDN Info: http://www.icann.org/en/topics/idn/

Blog: http://blog.icann.org/

IDN Wiki: http://idn.icann.org/

IDN TLD Map: http://www.icann.org/en/maps/idntld.htm

IDN Bloghttp://idnblog.com/

Verisign IDN FAQhttp://www.verisigninc.com/en_US/products-and-services/domain-name-services/domain-information-center/idn-resources/idn-faq/index.xhtml

This Domain Name is Greek to Me: An Introduction to InternationalizedDomain Names for Investigators (DFI News)http://www.dfinews.com/article/domain-name-greek-me-introduction-internationalized-domain-names-investigators?page=0,1

Internationalized Domain Names & Investigations in the Networked World(one of the DojoCon 2010 videos)http://www.irongeek.com/i.php?page=videos/dojocon-2010-videos

Resources (cont)

XN—ICANNhttp://www.hackerfactor.com/blog/index.php?/archives/321-xn-ICANN.html

IDNForums.ComEmphasis upon buying & selling IDN domains.http://www.idnforums.com/

IANA ccTLDs Databasehttp://www.iana.org/domains/root/db/#

Stratchclyde Forensics – IDN Homograph Attackshttp://www.computerforensicsglasgow.info/IDN_Homograph_Attacks.htm

New Arrival in Russian Spam – .РФhttp://www.thesecurityblog.com/2011/02/new-arrival-in-russian-spam-%D1%80%D1%84/

An IDN – Punycode Converterhttp://idnaconv.phlymail.de/

How to say an e-mail address in Russianhttp://www.themoscowtimes.com/opinion/article/the-really-cool-people-say-dot/439857.html

Resources (cont)

Keyboard Setup

How to Change Keyboard Languagehttp://www.lib.uchicago.edu/e/using/catalog/inputoptions.htmlhttp://tlt.its.psu.edu/suggestions/international/keyboards/winkey.html

http://www.al-bab.com/arab/comp.htm

Translation and Language Issues

American Translators Association: Getting It Right (insights into translationissues)http://www.atanet.org/publications/getting_it_right.php

Basis Technology – Excellent papers & presentations on language issues.http://www.basistech.com/resources/(The links on the left have more papers on topics such as Middle Eastern Languages, Digital Forensics,etc.)

Resources: Google Searchesfor Some IDN ccTLDs Republic of Korea:한국

http://www.google.com/search?q=site%3A.한국

Serbia: СРБhttp://www.google.com/search?q=site%3A%D0%A1%D0%A0%D0%91

Peoples Republic of China: 中国http://www.google.com/search?q=site%3A.%E4%B8%AD%E5%9B%BDhttp://www.google.com/search?q=site%3A.%E4%B8%AD%E5%9C%8B

Hong Kong SAR: 香港http://www.google.com/search?q=site%3A.%E9%A6%99%E6%B8%AF

Taiwan: 台湾http://www.google.com/search?q=site%3A.%E5%8F%B0%E6%B9%BEhttp://www.google.com/search?q=site%3A.%E5%8F%B0%E7%81%A3

Egypt: مصرhttp://www.google.com/search?q=site%3A.مصر

Jordan: االردن

http://www.google.com/search?q=site%3A.%D8%A7%D9%84%D8%A7%D8%B1%D8%AF%D9%86

Saudi Arabia: السعودیةhttp://www.google.com/search?q=site%3A.%D8%A7%D9%84%D8%B3%D8%B9%D9%88%D8%AF%D9%8A%D8%A9

Russian Federation: РФhttp://www.google.com/search?q=site%3A.%D0%A0%D0%A4

Thank you.

• Jon.Abolins@gmail.com

• Twitter: @jabolins

• Web: idn.MeydaOnline.com

top related