exploiting unicode-enabled software - cansecwest encodings categorization normalization binary...
TRANSCRIPT
Exploiting Unicode-enabled Software
CanSecWestMarch 2009
Chris Weberwwwlookoutnet
chriscasabasecuritycomCasaba Security
Exploiting Unicode-enabled Software
March 2009 copy 2009 Chris Weber
bull People for the Ethical Treatment of ASCII
ndash ldquoNo ASCII characters were harmed in the making of this presentationrdquo
wwwcasabasecuritycom
PETA Certified Presentation
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
ndash Find Unicode issues in Web-testing
ndash Visual Spoofing Detection API
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
Unicode Crash Course
199119901985
1981
198119641963
bull Unicode
bull ISO 10646 (UCS)
bull ISO-8859-1
bull More code pages galore
bull MBCSbull GB2312
bull CP437
bull EBCDIC
bull ASCII 7-bitbull 8th bit free-for-all to follow
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
CanSecWestMarch 2009
Chris Weberwwwlookoutnet
chriscasabasecuritycomCasaba Security
Exploiting Unicode-enabled Software
March 2009 copy 2009 Chris Weber
bull People for the Ethical Treatment of ASCII
ndash ldquoNo ASCII characters were harmed in the making of this presentationrdquo
wwwcasabasecuritycom
PETA Certified Presentation
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
ndash Find Unicode issues in Web-testing
ndash Visual Spoofing Detection API
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
Unicode Crash Course
199119901985
1981
198119641963
bull Unicode
bull ISO 10646 (UCS)
bull ISO-8859-1
bull More code pages galore
bull MBCSbull GB2312
bull CP437
bull EBCDIC
bull ASCII 7-bitbull 8th bit free-for-all to follow
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull People for the Ethical Treatment of ASCII
ndash ldquoNo ASCII characters were harmed in the making of this presentationrdquo
wwwcasabasecuritycom
PETA Certified Presentation
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
ndash Find Unicode issues in Web-testing
ndash Visual Spoofing Detection API
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
Unicode Crash Course
199119901985
1981
198119641963
bull Unicode
bull ISO 10646 (UCS)
bull ISO-8859-1
bull More code pages galore
bull MBCSbull GB2312
bull CP437
bull EBCDIC
bull ASCII 7-bitbull 8th bit free-for-all to follow
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
ndash Find Unicode issues in Web-testing
ndash Visual Spoofing Detection API
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
Unicode Crash Course
199119901985
1981
198119641963
bull Unicode
bull ISO 10646 (UCS)
bull ISO-8859-1
bull More code pages galore
bull MBCSbull GB2312
bull CP437
bull EBCDIC
bull ASCII 7-bitbull 8th bit free-for-all to follow
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
Unicode Crash Course
199119901985
1981
198119641963
bull Unicode
bull ISO 10646 (UCS)
bull ISO-8859-1
bull More code pages galore
bull MBCSbull GB2312
bull CP437
bull EBCDIC
bull ASCII 7-bitbull 8th bit free-for-all to follow
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
Unicode Crash Course
199119901985
1981
198119641963
bull Unicode
bull ISO 10646 (UCS)
bull ISO-8859-1
bull More code pages galore
bull MBCSbull GB2312
bull CP437
bull EBCDIC
bull ASCII 7-bitbull 8th bit free-for-all to follow
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Shift_jis
Gb2312
ISCII
Windows-1252
ISO-8859-1
EBCDIC 037
wwwcasabasecuritycom
Unicode Crash CourseCode pages and charsets
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode can represent them all
bull ASCII range is preserved
ndash U+0000 to U+007F are mapped to ASCII
wwwcasabasecuritycom
Unicode Crash CourseAd Infinitum
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Source Wikipedia
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull End users
bull Applications
bull Databases
bull Programming languages
bull Operating Systems
wwwcasabasecuritycom
Unicode Crash CourseThe Unicode Attack Surface
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUnthink it
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull A large and complex standard
Unicode Crash Course
code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties
canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks
escapings
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
Unicode Crash Course
Glyph
Encoding
Properties
Code point
Block Script
Plane
A
UTF-8 UTF-16 UTF-32
Hex Uppercase etc
U+0041
Basic Latin Latin
Basic Multilingual Plane(BMP)
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points
U+0000 to U+10FFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
A = U+0041
Every character has a unique number represented by a hex value
wwwcasabasecuritycom
Unicode Crash CourseCode Points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
AU+0041
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash Course
ſU+017F
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull The full 21-bit range is not actually available
U+0000 to U+D7FF and
U+E000 to U+10FFF
whatrsquos up with U+D800U+DFFF
wwwcasabasecuritycom
Unicode Crash CourseCode points
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Unicode Crash CourseUTF-16 Surrogate Pairs
U+101D1
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
UTF-8 ndash variable width 1 to 4 bytes (used to be 6)
UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs
UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed
wwwcasabasecuritycom
Unicode Crash CourseEncodings
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash coursebull Root Causes
ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareOverview
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Over 100000 assigned characters
bull Many lookalikes within and across scripts
AΑАᐱᗅᗋᗩᴀᴬꜲA6553766304
wwwcasabasecuritycom
Root CausesVisual Spoofing
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
httpπαράδειγμαδοκιμή
(exampletest)
wwwcasabasecuritycom
Root CausesIDN ndash Internationalized Domain Names
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull IDNA 2003
bull Nameprep (NFKC and prohibit)
bull Punycodendash httpxn--hxajbheg2az3alxn--jxalpdlp
bull Whitelist TLDrsquosndash ORG DE CN to name a few
bull Language settings and TLD
bull Character blacklisting
wwwcasabasecuritycom
Root CausesIDN ndash what do the browsers do
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Divergent browser implementations
bull Confusables exist
bull IDNA and Nameprep based on Unicode 32
ndash Wersquore up to Unicode 51 (larger repertoire)
wwwcasabasecuritycom
Root CausesIDN ndash so whatrsquos the problem
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Some browsers allow COM IDNrsquos
based on script family
ndash (Latin has a big family)
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Safari
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Opera
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwgooglecom is not wwwgooɡlecom
Latin U+0069
LatinU+0261
gɡ
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Normalize with NFKC
bull Homograph and Confusables detection
bull Specifications
ndash IDNA Stringprep
bull Guidance
ndash Unicode Consortium ICANN IETF IANA
wwwcasabasecuritycom
Root CausesGuidance for Visual Spoofing
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
Registries apply the guidance
ndash define the allowed characters per TLD
ndash Collaboration with IANA
Registrars sell the domain names
wwwcasabasecuritycom
Root CausesGuidance for International Domain Names
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
ICANN guidelines v20
ndash Inclusion-based
ndash Script limitations
ndash Character limitations
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
Deny-all default seems to be the right concept
A script can cross many blocks Even with limited script choices therersquos plenty to choose from
Great for domain labels but sub domain labels still open to punctuation and syntax spoofing
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Registrars still allow
ndash Confusables
ndash Combining marks
ndash Single Whole and Mixed-script
bull Registrars canrsquot control
ndash Syntax spoofing in sub domain labels
wwwcasabasecuritycom
Root CausesThe state of International Domain Names
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Non-Unicode attacks
bull Confusables
bull Invisibles
bull Problematic font-rendering
bull Manipulating Combining Marks
bull Bidi and syntax spoofing
wwwcasabasecuritycom
Attack VectorsVisual spoofing Vectors
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
rn can look like m in certain fonts
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
wwwmulletscom is not wwwrnulletscom
Latin U+006D
LatinU+0073 U+006E
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Are you using mono-width fonts
0 and O
1 and l
5 and S
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Classic long URLrsquos
httploginfacebookintvitationvideomessageid-
h048892r39sessionnfbidcomhomehtmdisbursements
wwwcasabasecuritycom
Attack VectorsNon-Unicode homograph attacks
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
The Confusables
ndash Single script
ndash Mixed script
ndash Whole script
wwwcasabasecuritycom
Attack VectorsDefining Homographs
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
wwwɑpplecom User thinks lsquoarsquo
Really itrsquos Latin small letter Alpha lsquoɑrsquo
wwwlooĸoutnet
User thinks lsquokrsquo
Really itrsquos Latin letter kra lsquoĸrsquo
wwwcasabasecuritycom
Attack VectorsSingle-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
wwwg๐๐glecom User thinks lsquoorsquo
Really itrsquos Thai digit zero lsquo๐rsquo
wwwfaϲebookcom
User thinks lsquocrsquo
Really itrsquos Greek lunate sigma symbol lsquocrsquo
wwwᏀooglecom
Really itrsquos Cherokee letter Nah lsquoᏀrsquo
wwwcasabasecuritycom
Attack VectorsMixed-script and The Confusables
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
wwwаЬсcom
User thinks lsquoabcrsquo
Really itrsquos Cyrillic script
wwwігѕgov
User thinks lsquoirsrsquo
Really itrsquos Greek script
wwwcasabasecuritycom
Attack VectorsWhole-script and The Confusables
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Browsers whitelist ORG
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Others donrsquot necessarily buthellip
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull ORG is whitelisted
ndash Limited characters available
bull To unscrutinizing eyes
iacute looks like i
wwwcasabasecuritycom
Attack VectorsIDN homograph attacks
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN homograph attacks
wwwmozillaorg is not wwwmoziacutellaorg
Latin U+0069
LatinU+00ED
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
(This case doesnrsquot work anymore)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
FULLWIDTH SOLIDUSU+FF0F
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
(Normalized to a U+002F)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecompathfilenottrustedorg
SOLIDUSU+002F
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
U+2571 Box Drawings
〳 U+3033 Kana Repeat Mark
Ꜹ U+A738 LATIN CAPITAL AV
ꜹ U+A739 LATIN SMALL AV
U+FF65 KATAKANA MIDDLE DOT
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with and lookalikes
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with (full stop) lookalikes
httpwwwgooglecom
Katakana DotU+FF65
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
(However punctuation not requiredhellip)
wwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
httpwwwgooglecomノpathノfilenottrustedorg
Katakana NoU+FF89
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsIDN Syntax Spoofing with lookalikes
Browser sees and displays a valid IDN
DNS sees Punycode
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
IDN Visual Spoofing
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Visual Spoofing Detection API
ndash Detects Confusables
ndash Detects Invisibles
ndash Detections syntax and punctuation lookalikes
ndash Detects combining mark tricks
bull Currently in testing
bull Release planned for Fall 2009
wwwcasabasecuritycom
IDN Visual SpoofingSolutions and Defenses (yes there is one)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
U+200B (ZERO WIDTH SPACE)
U+180E (MONGOLIAN VOWEL SEPARATOR)
U+FEFF (ZERO WIDTH NO-BREAK SPACE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsThe Invisibles
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
bull Fonts render glyphs confusingly
bull Fonts render glyphs as empty white space
httpwwwgooglecom phreedomorg
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing Problematic Font-rendering
middotmiddotmiddot is middotmiddotmiddot (Arial Gothic)
A is A (Lucida Sans Unicode Courier New)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Multiple combining marks
o looks like U+006F U+0304
o is U+006F U+0304 U+0304
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Combining Marks
bull Order of combining marksndash ȏ and ouml under NFKC
ltU+006F U+0308U+0311gt ltU+00F6 U+0311gt
ltU+006F U+0311U+0308gt ltU+020F U+0308gt
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidirectional Controls (Bidi)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
bull httpunicodeorgreportstr9
ndash ldquoThese characters are to be avoided wherever possible because of security concernsrdquo
ndash forbidden in IDNA
U+202D (LEFT-TO-RIGHT OVERRIDE)
U+202E (RIGHT-TO-LEFT OVERRIDE)
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Commonly occur in charset transformations and even innocuous APIrsquos
Impact Filter evasion Enable code execution
When σ becomes s
U+03C3 GREEK SMALL LETTER SIGMA
When prime becomes
U+2032 PRIME
wwwcasabasecuritycom
Root CausesBest-fit mappings
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Net runtime will marshall a string as LPStr to a pinvoke function
How can we best-fit the lt character
bull U+2329 maps to U+003c Left-Pointing Angle Bracketbull U+3008 maps to U+003c Left Angle Bracket
How can we best-fit the s character
bull U+015b maps to U+0073 Latin Small Letter S With Acutebull U+015d maps to U+0073 Latin Small Letter S With Circumflex
To deal with this specify a LPWStr type instead of LPStr[MarshalAs(UnmanagedTypeLPWStr)]
wwwcasabasecuritycom
Windows best-fit pInvokeBest-fit mappings
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Scrutinize charactercharset manipulation APIrsquos
bull Use EncoderFallback with SystemTextEncoding
bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()
bull Use Unicode end-to-end
wwwcasabasecuritycom
Root CausesGuidance for Best-Fit mappings
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull A popular social networking site in 2008
bull Implemented complex filtering logic to prevent XSS
ndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting
ndash Root Cause best-fit mappings
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
-moz-binding()
was not allowed buthellip
-[U+ff4d]oz-binding()
would best-fit map
wwwcasabasecuritycom
Case Study Social NetworkingBest-fit mappings
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Normalizing strings after validation is dangerous
Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
NFD - Decompose (canonical)
NFC - Decompose (canonical) Recompose
NFKD - Decompose (compatibility)
NFKC - Decompose (compatibility) Recompose
wwwcasabasecuritycom
Root CausesNormalization
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
İ becomes I +
wwwcasabasecuritycom
Root CausesNormalization
U+0130 U+0049 U+0307
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
But are there dangerous characters
You bethellip with NFKC and NFKD you could control HTML or other parsing
﹤ becomes lt
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
﹤ becomes lt
toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo
wwwcasabasecuritycom
Root CausesNormalization
U+FE64 U+003C
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Normalize strings before validation
NFKC first defense against Visual spoofing
wwwcasabasecuritycom
Root CausesGuidance for Normalization
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Non-shortest or overlong UTF-8
Impact Filter evasion Enable code execution
Application gets C0A7
OSFramework sees 27
Database gets
wwwcasabasecuritycom
Root CausesNon-shortest form UTF-8
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode specification forbids
ndash Generation of non-shortest form
ndash Interpretation of non-shortest form for BMP
bull Validate UTF-8 encoding (throw on error)
wwwcasabasecuritycom
Root CausesGuidance for Non-shortest form UTF-8
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
How many ways can you say
wwwcasabasecuritycom
Attack VectorsDirectory traversal
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Directory traversal test casesndash httpsiterootsystem
ndash Overlong UTF8 U+002FhttpsiterootC0AEsystem
ndash Full-width Solidus U+FF0F Normalized KC or KDhttpsiteroot EFBC8Fsystem
ndash Division Slash U+2215 best-fithttpsiteroot E28895system
ndash Two dot leader U+2025 Normalized KC or KDbull httpsiterootE280A5 E280A5 system
wwwcasabasecuritycom
Attack Vectors
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unassigned code points
ndash U+2073
bull Illegal code points
ndash Half a surrogate pair
bull Code points with special meaning
ndash U+FEFF is the BOM
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesHandling the Unexpected
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Over-consuming ill-formed byte sequences
Big problem with MBCS lead bytes
lt41 C2 3E 41gt becomes
lt41 41gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
ltimg src=[0xC2]gt onerror=alert(1)ltbr gt
becomes
ltimg src=gt onerror=alert(1)ltbr gt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Over-consumption
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Correcting insecurely rather than failing
ndash Substituting a lsquorsquo or a lsquorsquo would be bad
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-substitution
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
ldquodeletion of noncharactersrdquo (UTR-36)
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
ltscr[U+FEFF]iptgt becomes ltscriptgt
wwwcasabasecuritycom
Root CausesHandling the Unexpected Character-deletion
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Fail or error
bull Use U+FFFD instead ndash A common alternative is lsquorsquo which can be safe
wwwcasabasecuritycom
Root CausesSolutions for Handling the Unexpected
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Bypass filters WAFrsquos NIDS and validation
bull Exploit delivery techniques
ndash Eg Cross-site scripting (buffer overflow of the Web)
wwwcasabasecuritycom
Attack VectorsFilter evasion
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Safari and Firefox BOM consumptionndash Attack Filter evasion code execution
ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting
ndash Root Cause Character deletion
lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt
Can be nastier
lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt
wwwcasabasecuritycom
Case Study Apple and Mozilla
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Safari BOM injection for XSS
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
A Closer Look The BOM
BOMU+FEFF
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Attackers manipulate casing operations to inject otherwise prohibited characters
bull Casing can multiply the buffer sizes needed
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
toLower(ldquoİrdquo) == ldquoirdquo
toLower(ldquoscrİptrdquo) == ldquoscriptrdquo
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
len(x) = len(toLower(x))
wwwcasabasecuritycom
Root CausesCasing
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Perform casing operations before validation
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Casing
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Incorrect assumptions about string sizes (chars vs bytes)
bull Improper width calculations
bull Impact Enable code execution
wwwcasabasecuritycom
Root CausesBuffer Overflows
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Casing - maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
Lower 8 15 Ⱥ U+023A
16 32 1 A U+0041
Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
Normalization- maximum expansion factors
wwwcasabasecuritycom
Root CausesBuffer Overflows
Operation UTF Factor Sample
NFC8 3X 119136 U+1D160
16 32 3X ש U+FB2C
NFD8 3X ΐ U+0390
16 32 4X ᾂ U+1F82
NFKCNFKD8 11X
ملسو هيلع هللا ىلص U+FDFA16 32 18X
Source Unicode Technical Report 36
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Know the difference between bytes and chars
bull Secure coding
bull Leverage existing frameworks and APIrsquos
ndash ICU Net
wwwcasabasecuritycom
Root CausesGuidance for Buffer Overflows
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull White space and line breaks
ndash Eg when U+180E acts like U+0020
bull Quotation marks
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesControlling Syntax
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Manipulate HTML parsers and javascriptinterpreters
bull Control protocols
wwwcasabasecuritycom
Attacks and ExploitsControlling syntax
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode formatter characters exploited for XSS
ndash Damage Filter evasion controlling syntax
ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting
ndash Root Cause Interpreting ldquowhite spacerdquo
ndash A problem with HTML 40 spec
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
lta href=[U+180E]onclick=alert()gt
wwwcasabasecuritycom
Case Study Opera
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
DEMO
wwwcasabasecuritycom
Opera White Space Characters
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Case Study Opera
MVSU+180E
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Question specifications
bull Be carefulhellip
wwwcasabasecuritycom
Root CausesGuidance for Controlling Syntax
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
1) Character stabilityndash IDNANameprep based on Unicode 32
2) Designsndash Specs are carefully designed but not always perfect
bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of
U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo
ndash HTML 401 bull Defines four whitespace characters and explicitly leaves
handling other characters up to implementer
wwwcasabasecuritycom
Root CausesSpecifications
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Converting between charsets is dangerous
bull Mapping tables and algorithms vary across platforms
bull Impact Filter evasion Enable code execution Data-loss
wwwcasabasecuritycom
Root CausesCharset Transformations
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Avoid if possible
bull Use Unicode as the broker
bull Beware the PUA mappings
bull Transform case and normalize prior to validation and redisplay
wwwcasabasecuritycom
Root CausesGuidance for Charset Transformations
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Some charset identifiers are ill-defined
bull Vendor implementations vary
bull User-agents may sniff if confused
bull Attackers manipulate behavior
bull Impact Filter evasion Enable code execution
wwwcasabasecuritycom
Root CausesCharset Mismatches
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Root CausesCharset Mismatches
Content-Type charset=ISO-8859-1
ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt
Attacker-controlled input
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Force UTF-8
bull Error if uncertain
wwwcasabasecuritycom
Root CausesGuidance for Charset Mismatches
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode crash course
bull Root Causes
bull Attack Vectors
bull Tools
wwwcasabasecuritycom
Exploiting Unicode-enabled SoftwareAgenda
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Watcher
ndash Web-app security testing and auditing
bull Visual Spoofing Detection API
ndash Providing guarantees against Visual Spoofing and Homograph attacks
wwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure
wwwcasabasecuritycom
ToolsWatcher ndash Some of the Passive Checks Included
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
Tools
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
httpwebsecuritytoolcodeplexcom
wwwcasabasecuritycom
ToolsWatcher - Web-app Security Testing and Auditing
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Problemndash Unicode enables visual-spoofing-maximus
bull Solutionndash Confusable detection
ndash Invisibles detection
ndash Syntax spoof detection
ndash more
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weber
bull Cross-platform component library written in C
bull Can be applied in user-agents or any softwarendash Browsers
ndash Email clients
bull Planned for release Fall 2009
bull Email me with questions
wwwcasabasecuritycom
ToolsVisual Spoofing Detection API
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
March 2009 copy 2009 Chris Weberwwwcasabasecuritycom
ToolsVisual Spoofing Protection Demo
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom
Thank you
Contact me with questions new test cases or ideas to share
Visit my website for test cases Unicode and security tools and the Anti-Visual-Spoofing API
Chris WeberwwwlookoutnetCasaba Security
wwwcasabasecuritycom