presentation saffi keisari m arc s tevens, a lexander s otirov, j acob a ppelbaum, a rjen l enstra,...
TRANSCRIPT
Presentation
Saffi Keisari
MARC STEVENS, ALEXANDER SOTIROV,JACOB APPELBAUM, ARJEN LENSTRA, DAVID MOLNAR,DAG ARNE OSVIK AND BENNE DE WEGER
1
CONTENTCONTENT
2
EXPLOIT OF MD5 COLLISION
3
EXPLOIT OF MD5 COLLISION
4
HASHES AND MESSAGE DIGEST
Hash is also called message digest One-way function: d=h(m) but no h’(d)=m
Cannot find the message given a digest
Cannot find m1, m2, where d1=d2
Arbitrary-length message to fixed-length digest Randomness
any bit in the outputs ‘1’ half the time each output: 50% ‘1’ bits
5
USING HASH FOR AUTHENTICATION
Alice to Bob: challenge rA
Bob to Alice: MD(KAB|rA)
Bob to Alice: rB
Alice to Bob: MD(KAB|rB) Only need to compare MD results
6
7
MD5: MESSAGE DIGEST VERSION 5
input Message
Output 128 bits Digest
o Until recently the most widely used hash algorithm
o in recent times have both brute-force & cryptanalytic concerns
o Specified as Internet standard RFC1321 8
MD5 OVERVIEW
9
MD5 OVERVIEW 1. Padding:
Pad the message with: first the ‘1’-bit, next as many ‘0’ bits until the resulting bit lengthequals 448 mod 512, and finally the bit length of the original message as a 64-bit little-endianinteger. The total bit length of the padded message is 512N for a positive integer N.
2. Partitioning: The padded message is partitioned into N consecutive 512-bit blocks M1,M2, . . . ,MN.
3. Processing: MD5 goes through N + 1 states IHVi, for 0 i N, called the intermediate hash values. Each intermediate hash value IHVi consists of four 32-bit words ai, bi, ci, di. For i = 0 these are initialized to fixed public values: IHV0 = (a0, b0, c0, d0) = (6745230116, EFCDAB8916, 98BADCFE16, 1032547616), and for i = 1, 2, . . .N intermediate hash value IHVi is computed using the MD5
compression function described in detail below: IHVi = MD5Compress(IHVi−1,Mi).
4. Output: The resulting hash value is the last intermediate hash value IHVN, expressed as the
concatenation of the sequence of bytes, each usually shown in 2 digit hexadecimal representation, given by the four words aN, bN, cN, dN using Little-Endian. E.g. in this manner IHV0 will be expressed as the hexadecimal string
0123456789ABCDEFFEDCBA9876543210 10
MD5 PROCESS
As many stages as the number of 512-bit blocks in the final padded message
Digest: 4 32-bit words: MD=A|B|C|D Every message block contains 16 32-bit words: m0|
m1|m2…|m15
Digest MD0 initialized to: A=01234567,B=89abcdef,C=fedcba98, D=76543210
Every stage consists of 4 passes over the message block, each modifying MD
Each block 4 rounds, each round 16 steps
11
PROCESSING OF BLOCK MI - 4 PASSES
ABCD=fF(ABCD,mi,T[1..16])
ABCD=fG(ABCD,mi,T[17..32])
ABCD=fH(ABCD,mi,T[33..48])
ABCD=fI(ABCD,mi,T[49..64])
mi
+ + + +
A B C D
MDi
MD i+1
12
Functions and Random Numbers F(x,y,z) == (xy)(~x z), (for 0 ≤ t< 16)
G(x,y,z) == (x z) (y ~ z), (for 16 ≤ t< 32)
H(x,y,z) == xy z, (for 32 ≤ t< 48)
I(x,y,z) == y(x ~z) , (for 48 ≤ t< 64)
DIFFERENT PASSES...
Input: mt – a 32-bit word from the message
With different shift every round Tt – int(232 * abs(sin(i+1))), 0<i<65
Provided a randomized set of 32-bit patterns, which eliminate any regularities in the input data
ABCD: current MD Output:
ABCD: new MD
13
MD5 COMPRESSION FUNCTION Each round has 16 steps of the form:
a = b+((a+g(b,c,d)+X[k]+T[i])<<<s) a,b,c,d refer to the 4 words of the buffer, but
used in varying permutations note this updates 1 word only of the buffer after 16 steps each word is updated 4 times
where g(b,c,d) is a different nonlinear function in each round (F,G,H,I)
14
MD5 COMPRESSION FUNCTION
15
2004: First collision for MD5 [Wang,Yu]: Two 128 byte messages with same MD5 hash value
Identical prefix collision attack Messages differ only in 128 consecutive ‘random’
bytes Bytes before or after may not differ Currently: <1 sec on single pc core
Same MD5 hash value same signature
MD5( ) = MD5( )
COLLISIONS FOR MD5
16
COLLISIONS FOR MD5 2004 Due to the iterative structure of MD5 and to the fact that IHV0 can have any
128 bit value, such collisions can be combined into larger inputs. Namely, for any given prefix P and any given suffix S a pair of "collision blocks" {C,C'} can be computed such that MD5(P||C||S) = MD5(P||C'||S). We use the term "collision block" for a specially crafted bit string that is inserted into another bit string to achieve a collision. One collision block may consist of several input blocks, even including partial input blocks. The collision blocks of [WY] consist of precisely two consecutive input blocks.
17
COLLISIONS FOR MD5 2004
Their attack is based on a combined additive and XOR differential method. Using this differential, they have constructed 2 differential paths for the compression function of MD5 which are to be used consecutively to generate a collision of MD5 itself. Their constructed differential paths describe precisely how differences between the two pairs (IHV,B) and (IHV`,B`), of an intermediate hash value and an accompanying message block, propagate through the compression function.
They describe the integer difference (−1, 0 or +1) in every bit of the intermediate working states Qt and even specific values for some bits.
Using a collision finding algorithm they search for a collision consisting of two consecutive pairs of blocks (B0,B`0) and (B1,B`1 ), satisfying the 2 differential paths which starts from arbitrary IHˆV = IHˆV 0. Therefore the attack can be used to create two messages M and M0 with the same hash that only differ slightly in two subsequent blocks as shown in the following outline where IHˆV = IHVk for some k:
18
CHOSEN-PREFIX COLLISIONS 2006
Chosen-prefix collision (CPC) attack [Stevens, Lenstra, de Weger]
New stronger type of collisions Choose two arbitrary files (same length) Make them collide by appending 716 ‘random’ bytes Currently: 1 day on quad-core pc w/ only 588 bytes
Example: Colliding certificates with different identities
MD5 harmful for digital signatures
MD5( ) = MD5( )
19
MD5 Compression: IHV, M vs IHV’, M’ Analyze propagation of differences Choose δM=M’-M
Which achieves (partial) elimination of δIHV at end
Construct set of equations Sufficient conditions
Solve set of equations Actual M, M’
Repeat until δ IHV=0
CHOSEN-PREFIX COLLISIONS
20
BIRTHDAY PROBLEM
How many people do you need so that the probability of having two of them share the same birthday is > 50% ?
Random sample of n birthdays (input) taken from k (365, output) kn total number of possibilities (k)n=k(k-1)…(k-n+1) possibilities without duplicate birthday Probability of no repetition:
p = (k)n/kn 1 - n(n-1)/2k For k=366, minimum n = 23 n(n-1)/2 pairs, each pair has a probability 1/k of having the same
output n(n-1)/2k > 50% n>k1/2
21
BIRTHDAY PROBLEM - OUR CASE
After approximately √ (∏ |V| /2) iterations one may expect to have encountered a Collision
Let p be the probability that a birthday collision satisfies additional conditions that cannot be captured by V or f.
where f deterministic function f : V V, where different points x and y such that f(x) = f(y)
in our case the Ctr =√ (∏ |V| /(2p))
where V: Search space p:probability
22
CHOSEN-PREFIX COLLISIONS
Not all δIHVs can be eliminated First perform birthday search
Find δIHVs of specific forme.g. δHV=(0,x,x,y)
Extend search to lower # near-collision blocks Appends 64 to 96 bits to prefixes Then iteratively eliminate differences in δIHV Till δIHV=(0,0,0,0)
Vlastimil Klima, Finding MD5 collisions on a notebook PC using multi-message modifications,
Cryptology ePrint Archive, Report 2005/102, 2005, http://eprint.iacr.org/2005/102.
Vlastimil Klima, Tunnels in hash functions: MD5 collisions within a minute, Cryptology
ePrint Archive, Report 2006/105, 2006, http://eprint.iacr.org/2006/105.23
CHOSEN-PREFIX COLLISIONS Thus, due to the iterative structure of MD5, the chosen-prefix collision construction method is able to produce
on input of any pair of chosen prefixes {P,P'} and any suffix S, a pair of collision blocks {C,C'}, such that MD5(P||C||S) = MD5(P'||C'||S).
To be more precise, for given {P,P'} the collision blocks are constructed as follows. First P, P' are padded with bit strings A, A' of any useful size and contents to achieve equal lengths of P||A and P'||A'. Then, using a "birthdaying" step, bit strings B, B' are produced such that the resulting P||A||B and P'||A'||B' have equal length, being a multiple of 512, and such that the resulting IHVs at this point have a prespecified structure. This enables the construction of a pair of "near collision blocks" {NC,NC'} that gives rise to a collision. Each near collision block will consist of a number of 512 bit input blocks. Thus, with C = A||B||NC and C' = A'||B'||NC', the resulting IHVs are identical. Consequently, MD5(P||C) = MD5(P'||C'), and thus also for any suffix S it is the case that MD5(P||C||S) = MD5(P'||C'||S).
24
2006 EXAMPLE COLLIDING CERTIFICATES
serial number
validity period
“Arjen K. Lenstra”
real certRSA key8192 bits
X.509 extensions
valid signature
identical bytes(copied from real cert)
collision bits(computed)
chosen prefix(different)
serial number
validity period
“Marc Stevens”
real certRSA key8192 bits
X.509 extensions
valid signature
set bythe CA
25
COLLISION CONSTRAINS
In our target application, generating a rogue CA certificate, we have to deal with two hard limits: Because the CA that is supposed to sign our
(legitimate) certificate does not accept certification requests for RSA modulo larger than 2048 bits, each of our suffixes S and S` and their common appendage T must fit in 2048 bits. This implies that we can use at most 3 near-collision blocks. (each block 512 bits)
Furthermore, to reliably predict the serial number, the entire construction must be performed within a few days.
26
PARTIAL DIFFERENTIAL PATHS
27
oW: a larger value allows elimination of more differences in δIHV per near-collision blockoBut increases the cost of constructing each near-collision block by a factor of roughly 22w.oHowever due to the blow-up factor of 22w only the values 2, 3, 4, and 5 are of interestothe new ones vary the carry propagations in the last 3 steps and the boolean function difference in the last step. oThis change affects the working state only in difference δ Q64
COMPLEXITY, MEMORY REQUIREMENT
28
r: near-collision blocks
The first chosen-prefix collision example from [#1] used a 96-bit birth-day search space V with |V|= 296 to find a δ IHV = (δa; δb; δc; δd) with δa = 0, δb = δc = δd. This search can be continued until a birthday collision is found that requires a sufficiently small number of near-collision blocks, which leads to a trade-of between the birthday search and the number of blocks.
If one would aim for just 3 near-collision blocks, one expects 257:33 MD5 compressions for the 96-bit birthday search, which would take about 50 days on 215 PlayStation 3 game consoles.
By leaving δb free, we get an improved 64-bit search space (cf. [#2]).
In the resulting birthday collisions, the differences in δb compared to δc were handled by the differential path from [#2, section 7.4] which corresponds to δQ64 = δ2q ¨δ2q+21 mod 32
#1 http://www.win.tue.nl/hashclash/
#2 http://www.win.tue.nl/hashclash/Nostradamus/
COMPLEXITY, MEMORY REQUIREMENT
29
r: near-collision blocks
We can do a (64 + k)-bit search similar to the one above, but with δb = δc mod 2k.
Since δb does not introduce new differences compared to δc in the lower k bits, the average number of near-collision blocks may be reduced - in particular when taking advantage of our new family of differential paths - while incurring a higher birthdaying cost.
For any targeted number of near-collision blocks, this leads to a trade-of between the birthdaying
cost and space requirements (unless the number of blocks is at least 6, since then 241MB suffices for the plausible choice w = 2)
COMPLEXITY, MEMORY REQUIREMENT
30
The overall chosen-prefix collision construction takes on average less than a day on the cluster of PS3s.With 150Mb per PS3 Where W=5,K=8,M=30G, takes √ 2 more time the excepted
r: near-collision blocks
64 + k with δb = δc mod 2k.
COLLISION IMPROVEMENTS
Allow extra bit differences in last step
Eliminate more IHV differences per block Decreases avg. # collision bytes required Increases collision search complexity O(22w)
Arbitrary bitdifferencesw
Birthday search complexity
30
35
40
45
50
0 1 2 3 4 5 6
w
Ctr: 2
x
Blocks35791113
31
COLLISION IMPROVEMENTS
Birthday search for δIHV=(δa,δb,δc,δd)
of the form: δa=0, δd=δc Short CPC: very high memory requirements
New trade-off: δb=δc mod 2k, 0·k·32 Trade memory vs complexity
w=5: £210 vs £29
δaδdδcδb
Birthday search complexity (w=3)
20
25
30
35
40
45
40 50 60
Ctr: 2x
Mem
ory
: 2
x B
Blocks345
32
COLLISION IMPROVEMENTS
Rogue CA construction (<2048 bits) Cluster of 215 PlayStation3s
Performing like 8600 pc cores Complexity 250 using 30GB:
1 day on cluster Complexity 248.2 using a few TBs:
1 day on 20 PS3s and 1 pc1 day on 8 NVIDIA GeForce GTX280s1 day on Amazon EC2 at the cost of $2,000
Normal CPC Complexity approx. 239 (<1 day on quadcore pc) 33
OVERVIEW SSL
34
Web Deployment: Web server Email Servers(POP3, IMAP) Many Other service(IRC, SSL VPN. Etc…)
Very good at preventing eavesdropping Asymmetric key exchange(RSA) Symmetric crypto for data encryption
Man in The Middle attacks Preventing by establishing a chain of trust from
web site digital certificate to trusted certificate authority
CERTIFICATION AUTHORITIES (CAS)
Website digital certificates must be signedby a trusted Certificate Authority
Browsers ship with a list of trusted Cas Firefox 3 includes 135 trusted CA certs
CAs’ responsibilities: Verify the identity of the requestor Verify domain ownership for SSL certs Revoke bad certificates
35
36
1. User generates private key2. User creates a Certificate Signing Request (CSR) containing
• user identity• domain name• public key
3. CA processes the CSR• validates user identity• validates domain ownership• signs and returns the certificate
4. User installs private key and certificate on aweb server
Obtaining certificates
Obtaining certificates
ATTACK ON CERTIFICATION AUTHORITIES
We were able to create a sub-CA signed by a known trusted CA (RapidSSL) Not by default known by major web browsers But is trusted as it is signed by a known CA
Same effect as subverting a known trusted CA
Possible because one particular commercial CA used MD5 to create signatures
MD5 known to have significant weaknesses since 2004 had weaknesses in procedures
37
ATTACK ON CERTIFICATION AUTHORITIES
In July 2008 check if the colliding certificates attack could be applied to a real Certification Authority.
The first step was to identify the CAs that still used MD5. Unfortunately it is not possible to determine the hash function
a CA uses from the CA certificate. We had to look at (website) certificates issued by the CAs
instead. Over the course of a week we spidered the web and collected
more than 100,000 SSL certificates, of which about 30,000 were signed by CAs trusted by Firefox.
There were six CAs that had issued certificates signed with MD5 in 2008:
38
ATTACK ON CERTIFICATION AUTHORITIES RapidSSL
C=US, O=Equifax Secure Inc., CN=Equifax Secure Global eBusiness CA-1 FreeSSL (free trial certificates offered by RapidSSL)
C=US, ST=UT, L=Salt Lake City, O=The USERTRUST Network, OU=http://www.usertrust.com, CN=UTN-USERFirst-Network Applications
TC TrustCenter AGC=DE, ST=Hamburg, L=Hamburg, O=TC TrustCenter for Security in Data Networks GmbH, OU=TC TrustCenter Class 3 CA/[email protected]
RSA Data SecurityC=US, O=RSA Data Security, Inc., OU=Secure Server Certification Authority
ThawteC=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting cc, OU=Certification Services Division, CN=Thawte Premium Server CA/[email protected]
verisign.co.jpO=VeriSign Trust Network, OU=VeriSign, Inc., OU=VeriSign International Server CA - Class 3, OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign
Out of the 30,000 certificates that collected, about 9,000 were signed using MD5, and 97% of those were issued by RapidSSL.
It was quite surprising that so many CAs are still using MD5, considering that MD5 has been known to be insecure since the first collisions were presented in 2004. Since these CAs had ignored all previous warnings by the cryptographic community
39
CREATING A SUB-CA
serial number
validity period
real cert domainname
real certRSA key
max 2048 bits
X.509 extensions
valid signature
rogue CA cert
rogue CA RSA key
rogue CA X.509extensions
Netscape CommentExtension
(contents ignored bybrowsers)
valid signature
identical bytes(copied from real cert)
collision bits(computed)
chosen prefix(different)
CA bit!
40
OBSTACLES
Predicting serial number and validity period Total computation < a few days Max 204 collision bytes instead of 716
Limit by the CA RapidSSL Greatly increases computational time 17 months on 1000 pc cores
41
PREDICTIONS RapidSSL uses a fully automated system Certificate issued exactly 6 seconds after
clicking Validity period one year plus one day + 6 sec
RapidSSL uses sequential serial numbers: Nov 3 07:44:08 2008 GMT 643006 Nov 3 07:45:02 2008 GMT 643007 Nov 3 07:46:02 2008 GMT 643008 Nov 3 07:47:03 2008 GMT 643009 Nov 3 07:48:02 2008 GMT 643010 Nov 3 07:49:02 2008 GMT 643011 Nov 3 07:50:02 2008 GMT 643012 Nov 3 07:51:12 2008 GMT 643013 Nov 3 07:51:29 2008 GMT 643014 Nov 3 07:52:02 2008 GMT ?
42
PREDICTIONS
Estimate: 800-1000 certificates per weekendProcedure:
1. Get the serial number S on Friday2. Predict the value for time T on Sunday
to be S+10003. Generate the collision bytes4. Shortly before time T buy enough certs to
increment the counter to S+9995. Send colliding request at time T
and get serial number S+1000
43
PREDICTIONS
44
RESULT
Success at 4th attempt Generated CA signature for real cert
also valid for rogue CA cert Explicit safeguards:
Validity period limited to August 2004 Private key remains secret
Major browsers and affected CAs were informed in advance Responded quickly and adequately MD5 abandoned by CAs hours
after public presentation45
SINGLE BLOCK CPC
Birthday search for δIHV that can be reduced to 0 with single near-collision block
New approach: New fastest near-collision attack (compl. 215) Allow extra factor 226 in collision finding compl. Results in set of 223.3 usable δIHVs of the form
δa=-25, δd=-25+225, δc=-25 mod 220
Total complexity: approx. 253.2
Example single block CPC in paper
46
REAL VS. ROGUE CERTIVICATE
47
CA CERTIFICATE byte 0 - 3:header
byte 4 - 8:version number (3, default value)
byte 9 - 13:serial number (643015, set by CA and predicted by us)
byte 14 - 28:signature algorithm ("md5withRSAEncryption", set by CA)
byte 29 - 120:issuer Distinguished Name (CA default value)
byte 121 - 152:validity ("from 3 Nov. 2008 7:52:02 to 4 Nov. 2009 7:52:02", set by CA and predicted by us)
byte 153 - 440:subject Distinguished Name(Country = "US",Organisation = "i.broke.the.internet.and.all.i.got.was.this.t-shirt.phreedom.org",Organisational Unit (3 fields set by the CA) Common Name = "i.broke.the.internet.and.all.i.got.was.this.t-shirt.phreedom.org",Country, Organisation and Common Name set by us in the CSR)
byte 441 - 734:subject public key info:byte 441 - 444:header
byte 445 - 459:public key algorithm ("RSAEncryption", set by us in the CSR)
byte 460 - 468:headers
byte 469 - 729:RSA modulus (2048 bit value, set by us in the CSR)
byte 730 - 734:RSA public exponent (65537, set by us in the CSR)
byte 735 - 926:extensions:byte 735 - 740:headers
byte 741 - 756:key usage ("digital signature, nonrepudiation, key encipherment, data encipherment", set by CA)
byte 757 - 881:subject key identifier, crl distribution points, authority key identifier (uninteresting fields, set by CA)
byte 882 - 912:extended key usage ("server authentication, client authentication", set by CA)
byte 913 - 926:basic constraints ("CA = FALSE, Path Length = None", set by CA)
48
byte 0 - 3:header
byte 4 - 8:version number (3, default value)
byte 9 - 11:serial number (65, arbitrary choice)
byte 12 - 26:signature algorithm ("md5withRSAEncryption", set by us in the CSR)
byte 27 - 118:issuer Distinguished Name (CA default value)
byte 119 - 150:validity ("from 31 Jul. 2004 0:00:00 to 2 Sep. 2004 0:00:00“, set by CA and predicted by us)
byte 153 - 212:subject Distinguished Name(Common Name = "MD5 Collisions Inc. (http://www.phreedom.org/md5)")
byte 213 - 374:subject public key info:
byte 213 - 215:header
byte 216 - 230:public key algorithm ("RSAEncryption“, ", set by us in the CSR)
byte 231 - 237:headers
byte 238 - 369:RSA modulus (1024 bit value) , , set by us in the CSR
byte 370 - 374:RSA public exponent (65537) , , set by us in the CSR
byte 375 - 926:extensions:
byte 375 - 378:headers
byte 379 - 395:key usage ("digital signature, nonrepudiation, certificate signing, offline CRL signing, CRL signing")
byte 396 - 412:basic constraints ("CA = TRUE, Path Length = None")
byte 413 - 476:subject key identifier, authority key identifier (uninteresting fields)
byte 477 - 926:tumor (Netscape extension)
49
ROGUE CA CERTIFICATE
CONCLUSION Collision attacks on MD5 form real threat
Hard to replace broken crypto primitives MD5 used by major CAs
4 years after first collision attacks Crypto primitives can be broken overnight What to do when e.g. SHA-1 really falls, say
yesterday? How to make replacement of primitives easier?
Source code implementation released:http://code.google.com/p/hashclash(Support for CELL/PS3 & CUDA) 50
PROGRESS OF COLLISION ATTACKS
Attack complexities for MD5, SHA-1 and SHA-2
(logarithmic: 38 means 238 ¼ 1day on 1pc)
51
THE AUTHORS – THE GEEKS GANG
52http://www.win.tue.nl/hashclash/rogue-ca/
53