aaron bedra - effective software security teams
TRANSCRIPT
INFOSECSUMMIT 2016FORMAL VERIFICATION OF SECURE SOFTWARE SYSTEMS
Aaron Bedra Chief Security Of�cer, Eligible, Inc.
@abedra
FORMAL METHODS
THIS IS A HUGE CONCEPT AND WE ARE ONLYGOING TO SCRATCH THE SURFACE
BUT LET'S TALK ABOUT SOME OF THE BASICCATEGORIES
SPECIFICATION
DEVELOPMENT
VERIFICATION
TODAY WE WILL TALK ABOUT ALL OF THESE
IN VARYING DEGREES OF DETAIL
WE'RE GOING TO SKIP A WHOLE LOT OF THEBASICS
AND TALK ABOUT A LOT OF THINGS YOU MAYNOT HAVE SEEN BEFORE
IF YOU HAVEN'T BEEN PROPERLY INTRODUCEDTO FORMAL METHODS YOU WILL LEAVE WITH A
LOT OF THINGS TO LOOK INTO
WHY?
THREAT MODELS
SOMETIMES IT'S WORTH DOING THE WORK
HIGH-ASSURANCE CYBER MILITARY SYSTEMS(HACMS)
https://www.youtube.com/watch?v=3aFeGGyi19A
EXAMPLE: IMPLEMENTATION OFCRYPTOGRAPHIC ALGORITHMS
SOME LANGUAGES ARE BETTER FOR MODELINGALGORIGTHMS THAN OTHERS
BUT THEY AREN'T ALWAYS GOOD FOR THEACTUAL IMPLEMENTATION
THE GOOD NEWS IS THAT WE CAN HAVE OURCAKE AND EAT IT TOO!
WARNING: THE FOLLOWING EXAMPLES ARE NOTSUITABLE CRYPTOGRAPHIC IMPLEMENTATIONS
CAESAR'S CIPHERe(x) = (x + k)(mod 26)
VERY EASY TO MODEL AND IMPLEMENT
BUT WHAT IF I WANT TO MAKE SURE THAT ITWAS DONE CORRECTLY?
OUR MODEL caesar :: Int -> String -> String caesar k = map f where f c | inRange ('A','Z') c = chr $ ord 'A' + (ord c - ord 'A' + k) mod 26 | otherwise = c
AN IMPLEMENTATION char *caesar(int shift, char *input) { char *output = malloc(strlen(input)); memset(output, '\0', strlen(input));
for (int x = 0; x < strlen(input); x++) { if (isalpha(input[x])) { int c = toupper(input[x]); c = (((c - 65) + shift) % 26) + 65; output[x] = c; } else { output[x] = input[x]; } }
return output;}
BUT HOW DO WE COMPARE THEM?
WE COULD DO SOME PROPERTY BASED TESTING
SIMPLE EQUIVALENCE PROPERTY equivalenceProperty = \s -> (caesar 2 s) == ???
WE CAN'T TEST EQUIVALENCE UNLESS WE CANACTUALLY RUN THE OTHER CODE
LUCKILY, HASKELL HAS GOOD FFI SUPPORT
WE IMPORT AND WRAP OUR NATIVE CODE foreign import ccall "ceasar.h caesar" c_caesar :: CInt -> CString -> CString
native_caesar :: Int -> String -> IO String native_caesar shift input = withCString input $ \c_str -> peekCString(c_caesar (fromIntegral shift) c_str)
NOW WE CAN TRY REFERENCE ANDIMPLEMENTATION IN THE SAME PLACE
$ ghci caesar.hs caesar.so *Main> caesar 2 "ATTACKATDAWN" "CVVCEMCVFCYP" *Main> native_caesar 2 "ATTACKATDAWN" "CVVCEMCVFCYP"
BUT WE DO HAVE SOMETHING TO ADDRESS native_caesar :: Int -> String -> IO String caesar :: Int -> String -> String
THIS WORKS FINE IN THE REPL, BUT IT WON'TWORK ELSEWHERE
AVERT YOUR EYES! unsafeEq :: IO String -> String -> Bool unsafeEq x y = unsafePerformIO(x) == y
WE CAN NOW WRITE A PROPERTY. GIVEN ANINPUT ENSURE THAT THE OUTPUT OF BOTH
FUNCTIONS IS EQUIVALENT equivalenceProperty = forAll genSafeString $ \str -> unsafeEq (native_caesar 2 str) (caesar 2 str)
GENSAFESTRING IS A CUSTOM GENERATOR FORONLY ALPHA INPUTS
genSafeChar :: Gen Char genSafeChar = elements ['A'..'Z']
genSafeString :: Gen String genSafeString = listOf genSafeChar
newtype SafeString = SafeString { unwrapSafeString :: String } deriving Show
instance Arbitrary SafeString where arbitrary = SafeString <$> genSafeString
NOW WE CAN ACTUALLY RUN SOME TESTS! $ ghci caesar.hs caesar.so *Main> quickCheck equivalenceProperty +++ OK, passed 100 tests.
BUT THAT'S NOT ENOUGH TESTS. LET'S TURN ITUP A NOTCH
deepCheck p = quickCheckWith stdArgs{ maxSuccess = 1000000 } p
*Main> deepCheck equivalenceProperty +++ OK, passed 1000000 tests.
IS THAT ENOUGH? WHAT ABOUT A BILLION?MORE?
QUICKCHECK IS A WONDERFUL TOOL, BUT ITDOESN'T GET US ALL THE WAY THERE
THAT WAS ALSO A LOT OF WORK!
THERE HAS TO BE A BETTER WAY TO APPROACHTHIS
CRYPTOL
A domain-speci�c language for specifying cryptographicalgorithms
A SIMPLE EXAMPLE sqDiff1 (x, y) = x 2 - y 2 sqDiff2 (x, y) = (x-y) * (x+y)
sqDiffsCorrect : ([8], [8]) -> Bit property sqDiffsCorrect (x, y) = sqDiff1 (x, y) == sqDiff2 (x, y)
Cryptol> :l ha.cry Loading module Cryptol Loading module Main Main> :t sqDiffsCorrect sqDiffsCorrect : ([8], [8]) -> Bit Main> sqDiffsCorrect (3, 5) True Main> :prove sqDiffsCorrect Q.E.D.
THE GOAL IS TO MAKE REFERENCE IMPLEMENTATIONS OFCRYPTOGRAPHIC ALGORITHMS
caesar : {n} ([8], String n) -> String ncaesar (s, msg) = [ shift x | x <- msg ] where map = ['A' .. 'Z'] <<< s shift c = map @ (c - 'A')
$ ~/.local/bin/cryptol caesar.cry Loading module Cryptol Loading module Main Main> :set ascii=on Main> caesar (2, "ATTACKATDAWN") "CVVCEMCVFCYP"
SINCE CRYPTOL WAS PURPOSE BUILT WE CANMAKE BETTER MODELS
WE ALSO GET AN SMT SOLVER ALONG WITHQUICKCHECK
USE CHECK WHEN DEVELOPING, THEN PROVEWHEN YOU REALLY WANT TO TEST
validMessage : {n} (fin n) => String n -> Bit validMessage = all (\c -> elem (c, ['A' .. 'Z']))
property caesarCorrect (d,msg) = if validMessage msg then dCaesar(d, caesar(d, msg)) == msg else True
Cryptol> :prove caesarCorrect : ([8], String(10)) -> Bit Q.E.D.
WE NOW HAVE A BETTER WAY TO MODEL, BUTWHAT ABOUT THE INTEROP?
SOFTWARE ANALYSIS WORKBENCH (SAW)Another Galois creation that provides formal veri�cation
tools for C, Java, and Cryptol
A BASIC EXAMPLE int add_reference(int x, int y) { return x + y; }
int add_with_bug(int x, int y) { if (x == 5) return 42; return x + y; }
$ clang -c -emit-llvm add.c -o add.bc
ANALYZE WITH SAW l <- llvm_load_module "add.bc";
print "Extract terms"; add_ref <- llvm_extract l "add_reference" llvm_pure; add_bug <- llvm_extract l "add_with_bug" llvm_pure;
print "Find bug via SAT search"; let {{ thm1 x y = add_ref x y != add_bug x y }}; result <- sat abc {{ thm1 }}; print result;
print "Find bug via failed proof"; let {{ thm2 x y = add_ref x y == add_bug x y }}; result <- prove abc {{ thm2 }}; print result;
LET SAW FIND THE BUGS $ saw add.saw Loading module Cryptol Loading file "add.saw" Loading llvm bytecode Extract terms Find bug via SAT search Sat: [("x",5),("y",35)] Find bug via failed proof Invalid: [("x",5),("y",35)]
SPRINKLE IN SOME JAVA public class Add { public int add_reference(int x, int y) { return x + y; }
public int add_with_bug(int x, int y) { if (x == 7) return 42; return x + y; } }
$ javac -g Add.java
MIX THE LANGUAGE TERMS WITH SAW j <- java_load_class "Add";
print "Extract java terms"; java_add_ref <- java_extract j "add_reference" java_pure; java_add_bug <- java_extract j "add_with_bug" java_pure;
print "C reference, Java bug"; let {{ thm3 x y = add_ref x y == java_add_bug x y }}; result <- prove abc {{ thm2 }}; print result;
print "Java reference, C bug"; let {{ thm4 x y = java_add_ref x y == add_bug x y }}; result <- prove abc {{ thm4 }}; print result;
FIND BUGS ACROSS LANGUAGES Extract java terms C reference, Java bug Invalid: [("x",7),("y",33)] Java reference, C bug Invalid: [("x",5),("y",35)]
NOW WE HAVE A BETTER MODEL AND MOREPOWERFUL INTEROP*
A MORE SUITABLE EXAMPLE
LET'S TAKE A CRYPTOL REFERENCE FOR AES aesEncrypt : ([128], [AESKeySize]) -> [128] aesEncrypt (pt, key) = stateToMsg (AESFinalRound (k, rounds ! 0)) where (kInit, ks, k) = ExpandKey key state0 = AddRoundKey(kInit, msgToState pt) rounds = [state0] # [ AESRound (rk, s) | rk <- ks | s <- rounds ]
aesDecrypt : ([128], [AESKeySize]) -> [128] aesDecrypt (ct, key) = stateToMsg (AESFinalInvRound (k, rounds ! 0)) where (k, ks, kInit) = ExpandKey key state0 = AddRoundKey(kInit, msgToState ct) rounds = [state0] # [ AESInvRound (rk, s) | rk <- reverse ks | s <- rounds ]
AND A C IMPLEMENTATION FOR AES void aes128BlockEncrypt(const SWord32 *pt, const SWord32 *key, SWord32 *ct) { // No way this is going to fit on a slide... }
AND VERIFY THE IMPLEMENTATION MEETS THEREFERENCE
import "AES.cry";
let {{ aesExtract x = aesEncrypt (pt,key) where [pt,key] = split (join (reverse (groupBy{32} x))) }};
let main = do { f <- load_aig "aes.aig"; g <- bitblast {{ aesExtract }};
print "Checking equivalence"; res <- cec f g; print res;
write_aig "aes-ref.aig" {{ aesExtract }}; };
THE COMBINATIONAL EQUIVALENCE CHECKFINDS NETWORK BEHAVIOR MISMATCHES $ saw aes.saw Loading module Cryptol Loading file "aes.saw" Loading module AES Loading LLVM implementation Bitblasting Cryptol implementation Checking equivalence Invalid: 11566282233780696786424107357608548 2245194316165030070846998742492792053295122
WE MADE A COUPLE OF LEAPS THERE
SAW CALCULATES AN AND INVERTER GRAPH(AIG) REPRESENTATION
YOU CAN INVOKE THE LOGIC SYNTHESIS TOOLON THE MODELS
$abc UC Berkeley, ABC 1.01 (compiled Mar 8 2015 01:00:49) abc 01> cec aes.aig aes-ref.aig Networks are NOT EQUIVALENT. Verification failed for at least 64 outputs: po000 po005 po007 ... Output po000: Value in Network1 = 0. Value in Network2 = 1. Input pattern: pi000=0 pi128=0 pi001=1 pi129=0 pi002=0 pi130=0 pi003=0 pi131=0 pi004=1 pi132=0 pi005=0 pi133=0 pi006=0 pi134=0 ... output elided ... pi092=0 pi220=0 pi093=0 pi221=0 pi094=0 pi222=0 pi095=0 pi223=0
THESE ARE INCREDIBLY POWERFUL TOOLS ATYOUR DISPOSAL
LEARNING HOW TO USE THEM PROPERLY WILLTAKE SOME TIME
LET'S FLIP OUR VIEW A BIT
AND THINK ABOUT THE OUTSIDE IN
CRYPTOGRAPHIC PROTOCOL SHAPES ANALYZER(CPSA)
A tool for analyzing protocol executions and searchingthrough possible execution states to determine the
soundness of a given protocol
WE COULD ANALYZE CRYPTOGRAPHICPROTOCOLS, BUT LET'S DO SOMETHING A LITTLE
DIFFERENT
AUTHENTICATION
IMAGINE WE HAVE A CENTRALIZEDAUTHENTICATION SERVICE AND TWO OTHER
HTTP SERVICES
FIRST, WE DEFINE OUR AUTH SERVICE (defprotocol authenticate basic (defrole auth-service (vars (auth-service user name) (time method text) (ssl skey)) (trace (recv (enc user (pass user) ssl)) (send (enc (token user time method auth-service) ssl))) (uniq-orig (uuid user)))
THEN OUR SERVICES (defrole service (vars (auth-service user name) (time method req resp text) (ssl skey)) (trace (recv (enc req (token user time method auth-service) ssl)) (send (enc resp ssl))) (uniq-orig resp))
AND FINALLY OUR END USER (defrole end-user (vars (auth-service user name) (time method req0 resp0 req1 resp1 text) (ssl0 ssl1 ssl2 skey)) (trace ;; authenticate (send (enc user (pass user) ssl0)) (recv (enc (token user time method auth-service) ssl0)) ;; request first protected resource (resp0) from service s0 (send (enc req0 (token user time method auth-service) ssl1)) (recv (enc resp0 ssl1)) ;; request another protected resource (resp1) from service s1 (send (enc req1 (token user time method auth-service) ssl2)) (recv (enc resp1 ssl2))) (uniq-orig (pass user))))
NOW WE DEFINE THE SKELETONS
FIRST, WE DEFINE THE AUTH SERVICE AND ENDUSER INTERACTIONS
(defskeleton authenticate (vars (a u s0 s1 name) (req0 req1 resp0 resp1 text)) (defstrand auth-service 2 (auth-service a) (user u) (ssl (ltk a u))) (defstrand end-user 6 (auth-service a) (user u) (ssl0 (ltk a u)) (ssl1 (ltk s0 u)) (ssl2 (ltk s1 u)) (req0 req0) (resp0 resp0) (req1 req1) (resp1 resp1))
NEXT, OUR SERVICE INTERACTIONS (defstrand service 2 (auth-service a) (user u) (ssl (ltk s0 u)) (req req0) (resp resp0)) (defstrand service 2 (auth-service a) (user u) (ssl (ltk s1 u)) (req req1) (resp resp1))
AND FINALLY THE MOST IMPORTANT PART ;; assume SSL is not broken (non-orig (ltk a u) (ltk s0 u) (ltk s1 u)) ;; assume the Auth Service signing key remains uncompromised (non-orig (privk "sig" a)))
RUNNING IT SHOWS THAT OUR SERVICE IS INTACT
BUT IF WE "BREAK" SSL ;; assume SSL is not broken ;;(non-orig (ltk a u) (ltk s0 u) (ltk s1 u)) ;; assume the Auth Service signing key remains uncompromised (non-orig (privk "sig" a)))
THINGS GET A LITTLE MORE INTERESTING
SUMMARY
THERE HAVE BEEN LOTS OF INNOVATIONS INTHIS SPACE OVER THE PAST FEW YEARS
THE TOOLS BRING THESE IDEAS WITHIN REACH
YOU CAN DO THE DILLIGENCE TO MAKE SURECRITICAL COMPONENTS ARE DESIGNED AND
BUILT CORRECTLY
REFERENCES- - - -
saw.galois.comcryptol.netgithub.com/GaloisInc/llvm-veri�erwww.mitre.org/publications/technical-
papers/completeness-of-cpsa