checking the boundaries of static analysis

Upload: thomasdullien

Post on 14-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Checking the Boundaries of Static Analysis

    1/77

    Checking the boundaries

    of static analysisHalvar Flake - Syscan 2013

    Latest slides: http://goo.gl/pcDkV

    http://goo.gl/pcDkV
  • 7/29/2019 Checking the Boundaries of Static Analysis

    2/77

    Introduction

    Survey of some research on static analysis

    This talk: Focus on two big different flavors:Abstract InterpretationSMT-centric analysis

    Convergence of the approaches

  • 7/29/2019 Checking the Boundaries of Static Analysis

    3/77

    Abstract Interpretation

    SMT-based analysistechniques

  • 7/29/2019 Checking the Boundaries of Static Analysis

    4/77

    Topics (Part 1)

    Introduction to SMT-based analyses andabstract interpretation

    Common weaknesses of AI implementations

    Ideal world: How they could be fixed

    Possible improvements through SMT solvers

  • 7/29/2019 Checking the Boundaries of Static Analysis

    5/77

    Topics (Part 2)

    What bugs are hard to find ?

    What bugs are easy to find ?

    Why are browsers the "perfect storm" for staticanalyzers ?

  • 7/29/2019 Checking the Boundaries of Static Analysis

    6/77

    Warning (Part 1)

    Topic requires a lot of machinery

    Machinery ~ rope when rock-climbing

    Needed to climb safely

    Talk will focus on description of vistas, notteach detailed knotting techniques

  • 7/29/2019 Checking the Boundaries of Static Analysis

    7/77

    Warning (Part 2)

    Survey talk

    "Other people's work"

    Incomplete bibliography at the end

  • 7/29/2019 Checking the Boundaries of Static Analysis

    8/77

    SMT-based analyses

    Idea: Generate set of equations from code

    Find solutions to equations

  • 7/29/2019 Checking the Boundaries of Static Analysis

    9/77

    SMT Solvers - Lazy

    off-the-shelf SAT to find assignment forboolean "skeleton"

    subsolvers deal with individual clauses

    combination of solvers for various theories

    "bitvectors", "arrays", etc.

  • 7/29/2019 Checking the Boundaries of Static Analysis

    10/77

    SMT Solvers

    Lots of progress

    SAT revolution + IT Security + MSR

    z3

    Whitebox fuzzing / input crafting at MS: Solversvery mature, solve huge formulas

  • 7/29/2019 Checking the Boundaries of Static Analysis

    11/77

    SMT Solvers

    Formulas based on concrete paths through theprogram

    Size of formulas dependent on number ofoperations in trace

    Heavily influenced by quantity of memorywrites

  • 7/29/2019 Checking the Boundaries of Static Analysis

    12/77

    SMT Solvers in practice

    Great black-box oracle that will say "onesolution is X" or "no solution" to given

    equation

    Ability to solve complex equations quiteimpressive

    Translation from most IRs to SMT inputformat near-trivial

  • 7/29/2019 Checking the Boundaries of Static Analysis

    13/77

    Abstract Interpretation

    "In line X of the program, the property Y willalways be true."

    Derives such statements in a structured way

    How does it work ?

  • 7/29/2019 Checking the Boundaries of Static Analysis

    14/77

    Concrete Execution

    Input state:a = 4, b = 10

    a = a + b;

    Output state:a = 14, b = 10

  • 7/29/2019 Checking the Boundaries of Static Analysis

    15/77

    Concrete Execution

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Set of input states

    Set of output states

    a = a + b;

  • 7/29/2019 Checking the Boundaries of Static Analysis

    16/77

    Concrete vs. Abstract Interpretation

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Set of inputstates

    Set of outputstates

    a = a+ b;

    AbstractDomainElement

    Abstraction

  • 7/29/2019 Checking the Boundaries of Static Analysis

    17/77

    Concrete vs. Abstract Interpretation

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Set of inputstates

    Set of outputstates

    a = a+ b;

    AbstractDomainElement

    Abstraction

    Abstraction

    AbstractDomainElement

  • 7/29/2019 Checking the Boundaries of Static Analysis

    18/77

    Concrete vs. Abstract Interpretation

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Set of inputstates

    Set of outputstates

    a = a+ b;

    AbstractDomainElement

    Abstraction

    Abstraction

    Abstraction

    abstracttransform

    AbstractDomainElement

  • 7/29/2019 Checking the Boundaries of Static Analysis

    19/77

    Concrete vs. Abstract Interpretation

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Input state:a = 4, b = 10

    Output state:a = 14, b = 10

    Set of inputstates

    Set of outputstates

    a = a+ b;

    AbstractDomainElement

    Abstraction

    Concretization

    Abstraction

    abstracttransform

    AbstractDomainElement

  • 7/29/2019 Checking the Boundaries of Static Analysis

    20/77

    Abstract vs. concrete domains

    Compute on "simpler" structures.

    Need "join" ("set union") and "intersect".

    Translate effects of code to abstract domain.Call them "transforms".

  • 7/29/2019 Checking the Boundaries of Static Analysis

    21/77

    Example: Intervals

    Incoming set of states contains 990 elements(a = 20, b=5),

    (a = 22, b=6),

    ...,

    (a = 2000, b = 5)

    Approximate with interval:(a is in [20...2000], b is in [5...6])

    Intervals can be joined and intersected.

  • 7/29/2019 Checking the Boundaries of Static Analysis

    22/77

    Compute on abstract domain

    a in [20..2000], b in [5..6]

    a = a + b;

    a in [25..2006], b in [5..6]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    23/77

    "Abstract Interpretation"

    Interpreting (~executing) the program on anabstract domain instead of on a concrete set of

    states.

  • 7/29/2019 Checking the Boundaries of Static Analysis

    24/77

    Central Questions

    Properties of good abstract domains ?

    What abstract domains can we think of ?

    Intervals, polyhedral domains etc.

    How to lift concrete operations to abstract

    domains ?

  • 7/29/2019 Checking the Boundaries of Static Analysis

    25/77

    Common pitfalls & problems

    Myopic static analysis

    Clumsy construction of products of domains

    Codependency of intermediate representationand abstract domains

    Summarization of heap cells and implications

  • 7/29/2019 Checking the Boundaries of Static Analysis

    26/77

    Improve with solvers ?

    Myopic static analysis

    Clumsy construction of products of domains

    Codependency of intermediate representationand abstract domains

    Summarization of heap cells and implications

  • 7/29/2019 Checking the Boundaries of Static Analysis

    27/77

    Myopic static analysis

    Sequence of instructions

    Each instruction is lifted to abstract domain

    Each instruction overapproximates concreteoperation

    Cascade of imprecision leads to failure

  • 7/29/2019 Checking the Boundaries of Static Analysis

    28/77

    Myopic static analysis

  • 7/29/2019 Checking the Boundaries of Static Analysis

    29/77

    A fancy way of writing NOP

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

  • 7/29/2019 Checking the Boundaries of Static Analysis

    30/77

    x = [0,0x1BEEF]x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

  • 7/29/2019 Checking the Boundaries of Static Analysis

    31/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

  • 7/29/2019 Checking the Boundaries of Static Analysis

    32/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x2 = [0,0xFF00]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

  • 7/29/2019 Checking the Boundaries of Static Analysis

    33/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x2 = [0,0xFF00]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

  • 7/29/2019 Checking the Boundaries of Static Analysis

    34/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x2 = [0,0xFF00]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

    x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    35/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x2 = [0,0xFF00]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

    x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0x10000]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x4=[0,0],res=[0x,0xFF]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    36/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x2 = [0,0xFF00]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

    x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0x10000]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x4=[0,0x10000],res=[0x,0xFF]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0],res=[0x,0xFFFF]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    37/77

    x = [0,0x1BEEF]x = [0,0x1BEEF]x1 = [0,0xFF]x = [0,0x1BEEF]x1 = [0,0xFF]

    x2 = [0,0xFF00]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

    x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0x10000]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]

    x4=[0,0x10000],res=[0x,0xFF]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0],res=[0x,0xFFFF]x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0],res=[0x,0x1FFFF]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    38/77

    Myopic static analysis: Summary

  • 7/29/2019 Checking the Boundaries of Static Analysis

    39/77

    Clumsy product domains

    Multiple abstract domains can be combined

    Often necessary: More than one propertyneeds to be tracked

    Motivating example: Strided intervals for binary

    analysis

  • 7/29/2019 Checking the Boundaries of Static Analysis

    40/77

    Motivating example: Strided Intervals

    Byte array with intervals in 4-byte memory cells

    Array read via interval index

    x = array[ y ] where y = [0,12]

    Without alignment: x = [0, 0x30000000]

    [0, 0x10] [0, 0x20] [0, 0x30] [0, 50] [0, 100]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    41/77

    Strided Intervals

    Byte array with intervals in 4-byte memory cells

    [0, 0x10] [0, 0x20] [0, 0x30] [0, 50] [0, 100]

    10 00 00 00 20 00 00 00 30 00 00 00

    00 00 00 10

    20 00 00 00

    00 20 00 00

    00 00 20 00

    00 00 00 20

    30 00 00 00

    00 30 00 00

    00 00 30 00

    00 00 00 30

    [0, 0x30000000]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    42/77

    Strided Intervals

    Alignment information helps

    x = array[ y ] where (y = [0,12] and 0 mod 4)

    [0, 0x10] [0, 0x20] [0, 0x30] [0, 50] [0, 100]

    00 00 00 10 00 00 00 20 00 00 00 30

    [0, 0x30]

  • 7/29/2019 Checking the Boundaries of Static Analysis

    43/77

    Clumsy product domains

    Cartesian product is easy to build

    We know how to compute on Intervals

    We know how to compute onAlignment

    Imprecision creeps in

  • 7/29/2019 Checking the Boundaries of Static Analysis

    44/77

    Clumsy product domains

    x1 = read_16();

    x1 &= 0xFFFFFFF0

    if x1 < 110 go right

    x1 += 10

    x1 = [0,0xFFFF], 0 mod 1

  • 7/29/2019 Checking the Boundaries of Static Analysis

    45/77

    Clumsy product domains

    x1 = read_16();

    x1 &= 0xFFFFFFF0

    if x1 < 110 go right

    x1 += 10

    x1 = [0,0xBEEF], 0 mod 1

    x1 = [0,0xFFFF], 0 mod 16

  • 7/29/2019 Checking the Boundaries of Static Analysis

    46/77

    Clumsy product domains

    x1 = read_16();

    x1 &= 0xFFFFFFF0

    if x1 < 110 go right

    x1 += 10

    x1 = [0,0xBEEF], 0 mod 1

    x1 = [0,0xFFFF], 0 mod 16

    x1 = [0,110], 0 mod 16

  • 7/29/2019 Checking the Boundaries of Static Analysis

    47/77

    Clumsy product domains

    [0, 110], 0 mod 16 makes no sense

    It is a valid element of

    It is not a very sane abstraction from the set of

    possible states

    Better: [0, 96], 0 mod 16

  • 7/29/2019 Checking the Boundaries of Static Analysis

    48/77

    Clumsy product domains

    Component-wise transforms make thingsworse:

    [0, 110] x (0 mod 16) + [10, 10] x (0 mod 1)

    = [0, 120] x (0 mod 1)

    We want to compute on the reduced product

    Rewrite all transforms :-(

  • 7/29/2019 Checking the Boundaries of Static Analysis

    49/77

    Clumsy product domains: Summary

    We want to compute on the reduced product

    We can only easily compute on

    Leads to cascading imprecision

  • 7/29/2019 Checking the Boundaries of Static Analysis

    50/77

    Codependent IRs and domains

    Different parties implement different IRs

    REIL, RREIL, (TSL) etc.

    Intention: Make it "easy" to implement staticanalysers

  • 7/29/2019 Checking the Boundaries of Static Analysis

    51/77

    Codependend IRs and domains

    ILs and the domains end up "married"

    REIL works for bitvector stuff

    RREIL works for relational operations (intervalsetc.)

    TSL similar (but much more generic) to REIL -bitvector oriented

  • 7/29/2019 Checking the Boundaries of Static Analysis

    52/77

    Codependend IRs and domains

    Hard to derive therelational operator ">"

    Proposal by [RREIL]:Translate to language

    with relational operators

    Useful for certainrelational domains

    REIL for "ja":

    t0 = (ZF == 0)t1 = (CF == 0)t2 = (t0 & t1)

    if t2:goto ...

  • 7/29/2019 Checking the Boundaries of Static Analysis

    53/77

    Codependend IRs and domains

    Design for REIL was influenced by a bitvector-oriented domain

    Design for RREIL was influenced by relationaldomains

    Frequent symptom: Have domain, IL doesn'twork well for domain -- need to adopt IL

    Defeats the purpose of an IL ?

  • 7/29/2019 Checking the Boundaries of Static Analysis

    54/77

    Recap

    Myopic static analysis

    Clumsyness of constructing products of

    domains

    Codependency of intermediate representation

    and abstract domains

    Summarization of heap cells and implications

  • 7/29/2019 Checking the Boundaries of Static Analysis

    55/77

    How to improve ? Ideal world

    Assume you have an algorithm which ...

    ... given semantic spec of instructions

    ... given a description of the abstract domain

    computes good / tight transformers for you

  • 7/29/2019 Checking the Boundaries of Static Analysis

    56/77

    How to improve ? Ideal world

    Codependency: Solved

    Clumsy direct product: Also solved. Algorithm

    would rewrite transformers.

    Myopic analysis: Mostly solved. Treat entire

    basic block as one instruction. Computetransformers.

  • 7/29/2019 Checking the Boundaries of Static Analysis

    57/77

    Reality

    [GulTi06] and [CoCoMa11] deal with reducedproduct construction (via Nelson-Oppen-like

    methods)

    [BraKi10], [KiSo10] and [Reg04] designalgorithms for special abstract domains to

    compute transforms

    [ThaRe12CAV] and [ThaRe12SAS] derivetransforms automatically for some generic

    domains

    http://research.cs.wisc.edu/wpis/papers/sas12-bilateral-alphahat.pdfhttp://www.cs.wisc.edu/wpis/papers/CAV12-Staalmarck.pdfhttp://research.cs.wisc.edu/wpis/papers/sas12-bilateral-alphahat.pdfhttp://research.cs.wisc.edu/wpis/papers/sas12-bilateral-alphahat.pdfhttp://www.cs.wisc.edu/wpis/papers/CAV12-Staalmarck.pdfhttp://www.cs.utah.edu/~regehr/papers/asplos04/http://crest.cs.ucl.ac.uk/cow/7/slides/AutomaticAbstractionforCongruences-A.King.pdfhttp://embedded.rwth-aachen.de/lib/exe/fetch.php?media=bib:bk10.pdfhttp://software.imdea.org/~mauborgn/publi/fossacs11.pdfhttp://research.microsoft.com/pubs/70270/comb_pldi06.pdf
  • 7/29/2019 Checking the Boundaries of Static Analysis

    58/77

    Practical Home-Brew tips: Myopia

    SMT solvers can be great for "post-basic-block-narrowing"

    Domain-specific - but use SMT as oracle to"narrow down" results

    Example: Bsearch after myopic analysis

  • 7/29/2019 Checking the Boundaries of Static Analysis

    59/77

    x1 = x & 0xFF

    x2 = x & 0xFF00

    x3 = x & 0xFF0000

    x4 = x & 0xFF000000

    res = x1

    res = res | x2

    res = res | x3

    res = res | x4

    x=[0,0x1BEEF],x1=[0,0xFF],x2=[0,0xFF00],x3=[0,0x10000]x4=[0,0],res=[0x,0x1FFFF]

    (assert (bvugt res #x0001FFFF))(assert (bvugt res #x0000FFFF))

    (assert (bvugt res #x00017FFF))(..)(assert (bvugt res #x0001BEEF))

    Practical Home Brew tips: Product

  • 7/29/2019 Checking the Boundaries of Static Analysis

    60/77

    Practical Home-Brew tips: ProductDomains

    Hope your solver can operate on both domains

    Use similar narrowing techniques

    Have the solver deal with communicationbetween domains

    Domain-specific, but black-box oracle reallyhelps

  • 7/29/2019 Checking the Boundaries of Static Analysis

    61/77

    Summarization of heap cells

    Heap-allocated objects need to be tracked

    In binary analyses, usually summarized by

    allocation address

    In source analyses, often summarized by type

  • 7/29/2019 Checking the Boundaries of Static Analysis

    62/77

    Summarization of heap cells

    Common design: "most recently allocated"

    One heap cell to represent the most recently

    allocated cell of type X

    One heap cell to represent all other instances

  • 7/29/2019 Checking the Boundaries of Static Analysis

    63/77

    Summarization of heap cells

    Forward

    Back

    0x200

    Forward

    Back

    0x100

    Forward

    Back

    0x400

    Forward

    Back

    0x800

    Forward

    Back

    0x400

    Forward

    Back

    0x200

    Forward

    Back

    [0,800]

    Most recentlyallocated heap cell

    Summary Node

  • 7/29/2019 Checking the Boundaries of Static Analysis

    64/77

    Summarization of heap cells

    Forward

    Back

    0x200

    Forward

    Back

    0x100

    Forward

    Back

    0x400

    Forward

    Back

    0x800

    Forward

    Back

    0x400

    Forward

    Back

    0x200

    Forward

    Back

    [0,800]

    Most recentlyallocated heap cell

    Summary Node

  • 7/29/2019 Checking the Boundaries of Static Analysis

    65/77

    Summarization of heap cells

    Problematic for UAF analysis

    Example:

    Forward

    Back

    object *

    Forward

    Back

    object *

    Forward

    Back

    object *

    Forward

    Back

    object *

    Forward

    Back

    object *

    bIsValid bIsValid bIsValidbIsValidbIsValid

  • 7/29/2019 Checking the Boundaries of Static Analysis

    66/77

    Summarization of heap cells

    Problematic for UAF analysis

    Example:

    valid* valid * invalid * valid * valid *

    false true truetruetrue

  • 7/29/2019 Checking the Boundaries of Static Analysis

    67/77

    Summarization of heap cells

    Problematic for UAF analysis

    Example:

    valid orinvalid

    Summary Node

    true or false

  • 7/29/2019 Checking the Boundaries of Static Analysis

    68/77

    Summarization of heap cells

    Fieldwise "join" vs. whole-structure join

    Data structures ~ logically grouped data fields

    Most analyses treat them as "random bag ofdata fields"

    Summary nodes often have illogical state -leads to more illogical state and false positives

  • 7/29/2019 Checking the Boundaries of Static Analysis

    69/77

    AI becomes more solver-driven

    3 of 4 discussed problems can be improved bysolver-like techniques

    Solvers are "cheap and really fast undergradcalculators"

    AI community is adopting solvers (both toolsand language) - and benefitting

  • 7/29/2019 Checking the Boundaries of Static Analysis

    70/77

    Solvers become AI driven ?

    SMT solving algorithms can be rewritten interms of abstract interpretation

    Unclear if this will lead to benefits for the solvercommunity

    DPLL vs. Staalmarck ?

    Better solvers through abstraction ?

  • 7/29/2019 Checking the Boundaries of Static Analysis

    71/77

    Abstract Interpretation

    SMT-based analysistechniques

    Solver-augmentedAI

    AI-augmentedsolvers ?

  • 7/29/2019 Checking the Boundaries of Static Analysis

    72/77

    What bugs are hard to find ?

    Analysis of memory-copying loops with multipleinternal states

    crackaddr_vuln.c vs. crackaddr_notvuln.c

    Analysis of things that get too imprecise byheap summaries (UAF a good example)

    Analysis of situations with unclear control flowor missing data

    https://docs.google.com/file/d/0B5hBKwgSgYFaZXdrdVhSdFNZVWc/edit?usp=sharinghttps://docs.google.com/file/d/0B5hBKwgSgYFaTHJqMlZGRlBsTWM/edit?usp=sharinghttps://docs.google.com/file/d/0B5hBKwgSgYFaTHJqMlZGRlBsTWM/edit?usp=sharinghttps://docs.google.com/file/d/0B5hBKwgSgYFaZXdrdVhSdFNZVWc/edit?usp=sharing
  • 7/29/2019 Checking the Boundaries of Static Analysis

    73/77

    What bugs are easy to find ?

    Unencumbered by heap summarization

    Unencumbered by loopy state machines

    Things where the entire code flow is easilydetermined

  • 7/29/2019 Checking the Boundaries of Static Analysis

    74/77

    What bugs are easy to find ?

    Straight-from-packet or straight-from-userlandinteger overflows

    Straight-from-packet or straight-from-userlandbad API calls

    Straight-from-userland double-fetches

    Why are browsers a perfect storm for

  • 7/29/2019 Checking the Boundaries of Static Analysis

    75/77

    Why are browsers a perfect storm forstatic analyzers ?

    Extremely large and C++

    Multithreaded

    Mix pointers & values in bizarre ways (lowest 3bits indicate JS type etc.)

    Why are browsers a perfect storm for

  • 7/29/2019 Checking the Boundaries of Static Analysis

    76/77

    Why are browsers a perfect storm forstatic analyzers ?

    Control-flow attacker-driven through Javascript

    JITed code

    State machines in state machines

  • 7/29/2019 Checking the Boundaries of Static Analysis

    77/77

    That's it :-)

    Questions ?