lundnet: robust jet identification using graph neural networks...lundrepresentationofajet i...

29
LundNet: Robust jet identification using graph neural networks 4th IML Machine Learning Workshop, CERN, 22 October 2019 Frédéric Dreyer arXiv:20xx.xxxxx work in progress with Huilin Qu

Upload: others

Post on 28-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • LundNet: Robust jet identificationusing graph neural networks4th IML Machine Learning Workshop, CERN, 22 October 2019

    Frédéric Dreyer

    arXiv:20xx.xxxxxwork in progress with Huilin Qu

    http://arxiv.org/abs/

  • Boosted objects at the LHC

    I At LHC energies, EW-scale particles (W/Z/t. . . ) are often producedwith ?C �

  • Boosted objects at the LHC

    I At LHC energies, EW-scale particles (W/Z/t. . . ) are often producedwith ?C �

  • Lund diagrams

    I Lund diagrams in the (ln I�, ln�)plane are a very useful way ofrepresenting emissions.

    I Different kinematic regimes areclearly separated, used to illustratebranching phase space in partonshower Monte Carlo simulations andin perturbative QCD resummations.

    I Soft-collinear emissions are emitteduniformly in the Lund plane

    3F2 ∝ B3I

    I

    3��

    Primary Lund-plane regions

    soft-collinear

    hard-collinear (largez)

    ISR(larg

    eΔ)

    non-pert. (small kt)

    MPI/UE

    ln(R/Δ)ln(k

    t/GeV)

    Frédéric Dreyer 2/17

  • Lund diagrams

    Features such as mass, angle and momentum can easily be read from aLund diagram.

    jet mass ≡

  • Lund diagrams for substructure

    Substructure algorithms can often also be interpreted as cuts in the Lundplane.

    [Dasgupta, Fregoso, Marzani, Salam JHEP 1309 (2013) 029]

    Frédéric Dreyer 4/17

    https://arxiv.org/abs/1307.0007

  • Studying jets in the Lund plane

    Lund diagrams can provide a useful approach to study a range of jet-relatedquestions

    I First-principle calculations of Lund-plane variables.I Constrain MC generators, in the perturbative and non-perturbative

    regions.I Brings many soft-drop related observables into a single framework.I Impact of medium interactions in heavy-ion collisions.I Experimental measurements and precision comparison to data.I Basis of generative models for fast simulations.I Boosted object tagging using Machine Learning methods.

    Frédéric Dreyer 5/17

  • Lund plane representation

    To create a Lund plane representation of a jet, recluster a jet 9 with theCambridge/Aachen algorithm then decluster the jet following the hardestbranch.

    1. Undo the last clustering step, defining two subjets 91 , 92ordered in ?C .

    2. Save the kinematics of the current declusteringΔ ≡ (H1 − H2)2 + ()1 − )2)2 , :C ≡ ?C2Δ,

  • Lund plane representation

    ln k

    (c)

    ln k

    t

    ln k

    t

    LU

    ND

    DIA

    GR

    AM

    PR

    IM

    AR

    Y L

    UN

    D P

    LA

    NE

    JE

    T

    (b)

    (a)

    (b)

    (a) (c)

    tt

    ln k

    (c)

    ln 1/ ∆

    ln 1/ ∆ ln 1/ ∆

    ln 1/ ∆

    (b)

    (c)

    (b)

    (b)(b)

    (c)

    Frédéric Dreyer 6/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Lund representation of a jet

    I Each jet has an imageassociated with its primarydeclustering.

    I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.

    I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )

    4

    2

    0

    2

    4

    6

    ln(k

    t/GeV

    )

    Lund image for a 2 TeV QCD jet

    Frédéric Dreyer 7/17

  • Jets as Lund images

    Average over declusterings of hardest branch for 2 TeV QCD jets.

    0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )

    2

    1

    0

    1

    2

    3

    4

    5

    6

    7

    ln(k

    t/GeV

    )

    Pythia8.230(Monash13)s = 14 TeV, pt > 2 TeV

    QCD jets, averaged primary Lund plane

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.02 0.2 0.5 0.01 0.1 1

    16.4

  • Jets as Lund images

    Average over declusterings of hardest branch for 2 TeV QCD jets.

    0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )

    2

    1

    0

    1

    2

    3

    4

    5

    6

    7

    ln(k

    t/GeV

    )

    Pythia8.230(Monash13)s = 14 TeV, pt > 2 TeV

    QCD jets, averaged primary Lund plane

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)

    Perturbative

    Non-perturbative– – – – – – – – –

    Primary Lund-plane regions

    soft-collinear

    hard-collinear (largez)

    ISR(larg

    eΔ)

    non-pert. (small kt)

    MPI/UE

    ln(R/Δ)ln(k

    t/GeV)

    Non-perturbative region clearly separated from perturbative one.Frédéric Dreyer 8/17

  • Jets as Lund images

    Average over declusterings of hardest branch for 2 TeV QCD jets.

    0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )

    2

    1

    0

    1

    2

    3

    4

    5

    6

    7

    ln(k

    t/GeV

    )

    Pythia8.230(Monash13), partons = 14 TeV, pt > 2 TeV

    QCD jets, averaged primary Lund plane

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)

    Perturbative

    Non-perturbative– – – – – – – – –

    Primary Lund-plane regions

    soft-collinear

    hard-collinear (largez)

    ISR(larg

    eΔ)

    non-pert. (small kt)

    MPI/UE

    ln(R/Δ)ln(k

    t/GeV)

    Non-perturbative region clearly separated from perturbative one.

    hadron+MPI→ parton

    Frédéric Dreyer 8/17

  • Measurement of the Lund plane

    Lund images provide an opportunity for experimental measurements andcomparisons with theory

    Frédéric Dreyer 9/17

  • Lund images for QCD and W jets

    I Hard splittings clearly visible, along the diagonal line with jet mass< = 2 TeV

    QCD jets, averaged primary Lund plane

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)

    0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )

    2

    1

    0

    1

    2

    3

    4

    5

    6

    7

    ln(k

    t/GeV

    )

    Pythia8.230(Monash13)s = 14 TeV, pt > 2 TeV

    W jets, averaged primary Lund plane

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9S( , kt)

    Frédéric Dreyer 10/17

  • Boosted] tagging

    I LL approach alreadyprovides substantialimprovement overbest-performing substructureobservable.

    I LSTM network substantiallyimproves on results obtainedwith other methods.

    I Large gain in performance,particularly at higherefficiencies.

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0125

    102050

    100200500

    100020005000

    1/QC

    D

    Pythia8(Monash13)hadron+MPI

    Delphes+SPRA1pt > 2 TeV

    QCD rejection v. W efficiency

    Jet Image+CNNLund Image+CNNLund+DNNLund+LSTMLund+likelihoodD[loose]2 +BDT

    0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0W

    0.20.30.50.7

    11.52

    3ra

    tio to

    Lun

    d+lo

    g.lik

    .

    Frédéric Dreyer 11/17

  • Using the full jet information

    I Performance can be improved further by taking secondary Lund planesinto account, particularly relevant for top tagging.

    I Dynamic Graph CNN based methods perform particularly well, treatingthe full Lund diagram as a vertices on a graph.

    [Wang, Sun, Liu, Sarma, Bronstein, Solomon arXiv:1801.07829][Qu, Gouskos, arXiv:1902.08570]

    Frédéric Dreyer 12/17

    https://arxiv.org/abs/1801.07829https://arxiv.org/abs/1902.08570

  • LundNet models

    We study two separate models, one which optimises performance and onedesigned to reduce the complexity of the model and improve its robustness.

    In both cases, the graph network starts by constructing a graph for a jet,with each node corresponding to a Lund declustering.

    FullLundNet:

    The kinematic input for each node is (ln I, lnΔ,#, ln

  • Boosted] tagging revisited

    I Graph-based methodsoutperform our previousbenchmark significantly.

    I The FullLundNet modelprovides a slightimprovement overParticleNet.

    I FastLundNet achievesalmost the sameperformance with alightweight and robust model. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    W

    1

    10

    100

    1000

    10000

    1/QC

    D

    Pythia 8.223 simulationsignal: pp WW, background: pp jj

    anti-kt R = 1 jets, pt > 500 GeV

    QCD rejection v. W tagging efficiency

    RecNN (Louppe et al. '17)Lund+LSTM (Dreyer et al. '18)FastLundNetFullLundNetParticleNet (Gouskos et al. '19)

    Frédéric Dreyer 14/17

  • Boosted top tagging

    I For top tagging primarydeclustering sequencedoesn’t capture the fullsubstructure information.

    I Can achieve large gains inperformance by taking intoaccount the full tree.

    I FullLundNet model performsparticularly well, showingvast improvement overprevious benchmarks. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Top

    1

    10

    100

    1000

    10000

    1/QC

    D

    Pythia 8.223 simulationsignal: pp tt, background: pp jj

    anti-kt R = 1 jets

    pt > 500 GeV

    QCD rejection v. Top tagging efficiency

    RecNN (Louppe et al. '17)Lund+LSTM (Dreyer et al. '18)FastLundNetFullLundNetParticleNet (Gouskos et al. '19)

    Frédéric Dreyer 15/17

  • Robustness to non-perturbative effectsI Performance compared to resilience to MPI and hadronisation corrections.I Vary cut on :C , which reduces sensitivity to the non-perturbative region.

    5

    50

    10

    1 10

    anti kt R=1 jets, pt>2 TeV, εW=0.4signal: pp->WW, background: pp->jj

    Pythia 8.223 simulation

    perf

    orm

    ance

    resilience to non-perturbative effects

    performance v. resilience

    Lund+LSTM (Dreyer et al 18)FastLundNetFullLundNet

    ParticleNet (Gouskos et al. 19)RecNN (Louppe et al. 17)

    Δ& = & − &′

    � =©«Δ&2

    (

    〈&〉2(

    +Δ&2

    〈&〉2�

    ª®®¬− 12

    〈&〉 = 12 (& + &′)

    I FullLundNet & ParticleNetreach high performance,but are not particularlyresilient to NP effects.

    I FastLundNet modelachieves high resiliencewhile retaining most of theperformance.

    (c.f. arXiv:1803.07977)

    & ,√& Q

    CD

    Frédéric Dreyer 16/17

    https://arxiv.org/abs/1803.07977

  • CONCLUSIONS

  • Conclusions

    I Jet substructure provides a unique practical playground for recentdevelopments in machine learning.

    I Discussed a new way to study and exploit radiation patterns in a jetusing the Lund plane.

    I Introduced two models, FullLundNet and FastLundNet, which canachieve state-of-the-art performance on jet tagging benchmarks.

    I Combination of physical insight and machine learning allows formodels that combine performance and robustness.

    Frédéric Dreyer 17/17

    Lund images and CNNsConclusions