chap02g-neural network model

8/15/2019 Chap02g-Neural Network Model

1/31

ADVANCED INFORMATIONRETREIVAL

Chapter 02: Modeling -

Neural Network Model


2/31

Neural Network Model

A neural network is an oversimplified representationof te neuron inter!onne!tions in te uman "rain#

nodes are pro!essin$ units

ed$es are s%napti! !onne!tions te stren$t of a propa$atin$ si$nal is modelled "% a

wei$t assi$ned to ea! ed$e

te state of a node is defined "% its activation level

dependin$ on its a!tivation level& a node mi$t issue

an output si$nal


3/31

Neural Networks

' Neural Networks' Comple( learnin$ s%stems re!o$ni)ed in animal "rains

' *in$le neuron as simple stru!ture

' Inter!onne!ted sets of neurons perform !omple( learnin$ tasks

' +uman "rain as ,-,. s%napti! !onne!tions

' Artifi!ial Neural Networks attempt to repli!ate non/linear learnin$

found in nature

Dendrites

Cell Body

Axon


4/31

Neural Networks 0cont’d 1

' Dendrites $ater inputs from oter neurons and !om"ine

information

' Ten $enerate non/linear response wen tresold rea!ed

' *i$nal sent to oter neurons via a(on

' Artifi!ial neuron model is similar

' Data inputs 0(i1 are !olle!ted from upstream neurons input to

!om"ination fun!tion 0si$ma1

→Σn x

x

x

2

1

y


5/31

Neural Networks 0cont’d 1

' A!tivation fun!tion reads !om"ined input and produ!es non/linear

response 0%1

' Response !anneled downstream to oter neurons

' 2at pro"lems appli!a"le to Neural Networks3

' 4uite ro"ust wit respe!t to nois% data

' Can learn and work around erroneous data

' Results opa5ue to uman interpretation

' Often re5uire lon$ trainin$ times


6/31

Input and Output En!odin$

' Neural Networks re5uire attri"ute values en!oded to 6-& ,7

' Numeri!' Appl% Min/ma( Normali)ation to !ontinuous varia"les

' 2orks well wen Min and Ma( known

' Also assumes new data values o!!ur witin Min/Ma( ran$e

' Values outside ran$e ma% "e re8e!ted or mapped to Min or Ma(

)min()max(

)min(

)range(

)min(*

X X

X X

X

X X X

−

−=

−=


7/31

Input and Output En!odin$ 0cont’d 1

' Output ' Neural Networks alwa%s return !ontinuous values 6-& ,7

' Man% !lassifi!ation pro"lems ave two out!omes

' *olution uses tresold esta"lised a priori in sin$le output node to

separate !lasses

' For e(ample& tar$et varia"le is 9leave: or 9sta%:

' Tresold value is 9leave if output ;< -=>?:

' *in$le output node value < -=?@ !lassifies re!ord as 9leave:


8/31

*imple E(ample of a Neural

Network

' Neural Network !onsists of la%ered& feedforward& !ompletel%

!onne!ted network of nodes

'Feedforward restri!ts network flow to sin$le dire!tion

' Flow does not loop or !%!le

' Network !omposed of two or more la%ers

Node1

Node

2

Node3

NodeB

NodeA

Node

Z

W1AW1B

W2A

W2B

WAZ

W3AW3B

W0A

WBZW0Z

W0B

Input LayerInput Layer Hidden LayerHidden Layer Output LayerOutput Layer


9/31

*imple E(ample of a Neural Network0cont’d 1

' Most networks ave Input& +idden& Output la%ers

' Network ma% !ontain more tan one idden la%er

' Network is !ompletel% !onne!ted

' Ea! node in $iven la%er& !onne!ted to ever% node in ne(t la%er

' Ever% !onne!tion as wei$t 02i81 asso!iated wit it

' 2ei$t values randoml% assi$ned - to , "% al$oritm

' Num"er of input nodes dependent on num"er of predi!tors

' Num"er of idden and output nodes !onfi$ura"le


10/31

*imple E(ample of a Neural Network 0cont 1

' Com"ination fun!tion produ!es linear !om"ination of node

inputs and !onne!tion wei$ts to sin$le s!alar value

' For node 8& (ij is ith input' 2ij is wei$t asso!iated wit ith input node

' I , inputs to node 8

' (1& (2& ===& ( I are inputs from upstream nodes

' (0 is !onstant input value < ,=-

' Ea! input node as e(tra input 20j(0j < 20j

j I j I j j j jiji

ij j xW xW xW xW +++==∑ ...net 1100

Node1

Node2

Node3

NodeB

NodeA Node

Z

W1AW1BW2AW2B

WAZ

W3AW3B

W0A

WBZW0Z

W0B



11/31


' Te s!alar value !omputed for idden la%er Node A e5uals

' For Node A& net A < ,=B@ is input to a!tivation fun!tion

' Neurons 9fire: in "iolo$i!al or$anisms' *i$nals sent "etween neurons wen !om"ination of inputs

!ross tresold

x 0 = 1.0 W

0A = 0.5 W

0B = 0.7 W

0Z = 0.5

x 1 = 0.4 W

1A = 0.6 W

1B = 0.9 W

AZ = 0.9

x 2 = 0.2 W

2A = 0.8 W

2B = 0.8 W

BZ = 0.9

x 3 = 0.7 W

3A = 0.6 W

3B = 0.4

32.1)7.0(6.0)2.0(8.0)4.0(6.05.0

)0.1(net 3322110

=+++

=+++==∑ A A A A A A AiAi

iA A xW xW xW W xW


12/31


' Firin$ response not ne!essaril% linearl% related to in!rease in

input stimulation

' Neural Networks model "eavior usin$ non/linear a!tivation

fun!tion

' *i$moid fun!tion most !ommonl% used

' In Node A& si$moid fun!tion takes net A < ,=B@ as input and

produ!es output

xe y

−+

=1

1

7892.01

132.1 =

+=

−e y


13/31


' Node A outputs -=?@ alon$ !onne!tion to Node & and

"e!omes !omponent of net Z

' efore net Z is !omputed& !ontri"ution from Node re5uired

'Node !om"ines outputs from Node A and Node & trou$net Z

8176.01

1)net(

and,

5.1)7.0(4.0)2.0(8.0)4.0(9.07.0

)0.1(net

5.1B

3322110

=+

=

=+++

=+++==

−

∑

e f

xW xW xW W xW B B B B B B BiBiiB B


14/31


' Inputs to Node not data attri"ute values

' Rater& outputs are from si$moid fun!tion in upstream nodes

' Value -=?.- output from Neural Network on first pass' Represents predi!ted value for tar$et varia"le& $iven first

o"servation

8750.01

1)net(

finally,

9461.1)8176.0(9.0)7892.0(9.05.0

)0.1(net

9461.1z

0

=+

=

=++

=++==

−

∑

e f

xW xW W xW BZ BZ AZ AZ Z iZ

i

iZ Z


15/31

*i$moid A!tivation Fun!tion

' *i$moid fun!tion !om"ines nearl% linear& !urvilinear& and nearl%

!onstant "eavior dependin$ on input value' Fun!tion nearl% linear for domain values /, G ( G ,

' e!omes !urvilinear as values move awa% from !enter

' At e(treme values& f0 x 1 is nearl% !onstant

'Moderate in!rements in x produ!e varia"le in!rease in f0 x 1&dependin$ on lo!ation of x

' *ometimes !alled 9*5uasin$ Fun!tion:

' Takes real/valued input and returns values 6-& ,7


16/31

a!k/Hropa$ation

' Neural Networks are supervised learnin$ metod

' Re5uire tar$et varia"le

' Ea! o"servation passed trou$ network results in output

value

' Output value !ompared to a!tual value of tar$et varia"le' 0A!tual Output1 < Error

' Hredi!tion error analo$ous to residuals in re$ression models

' Most networks use *um of *5uares 0**E1 to measure ow well

predi!tions fit tar$et values

∑∑ −= sOutputNodecords

output actual SSE 2

Re

)(


17/31

a!k/Hropa$ation 0cont’d 1

' *5uared predi!tion errors summed over all output nodes& and

all re!ords in data set

' Model wei$ts !onstru!ted tat minimi)e **E

' A!tual values tat minimi)e **E are unknown

' 2ei$ts estimated& $iven te data set


18/31

a!k/Hropa$ation Rules

' a!k/propa$ation per!olates predi!tion error for re!ord "a!k

trou$ network

' Hartitioned responsi"ilit% for predi!tion error assi$ned to various

!onne!tions

' a!k/propa$ation rules defined 0Mit!ell1

j

ji

x

j

ij jij

ij!"##EN$ ij NEW ij

n!det! "el!ngingerr!r #arti$%lar af!rlityre!n&i"ire#re&ent&

n!dein#%t t!t'&ignifie&x

ratelearning

'ere,

i

,,

=

=

=

=∆

∆+=

δ

η

ηδ


19/31

a!k/Hropa$ation Rules 0cont’d 1

' Error responsi"ilit% !omputed usin$ partial derivative of te

si$moid fun!tion wit respe!t to net j

' Values take one of two forms

'

Rules sow w% input values re5uire normali)ation' Lar$e input values (i would dominate wei$t ad8ustment

' Error propa$ation would "e overwelmed& and learnin$ stifled

d!n&treamn!de&f!rlitie&re!n&i"ierr!r!f &%meig'tedt!refer& 'ere,

n!de&layer'iddenf!r

n!de&layer!%t#%tf!r

)!%t#%t1(!%t#%t

)!%t#%ta$t%al)(!%t#%t1(!%t#%t

∑

∑

−

−−=

%OWNS$#EA& j j'

%OWNS$#EA& j j'

j

W

W

δ

δ δ


20/31

E(ample of a!k/Hropa$ation

' Re!all tat first pass trou$ network %ielded output < -=?.-

' Assume a!tual tar$et value < -=& and learnin$ rate < -=-,

' Hredi!tion error < -= / -=?.- < /-=-?.

' Neural Networks use sto!asti! "a!k/propa$ation

' 2ei$ts updated after ea! re!ord pro!essed "% network

' Ad8ustin$ te wei$ts usin$ "a!k/propa$ation sown ne(t

' Error responsi"ilit% for Node & an output node& found first

0082.0)875.08.0)(875.01(875.0

)!%t#%ta$t%al)(!%t#%t1(!%t#%t ****

−=−−

=−−= Z δ

Node1

Node2

Node3

NodeB

NodeA NodeZ

W1AW1BW2AW2B

WAZ

W3AW3B

W0A

WBZW0Z

W0B



21/31

E(ample of a!k/Hropa$ation 0cont’d 1

' Now ad8ust 9!onstant: wei$t w0Z usin$ rules

' Move upstream to Node A& a idden la%er node

' Onl% node downstream from Node A is Node

49918.000082.05.0

00082.)1)(0082.0(1.0)1(

0,0,0

0

=−=∆+=

−=−==∆

Z !"##EN$ Z NEW Z

Z Z

W ηδ

00123.0)0082.0)(9.0)(7892.01(7892.0

)!%t#%t1(!%t#%t ++

−=−−=

−= ∑ %OWNS$#EA&

j j' A W δ δ


22/31


' Ad8ust wei$t w AZ usin$ "a!k/propa$ation rules

' Conne!tion wei$t "etween Node A and Node ad8usted from-= to -=B.B

' Ne(t& Node is idden la%er node

' Onl% node downstream from Node is Node

899353.0000647.09.0

000647.0)7892.0)(0082.0(1.0)(

,, =−=∆+=

−=−==∆

AZ !"##EN$ AZ NEW AZ

A Z AZ

O"$("$ W ηδ

0011.0)0082.0)(9.0)(8176.01(8176.0

)!%t#%t1(!%t#%t BB

−=−−=

−= ∑ %OWNS$#EA&

j j' B W δ δ


23/31


' Ad8ust wei$t w BZ usin$ "a!k/propa$ation rules

' Conne!tion wei$t "etween Node and Node ad8usted from-= to -=BB

' *imilarl%& appli!ation of "a!k/propa$ation rules !ontinues to

input la%er nodes

' 2ei$ts Jw1A& w2A& w)A & w0AK and Jw1B& w2B& w)B & w0BK updated "%

pro!ess

89933.000067.0.09.0

00067.0)8176.0)(0082.0(1.0)(

,, =−=∆+=

−=−==∆

BZ !"##EN$ BZ NEW BZ

B Z BZ

O"$("$ W ηδ


24/31


' Now& all network wei$ts in model are updated

' Ea! iteration "ased on sin$le re!ord from data set

' *ummar%' Network !al!ulated predi!ted value for tar$et varia"le

' Hredi!tion error derived

' Hredi!tion error per!olated "a!k trou$ network

' 2ei$ts ad8usted to $enerate smaller predi!tion error

' Hro!ess repeats re!ord "% re!ord


25/31

Termination Criteria

' Man% passes trou$ data set performed

' Constantl% ad8ustin$ wei$ts to redu!e predi!tion error

' 2en to terminate3

' *toppin$ !riterion ma% "e !omputational 9!lo!k: time3' *ort trainin$ times likel% result in poor model

' Terminate wen **E rea!es tresold level3

' Neural Networks are prone to overfittin$

' Memori)in$ patterns rater tan $enerali)in$

' And


26/31

Learnin$ Rate

' Re!all Learnin$ Rate 0reek 9eta:1 is a !onstant

' +elps ad8ust wei$ts toward $lo"al minimum for **E

' *mall Learnin$ Rate' 2it small learnin$ rate& wei$t ad8ustments small

' Network takes una!!epta"le time !onver$in$ to solution

' Lar$e Learnin$ Rate' *uppose al$oritm !lose to optimal solution

' 2it lar$e learnin$ rate& network likel% to 9oversoot: optimal

solution

ratelearning

'ere,10

=


27/31

Neural Network for IR: From te work "% 2ilkinson +in$ston& *IIR,

Document

Terms

Query

TermsDocuments

k a

k b

k c

k a

k b

k c

k 1

k t

d1

d j

d j+1

dN


28/31

Neural Network for IR Tree la%ers network

*i$nals propa$ate a!ross te network

First level of propa$ation#

4uer% terms issue te first si$nals

Tese si$nals propa$ate a!!ross te network torea! te do!ument nodes

*e!ond level of propa$ation#

Do!ument nodes mi$t temselves $enerate new

si$nals wi! affe!t te do!ument term nodes

Do!ument term nodes mi$t respond wit new

si$nals of teir own


29/31

Quantifying Signal Propagation

Normali)e si$nal stren$t 0MAP < ,1 4uer% terms emit initial si$nal e5ual to ,

2ei$t asso!iated wit an ed$e from a 5uer% term

node ki to a do!ument term node ki#2i52i5 < wi5

s5rt 0 Σi wi5 1 2ei$t asso!iated wit an ed$e from a do!ument

term node ki to a do!ument node d8#

2i82i8 < wi8s5rt 0 Σi wi8 1

2

2


30/31

Quantifying Signal Propagation After te first level of si$nal propa$ation& te

a!tivation level of a do!ument node d8 is $iven "%#

Σi 2i52i5 2i82i8 < Σi wi5 wi8s5rt 0 Σi wi5 1 Q s5rt 0 Σi wi8 1

wi! is e(a!tl% te rankin$ of te Ve!tor model

New si$nals mi$t "e e(!an$ed amon$ do!ument

term nodes and do!ument nodes in a pro!ess

analo$ous to a feed"a!k !%!le

A minimum tresold sould "e enfor!ed to avoid

spurious si$nal $eneration

222


31/31

Conclusions

Model provides an interestin$ formulation of te IRpro"lem

Model as not "een tested e(tensivel%

It is not !lear te improvements tat te model mi$tprovide