university studies 15a: consciousness i neural network modeling

34
University Studies 15A: Consciousness I Neural Network Modeling

Upload: darren-randall-quinn

Post on 20-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: University Studies 15A: Consciousness I Neural Network Modeling

University Studies 15A:

Consciousness I

Neural Network Modeling

Page 2: University Studies 15A: Consciousness I Neural Network Modeling

The Neuron

axon (output) cell nucleus

dendrite (input)

t

i(a)

i(b)

i(c) o

wa

wc

wb

The Schematic Model of a Neuron

output = 1, if ∑ input(i) wi > t 0, otherwise

Page 3: University Studies 15A: Consciousness I Neural Network Modeling

i(1)

i(2)

i(3)

i(i)

i(n)

t……

w1

w2

w3

wi

wn

output

The Perceptron: A Single-layered Neural Network

∑𝑖=1

𝑛

i( 𝑖)×𝑤𝑖≥ 𝑡if output = 1

output = 0else

t = threshold activation level

Page 4: University Studies 15A: Consciousness I Neural Network Modeling

i(1)

i(2)

i(3)

i(i)

i(n)

t……

w1

w2

w3

wi

wn

output

We can think of the input into the perceptron as a vector: [𝑖(1)𝑖(2)⋮𝑖 (𝑖)⋮

𝑖(𝑛)]

Page 5: University Studies 15A: Consciousness I Neural Network Modeling

i(1)

i(2)

i(3)

i(i)

i(n)

t……

w1

w2

w3

wi

wn

output

We can think of the weights of the connections between the units in the perceptron as a vector: [

𝑤(1)𝑤(2)⋮

𝑤 (𝑖)⋮

𝑤(𝑛)]

Page 6: University Studies 15A: Consciousness I Neural Network Modeling

Linear Algebra: the Mathematics of Many Dimensions (Quick Version)

Quantities in one dimension: the scalar

nQuantities in two dimensions: the two-dimensional vector

(a,b)

(a,b)

a

bThe vector (a,b) has both an amount and a direction

Page 7: University Studies 15A: Consciousness I Neural Network Modeling

More generally, we can think of vectors in an n-dimensional space (this can be arbitrarily large):

(a1, a2, a3, … ai, … , an)

Or, the simpler notation: �⃑�The vector arithmetic we need for neural networks is simple:

Vector addition: �⃑� �⃑�(a1, a2, a3, … ai, … , an) (b1, b2, b3, … bi, … , bn) =

(a1+b1, a2+b2, a3+b3, … ai+bi, … , an+bn)

+

+

Page 8: University Studies 15A: Consciousness I Neural Network Modeling

(a1, a2, a3, … ai, … , an)

Or, more generally: multiplying a vector by a scalar

n∙⃑𝑎

Then: �⃑�(a1, a2, a3, … ai, … , an)

(2a1, 2a2, 2a3, … 2ai, … , 2an)

+

+

�⃑� = 2⃑𝑎

= (na1, na2, na3, … nai, … , nan)

Page 9: University Studies 15A: Consciousness I Neural Network Modeling

Some general properties of addition should be clear:

�⃑� �⃑�+ = �⃑�+�⃑��⃑� �⃑�+ �⃑�) +( )= �⃑� �⃑�+ ( �⃑� +

Now that we have these facts, we can introduce two important features of vectors: linear combination and linear independence

is a linear combination of and�⃑� �⃑� �⃑�if there are scalars m and n such that m⃑𝑎+ n⃑𝑏=�⃑�

Page 10: University Studies 15A: Consciousness I Neural Network Modeling

is linearly independent of and�⃑� �⃑� �⃑�Otherwise,

In an n-dimensional space, any set of n vectors that are linearly independent of one another can be used (in linear combination) to describe all the other vectors in the space.

That set of vectors spans the space.

Page 11: University Studies 15A: Consciousness I Neural Network Modeling

More Vector Math: Multiplication (Inner Product)

Another important mathematical operation performed on vectors is the inner product (which is a scalar quantity).

�⃑� ∙ �⃑�=∑𝑖=1

𝑛

𝑎𝑖𝑏𝑖

Given this, the following equalities should be clear:

�⃑� ∙ �⃑�=�⃑� ∙ �⃑��⃑� ∙ ( �⃑�+�⃑� )=�⃑� ∙ �⃑�+ �⃑� ∙ �⃑�

Page 12: University Studies 15A: Consciousness I Neural Network Modeling

i(1)

i(2)

i(3)

i(i)

i(n)

t……

w1

w2

w3

wi

wn

output

For the Perceptron: Activation is the inner product of the input vector and the weighting vector

∑𝑖=1

𝑛

i( 𝑖)×𝑤𝑖=�⃑� ∙�⃑�

Page 13: University Studies 15A: Consciousness I Neural Network Modeling

a(1)

a(2)

a(3)

a(i)

a(n)

b(j)……

Oj

b(m)

b(1)

……

Om

O1

Now lets complicate the model: two layers of neurons are connected

Page 14: University Studies 15A: Consciousness I Neural Network Modeling

a(1)

a(2)

a(3)

a(i)

a(n)

b(j)……

Oj

b(m)

b(1)

……

Om

O1

For the link between the ith unit in Layer A and the jth unit in Layer B, there is a connection weigh wi,j

Page 15: University Studies 15A: Consciousness I Neural Network Modeling

This set of weights defines a weighting matrix of dimension (m,n) (columns for Layer A, rows for Layer B)

[ 𝑤1,1 𝑤2,1 ⋯ 𝑤𝑛 ,1

𝑤1,2 𝑤2,2 ⋯ 𝑤𝑛 , 2

⋮ ⋮ ⋱ ⋮𝑤1 ,𝑚 𝑤2 ,𝑚 ⋯ 𝑤𝑛 ,𝑚

]Wn,m =

For our purposes, it perhaps is best to think of matrices as entities that transform vectors of one dimensionality into a different dimensional space in a way determined by the values of the rows and columns in the matrix.

Page 16: University Studies 15A: Consciousness I Neural Network Modeling

a(1)

a(2)

a(3)

a(i)

a(n)

b(j)……

Oj

b(m)

b(1)

……

Om

O1

One can describe the output from Layer A as a vector⃑𝑂𝐴

The activation values of Layer B also are a vector: �⃑�𝑒𝑡 𝐵

Page 17: University Studies 15A: Consciousness I Neural Network Modeling

Putting everything together, we have the equation:

�⃑�𝑒𝑡 𝐵 �⃑�𝐴= W ∙

Finally, because the output from Layer B depends on the threshold tj for each unit

= f(W ∙ )⃑𝑂𝐴�⃑�𝐵 = f( )⃑𝑁𝑒𝑡 𝐵

So,… what does all of this get us?

Page 18: University Studies 15A: Consciousness I Neural Network Modeling

We now can describe Hebb’s learning rule:

∆wij = aibj

ai and bj = the output values for the ith unit in Layer A and the jth unit in Layer Bwij = the connection weight between the units = a learning parameter

Note that if either ai or bj is 0, the weight does not changeNote also that the vectors and matrix can be arbitrarily large, since we now are tracking relations between individual units

Page 19: University Studies 15A: Consciousness I Neural Network Modeling

So,… where does this simple learning rule get us?Let’s begin with a simple system:

1. Sixteen input units are connected to two output units

2. Only two input units are active at a time.

3. They must be horizontal or vertical neighbors

4. Only one output unit can be active at a time (inhibition is marked by the black dots).

Page 20: University Studies 15A: Consciousness I Neural Network Modeling

If one trains the network via Hebbian learning using a series of activations that follow the neighbor rule, the system settles into a stable set of weights.

Trial 1 Trial 2 Trial 3

Filled circle: Output Unit 1 gave the input from that unit a higher weight, Empty circle: Output Unit 2 gave the input from that unit a higher weight. Heavy line: When the two input units were active, Output Unit 1 wonThin line: Output unit 2 won the competition. Since the output units organize their responses through mutual inhibition, they must find some feature to divide the input domain, and so they discover a topographic map.

Page 21: University Studies 15A: Consciousness I Neural Network Modeling

This simple example resembles more complex cases:

In the retina, there are “on-surround” and “off-surround” cells that send their activation to the thalamus. The thalamus processes the information and passes it along to the primary visual cortex.

On-center off-surround (active)

Off-center on-surround(active)

Off-center on-surround(Inactive)

On-center off-surround(inactive)

Page 22: University Studies 15A: Consciousness I Neural Network Modeling

The so-called “simple cells” of the primary visual cortex organize themselves through mutual inhibition as they divide the inputs from the thalamus (LGN).

How they divide the input space is to learn to respond to line segments at a specified angle. Some respond to 45°, some to 72°, and so. V1 uses “coarse coding:” not all angles are represented. Instead, angles in between can be represented by linear combinations of activation vectors.

Page 23: University Studies 15A: Consciousness I Neural Network Modeling

In the simple example of the artificial network, the network had input that followed certain regularities, and it divided the input space in half to match the binary output dimension. In the visual system as well, there are regularities that the system captures, constrained by its input and output design. In V1, the simple cells start with angled line segments. There are regularities among the patterns of line segments taken from the natural world that the complex cells then capture.

Page 24: University Studies 15A: Consciousness I Neural Network Modeling

Neural networks extract patterns and divide an input space.

This can lead to odd results with implications for biological neural networks.

David McClelland tested the ability of a neural network to build a classification tree based on closeness of attributes.

He built a network that could handle simple property statements like:Robin can grow, move, fly.Oak can grow.Salmon has scales, gills, skin.Robin has wings, feathers, skin.Oak has bark, branches, leaves, roots.

Page 25: University Studies 15A: Consciousness I Neural Network Modeling

Baars and Gage discuss this and give the design:

Page 26: University Studies 15A: Consciousness I Neural Network Modeling

What Baars and Gage do not discuss was the next step.

McClelland fed the system facts about penguins:

Penguin can swim, move, grow.Penguin has wings, feathers, skin.

The result was a tree that did a good job:

Page 27: University Studies 15A: Consciousness I Neural Network Modeling

The results were profoundly different if they gave the facts about penguins interleaved with facts about the other objects or if it was all penguins all the time (we’ll come back to this result when we discuss memory):

Page 28: University Studies 15A: Consciousness I Neural Network Modeling

People explored the properties of networks as pattern extractors from many different angles.

For example, trying to teach a system how to handle relative clauses proved very hard.

Then people tried modeling the system on a feature of the child’s brain: that not all of the memory resources are there from the beginning.

They built a system that initially had limited short-term memory to handle sentence structure and simply jettisoned all complexities. Then they slowly expanded the size of short-term memory, and the system mimicked children’s behavior in acquiring the ability to handle relative clauses.

That is, their models taught them to respect issues of timing and resources.

Page 29: University Studies 15A: Consciousness I Neural Network Modeling

Another aspect of neural networks in the brain that people explored through artificial networks is recurrency, when nodes in networks loop back on themselves.

One absolutely crucial feature of recurrent networks is the ability to complete partial patterns:

The image of the Dalmatian is very incomplete, but the brain feeds back knowledge of Dalmatians to the visual system, which then produces a yet more complete view and cycles in loops until perception settles into “Dalmation.”

Page 30: University Studies 15A: Consciousness I Neural Network Modeling

However, if a recurrent system insists on settling into its best guess, it will never be able to learn anything new.

Artificial neural network modelers borrowed from the brain again to provide a way to selectively shut down recurrent connections:

Whenever a new pattern produces great activity as the mutually inhibiting neurons in the “IT cortex” cannot quickly settle into a pattern, that activity activates the “Basal Forebrain” to produce acetylcholine that shuts down the recurrent connections in the “IT cortex,” allowing the new pattern to be assimilated into the organization of the input space.

Page 31: University Studies 15A: Consciousness I Neural Network Modeling

The use of recurrent connections is central to an important model for learning in neural networks that does not rely on the “back-propagation of error” initially use to train the hidden units that overcame the limitations of perceptrons.

The first layer of memory suggests its best guess, F1, and passes it to the second layer. If that layer also finds a pattern like F1, the recurrent connection quickly leads to a settled state. If it finds and F2 that is too different from F1, it passes activation back to the first layer that removes the block on resetting the second layer, sends it the input, and retrains the system.

It really does work.

Page 32: University Studies 15A: Consciousness I Neural Network Modeling

These sorts of pattern-completing, self-modifying networks appear throughout the brain.

Baars and Gage stress that 90% of the connections between the thalamus and V1 go from V1 to the thalamus as re-entrant connections rather than feed-forward input.

Many neural net modelers have developed systems based on re-entrant brain connectivity:

Page 33: University Studies 15A: Consciousness I Neural Network Modeling

What needs to be stressed is that the neural network modelers can test the behavior of networks (for example, the effect of fear arousal from the amygdala on visual object recognition in the IT (inferotemporal cortex)).

Artificial neural networks have given us a model for memory in the changing of synaptic connections strength (the Weighting matrix).

The models present us with “objects” at all levels that are the ways the network divides the input space into a space with an implicit dimensionality. For V1, for example, that space largely is composed of angled line segments and then more complex sets of combinations of line segments. As one goes higher into the visual cortex, ever more complex mutually-differentiated patterns define the “base vectors” that describe the object space.

Artificial neural networks give us testable ways to think about how the brain operates that simply were not available before these models were developed.

Page 34: University Studies 15A: Consciousness I Neural Network Modeling

The success of the neural network models and the growing sophistication of our understanding of them allow us to approach a system like visual consciousness with tools that can help us explore the dynamics: