UserPreferences

Connectionism


1. Introduction to Connectionism

This is a brief introduction to connectionist networks (also called artificial neural networks). It follows the text from [WWW]Chapter 3 of [WWW]Learning to See Analogies: A Connectionist Exploration. Also, the pseudocode is Python, and the actual code examples are from [WWW]pyrorobotics.com.

1.1. History of Artificial Neural Networks

Highlights through the history of artificial neural networks:

Personally, I feel that AI has split into two distinct paradigms:

These two pardagims, in my opinion, have little to do with one another. That is, emergent models can certainly show rational, rule-like behavior. But the implementation of emergent models have nothing to do with how rational models operate.

Let's explore this issue.

1.2. Machine Learning

Given an input A, the system can respond back with output B. This could be done with a simple table:

Input Output
A B
C D
E F
G H
I J
... ...

You see any problems with this approach?

1.3. Network Mechanics

A neural network is:

Items:

  1. Network, composed of

    1. layers, composed of

      1. units/nodes

      2. an activation function

    2. weights between layers

  2. Training patterns, composed of

    1. input patterns

    2. target patterns

  3. Testing patterns, saved to test training

These training patterns are presented to the network repeatedly (in epochs).

The network can be thought of as computing a single function, g(A) -> B. But it may be more accurate to think of the pattern A being associated with output B.

Let's see how this would work.

1.3.1. Forward propagation of activation

Networks are often group into layers. This makes the implementation easy to compute.

http://bubo.brynmawr.edu/~dblank/images/connectionism/three-layer-net.gif

Proof: ANNs are Turing Machine equivalent. See Franklin and Garzon.

Proof: A three-layer network can compute anything that an N-layer can compute; if a many-layered network can compute something, then there is a three-layer network that can compute the same thing. However, that doesn't say anything about whether that computation can be learned!

The node:

http://bubo.brynmawr.edu/~dblank/images/connectionism/node.gif

http://bubo.brynmawr.edu/~dblank/images/connectionism/sigma.gif

The net input is a weighted sum of all the incoming activations plus the node's bias value:

 for m in toNodes:
   netInput[m] = bias[m]
   for i in fromNodes:
       netInput[m] += (weight[m][i] * activation[i])

where weight[m][i] is the weight, or connection strength, from the i-th node to the m-th node, activation[i] is the activation signal of the i-th node, and bias[m] is the bias value of the m-th node.

After computing the net input, each node has to compute its output activation. The activation function used in backprop networks is generally:

   def activationFunction(netInput):
       return 1.0 / (1.0 + exp(-netInput))

   for m in toNodes:
       activation[m] = activationFunction(netInput[m])

http://bubo.brynmawr.edu/~dblank/images/connectionism/activation.gif

http://bubo.brynmawr.edu/~dblank/images/connectionism/logistic.gif

However, without training, this can't do much. You already know one method of getting some weights. How would that work?

Hint: Darwin.

That would work, but would be slow. Why? A better method is the backpropagation of error.

1.3.2. Backpropagation of Error

for m in toNodes:
   error[m] = (desiredOutput[m] - actualOutput[m])
   delta[m] = error[m] * actualOutput[m] * (1 - actualOutput[m])
   for i in fromNodes:
      weightUpdate[m][i] = (EPSILON * delta[m] * actualOutput[i]) + (MOMENTUM * weightUpdate[m][i])

1.3.3. Example Representations

http://bubo.brynmawr.edu/~dblank/images/connectionism/count-table.gif

http://bubo.brynmawr.edu/~dblank/images/connectionism/localist.gif

http://bubo.brynmawr.edu/~dblank/images/connectionism/distributed-rep.gif

1.3.4. Example Problem

Building Neural Networks using Conx

1.4. Related Networks

http://bubo.brynmawr.edu/~dblank/images/connectionism/srn.gif

1.5. Why Neural Networks?

  1. They can learn a function that we may not know how to program.

  2. When they learn, they generalize.

  3. They are the easiest way to show different levels of computation

# File: NNxor.py
# import all the conx API
from pyrobot.brain.conx import *

# create the network
n = Network()

# add layers in the order they will be connected
n.addLayer('input',2)      # The input layer has two nodes
n.addLayer('hidden', 2)    # ADD HIDDEN
n.addLayer('output',1)     # The output layer has one node
n.connect('input','hidden','output')  # ADD HIDDEN

# provide training patterns (inputs and outputs)
n.setInputs([[0.0,0.0],[0.0,1.0],[1.0,0.0],[1.0,1.0]])
n.setOutputs([[0.0],[1.0],[1.0],[0.0]])

# set learning parameters
n.setEpsilon(0.5)
n.setTolerance(0.2)
n.setReportRate(1)

# learn
n.train()

How does this generalize? Run like this python -i NNxor.py:

>>> n.propagate(input = [.5, .5])

Is that what you would expect? How does the network generalize overall? Add the following to your file:

def symbol(n):
    return ".123456789#"[int(round(n * 10))]

def test(net):
    resolution = 50.0
    for i1 in range(0, int(resolution)):
        print "   ", 
        for i2 in range(0, int(resolution)):
            output = net.propagate(input = [i1/resolution, i2/resolution])
            print symbol(output[0]),
        print
    print

And try this:

>>> test(n)
>>> n.initialize()
>>> n.train()
>>> test(n)

1.6. Levels of Computation

Example of Holistic Computation: [WWW]A Case Study of RAAM, and the associated [WWW]paper.

1.7. Points to Ponder

  1. Can an artificial neural network learn to do something that it wasn't explicitly trained?

  2. Does an artificial neural network just learn a set of rules?