UserPreferences

PyroModuleNeuralNetworksAdvanced


Associating Images with Labels

Kim Plunkett, Chris Sinha, Martin F. Moller, and Ole Strandsby wrote a paper called Symbol grounding or the emergence of symbols? Vocabulary growth in children and a connectionist net which was published in the journal Connection Science, volume 4, number 3 and 4, in 1992. In this paper, they describe a neural network model which associates distorted prototype images with labels. Their network is divided into two main pathways, an image pathway and a label pathway. Each pathway has a separate hidden layer, but then the pathways meet in a shared hidden layer before connecting to the output layers. Implementing this model demonstrates the flexibility of Con-x. Below is a much simplified version of the model.

from pyrobot.brain.conx import * 
 
# create a network to do the image labeling task 
 
# imageOutput labelOutput 
#       ^         ^ 
#        \       / 
#      sharedHidden 
#        ^       ^ 
#       /         \ 
# imageHidden labelHidden 
#      ^           ^ 
#      |           | 
#  imageInput  labelInput 
 
n = Network() 

# add layers
n.add(Layer('imageInput', 9))  
n.add(Layer('labelInput', 3))  
n.add(Layer('imageHidden', 3))
n.add(Layer('labelHidden', 3))
n.add(Layer('sharedHidden', 5))
n.add(Layer('imageOutput', 9)) 
n.add(Layer('labelOutput', 3)) 

# add connections
n.connect('imageInput', 'imageHidden') 
n.connect('labelInput', 'labelHidden') 
n.connect('imageHidden', 'sharedHidden') 
n.connect('labelHidden', 'sharedHidden') 
n.connect('sharedHidden', 'imageOutput') 
n.connect('sharedHidden', 'labelOutput') 

# associate layers
n.associate('imageInput','imageOutput')
n.associate('labelInput','labelOutput')

# provide training patterns 
 
# Below are some crude prototype images 
# modeled after the letters X, T, and L 
# called ximg, timg, and limg.  In addition, 
# there are three simple distortions for 
# each of these images (named a, b, and c). 
ximg = [1.0, 0.0, 1.0, 
        0.0, 1.0, 0.0, 
        1.0, 0.0, 1.0] 
ximga= [0.0, 0.0, 1.0, 
        0.0, 1.0, 0.0, 
        1.0, 0.0, 1.0] 
ximgb= [1.0, 0.0, 1.0, 
        0.0, 0.0, 0.0, 
        1.0, 0.0, 1.0] 
ximgc= [1.0, 0.0, 1.0, 
        0.0, 1.0, 0.0, 
        0.0, 0.0, 1.0] 
timg = [1.0, 1.0, 1.0, 
        0.0, 1.0, 0.0, 
        0.0, 1.0, 0.0] 
timga= [0.0, 1.0, 1.0, 
        0.0, 1.0, 0.0, 
        0.0, 1.0, 0.0] 
timgb= [1.0, 1.0, 1.0, 
        0.0, 1.0, 0.0, 
        0.0, 0.0, 0.0] 
timgc= [1.0, 1.0, 0.0, 
        0.0, 1.0, 0.0, 
        0.0, 1.0, 0.0] 
limg = [1.0, 0.0, 0.0, 
        1.0, 0.0, 0.0, 
        1.0, 1.0, 1.0] 
limga= [0.0, 0.0, 0.0, 
        1.0, 0.0, 0.0, 
        1.0, 1.0, 1.0] 
limgb= [1.0, 0.0, 0.0, 
        1.0, 0.0, 0.0, 
        1.0, 0.0, 1.0] 
limgc= [1.0, 0.0, 0.0, 
        1.0, 0.0, 0.0, 
        0.0, 1.0, 1.0] 
 
# These are the arbitrary, orthogonal labels 
# for the three types of images. 
xlab = [1.0, 0.0, 0.0] 
tlab = [0.0, 1.0, 0.0] 
llab = [0.0, 0.0, 1.0] 
 
# This is the training set.  An epoch of 
# training consists of auto-associating 
# each of the images and their appropriate 
# labels.  Recall that the Plunkett et. al. 
# paper discussed two styles of training: 
# one stage and phased.  Currently this only 
# does the one stage variety.  

# extend images with labels and then use the mapInputs method
ximga.extend(xlab)
ximgb.extend(xlab)
ximgc.extend(xlab)
timga.extend(tlab)
timgb.extend(tlab)
timgc.extend(tlab)
limga.extend(llab)
limgb.extend(llab)
limgc.extend(llab)

# set learning parameters 
n.setEpsilon(0.3) 
n.setMomentum(0.1) 
n.setTolerance(0.1) 
n.setStopPercent(.98)

# set inputs, targets are associated (automatic)
patterns = [ximga, ximgb, ximgc,
            timga, timgb, timgc,
            limga, limgb, limgc]

n.setInputs(patterns)

# determine layer index and the vector offset
n.mapInput('imageInput',0)
n.mapInput('labelInput',9)
n.mapTarget('imageOutput',0)
n.mapTarget('labelOutput',9)

# learn 
n.setLearning(1)
n.train()

# This is a blank image and label used in testing. 
bimg = [0.0, 0.0, 0.0, 
        0.0, 0.0, 0.0, 
        0.0, 0.0, 0.0] 
blab = [0.0, 0.0, 0.0] 
 
# Try presenting an image alone, then a label alone, 
# and then both together for each prototype image. 
testImages = [ximg, bimg, ximg, 
              timg, bimg, timg, 
              limg, bimg, limg] 
testLabels = [blab, xlab, xlab, 
              blab, tlab, tlab, 
              blab, llab, llab] 
targImages = [ximg, ximg, ximg, 
              timg, timg, timg, 
              limg, limg, limg] 
targLabels = [xlab, xlab, xlab, 
              tlab, tlab, tlab, 
              llab, llab, llab] 

print "Training ended" 
# examine results 
n.setLearning(0)   
n.setInteractive(1)   
for pat in range(len(testImages)): 
    print "testing pattern", pat 
    n.step(imageInput = testImages[pat], \
           imageOutput = targImages[pat], \
           labelInput = testLabels[pat], \
           labelOutput = targLabels[pat]) 

Adapting this program for the use of real camera images

When adapting this model for real camera images, the amount of processing that must be done increases dramatically. Real camera images typically consist of hundreds or thousands of pixels. When using such images a single epoch may take 15 or 20 minutes to be completed. As a result, there are several changes you should make to the program.

Saving the weights

When running long experiments of this kind, it is important to save the weights of the network periodically during training. Then if the experiment gets interrupted for some reason, it can be restarted with the most recently saved weights rather than having to restart from scratch.

We can replace n.train() in the example above with the following loop:

# the sweep loop
epoch = 1
tssErr = 1.0
totalCorrect = 0
totalCount = 1
while epoch < 1000 and totalCorrect * 1.0 / totalCount < n.stopPercent:
    (tssErr, totalCorrect, totalCount, totalPCorrect) = n.sweep()
    print "Epoch #%6d" % epoch, "| TSS Error: %.2f" % tssErr, \
          "| Correct =", totalCorrect * 1.0 / totalCount, \
          "| RMS Error: %.2f" % n.RMSError()
    if epoch % 10 == 0:
        n.saveWeightsToFile("epoch" + str(epoch) + ".wts")
    epoch += 1

# save final weights
n.saveWeightsToFile("final.wts")

By adding the following lines to the training while loop, the weights will be saved every 10 epochs.

if epoch % 10 == 0:
    n.saveWeightsToFile("epoch" + str(epoch) + ".wts")

Also, after the training while loop ends, the final weights should be saved: n.saveWeightsToFile("final.wts").

Saving the test results

When adapting this model for more realistic images, doing the testing interactively (as shown previously) may not be practical. Rather than seeing the activations of the entire network, it would be more useful to simply see the total error for each output layer. Below is a testing program, which can be run after training has been completed, that saves these error results to a file.

from pyrobot.brain.conx import *

# Recreate the network architecture
n = Network()

# add layers
n.add(Layer('imageInput', 9))  
n.add(Layer('labelInput', 3))  
n.add(Layer('imageHidden', 3))
n.add(Layer('labelHidden', 3))
n.add(Layer('sharedHidden', 5))
n.add(Layer('imageOutput', 9)) 
n.add(Layer('labelOutput', 3)) 

# add connections
n.connect('imageInput', 'imageHidden') 
n.connect('labelInput', 'labelHidden') 
n.connect('imageHidden', 'sharedHidden') 
n.connect('labelHidden', 'sharedHidden') 
n.connect('sharedHidden', 'imageOutput') 
n.connect('sharedHidden', 'labelOutput')

print "Loading saved weights"
n.loadWeightsFromFile("final.wts")
print "Done"

# provide testing patterns

ximg = [1.0, 0.0, 1.0,
        0.0, 1.0, 0.0,
        1.0, 0.0, 1.0]
ximgd= [1.0, 0.0, 1.0,
        0.0, 1.0, 0.0,
        1.0, 0.0, 0.0]
timg = [1.0, 1.0, 1.0,
        0.0, 1.0, 0.0,
        0.0, 1.0, 0.0]
timgd= [1.0, 1.0, 1.0,
        0.0, 0.0, 0.0,
        0.0, 1.0, 0.0]
limg = [1.0, 0.0, 0.0,
        1.0, 0.0, 0.0,
        1.0, 1.0, 1.0]
limgd= [1.0, 0.0, 0.0,
        1.0, 0.0, 0.0,
        1.0, 1.0, 0.0]

# These are the arbitrary, orthogonal labels
# for the three types of images.

xlab = [1.0, 0.0, 0.0]
tlab = [0.0, 1.0, 0.0]
llab = [0.0, 0.0, 1.0]

# This is a blank image and label used in testing.

bimg = [0.0, 0.0, 0.0,
        0.0, 0.0, 0.0,
        0.0, 0.0, 0.0]
blab = [0.0, 0.0, 0.0]


testImages = [ximgd, bimg, ximgd,
              timgd, bimg, timgd,
              limgd, limg, limgd]
testLabels = [blab, xlab, xlab,
              blab, tlab, tlab,
              blab, llab, llab]
targImages = [ximg, ximg, ximg,
              timg, timg, timg,
              limg, limg, limg]
targLabels = [xlab, xlab, xlab,
              tlab, tlab, tlab,
              llab, llab, llab]

# save error results to a file
n.setLearning(0)  
out = open("shared.err", "w") 

for pat in range(len(testImages)):
    print "testing pattern", pat
    n.step(imageInput = testImages[pat], \
           imageOutput = targImages[pat], \
           labelInput = testLabels[pat], \
           labelOutput = targLabels[pat])
    imgErr = n.getLayer('imageOutput').TSSError()
    labErr = n.getLayer('labelOutput').TSSError()
    out.write("Pattern " + str(pat) + "\n")
    out.write("imageOutput error: " + str(imgErr) + "\n")
    out.write("labelOutput error: " + str(labErr) + "\n")
    out.write("----------------------------------------------\n")

# close the file
out.close()

Further Reading

Plunkett, K., Sinha, C., Moller, M. F., & Strandsby, O. (1992). Symbol grounding or the emergence of symbols? Vocabulary growth in children and a connectionist net. Connection Science. Volume 4, Number 3 and 4.

Pyro Modules Table of Contents

Modules

  1. PyroModuleIntroduction

  2. PyroModuleObjectOverview

  3. PyroModulePythonIntro

  4. PyroModuleDirectControl

  5. PyroModuleSequencingControl

  6. PyroModuleBehaviorBasedControl

  7. PyroModuleReinforcementLearning

  8. PyroModuleNeuralNetworks

  9. PyroModuleEvolutionaryAlgorithms

  10. PyroModuleComputerVision

  11. PyroModuleMapping

  12. PyroModuleMultirobot

  13. FurtherReading

Additional Resources

  1. PyroIndex

  2. PyroAdvancedTopics

  3. PyroUserManual

  4. [WWW]Pyro Tutorial Movies

Reference: PyroSiteNotes