Building Neural Networks Using Conx
To create neural networks in Pyro we will use the Conx module, which is a Python based package that provides an API for neural network scripting. Conx was designed to be used by connectionist researchers, as well as a teaching tool for AI and robotics courses. The idea behind this system is to allow experimenters to quickly and easily create, train, and test basic architectures such as feedforward and simple recurrent neural networks. Conx can be used independent of Pyro for any kind of neural network modeling. It can be used in Pyro to write neural network controlled brains for robots and thus becomes an excellent choice for doing robot learning experiments.
A First Network
Let us first jump right into Conx to create a neural network that will learn the AND function from the previous section. Later, we will return and introduce the API more formally. The training data set for the AND network is reproduced below.
|Input A||Input B||Output|
Example: A First Network
It only takes a few commands to create a network in Conx. The program below implements the network containing an input layer with two input nodes and and output layer with one output node to solve the AND problem. The AND network should return an output close to 1.0 when both of its inputs are 1.0. Otherwise it should return an output close to 0.0. Given the values in the data set, we do not need to do any scaling. We will set EPSILON to 0.5, report progress after every epoch, and accept values as correct with a tolerance of 20 % (i.e., 0.2).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
# import all the conx API from pyrobot.brain.conx import * # create the network n = Network() # add layers in the order they will be connected n.addLayer('input',2) # The input layer has two nodes n.addLayer('output',1) # The output layer has one node n.connect('input','output') # The input layer is connected to the output layer # provide training patterns (inputs and outputs) n.setInputs([[0.0,0.0],[0.0,1.0],[1.0,0.0],[1.0,1.0]]) n.setOutputs([[0.0],[0.0],[0.0],[1.0]]) # set learning parameters n.setEpsilon(0.5) n.setTolerance(0.2) n.setReportRate(1) # learn n.train()
As demonstrated in the program above, the main steps to creating a network in Conx are:
Create an instance of the Network class.
Add the appropriate number of layers to the network. The Layer constructor expects a name given as a string and the number of nodes.
Connect the appropriate layers together by giving the "from" name and then the "to" name.
Set the patterns to be learned. Both the methods setInputs and setOutputs expect a list of lists. A teacher pattern should be in the same list position as its corresponding input pattern.
Set the parameters for learning including the learning rate (epsilon), tolerance, and report rate. The report rate will cause progress messages to be printed every n epochs. In this example, messages will be printed every 1 time step.
Lastly, train the network.
Go ahead and copy the above program and save it in a file called NNand.py. Then, as shown below, run it in python. You may see something similar to what is shown below:
$ python AND.py Epoch # 1 | TSS Error: 1.02 | Correct = 0.0 | RMS Error: 0.48 Epoch # 2 | TSS Error: 1.03 | Correct = 0.0 | RMS Error: 0.78 Epoch # 3 | TSS Error: 0.79 | Correct = 0.0 | RMS Error: 0.26 Epoch # 4 | TSS Error: 0.72 | Correct = 0.25 | RMS Error: 0.34 Epoch # 5 | TSS Error: 0.46 | Correct = 0.25 | RMS Error: 0.36 Epoch # 6 | TSS Error: 0.41 | Correct = 0.25 | RMS Error: 0.45 Epoch # 7 | TSS Error: 0.32 | Correct = 0.25 | RMS Error: 0.24 Epoch # 8 | TSS Error: 0.23 | Correct = 0.25 | RMS Error: 0.32 Epoch # 9 | TSS Error: 0.21 | Correct = 0.25 | RMS Error: 0.24 Epoch # 10 | TSS Error: 0.18 | Correct = 0.25 | RMS Error: 0.24 Epoch # 11 | TSS Error: 0.15 | Correct = 0.25 | RMS Error: 0.22 Epoch # 12 | TSS Error: 0.13 | Correct = 0.5 | RMS Error: 0.21 Epoch # 13 | TSS Error: 0.12 | Correct = 0.75 | RMS Error: 0.18 Epoch # 14 | TSS Error: 0.12 | Correct = 0.75 | RMS Error: 0.27 Epoch # 15 | TSS Error: 0.11 | Correct = 0.75 | RMS Error: 0.26 Epoch # 16 | TSS Error: 0.09 | Correct = 0.75 | RMS Error: 0.16 Epoch # 17 | TSS Error: 0.09 | Correct = 1.0 | RMS Error: 0.18 ---------------------------------------------------- Final # 18 | TSS Error: 0.09 | Correct = 1.0 ----------------------------------------------------
The above output shows that this particular network learned the AND data set after 18 epochs of training. Each line of the output is reported after every epoch (the report rate was set to 1 epoch) and contains the epoch number, the TSS error for that epoch, the percent of patterns correctly identified by the network, and the RMS error. Notice that the TSS error decreased from 1.02 to 0.09 and, more importantly, it decreased after every epoch.
Exercise 1Run the above program with different values of EPSILON (say, 0.8, 0.7, 0.6, 0.4, 0.3, 0.2) and record the number of epochs it takes to train. Fill in the table below:
Exercise 2Repeat the above and run the experiment several times for the same values of EPSILON. You will notice that it takes a different number of epochs each time. Record at least five runs for each value of EPSILON in the table below.
|EPSILON||Epochs: Trial #1||Trial #2||Trial #3||Trial #4||Trial #5||Average # Epochs|
Testing the trained network
In order to see exactly how the network is responding to each input pattern once it is trained, add the following lines to the end of your NNand.py program. Here we turn the learning off, so that we can test the network without changing the weights. Then we turn interactive on, so that the activations of the network will be displayed. Finally, we propagate the set of patterns through the network to see the results.
# verify learning n.setLearning(0) n.setInteractive(1) n.sweep()
Run the program with the above modifications. After training is complete, the program starts presenting input patterns to the network and gives you the output generated by the network. After each pattern, you will be prompted to quit or continue. Hitting the RETURN key will continue with another pattern and hitting 'q' will quit. This way you can test and ensure that the network has actually learned. A session is shown below:
Final # 16 | TSS Error: 0.10 | Correct = 1.0 ---------------------------------------------------- -----------------------------------Pattern # 3 Display network 'Backprop Network': ============================= Display Layer 'output' (kind Output): Target : 0.00 Activation: 0.17 ============================= Display Layer 'input' (kind Input): Activation: 1.00 0.00 --More-- [quit, go] -----------------------------------Pattern # 4 Display network 'Backprop Network': ============================= Display Layer 'output' (kind Output): Target : 1.00 Activation: 0.81 ============================= Display Layer 'input' (kind Input): Activation: 1.00 1.00 --More-- [quit, go] -----------------------------------Pattern # 2 Display network 'Backprop Network': ============================= Display Layer 'output' (kind Output): Target : 0.00 Activation: 0.16 ============================= Display Layer 'input' (kind Input): Activation: 0.00 1.00 --More-- [quit, go] -----------------------------------Pattern # 1 Display network 'Backprop Network': ============================= Display Layer 'output' (kind Output): Target : 0.00 Activation: 0.01 ============================= Display Layer 'input' (kind Input): Activation: 0.00 0.00
Notice that all patterns are correct within the specified tolerance (0.2). Also notice that the patterns are presented in random order. In the API you can control how patterns are presented during testing and during training. By default, patterns are presented in random order.
If you would like to see how the weights have changed from their initial values to the final values, you can save the weights to a file as shown below. For the AND network there are only three parameters: a bias for the output node, and two weights from the input nodes to the output node.
n.saveWeightsToFile("before.wts") n.train() n.saveWeightsToFile("after.wts")
Run the program with the above changes. After training is complete (you may also go through a testing phase here if the testing code is still there), you will see the two weight files in your working directory. For the example above, the weight file has the following contents:
-4.9495515489301756 3.4171966132860616 3.3065517183219439
The first weight is the bias and the second two are the weights on the two inputs. Thus, for example, when you apply pattern#4 (1.0, 1.0) to expect an output of 1.0 you can apply the transfer function yourself as follows:
First, compute net input: -4.9495515489301756 + 1.0*3.4171966132860616 + 1.0*3.3065517183219439 = 1.78 (approx)
Next, apply the activation function to compute the output: f(1.78) = 1/(1+e(-1.78)) = 0.855
0.855 is within the tolerance (i.e. close enough to 1.0) so the output is as expected. You can also review this in the interactive output above. Since this was a very simple network, it was easy to demonstrate the above calculation and therefore makes for an understandable example. In general, you will have a very large input layer connected to an hidden layer and then the hidden layer to the output layer and so the above calculation may become formidable to do by hand.
Saved weights can be reloaded into a network at a later date as long as you first create a network with the same architecture.
To reload a set of saved weights from a file, use the method loadWeightsFromFile(filename). However, before reading in the weights, you must already have the appropriate network architecture defined.
To interactively examine a weight between two units:
n.getWeights("fromLayerName", "toLayerName")[fromPos][toPos] # or this new, shorter syntax: n["fromLayerName", "toLayerName"][fromPos][toPos]
Notice that when passing arguments into functions in Conx, always use the order "from" first, then "to".
To examine a unit's bias weight:
n.getLayer("layerName").weight[pos] # or this new, shorter syntax in which bias is an array: n["layerName"].weight[pos] # or, yet another perspective in which you access a Node object: n["layerName"][pos].weight
Exercise 3: Modify the AND network to create an OR network
Copy your NNand.py program to one called NNor.py. Modify the outputs appropriately in your new file to solve the OR problem. The OR network should output a 1.0 when any of its inputs are 1.0, otherwise it should output a 0.0. Check the weights after training is completed and verify that they produce the correct output.
Exercise 4: Modify the AND network to create an XOR network
Copy your NNand.py program to one called NNxor.py. Modify the outputs appropriately in your new file to solve the XOR problem. The XOR network should output a 1.0 when exclusively one of its inputs are 1.0, otherwise is should output a 0.0. The XOR problem is a well known example of a task that a simple two layer network cannot solve. Before adding a hidden layer, try training the simple two layer network on the XOR problem. What happens? In order to learn XOR, you will need to add a hidden layer of at least size two.
Con-x is sensitive to the order in which the layers are added. It is important to add the layers in the order they will be connected to one another, i.e. input first, then hidden, and finally output. Be sure to connect the input to the hidden and the hidden to the output. Con-x provides a shortcut for building a standard three-layer network, since it is done so frequently. You can use the method addLayers() as shown in the next section.