-
The C code algorithm for quickprop was developed by Terry Regier at the University of California, Berkeley. It is a translation of Scott Fahlman's Common Lisp Code. The purpose of this exercise was to determine if the Conx version of quickprop was identical to the C Code version. To determine this I initialized two identically structured networks in both systems, started them both off at identical weights and stepped through the process examining the weight change that was calculated for the next step.
I found three differences in how the algorithm was handled.
The first was simply a switch setting. Fahlman's algorithm defalts the mu factor to 1.75. Conx used 2.25.
The second difference is the number of nodes considered when calculating the weight changes. Fahman's calculations included the bias node in the calculations, conx did not. (refered to as the n change in results)
Finally, when the Fahlman's program first runs the algorithm, the wed (weight error delta) values are initialized to the decay factor times the weight of the connections at each node. The conx algorithm did not do this initial calculation. (refered to as init change in results)
These three changes were implemented in the Conx code and I ran the 10-5-10, 4-2-4, and 8-3-8 experiments that are described in Fahlman's paper I wondered whether the initialization of the weight error delta was just an oversight, and decided to run the tests both with and without this change, as well as with and without the change to including the bias node. The results were consistently poorer than Fahlman's, so I also ran a few trials on the 10-5-10 test using the C Code. Although I did not do as many trials, it seemed that the C code was not going to perform as well as Fahlman's code either. The results can be seen
- here.
We decided to change the default setting for the mu factor to 1.75 in Conx and also to include the bias node in the weight change calulations. It was decided not to initialize the delta weights because it did not seem to add any significant improvement. In addition Fahlman's paper considers how to handle the situtation when the weight delta error is zero either at startup or during the process. The weight change function handles this condition.
