Developmental Robotics group Summer 2006
This page documents the activities of the Developmental Robotics Research Group for the summer 2006. The group is composed of faculty Douglas Blank, and students George Dahl (Swarthmore), Julia Ferraioli (Bryn Mawr), and Leslie McTavish, (Bryn Mawr).
Our goal is to explore the possibility of a robot that can begin to learn, starting with nothing but a seed program, everything that it needs to exhibit intelligent behavior. We imagine a robot that is self-motivated to explore what it doesn't understand, and become bored with what it does. This research is part of a long-term agenda to explore developmental systems. We are considering (and developing) various neural network models of learning and memory.
Past summers' research:
-
DevelopmentalRoboticsSummer2002: PyroModuleSelfOrganizingMap
-
DevelopmentalRoboticsSummer2005: XORNoise, Emergent Framework for DevRob
Here, we will keep a log of some of the major activities carried out and/or planned for the summer.
Go to DevelopmentalRobotics for a starting point to these activities.
Summer Schedule
We will have regular group meetings at some day/time TBA.
-
Doug will be away May 22 - May 24 (Seattle, WA)
-
Doug will be away June 1 - June 6 (Bloomington, IN)
-
AAAI Robot Competition and Exhibition July 16 - July 20, (Boston, MA)
Events that Summer Science Research Fellowship Recipients are required to attend:
-
Safety Training: June 1st, 9:30am - 12noon, PSB 243
-
Science Writing & Poster Design: June 15th, 1pm - 4pm, PSB 232
-
Career Day: June 29th, 2pm - 4pm, PSB 243
-
Ethics/Lab Survival Skills Workshop: July 13th, TBA, TBA
-
Career and Resume Workshop: July 27th, TBA, TBA
-
Poster Session: September 7th, 3pm - 5pm, Campus Center
2006
May June
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 1 2 3
7 8 9 10 11 12 13 4 5 6 7 8 9 10
14 15 16 17 18 19 20 11 12 13 14 15 16 17
21 22 23 24 25 26 27 18 19 20 21 22 23 24
28 29 30 31 25 26 27 28 29 30
July August
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 1 2 3 4 5
2 3 4 5 6 7 8 6 7 8 9 10 11 12
9 10 11 12 13 14 15 13 14 15 16 17 18 19
16 17 18 19 20 21 22 20 21 22 23 24 25 26
23 24 25 26 27 28 29 27 28 29 30 31
30 31
Ideas to Explore
-
Explore Growing Neural Gas (GNG)
-
Explore Cascade Correlation
-
Explore attention and learning
Log
-
A log of weekly/daily events that others should know about.
Week 1
This week, we primarily read the papers mentioned below, in the paper summaries section. We discussed them, as well as ventured into the subjects of self-organizing maps and resource allocating vector quantizers. After discussing different types of neural networks, we implemented our own based on the bumper values of a simulated robot (see DevelopmentalRoboticsExperiment1).
Week 2
Summary of the Quickprop Algorithm (QuickPropSummary)
Week 3
Added mdp and scipy to the pyrobot yum FC5 repository. To install: yum install python-scipy python-mdp. I had to disable some fancy things in mdp to get it to work around the newest version of scipy.
Implemented k-means in python, along with a quick and dirty visualization charting of resulting clusters. Changed scatter.py a bit, to make the customization of colors and size easier to do. Experimented more with GNG and mdp.
Worked on implementing cascade correlation in conx. Currently debugging.
Worked on defining the differences between Fahlman's C code for quickprop, and the conx version
Week 4
Debugging Cascade-correlation continues. Weights grow to infinity for an unknown reason. Upgraded from print statments to Pdb. -G
Cascade-correlation issues seem to be resolved. I figured out why the weights grew to infinity and fixed the problem. Discussion linked to below. -G
Week 5
EvolutionOfLanguage - demonstration of running Pyrobot Simulator in faster-than-real time on an evolving robot problem.
SummerResearchAbstract -- a work in progress about the role of social interaction in developmental systems.
Discussed elements of proposed developmental system. They included pressures, attention, intrinsic motivation, social interaction, mimicry and goals. Read papers from Connection Science to be discussed in Week 6. Experimented with pyrobot's worlds and brains. Created a brain that moved in circles by following its longest sensor. Recreated DevelopmentalRoboticsExperiment1.
Week 6
Modified the avoid brain slightly to respond to differences in right and left sensors when head-on with an object. Gathered data from this brain to use to train a neural network. When using the resulting weights from this training, it seems like the robot is following a path rather than avoiding objects, which I did not expect. Trained it again in an empty room to get more unbiased data, which seemed to work better than the tutorial world. Details in ModifiedAvoidTraining.
Week 7 Worked on comparing Fahlman's candidate training phase to our candidate training phase. Discussion on: CandidateTrainingDifferences - G
Links
Papers
-
Combining Cascade-correlation and Temporal Difference Learning
-
Cascade-correlation Learning Architecture Fahlman's Original Cascade-correlation paper
-
Recurrent Cascade-correlation Cascade-correlation for networks with recurrent units
-
The improved Rprop algorithm A batch learning algorithm that may have some advantages over Quickprop
-
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/user/sef/www/publications/qp-tr.ps Faster-Learning Variations on Back-Propagation: An Empirical Study (In this paper Fahlman introduced the Quickprop learning algorithm)
-
Connection Science - Special Issue on Developmental Robotics (id: student, password: TBA)
-
Gender and Name Phonology - Paper for Psychology Dept
-
PCA - A Tutorial on Principle Component Analysis - Lindsay I Smith.
-
Backprop Tricks - Look at Section 4.1 for discussion of online versus offline training (batch versus stochastic incremental)
-
Large Scale Online Learning - Argues for asymptotic advantages of online algorithms in cases where training data is abundant but computational resources are scarce
-
The playground experiment: Task-independent development of a curious robot Oudeyer, Kaplan, Hafner and Whyte on IAC learning
Applets
-
GNG Applet - an applet that shows different types of competitive models, very interesting
-
K-Means Applet - an applet that demonstrates k-means...you can create your own clusters, data points, and choose your own metric
Misc
-
SOM images - example map made from black and white Khepera images
-
GovernorForNeuralNetworks - example of the governor applied to the "Figure Eight" hallway problem
-
SciPy - scientific python
-
MDP - Modular toolkit for Data Processing
-
Recurrent Cascade Correlation - Fahlman's RCC code
-
The Playground Experiment - Movies, description, etc...
Paper Summaries
-
Blank, D.S., Kumar, D., Meeden, L., and Marshall, J. (2005). Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture. . Cybernetics and Systems, 36(2).
PDF
The focus of this paper was to steer us away from the traditional preconception (or misconception) of looking at robotics with an anthropomorphic bias. While traditional approaches to tasks might be good enough for humans, it is extremely unlikely that a robot will perceive the same task in the same way. Thus enters the idea of developmental robotics, which centers on the idea of letting the robot not only decide its own methods to do a task, but pretty much explore everything else on its own as well. Let the robot discover its own capabilities, and then create goals and behaviours based on a developmental algorithm. This eliminates altogether the need for prespecified tasks.
The robot will shy away from environments that seem chaotic, but become interested in "new and exciting" environments. In other words, it will be attracted to situations it cannot predict. To do this, it must be able to discover abstractions in its environment, using structures such as self-organizing maps (SOMs) and resource allocating vector quantizers (RAVQs). See links below for more information. To avoid catastrophic fogetting, a network governer is implemented using a RAVQ, which regulates the flow of training patterns. This way, more common patterns are not seen by the network as frequently as they appear. Therefore, less common patterns are less likely to be overwritten, or "forgotten". Once training is complete, the governer is taken out of the picture, and the network acts alone on the tasks.
Two experiments are addressed in this paper, one was a wall following task using a governer and the other was an abstraction experiment trying to find goals. The details may be found in the link above.
-
Blank, Douglas S., Lewis, Joshua M., and Marshall, James B. (2005) The Multiple Roles of Anticipation in Developmental Robotics. AAAI Fall Symposium Workshop Notes, From Reactive to Anticipatory Cognitive Embodied Systems. AAAI Press.
PDF
This paper discusses some of the ways in which anticipation can be used as a method of enhancing learning systems in developmental robotics. These types of systems are based on the Markovian principle: that any action to be taken is based on the current state. This presents a chicken-and-egg type of dilemma, but it can be solved by allowing the system to evolve its representations of percptions and actions.
A simple recurrent network (SRN) was developed which used a two stage training scheme. In one phase of this training scheme the system combined its current motor 'in' action with its sensory state and context (past experience) to predict what its next sensory state would be. The prediction was compared to the actual state and a spatial map of prediction error was generated. The task of the robot was to focus its attention on the area of greatest error. This technique created a robot that kept its focus on a moving decoy until it was able to predict the decoy's actions. It then became "bored" and turned its attention away.
One problem encountered was that the robot could potentially become fixated on random patterns that it would never be able to predict. The solution was to enable to robot to 'predict' that the situtation was unlearnable. To do this, random noisy patterns were introduced into the training patterns and the network was eventally able to produce valid outputs.
Another technique that was introduced to facilitate learning were strategies categorized as hints. The idea of catalytic hints, which are carfully constructed learning pairs, was expanded to "autocatalycic hints". The two variations of hints deveoped were Error Anticipation (EA) and Hidden Layer Anticipation (HLA). It was found that while both improved leaning speed, HLA had a much greater impact than EA did in both difficult and noisy training sets.
-
Marshall, J., Blank, D., and Meeden, L. (2004). An Emergent Framework for Self-Motivation in Developmental Robotics. International Conference on Development and Learning, 2004.
PDF
-
Daszykowski, M., Walczak, B., and Massart, D. L. (2002). On the Optimal Partitioning of Data with K-Means, Growing K-Means, Neural Gas, and Growing Neural Gas. American Chemical Society, 2002.
PDF
This paper explores four different clustering algorithms: k-means, growing k-means, neural gas and growing neural gas. The goal of all of these algorithms is to partition data into "n" clusters, based on the compositional, or natural, tendencies of the data. K-means picks "k" random points in "n"-dimensional space to be the cluster centers. It then assigns data points to each cluster based on a distance metric. After one run through all of the data, every point has been assigned to a cluster center. The cluster center position is updated to be the mean of the points assigned to it. This process is repeated until the cluster centers are relatively stable.
Neural Gas also uses "k" number of points, initialized to a random weight vector. Take each data point and compare to each cluster center weight vector to determine distances. Sort distances and find winning cluster center and its neighbors. Adjust weights of winner and neighbors according to specific equation that may be found in paper. Growing Neural Gas uses the same idea, except with a few extras. Start with two cluster centers, initialized to random weight vectors, and an edge between them with an age of zero. Randomly select a data point and find the closest two cluster centers and connect them by an edge whose age is initialized to zero. If the edge already exists, set the age equal to zero. The cluster center closest to the data point and its topological neighbors are moved fractionally to the data point. After a certain number of iterations, a new cluster center is introduced between the two nodes that have the maximum squared distance between them, and then connected by edges to both. Edges reaching a maximum age die, and cluster centers which have no emanating edges also die.
Growing K-Means works similarly to both Growing Neural Gas and k-means. We start with two cluster centers randomly in the attribute space of the data, and randomly select a data point. Determine the closest cluster center, and move it fractionally towards the data point. Once the whole data set has seen the cluster centers, assign each data point to the nearest cluster center and add the squared distance between the data point and its cluster center to the dispersion variable. Insert a new node midway between the cluster center with the highest dispersion variable and the data point farthest away which is assigned to that cluster center. If a maximum number of nodes are reached, decrease the learning rate.
The paper concludes that while k-means has its advantages (namely, being fast) it also is outperformed by Growing Neural Gas and Growing K-Means, and that GNG is outperformed (slightly) by Growing K-Means. However, these conclusions are slight and may be uniquely suited to the specific experiments that they were running at the time.
Glossary and Concepts
-
Principal Component Analysis (PCA):
-
Simple Recurrent Network (SRN):
-
Self-Organizing Map (SOM):
-
Resource Allocating Vector Quantizer (RAVQ) and Adaptive RAVQ:
-
Backpropagation of Error Network:
-
Reinforcement Learning (RL):
-
Euclidean Distance vs. City-block Distance (aka, Manhattan Distance)
Sample Code
Shell
To experiment with Player/Stage on FC5:
player /usr/share/stage/worlds/everything.cfg & playerv
Common CVS commands:
cvs -d :pserver:anonymous.compscitest.brynmawr.edu:/cvs login alias cvs='cvs -d :pserver:anonymous@compscitest.brynmawr.edu:/cvs' cvs co pyrobot cvs update -d cvs commit
See also PyroDeveloperCVS
Python
from math import sqrt
def euclideanDistance(v1, v2):
sum = 0
for i in range(len(v1)):
sum += (v1[i] - v2[i]) ** 2
return sqrt(sum)
def cityblockDistance(v1, v2):
sum = 0
for i in range(len(v1)):
sum += abs(v1[i] - v2[i])
return sum
def eDistance(v1, v2):
return sqrt(sum([(i - j) ** 2 for (i,j) in zip(v1, v2)]))
def cbDistance(v1, v2):
return sum([abs(i - j) for (i,j) in zip(v1,v2)])
Experiments
To be Installed
YUM installs
-
audacity
-
mono-winforms
-
pyrobot player stage
-
beagle
-
wv
-
lapack lapack-devel
-
atlas atlas-devel
-
python-psyco
-
python-nltk
-
python-numeric
-
numpy
-
python-scipy python-mdp
-
wxPython wxGTK wxGTK-gl
-
gnuplot
-
ipython
-
lam
-
xorg-x11-drv-nvidia kmod-nvidia-smp
-
freeglut-devel
RPM
-
acroread
Make Install
-
pyMPI
Python Tools
-
ipython - better interactive Python shell Need to yum install ipython to use.
-
pdb - Python Debugger, from inside Python. For reference, see
this pydoc. The subsections detail commands.
-
Pdb - A quick introduction to Pdb
-
idle - a simple integrated development environment (IDE) in Python. Run python /usr/lib/python2.4/idlelib/idle.py
