From chrisspen at gmail.com Mon Jan 8 21:40:33 2007 From: chrisspen at gmail.com (Chris S) Date: Mon Jan 8 21:40:33 2007 Subject: [Pyro-users] Time Difference Learning with Conx Message-ID: I'm trying to use Conx to train a couple networks to aid in a board game engine, and I'd like to get some feedback on my strategy. The networks will be used in a minimax search to prune the state tree and evaluate the player's position. Both networks will take the board as input, with a layer for each position on the board. One network, the move suggester, will output a layer for each action, indicating how much performing that action will increase the player's score. The second network, the score estimator, will output estimates of each player's score. My question is what is the best way to train these networks? My current strategy is to do nothing until the game is over. I'll use a static algorithm to reliably score the end game state. Then I'll create a training corpus by taking the score and iterating through each move in the game, creating training sets in the form of [boardstate, finalscore]. For the move suggester, I think I'd have to disable the output layers for the actions not selected. This method seems fairly primitive and naive, but I've never done time difference learning with neural networks before. Is there a better way? Any suggestions are appreciated. Regards, Chris From bthom at cs.hmc.edu Mon Jan 8 22:09:04 2007 From: bthom at cs.hmc.edu (belinda thom) Date: Mon Jan 8 22:09:29 2007 Subject: [Pyro-users] Time Difference Learning with Conx In-Reply-To: References: Message-ID: <3DC7E1B4-BAEA-4D10-B506-B344301C2110@cs.hmc.edu> I'm doing something similar right now (although I'm not at present using conx). I used the algorithm Tom Mitchell suggests at the end of his 1st chapter in Machine Learning (a textbook). When you're assuming a linear activation function, and I don't believe this method is any different for non-linear cases. Updates are easily done for _each_ play in the game as follows: your current training estimate of the value of a state is compared to the value the function estimates for the "best" (in a minimax sense) next state (when the player will next play). Check out his text, its pretty clear. HTH, --b On Jan 8, 2007, at 6:40 PM, Chris S wrote: > I'm trying to use Conx to train a couple networks to aid in a board > game engine, and I'd like to get some feedback on my strategy. The > networks will be used in a minimax search to prune the state tree and > evaluate the player's position. Both networks will take the board as > input, with a layer for each position on the board. One network, the > move suggester, will output a layer for each action, indicating how > much performing that action will increase the player's score. The > second network, the score estimator, will output estimates of each > player's score. > > My question is what is the best way to train these networks? My > current strategy is to do nothing until the game is over. I'll use a > static algorithm to reliably score the end game state. Then I'll > create a training corpus by taking the score and iterating through > each move in the game, creating training sets in the form of > [boardstate, finalscore]. For the move suggester, I think I'd have to > disable the output layers for the actions not selected. > > This method seems fairly primitive and naive, but I've never done time > difference learning with neural networks before. Is there a better > way? Any suggestions are appreciated. > > Regards, > Chris > _______________________________________________ > Pyro-users mailing list > Pyro-users@pyrorobotics.org > http://emergent.brynmawr.edu/mailman/listinfo/pyro-users From chrisspen at gmail.com Mon Jan 8 22:26:34 2007 From: chrisspen at gmail.com (Chris S) Date: Mon Jan 8 22:26:33 2007 Subject: [Pyro-users] Time Difference Learning with Conx In-Reply-To: <3DC7E1B4-BAEA-4D10-B506-B344301C2110@cs.hmc.edu> References: <3DC7E1B4-BAEA-4D10-B506-B344301C2110@cs.hmc.edu> Message-ID: Thanks, I'll check out that text. Chris On 1/8/07, belinda thom wrote: > I'm doing something similar right now (although I'm not at present > using conx). > > I used the algorithm Tom Mitchell suggests at the end of his 1st > chapter in Machine Learning (a textbook). > > When you're assuming a linear activation function, and I don't > believe this method is any different for non-linear cases. > > Updates are easily done for _each_ play in the game as follows: your > current training estimate of the value of a state is compared to the > value the function estimates for the "best" (in a minimax sense) next > state (when the player will next play). Check out his text, its > pretty clear. > > HTH, > --b From Matthew2.Studley at uwe.ac.uk Wed Jan 10 04:38:30 2007 From: Matthew2.Studley at uwe.ac.uk (matthew studley) Date: Wed Jan 10 04:45:38 2007 Subject: [Pyro-users] Re: Pyro-users Digest, Vol 36, Issue 1 In-Reply-To: <200701091700.l09H03dV026306@emergent.brynmawr.edu> References: <200701091700.l09H03dV026306@emergent.brynmawr.edu> Message-ID: <1168421910.3049.62.camel@localhost.localdomain> > My question is what is the best way to train these networks? My > current strategy is to do nothing until the game is over. I'll use a > static algorithm to reliably score the end game state. Then I'll > create a training corpus by taking the score and iterating through > each move in the game, creating training sets in the form of > [boardstate, finalscore]. I think you'll run into problems with this training strategy; is each move worth the final score?. You might want to look at TD-Lambda, Q-learning or Sarsa algorithms. See the book on "Reinforcement Learning" by Sutton and Barto. it's online at : http://www.cs.ualberta.ca/%7Esutton/book/ebook/the-book.html some work by IBM using TD-Lambda to train an ANN to play backgammon: http://www.research.ibm.com/massive/tdl.html regards Matt -- Dr Matthew Studley Artificial Intelligence Group Faculty of Computer Science, Engineering and Mathematics University of the West of England Coldharbour Lane Frenchay Bristol UK BS16 1QY ================================= tel: +44 (0) 11732 83177 mob: +44 (0) 7712 659022 This email was independently scanned for viruses by McAfee anti-virus software and none were found From edzela at yahoo.com Thu Jan 11 09:14:13 2007 From: edzela at yahoo.com (wester zela) Date: Thu Jan 11 09:14:12 2007 Subject: [Pyro-users] automatic parking car Message-ID: <20070111141413.68563.qmail@web39706.mail.mud.yahoo.com> Hi all I appreciate your help in this topic. I have only few months working with Pyrobot and I want to know if I can simulate a car with sensor, someone have worked about it? How to do it?. I want to apply/research some algorithms in automatic parking cars. Thanks in advance Wester ____________________________________________________________________________________ Have a burning question? Go to www.Answers.yahoo.com and get answers from real people who know. From khasymskia at wlu.edu Sun Jan 14 21:56:44 2007 From: khasymskia at wlu.edu (Alexander Khasymski) Date: Sun Jan 14 21:57:04 2007 Subject: [Pyro-users] Communication between robots Message-ID: <45AAA715.81AA.00A9.0@wlu.edu> Hi, I was wondering if there is a way to implement simple message passing between two robots with different brains? I'm doing a project in language evolution and need a way for robots to pass short strings (5-6 characters) to each other. The only obvious way I could think of was implementing a brain that controls both robots, in which case message passing would boil down to modifying a few global variables, however I could not find out how to create such a brain part from the fact that it is possible to do so. Any suggestion, comments, examples of something similar will be greatly appreciated. Thanks, Aleksandr From knerr at cs.swarthmore.edu Mon Jan 15 15:59:40 2007 From: knerr at cs.swarthmore.edu (Jeff Knerr) Date: Mon Jan 15 15:59:39 2007 Subject: [Pyro-users] build problem (libplayerc/playerc.h) on debian Message-ID: <20070115205940.GA28515@basil.cs.swarthmore.edu> Hi Doug/Pyro-users. I'm trying to build a new version of pyrobot and am getting some compile errors: make[1]: Entering directory `/home/knerr/pyrobot/camera/player' swig -python -c++ -I../device/ -I/usr/local/include -I/usr/local/include -o PlayerCam.cc PlayerCam.i cc -c -I/usr/include/python2.4 -I../device/ -I/usr/local/include -I/usr/local/include PlayerCam.cc -o playercam.o In file included from PlayerCam.cc:2567: PlayerCam.h:5:32: libplayerc/playerc.h: No such file or directory In file included from PlayerCam.cc:2567: ... I have a /usr/local/include/playerc.h, but not a libplayerc/playerc.h. If I change the PlayerCam.h to just include "playerc.h", it still doesn't work: cc -Wall -Wno-unused -D_POSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS -D_REENTRANT -DPOSIX -D__x86__ -D__linux__ -D__OSVERSION__=2 -frepo -DUSINGTHREADS -DLINUX -D_GNU_SOURCE -I/usr/include/python2.4 -I../device/ -I/usr/local/include -I/usr/local/include -o PlayerCam.o -c PlayerCam.cpp PlayerCam.cpp: In constructor `PlayerCam::PlayerCam(char*, int)': PlayerCam.cpp:19: error: `PLAYER_OPEN_MODE' undeclared (first use this function) PlayerCam.cpp:19: error: (Each undeclared identifier is reported only once for each function it appears in.) make[1]: *** [PlayerCam.o] Error 1 make[1]: Leaving directory `/home/knerr/pyrobot/camera/player' make: *** [camera/player] Error 2 I've build this before, but it's been 6+ months, so maybe some other things (player? stage?) need to be upgraded first?? Let me know if you have any ideas. Thanks. jeff debian sarge, 2.6.16 kernel new cvs download of pyrobot (jan 15, 2007) python2.4 (also tried 2.3, no difference) player-1.6.4, cvs version of stage from May 2005, swig-1.3.31, gazebo-0.5.1 From dblank at brynmawr.edu Tue Jan 16 13:09:20 2007 From: dblank at brynmawr.edu (Douglas S. Blank) Date: Tue Jan 16 13:09:41 2007 Subject: [Pyro-users] build problem (libplayerc/playerc.h) on debian In-Reply-To: <20070115205940.GA28515@basil.cs.swarthmore.edu> References: <20070115205940.GA28515@basil.cs.swarthmore.edu> Message-ID: <45AD14D0.7090908@brynmawr.edu> Jeff, I would only upgrade Player/Stage to 2.0.2; it has change dramatically in the year after that (and Pyro hasn't been upgraded to match yet). When I install player-2.0.2 from sources, I get: /usr/local/include/player-2.0/libplayerc/playerc.h So it looks like you might have a really old player install. -Doug Jeff Knerr wrote: > Hi Doug/Pyro-users. I'm trying to build a new version of pyrobot and am > getting some compile errors: > > make[1]: Entering directory `/home/knerr/pyrobot/camera/player' > swig -python -c++ -I../device/ -I/usr/local/include -I/usr/local/include -o PlayerCam.cc PlayerCam.i > cc -c -I/usr/include/python2.4 -I../device/ -I/usr/local/include -I/usr/local/include PlayerCam.cc -o playercam.o > In file included from PlayerCam.cc:2567: > PlayerCam.h:5:32: libplayerc/playerc.h: No such file or directory > In file included from PlayerCam.cc:2567: > ... > > I have a /usr/local/include/playerc.h, but not a libplayerc/playerc.h. > If I change the PlayerCam.h to just include "playerc.h", it still doesn't work: > > cc -Wall -Wno-unused -D_POSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS -D_REENTRANT -DPOSIX -D__x86__ -D__linux__ -D__OSVERSION__=2 -frepo -DUSINGTHREADS -DLINUX -D_GNU_SOURCE -I/usr/include/python2.4 -I../device/ -I/usr/local/include -I/usr/local/include -o PlayerCam.o -c PlayerCam.cpp > PlayerCam.cpp: In constructor `PlayerCam::PlayerCam(char*, int)': > PlayerCam.cpp:19: error: `PLAYER_OPEN_MODE' undeclared (first use this > function) > PlayerCam.cpp:19: error: (Each undeclared identifier is reported only once for > each function it appears in.) > make[1]: *** [PlayerCam.o] Error 1 > make[1]: Leaving directory `/home/knerr/pyrobot/camera/player' > make: *** [camera/player] Error 2 > > I've build this before, but it's been 6+ months, so maybe some other things > (player? stage?) need to be upgraded first?? > > Let me know if you have any ideas. Thanks. > > jeff > > debian sarge, 2.6.16 kernel > new cvs download of pyrobot (jan 15, 2007) > python2.4 (also tried 2.3, no difference) > player-1.6.4, cvs version of stage from May 2005, swig-1.3.31, gazebo-0.5.1 > _______________________________________________ > Pyro-users mailing list > Pyro-users@pyrorobotics.org > http://emergent.brynmawr.edu/mailman/listinfo/pyro-users > > From knerr at cs.swarthmore.edu Wed Jan 24 13:23:28 2007 From: knerr at cs.swarthmore.edu (Jeff Knerr) Date: Wed Jan 24 13:23:28 2007 Subject: [Pyro-users] build problem (libplayerc/playerc.h) on debian In-Reply-To: <45AD14D0.7090908@brynmawr.edu> References: <20070115205940.GA28515@basil.cs.swarthmore.edu> <45AD14D0.7090908@brynmawr.edu> Message-ID: <20070124182327.GA17114@basil.cs.swarthmore.edu> >> I would only upgrade Player/Stage to 2.0.2; it has change dramatically >> in the year after that (and Pyro hasn't been upgraded to match yet). Thanks, Doug. That worked! jeff From bthom at cs.hmc.edu Thu Jan 25 16:44:03 2007 From: bthom at cs.hmc.edu (belinda thom) Date: Thu Jan 25 16:46:31 2007 Subject: [Pyro-users] Mac OS X install question Message-ID: Hi, I am on a Mac G5, running OS X 10.4.8. As reported elsewhere (http://emergent.brynmawr.edu/pipermail/pyro- users/2006-November/000447.html), I wasn't using an actual camera, and so only need the PyrobotSimulator. I thus answered "no" to all the options (except the clustering one) when doing the configure. I now wish to play w/some of the other code and am wondering if any Mac/ pyro users know how to avoid the following "-shared" error (appended), which arises when I try to include the Imaging option. I am using the latest developer tools and X11, although I don't need X11 b/c the version of Python I'm using comes w/native Aqua-Tk. Ideas welcome. --b 26 % sudo make (cd ./camera/device && make) g++ -O3 -Wall -Wno-unused -D_POSIX_THREADS - D_POSIX_THREAD_SAFE_FUNCTIONS -D_REENTRANT -DPOSIX -D__x86__ - D__linux__ -D__OSVERSION__=2 -frepo -DUSINGTHREADS -DLINUX - D_GNU_SOURCE -I/Library/Frameworks/Python.framework/Versions/2.4/ include/python2.4 -shared -c -o Device.o Device.cpp powerpc-apple-darwin8-g++-4.0.1: unrecognized option '-shared' (cd ./vision/cvision && make) swig -I../../camera/device/ -python -c++ -o Vision.cc Vision.i cc -c -I/Library/Frameworks/Python.framework/Versions/2.4/include/ python2.4 -I../../camera/device/ Vision.cc -o vision.o cc -O3 -Wall -Wno-unused -D_POSIX_THREADS - D_POSIX_THREAD_SAFE_FUNCTIONS -D_REENTRANT -DPOSIX -D__x86__ - D__linux__ -D__OSVERSION__=2 -frepo -DUSINGTHREADS -DLINUX - D_GNU_SOURCE -I/Library/Frameworks/Python.framework/Versions/2.4/ include/python2.4 -I../../camera/device/ -shared vision.o Vision.o ../../camera/device/Device.o -o _vision.so -lstdc++ -ldl - lpthread powerpc-apple-darwin8-gcc-4.0.1: unrecognized option '-shared' /usr/bin/ld: multiple definitions of symbol _init_vision vision.o definition of _init_vision in section (__TEXT,__text) Vision.o definition of _init_vision in section (__TEXT,__text) collect2: ld returned 1 exit status make[1]: *** [_vision.so] Error 1 make: *** [vision/cvision] Error 2 From chrisspen at gmail.com Fri Jan 26 12:26:42 2007 From: chrisspen at gmail.com (Chris Spencer) Date: Fri Jan 26 12:26:41 2007 Subject: [Pyro-users] Conx with Reinforcement Learning In-Reply-To: <45880C4C.5000304@brynmawr.edu> References: <45880C4C.5000304@brynmawr.edu> Message-ID: On 12/19/06, Douglas S. Blank wrote: > Chris, > > Yes, there is the beginning of some support in Conx for similar > techniques. You can at least use this as an example. Take a look at the > SigmaNetwork in pyrobot/brain/conx.py. It is an example of a type of > CRBP (complimentary reinforcement backprop). It works like this: [snip] Thanks for the great example. However, is it possible for SigmaNetwork to train multiple distinct outputs? In your example, you have 11 output nodes attempting to approximate XOR. Suppose I wanted a network that would approximate a policy function, requiring that each output node represent a unique action. Am I right in thinking that SigmaNetwork would be unable to perform this task, since it requires that all the output nodes approximate a single target? Also, are you aware of any studies comparing CRBP to TD-Lambda? It would be interesting to know which performs better. Regards, Chris From dblank at brynmawr.edu Sun Jan 28 16:14:21 2007 From: dblank at brynmawr.edu (Douglas S. Blank) Date: Sun Jan 28 16:14:46 2007 Subject: [Pyro-users] Conx with Reinforcement Learning In-Reply-To: References: <45880C4C.5000304@brynmawr.edu> Message-ID: <45BD122D.1070301@brynmawr.edu> Chris Spencer wrote: > On 12/19/06, Douglas S. Blank wrote: >> Chris, >> >> Yes, there is the beginning of some support in Conx for similar >> techniques. You can at least use this as an example. Take a look at the >> SigmaNetwork in pyrobot/brain/conx.py. It is an example of a type of >> CRBP (complimentary reinforcement backprop). It works like this: > [snip] > > Thanks for the great example. However, is it possible for SigmaNetwork > to train multiple distinct outputs? In your example, you have 11 > output nodes attempting to approximate XOR. Suppose I wanted a network > that would approximate a policy function, requiring that each output > node represent a unique action. Am I right in thinking that > SigmaNetwork would be unable to perform this task, since it requires > that all the output nodes approximate a single target? The way that it is written, yes, you are right: the whole output layer computes a single value. But, you should be able to adapt the source code of the SigmaNetwork to make it so that a subset each computes just one value, and the entire output layer can compute as many as you want (with overlap, too, if you wished). > Also, are you aware of any studies comparing CRBP to TD-Lambda? It > would be interesting to know which performs better. I don't know of any, but I would be suspicious of some general statement about one always performing better than the other. -Doug > Regards, > Chris > _______________________________________________ > Pyro-users mailing list > Pyro-users@pyrorobotics.org > http://emergent.brynmawr.edu/mailman/listinfo/pyro-users > > From dblank at brynmawr.edu Sun Jan 28 16:20:02 2007 From: dblank at brynmawr.edu (Douglas S. Blank) Date: Sun Jan 28 16:20:26 2007 Subject: [Pyro-users] Communication between robots In-Reply-To: <45AAA715.81AA.00A9.0@wlu.edu> References: <45AAA715.81AA.00A9.0@wlu.edu> Message-ID: <45BD1382.9010504@brynmawr.edu> Alexander Khasymski wrote: > Hi, > > I was wondering if there is a way to implement simple message passing > between two robots with different brains? I'm doing a project in > language evolution and need a way for robots to pass short strings (5-6 > characters) to each other. The only obvious way I could think of was > implementing a brain that controls both robots, in which case message > passing would boil down to modifying a few global variables, however I > could not find out how to create such a brain part from the fact that it > is possible to do so. Any suggestion, comments, examples of something > similar will be greatly appreciated. You could have one python program control many brains, but it might just be easier to implement some simple message-passing. Take a look at the other PyRO (Python Remote Objects), for example. Also, it is very easy just to open a socket and send strings, or serialized Python objects. Take a look at the "pickle" module. Depending on what the speed that you wish to send messages, you could use the simple Instant Messaging protocol that I have written for a new project called Myro. You can find the sources at http://wiki.roboteducation.org/ under Developer. Look at the code in chat.py. It is relatively slow, but allows a human to also send messages, and provides an easy debugger. The fastest, easiest method, though, is just open direct connections to/from each robot. -Doug > Thanks, > > Aleksandr > > _______________________________________________ > Pyro-users mailing list > Pyro-users@pyrorobotics.org > http://emergent.brynmawr.edu/mailman/listinfo/pyro-users > > From dblank at brynmawr.edu Sun Jan 28 16:21:47 2007 From: dblank at brynmawr.edu (Douglas S. Blank) Date: Sun Jan 28 16:22:09 2007 Subject: [Pyro-users] automatic parking car In-Reply-To: <20070111141413.68563.qmail@web39706.mail.mud.yahoo.com> References: <20070111141413.68563.qmail@web39706.mail.mud.yahoo.com> Message-ID: <45BD13EB.2090708@brynmawr.edu> wester zela wrote: > Hi all > I appreciate your help in this topic. I have only few months working with Pyrobot and I want to know if I can simulate a car with sensor, someone have worked about it? How to do it?. I want to apply/research some algorithms in automatic parking cars. You could develop a car in the Pyrobot Simulator very quickly, and fairly easily write neural networks, symbolic AI, GAs, or some other method to park you simcar in Pyro. -Doug > Thanks in advance > > Wester > > > > ____________________________________________________________________________________ > Have a burning question? > Go to www.Answers.yahoo.com and get answers from real people who know. > > _______________________________________________ > Pyro-users mailing list > Pyro-users@pyrorobotics.org > http://emergent.brynmawr.edu/mailman/listinfo/pyro-users > > From knerr at cs.swarthmore.edu Mon Jan 29 10:43:37 2007 From: knerr at cs.swarthmore.edu (Jeff Knerr) Date: Mon Jan 29 10:43:35 2007 Subject: [Pyro-users] run problem on debian/player-2.0.2 In-Reply-To: <45AD14D0.7090908@brynmawr.edu> References: <20070115205940.GA28515@basil.cs.swarthmore.edu> <45AD14D0.7090908@brynmawr.edu> Message-ID: <20070129154337.GA8108@bay.cs.swarthmore.edu> >> I would only upgrade Player/Stage to 2.0.2; Hey Doug/Pyro-users. I was able to build player/stage 2.0.2 and pyrobot, but now I'm having some problems running it: - when I try Server=StageSimulator and choose any of the .cfg files in pyrobot/plugins/worlds/Stage, I get "failed to parse config file" errors like this /usr/local/pyrobot/plugins/worlds/Stage/tutorial.cfg:14 error: unknown interface: [position] error : Initialization failed for driver "stage" error : failed to parse config file /usr/local/pyrobot/plugins/worlds/Stage/tutorial.cfg I saw an earlier post from 2006-08-01 saying "Pyrobot isn't quite finished for use with Player 2.0.2". Is that still the problem here? - when I try Server=StageSimulator and choose a cfg file from /usr/local/share/stage/worlds, it loads fine, but then fails with this when I try to load Robot=Player6665.py: ImportError: /usr/local/lib/python2.4/site-packages/_playerc.so: undefined symbol: playerc_client_set_request_timeout Any ideas on this one? The Gazebo and Pyrobot simulators seem to work well! And when I tried the Quick Start stuff from the player manual, that all worked. Thanks. jeff