1. CREW
1.1. Main Application Pages (5)
1.1.1. Title
1.1.2. Participants
1.1.2.1. Students
-
Catherine Chiu, Senior (class of 2004), Computer Science Major
-
Ioana Butoi, Junior (class of 2005), Computer Science Major
-
Darby Thompson, Junior (class of 2005), Computer Science Major
1.1.2.2. Faculty
-
Professor Douglas Blank, Assistant Professor of Computer Science, Bryn Mawr College
-
Professor Deepak Kumar, Associate Professor of Computer Science, Bryn Mawr College
1.1.3. General Project Description
The purpose of the project is to integrate several techniques and algorithms to make a robot give tours of our Science Building. The robot will also be capable of giving directions from its current location to any room in the building.While such a robot may be viewed as a novelty, it will serve to generate more interest in the college's Computer Science program. For us, computer science students, the project presents several challenges. It will also serve to bring about awareness to the general public of the state-of-the-art of robotics and will be an attempt to make people feel comfortable around robots. The robot, besides being mobile, will interact with people using voice and computer vision.
Our Science Building has a complex layout that is confusing even to its regular inhabitants. Students unfamiliar with the building often get lost and end up arriving late for classes and meetings. Our robot will be capable of providing directions on demand.
We realize that creating a perfect tour guide robot is a huge task since it can involve several research-level issues and problems that still remain unsolved. For example, when we say that the robot is going to interact with a person using voice and computer vision, we are going to integrate the available techniques and algorithms.
There have been a few robots created by research teams that were designed to give museum tours. Rhino is a robot that gave tours in the "Deutsches Museum Bonn"for six days [1]. Another example is Minerva, an interactive tour-guide robot which gave tours at the Smithsonian Museum of American History for two weeks [2].
What we are proposing is not an unsolved problem, although it has not been done in many places. It will be a challenge for us, because the people that have succeeded in this project were part of a research team and not just undergraduate students,like us.Given our background and preparation we feel that this project will appropriately challenge us and give us an opportunity to apply our computer science knowledge.
1.1.4. Specific Questions/Hypothesis (to be addressed)
Solving the task described above will require us to learn and integrate several techniques and algorithms dealing with mapping, localizing, Path Planning, Collision Avoidance, Task Planning and User Interaction.In order for the robot to give tours and directions, it must first be equipped with a map/floor plan of the building. These come in two forms: Dynamic and Static mapping. A static map does not change (eg a simple floor plan), whereas a dynamic map incorporates changes in the position of obstacles. Commonly used techniques of mapping include occupancy grids, texture maps and traditional graphs.
Localization is the ability to, at any time, have the robot know it's position in the building. It is important for the robot to be able to estimate it's position so that it knows where to move to get to the next place on the tour. The ability to locate itself is a corroboration of the internal representation of the map with it's perception and relies heavily on sensor readings and motor values. Both RHINO and MINERVA used a version of Markov localization algorithms that use probabilities to determine the most likely location of the robot [1,2].
How will the robot get from one room to another while giving a tour? When asked to be taken to a specific room, which route should be taken? These are questions usually addressed by a path-planning algorithm. Path-planning algorithms are either static (once defined, will not change) or dynamic (responding to changing obstructions). They come in forms ranging from graph searching trees to occupancy grids. The team who developed MINERVA used a coastal planner which generated a path between two exhibits minimizing the chances of getting lost by staying close enough to walls so that it was not left in open spaces where sensor data would no-longer be useful [2].
As the robot moves around the building we must insure that it does not bump into objects or people, therefore the robot must automatically stop or turn away from an obstacle as soon as it detects one within a certain specified distance.
The robot needs to be able to coordinate the various activities that it can perform involving either motion or interaction. To do this a task planner is used to take commands or readings from the microphone, camera, path-planning, obstacle avoidance etc and translate these into motor values or speech values. RHINO used GOLOG; a language used to specify complex actions, and GOLEX which translates the actions from GOLOG into the most basic commands for the software running the robot.
Interacting using speech inherently involves using natural language understanding. However, this in itself is a large project so, for our purposes we will be restricting the interaction to a small, well-defined set of sentences. For example;
visitor: Help robot: Would you like a tour? visitor: No robot: Would you like to be shown to a room? visitor: No robot: Would you like directions to a room? visitor: Yes robot: Where would you like to go? visitor: Room 230 robot: Go down this hall, turn right at the end, the room is on your left.
To be able to start the tour process, the robot must first learn how to recognize people. Given this ability, the robot will be able to stand at the front door and ask any incoming people if they would like a tour or need to be shown a specific room. Using a camera, the robot should also be able to recognize people whom the robot has seen previously that day and should respond 'hello again' or something similar. Once the robot has been adapted to do this it will also be made to recognize an obstacle as human and will perhaps ask them to move out of the way. The techniques used for this were employed by an hors d'ouvres serving robot named Alfred created by a team at Swarthmore College [5].
1.1.5. Plan of Work
We will use one of the following robots: Pioneer 2-DX or Elektro (as seen below). These robots are equipped with sonar sensors and cameras. The P2D8 is also equipped with rear bumpers(sensors that tell it if it touched something) and a gripper. Elektro has laser sensors which provide more accurate information about the environment (distances to objects, object forms).
The software that we will use to control the behavior of the robots is 'Pyro' which is developed by a team including both Professor Blank and Professor Kumar [3]. Pyro stands for Python Robotics. All robot behaviors and functions are programmed in the Python programming language.
The robot will be given a map of the building (in the form of an occupancy grid) which it will then adapt to it's own representation of the environment. It will build on this map by collecting data whilst displaying a basic innate behavior to learn it's environment. In real-time, the robot will constantly be renewing the map as obstacles move, using sensor readings and images from the camera to detect changes in the occupancy grid. Thus it will use a dynamic mapping procedure.
Having the map of the building is not necessarily enough, the robot also needs to know where it is on the map to be able to know what it's next move should be. The robot would use its sensors (sonars, IR and camera), finding specific features of a place and it's motor data (distance and rotation) to localize itself. It will do this by using probabilistic measures combined with landmark recognition. Once the robot has localized itself, it will find a way to get to the next target using path-planning.
Once an occupancy grid has been created and the robot has localized itself, we will integrate a program currently being developed by the same team that created Pyro. This program is given a grid, starting location and finishing location and finds a path of grid squares to get from start to finish. The path planning will be dynamic and will be updated in real-time to incorporate new obstacles etc. This will also enable the robot to give directions to people as it is giving a tour.
The robot will be trained to avoid obstacles using a Neural Network. Once taught, it will be equipped to navigate through the environment without harming others or itself. This Neural Network will be especially important since during certain times of the day there are large crowds in the hallways. Accompanying the automatic motor values associated with avoidance (stopping, turning out of the way), if the robot recognizes the obstacle as a person, it will interact with the human ('Excuse me!'). We have already written and tested obstacle avoidance using neural nets on several robots as part of a term project in the Developmental Robotics course and also in the subsequent summer research program in 2003.
Task Planning will be built into the 'brain' with a series of if/else type statements. In the brain, the inputs and commands from other various parts of the program (eg Path-Planning, voice commands etc) will be translated into changes in the map, producing a current location, motor values and predicted sensor readings.
Using the camera, images will be manipulated to use for landmark detection for the robot's localization of and also in the recognition of a person. Using color histograms, shape recognition and movement, the robot will be able to detect if it is looking at a human and will also be able to recognize a previously seen person (identifying them using a mixture of clothing color and voice recognition). We will achieve this by integrating and building upon the program developed by Professor Blank and fellow researchers [6]. The robot will also have the ability to use a microphone to pick up sound from the surrounding area. Integrating voice recognition software with the robot, we will be able to extract the background noises to an extent and will take voice commands and respond. We are proposing to use the Sphinx speech recognition software from CMU [7]. For speech generation, we plan to use the Festival Speech Synthesis System from Edinburgh [8].
1.1.6. Expected Outcomes
So far, based on our coursework in AI, Robotics, Summer Research and Developmental Robotics, we have acquired substantial hands-on experience on working with the Pioneer and Elecktro robots (among others). We are well accustomed to programming all kinds of robot behaviors and learning experiments using the the pyro software. This summer, we will be doing research work on mapping algorithms that use occupancy grids. In the work we are proposing, we will be acquiring and learning about path planning using occupancy grids, voice recognition and generation, and integration of all the working components through Pyro.We plan to release all of our programs as open-source software. We will also produce a DVD movie that will record our work on the project as it progresses through the year, resulting ultimately in a series of clips that demonstrate all the behaviors of the tour guide robot. We plan to write about our results in senior theses and will also send articles for publicity (campus and national magazines and newspapers) and publications in student research conferences (NCUR, SIGCSE, AI Magazine, etc).
1.1.7. References
1. W. Burgard, A.B. Cremers, D. Fox, G. Lakemeyer, D. Hähnel, D. Schulz, W. Steiner, and S. Thrun. The museum tour-guide robot RHINO. In Proceedings of the 14. Fachgespräch Autonome Mobile Systeme (AMS '98). Springer Verlag, 19982. S. Thrun, M. Bennewitz, W. Burgard, A.B. Cremers, F. Dellaert, D. Fox, D. Hähnel, C. Rosenberg, N. Roy, J. Schulte, and D. Schulz. MINERVA: A second generation mobile tour-guide robot. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 1999
3. D. Blank, L. Meeden, and D. Kumar. Python Robotics: An Environment for Exploring Robotics Beyond LEGOs. In Proceedings of the Thirty-Fourth SIGCSE Technical Symposium on Computer Science Education, Reno Nevada, ACM Press, February 2003
4. J. Howell and B. Randall Donald Practical Mobile Robot Self-Localization. Proceedings, 2000 International Conference on Robotics and Automation, San Francisco, CA, April 24-28, 2000
5. L. Meeden, B. Maxwell, N. Saka Addo, L. Brown, P. Dickson, J. Ng, S. Olshfski, E. Silk, and J. Wales. Alfred: The Robot Waiter Who Remembers You. Published in Autonomous Robots,2001.
6. D. Blank, G. Beavers,W. Arensman, C. Caloianu, T. Fujiwara, S. McCaul,and C. Shaw.A Robot Team that Can Search, Rescue,and Serve Cookies: Experiments in Multi-modal Person Identification and Multi-robot Sound Localization.In Proceedings of the 2001 Twelfth Annual Midwest Artificial Intelligence and Cognitive Science Society Meetings, 2001.
7. The software to perform speech recognition is available from CMU, Sphinx. http://www.speech.cs.cmu.edu/sphinx/
8. The software to convert text to speech is available from the University of Edinburgh, Festival Speech Synthesis System. http://www.cstr.ed.ac.uk/projects/festival/ http://www.cstr.ed.ac.uk/projects/festival/
1.1.8. Student Activity and Responsibility
The students will be have to do the necessary research in order to complete the project. We will work on integrating the different parts of the project and testing the results making sure that everything works as it is supposed to. We will have to program in Python and use Pyro (which is Python based).
1.1.9. Faculty Activity and Responsibility
Our faculty sponsors will continue the development of Pyro and make sure that the features need for the project will be part of the program. They will provide the lab and the robots necessary for the project. They will also give assistance on how the robots work.1.1.10. Budget (and justification if requests are made beyond student stipends
The following items will need to be purchased: Speakers and microphone, DVD disks, blank CDs, digital video tape,
2. Information Pages
Faculty pages must include the following:-
name
-
school/department address
-
email address
-
relevant background in this area (1-2 paragraphs or URL)
