1. The Importance of Starting Small
This is a presentation of Jeff Elman's "The Importance of Starting Small" (and related papers). See http://crl.ucsd.edu/~elman/ for original papers.
For very different results on similar experiments, see http://www.cnbc.cmu.edu/~plaut/papers/pdf/RohdePlaut97CogSci.startingSmall.pdf
1.1. Questions
Puzzling set of questions
In addition, if the brain was evolved to deal with perceptual data, why is language so symbolic? Why does language appear to be so rule-driven when it appears to implemented in a non-rule oriented manner?
1.2. Ways to be innate
-
Representational innateness
-
Architectural innateness
-
Chronotopic innateness
1.3. Properties of Language
the cat who the dogs chase runs toward me
Gold (1967) proved that with formal languages, it was impossible to learn such sentences without having negative evidence: The sentence "Bunnies is cuddly", is not grammatical.
But children do learn language without much negative examples.
Therefore, (
WARNING: leap approaching!) children must have innate knowledge about the form that language will take, and the rest is just fine-tuning.
-
It could be the case that this proof has nothing to do with natural language learning
-
It could be the case that negative evidence is given
Could such a language be learned by a simple artificial system?
1.4. The Experiments
1.5. Results
Experiment #1: The network was trained on a corpus of sentences, and the it failed miserably. It learned some, but overall very poorly. Maybe Gold was right?
Experiment #2: The network was trained according to a strict schedule, 5 epochs each:
| Phase 1 | 10,000 simple sentences | 0 complex sentences |
| Phase 2 | 7,500 simple sentences | 2,500 complex sentences |
| Phase 3 | 5,000 simple sentences | 5,000 complex sentences |
| Phase 4 | 2,500 simple sentences | 7,500 complex sentences |
| Phase 5 | 0 simple sentences | 10,000 complex sentences |
This worked! However, it was very ad hoc and probably took some fiddling by a graduate student to get it to work correctly. In addition, the environment was manipulated in a manner very much unlike the way that children learn language; they are exposed to it in all its complexity from early on. Could it be learned without such a strict manipulative schedule?
Experiment #3: The context bank of the network was randomly wiped out every 2 or 3 words with random patterns. The length between wipe-outs was slowly increased, until no wipe-outs were made.
This worked!
