UserPreferences

ImportanceOfStartingSmall


1. The Importance of Starting Small

This is a presentation of Jeff Elman's "The Importance of Starting Small" (and related papers). See http://crl.ucsd.edu/~elman/ for original papers.

For very different results on similar experiments, see http://www.cnbc.cmu.edu/~plaut/papers/pdf/RohdePlaut97CogSci.startingSmall.pdf

1.1. Questions

http://bubo.brynmawr.edu/~dblank/startSmall/questions.gif

Puzzling set of questions

In addition, if the brain was evolved to deal with perceptual data, why is language so symbolic? Why does language appear to be so rule-driven when it appears to implemented in a non-rule oriented manner?

1.2. Ways to be innate

  1. Representational innateness

  2. Architectural innateness

  3. Chronotopic innateness

http://bubo.brynmawr.edu/~dblank/startSmall/innate.gif

1.3. Properties of Language

the cat who the dogs chase runs toward me

http://bubo.brynmawr.edu/~dblank/startSmall/phrase.gif

Gold (1967) proved that with formal languages, it was impossible to learn such sentences without having negative evidence: The sentence "Bunnies is cuddly", is not grammatical.

But children do learn language without much negative examples.

Therefore, ( /!\ WARNING: leap approaching!) children must have innate knowledge about the form that language will take, and the rest is just fine-tuning.

  1. It could be the case that this proof has nothing to do with natural language learning

  2. It could be the case that negative evidence is given

Could such a language be learned by a simple artificial system?

1.4. The Experiments

http://bubo.brynmawr.edu/~dblank/startSmall/grammar-1.gif

http://bubo.brynmawr.edu/~dblank/startSmall/table1.gif

http://bubo.brynmawr.edu/~dblank/startSmall/network.gif

1.5. Results

Experiment #1: The network was trained on a corpus of sentences, and the it failed miserably. It learned some, but overall very poorly. Maybe Gold was right?

Experiment #2: The network was trained according to a strict schedule, 5 epochs each:

Phase 1 10,000 simple sentences 0 complex sentences
Phase 2 7,500 simple sentences 2,500 complex sentences
Phase 3 5,000 simple sentences 5,000 complex sentences
Phase 4 2,500 simple sentences 7,500 complex sentences
Phase 5 0 simple sentences 10,000 complex sentences

This worked! However, it was very ad hoc and probably took some fiddling by a graduate student to get it to work correctly. In addition, the environment was manipulated in a manner very much unlike the way that children learn language; they are exposed to it in all its complexity from early on. Could it be learned without such a strict manipulative schedule?

Experiment #3: The context bank of the network was randomly wiped out every 2 or 3 words with random patterns. The length between wipe-outs was slowly increased, until no wipe-outs were made.

This worked!

http://bubo.brynmawr.edu/~dblank/startSmall/fig2.gif

http://bubo.brynmawr.edu/~dblank/startSmall/fig3.gif

http://bubo.brynmawr.edu/~dblank/startSmall/fig4.gif

http://bubo.brynmawr.edu/~dblank/startSmall/fig5.gif

http://bubo.brynmawr.edu/~dblank/startSmall/hiddenspace.gif

http://bubo.brynmawr.edu/~dblank/startSmall/skulls.gif