Trace: • five_things_brain_2014
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
aslin_newport [2015/11/22 23:16] – silvia | aslin_newport [2016/02/13 11:03] (current) – silvia | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Statistical Learning: From Acquiring | + | ====== Statistical Learning: From Acquiring Specific Items to Forming General Rules ====== |
- | Specific Items to Forming General Rules ====== | + | |
---- | ---- | ||
Line 28: | Line 27: | ||
listening to) the input. | listening to) the input. | ||
\\ | \\ | ||
+ | Saffran et al. (1996) suggested the term statistical learning to | ||
+ | refer to the process by which learners acquire information about | ||
+ | distributions of elements in the input. Thus, the probability that one syllable followed another | ||
+ | within a word (the transitional probability) was 1.0, whereas | ||
+ | the transitional probability of syllable pairs at word boundaries | ||
+ | was 0.33. | ||
+ | \\ | ||
+ | Thus, statistical learning is a powerful and domaingeneral | ||
+ | mechanism available early in development to infants | ||
+ | who are naïve (i.e., uninstructed) about how to negotiate a | ||
+ | complex learning task. | ||
+ | These results show that a statistical-learning mechanism | ||
+ | enables learners to extract one or more statistics and use this | ||
+ | information to make an implicit decision about the stimulus materials | ||
+ | that were present in the input. This ability is important for | ||
+ | learning which syllables form words, for estimating the number | ||
+ | of peaks in a distribution of speech sounds, and for discovering | ||
+ | which visual features form the parts of a scene. But this does not | ||
+ | address the question of how learners form rules—abstractions | ||
+ | about patterns that could be generalized to elements that have | ||
+ | never been seen or heard. How do learners who are exposed to a | ||
+ | subset of the possible patterns in their input go beyond this to | ||
+ | infer a set of general principles or “rules of the game”. | ||
+ | \\ | ||
+ | Several studies have documented that infants can make the | ||
+ | inductive leap from observed stimuli to novel stimuli that follow | ||
+ | the same rules. | ||
+ | \\ | ||
+ | **Some researchers have claimed that statistical learning and | ||
+ | rule learning are two separate mechanisms, because statistical | ||
+ | learning involves learning about elements that have been presented | ||
+ | during exposure, whereas rule learning can be applied | ||
+ | to novel elements and novel combinations** (see Endress & | ||
+ | Bonatti, 2007; Marcus, 2000). **But why do learners sometimes | ||
+ | keep track of the specific elements in the input they are | ||
+ | exposed to and at other times learn a rule that extends beyond | ||
+ | the specifics of the input? An alternate hypothesis is that these | ||
+ | two processes are in fact not distinct, but rather are different | ||
+ | outcomes of the same learning mechanism.** | ||
+ | \\ | ||
+ | \\ | ||
+ | // | ||
+ | \\ | ||
+ | \\ | ||
+ | For example, some stimulus dimensions are naturally more | ||
+ | salient than others. **If stimuli are encoded in terms of their | ||
+ | salient dimensions rather than their specific details, then learners | ||
+ | will appear to generalize a rule by applying it to all stimuli | ||
+ | that exhibit the same pattern on these salient dimensions.** | ||
+ | \\ | ||
+ | \\ | ||
+ | //MyNote//: what triggers encoding in terms of the salient dimensions that apply to all stimuli? | ||
+ | \\ | ||
+ | \\ | ||
+ | Although perceptual cues can serve as powerful constraints on | ||
+ | statistical learning, perceptual salience is not how most rules | ||
+ | are defined in the natural environment. | ||
+ | \\ | ||
+ | **They acquire rules when patterns in the input indicate | ||
+ | that several elements occur interchangeably in the same contexts, but acquire specific instances when the patterns | ||
+ | apply only to the individual elements.** | ||
+ | \\ | ||
+ | \\ | ||
+ | //MyNote//: **CRUCIAL POINT**: what features of the input indicate that elements occur interchangeably? | ||
+ | \\ | ||
+ | \\ | ||
+ | For example, Xu and | ||
+ | Tenenbaum (2007) have shown that if children hear the word | ||
+ | “glim” applied to three different dogs, they will infer that | ||
+ | “glim” means dog. In contrast, if “glim” is used three times to | ||
+ | refer to the same dog, children interpret it as the dog’s name. | ||
+ | The same contrast between learning items and learning rules | ||
+ | can occur for syllable and word sequences. | ||
+ | \\ | ||
+ | Gerken (2006) has made this argument by reconsidering | ||
+ | and modifying the design of the Marcus et al. (1999) rulelearning | ||
+ | experiment (see Fig. 3). Marcus et al. presented 16 | ||
+ | different AAB strings in the learning phase of their experiment. | ||
+ | Notice in Figure 3 that four strings ended in di, four | ||
+ | ended in je, four ended in li, and four ended in we. Thus, | ||
+ | infants could have learned the general AAB rule, or they could | ||
+ | have learned a more specific pattern: that every string ended in | ||
+ | di, je, li, or we. The more consistent or reliable cue was the | ||
+ | repetition of the first two syllables—the AAB rule—because it | ||
+ | applied to every string, whereas the “ends in di (or je, or li, or | ||
+ | we)” rule applied to only one-fourth of the strings. | ||
+ | Gerken (2006) asked whether infants presented with a subset | ||
+ | of the 16 strings from the Marcus et al. (1999) study would | ||
+ | favor the “repetition of the first two syllables” rule or the | ||
+ | “ends in di, je, li, or we” rule. Infants who heard only four | ||
+ | AAB strings that ended in the same syllable (e.g., di in the | ||
+ | leftmost column of Fig. 3) were tested on two equally plausible | ||
+ | rules: (1) all strings involve an AAB repetition, and (2) all | ||
+ | strings end in di. These infants failed to generalize the first | ||
+ | rule to a novel string that retained the AAB pattern but did not | ||
+ | end in di. In contrast, infants who heard only four AAB strings | ||
+ | lying along the diagonal in Figure 3 replicated the Marcus | ||
+ | et al. result. Because these strings shared an AAB pattern but | ||
+ | ended in four different syllables, only the AAB rule was | ||
+ | reliable. | ||
+ | \\ | ||
+ | \\ | ||
+ | //MyNote//: **QUESTION**: | ||
+ | \\ | ||
+ | Consider the set of 4 strings: //leledi, wiwije, jijili, dedewe// | ||
+ | \\ | ||
+ | The following rules are equally reliable for all strings: | ||
+ | \\ | ||
+ | 1. AAB | ||
+ | \\ | ||
+ | 2. starts with 2x //le, wi, ji or de// | ||
+ | \\ | ||
+ | 3. ends in //di, je, li, we// | ||
+ | \\ | ||
+ | Why do learners sometimes stick to the narrow generalizations [2,3] and sometimes make a wider generalization (category-based) [1]? | ||
+ | \\ | ||
+ | \\ | ||
+ | In recent work, we (Reeder, Newport, & Aslin, 2009, 2010) | ||
+ | demonstrated a similar phenomenon—and described some of | ||
+ | the principles for its operation—in the learning of an artificiallanguage | ||
+ | grammar. In our experiments, | ||
+ | presented with sentences made up of nonsense words that | ||
+ | came from three different grammatical categories (A, X, and | ||
+ | B), much like subjects, verbs, and direct objects in sentences | ||
+ | such as “Bill ate lunch.” Depending on the experiment, the | ||
+ | input included sentences in which **all of the words within a | ||
+ | particular category occurred in the same contexts** (e.g., words | ||
+ | X1, X2, and X3 all occurred after any of the A words and before | ||
+ | any of the B words), or **the input included only sentences in | ||
+ | which the X words occurred in a limited number of overlapping | ||
+ | A-word or B-word contexts**. | ||
+ | Adult learners are surprisingly sensitive to these differences. | ||
+ | Our results showed that **// | ||
+ | depended on the precise degree of overlap among word | ||
+ | contexts that they heard in the input, and also on the consistency | ||
+ | with which a particular A or B word was missing from | ||
+ | possible X-word contexts// | ||
+ | \\ | ||
+ | \\ | ||
+ | **Adults generalize rules when the | ||
+ | shared contexts are largely the same, with only an occasional | ||
+ | absence of overlap (i.e., a “gap”). However, when the gaps are | ||
+ | persistent, adults judge them to be legitimate exceptions to the | ||
+ | rule and no longer generalize to these contexts.** | ||
+ | \\ | ||
+ | \\ | ||
+ | //MyNote//: this is a broad description of the observed results, but no explanation as to why this is the case, and no precision in describing: " | ||
+ | \\ | ||
+ | \\ | ||
+ | Thus, similar | ||
+ | to the results of Gerken (2006), our findings showed that it | ||
+ | was the consistency of context cues that led learners to generalize | ||
+ | rules to novel strings, and it was the inconsistency of | ||
+ | context cues that kept learners from generalizing and led them | ||
+ | to treat some strings as exceptions. | ||
+ | The key point here is that in terms of the reliability of context | ||
+ | cues, statistical learning and rule learning are not different | ||
+ | mechanisms (see Orban, Fiser, Aslin, & Lengyel, 2008). When | ||
+ | there are strong perceptual cues, such as the repetition of elements | ||
+ | in an AAB sequence, a statistical-learning mechanism | ||
+ | can compute the regularities of the repetitions (i.e., they are | ||
+ | either present or absent) or of the elements themselves (e.g., | ||
+ | the particular syllables). And, as hypothesized by Gerken | ||
+ | (2006) and Reeder et al. (2009, 2010), even when there are no | ||
+ | perceptual cues, the consistency of how the context cues are | ||
+ | distributed across strings of input determines whether a rule is | ||
+ | formed—enabling generalization to novel strings—or whether | ||
+ | specific instances are learned. According to this hypothesis, | ||
+ | statistical learning is a single mechanism whose outcome | ||
+ | applies either to elements that have been experienced or to | ||
+ | generalization beyond experienced elements, depending on | ||
+ | the manner and consistency with which elements are patterned | ||
+ | in the learner’s input. Importantly, | ||
+ | accomplished without instruction, | ||
+ | structured input. | ||
+ | ---- | ||
+ | **Conclusion: | ||
+ | \\ | ||
+ | Perceptual salience and the patterning of context cues are not | ||
+ | the only factors that can influence what learners acquire via a | ||
+ | statistical-learning mechanism. An extensive literature in linguistics | ||
+ | has argued that languages of the world display a small | ||
+ | number of universal patterns—or a few highly common patterns, | ||
+ | out of many that are possible—and has suggested that | ||
+ | language learners will fail to acquire languages that do not | ||
+ | exhibit these regularities (Chomsky, 1965, 1995). | ||
+ | Recently, a number of studies using | ||
+ | artificial grammars have indeed shown that both children and | ||
+ | adults will more readily acquire languages that observe the | ||
+ | universal or more typologically common patterns found in | ||
+ | natural languages. | ||
+ | For example, Hudson Kam and Newport (2005, 2009) and | ||
+ | Austin and Newport (2011) presented adults and children with | ||
+ | miniature languages containing inconsistent, | ||
+ | occurring forms (e.g., nouns were followed by the nonsense | ||
+ | word ka 67% of the time and by the nonsense word po the | ||
+ | remaining 33% of the time). This type of probabilistic variation | ||
+ | is not characteristic of natural languages, but it does occur | ||
+ | in the speech of nonnative speakers who make grammatical | ||
+ | errors. Adult learners in these experiments matched the probabilistic | ||
+ | variation they had heard in their input when they produced | ||
+ | sentences using the miniature language, but young | ||
+ | children formed a regular rule, producing ka virtually all of the | ||
+ | time, thereby restoring to the language the type of regularity | ||
+ | that is more characteristic of natural languages. | ||
+ | \\ | ||
+ | \\ | ||
+ | It is not always clear why learners acquire certain types of | ||
+ | patterns more easily than others (and why languages therefore | ||
+ | more commonly exhibit these patterns). Some word orders | ||
+ | place prominent words in more consistent positions across different | ||
+ | types of phrases; other patterns are more internally regular | ||
+ | or conform better to the left-to-right biases of auditory | ||
+ | processing. A full understanding of the principles underlying | ||
+ | these learning outcomes awaits further research. What is clear, | ||
+ | however, is that statistical learning is not simply a veridical | ||
+ | reproduction of the stimulus input. Learning is shaped by a | ||
+ | number of constraints on perception and memory, at least | ||
+ | some of which may apply not only to languages but also to | ||
+ | nonlinguistic patterns. |