Trace:
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| aslin_newport [2015/11/22 23:16] – silvia | aslin_newport [2016/02/13 11:03] (current) – silvia | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Statistical Learning: From Acquiring | + | ====== Statistical Learning: From Acquiring Specific Items to Forming General Rules ====== |
| - | Specific Items to Forming General Rules ====== | + | |
| ---- | ---- | ||
| Line 28: | Line 27: | ||
| listening to) the input. | listening to) the input. | ||
| \\ | \\ | ||
| + | Saffran et al. (1996) suggested the term statistical learning to | ||
| + | refer to the process by which learners acquire information about | ||
| + | distributions of elements in the input. Thus, the probability that one syllable followed another | ||
| + | within a word (the transitional probability) was 1.0, whereas | ||
| + | the transitional probability of syllable pairs at word boundaries | ||
| + | was 0.33. | ||
| + | \\ | ||
| + | Thus, statistical learning is a powerful and domaingeneral | ||
| + | mechanism available early in development to infants | ||
| + | who are naïve (i.e., uninstructed) about how to negotiate a | ||
| + | complex learning task. | ||
| + | These results show that a statistical-learning mechanism | ||
| + | enables learners to extract one or more statistics and use this | ||
| + | information to make an implicit decision about the stimulus materials | ||
| + | that were present in the input. This ability is important for | ||
| + | learning which syllables form words, for estimating the number | ||
| + | of peaks in a distribution of speech sounds, and for discovering | ||
| + | which visual features form the parts of a scene. But this does not | ||
| + | address the question of how learners form rules—abstractions | ||
| + | about patterns that could be generalized to elements that have | ||
| + | never been seen or heard. How do learners who are exposed to a | ||
| + | subset of the possible patterns in their input go beyond this to | ||
| + | infer a set of general principles or “rules of the game”. | ||
| + | \\ | ||
| + | Several studies have documented that infants can make the | ||
| + | inductive leap from observed stimuli to novel stimuli that follow | ||
| + | the same rules. | ||
| + | \\ | ||
| + | **Some researchers have claimed that statistical learning and | ||
| + | rule learning are two separate mechanisms, because statistical | ||
| + | learning involves learning about elements that have been presented | ||
| + | during exposure, whereas rule learning can be applied | ||
| + | to novel elements and novel combinations** (see Endress & | ||
| + | Bonatti, 2007; Marcus, 2000). **But why do learners sometimes | ||
| + | keep track of the specific elements in the input they are | ||
| + | exposed to and at other times learn a rule that extends beyond | ||
| + | the specifics of the input? An alternate hypothesis is that these | ||
| + | two processes are in fact not distinct, but rather are different | ||
| + | outcomes of the same learning mechanism.** | ||
| + | \\ | ||
| + | \\ | ||
| + | // | ||
| + | \\ | ||
| + | \\ | ||
| + | For example, some stimulus dimensions are naturally more | ||
| + | salient than others. **If stimuli are encoded in terms of their | ||
| + | salient dimensions rather than their specific details, then learners | ||
| + | will appear to generalize a rule by applying it to all stimuli | ||
| + | that exhibit the same pattern on these salient dimensions.** | ||
| + | \\ | ||
| + | \\ | ||
| + | //MyNote//: what triggers encoding in terms of the salient dimensions that apply to all stimuli? | ||
| + | \\ | ||
| + | \\ | ||
| + | Although perceptual cues can serve as powerful constraints on | ||
| + | statistical learning, perceptual salience is not how most rules | ||
| + | are defined in the natural environment. | ||
| + | \\ | ||
| + | **They acquire rules when patterns in the input indicate | ||
| + | that several elements occur interchangeably in the same contexts, but acquire specific instances when the patterns | ||
| + | apply only to the individual elements.** | ||
| + | \\ | ||
| + | \\ | ||
| + | //MyNote//: **CRUCIAL POINT**: what features of the input indicate that elements occur interchangeably? | ||
| + | \\ | ||
| + | \\ | ||
| + | For example, Xu and | ||
| + | Tenenbaum (2007) have shown that if children hear the word | ||
| + | “glim” applied to three different dogs, they will infer that | ||
| + | “glim” means dog. In contrast, if “glim” is used three times to | ||
| + | refer to the same dog, children interpret it as the dog’s name. | ||
| + | The same contrast between learning items and learning rules | ||
| + | can occur for syllable and word sequences. | ||
| + | \\ | ||
| + | Gerken (2006) has made this argument by reconsidering | ||
| + | and modifying the design of the Marcus et al. (1999) rulelearning | ||
| + | experiment (see Fig. 3). Marcus et al. presented 16 | ||
| + | different AAB strings in the learning phase of their experiment. | ||
| + | Notice in Figure 3 that four strings ended in di, four | ||
| + | ended in je, four ended in li, and four ended in we. Thus, | ||
| + | infants could have learned the general AAB rule, or they could | ||
| + | have learned a more specific pattern: that every string ended in | ||
| + | di, je, li, or we. The more consistent or reliable cue was the | ||
| + | repetition of the first two syllables—the AAB rule—because it | ||
| + | applied to every string, whereas the “ends in di (or je, or li, or | ||
| + | we)” rule applied to only one-fourth of the strings. | ||
| + | Gerken (2006) asked whether infants presented with a subset | ||
| + | of the 16 strings from the Marcus et al. (1999) study would | ||
| + | favor the “repetition of the first two syllables” rule or the | ||
| + | “ends in di, je, li, or we” rule. Infants who heard only four | ||
| + | AAB strings that ended in the same syllable (e.g., di in the | ||
| + | leftmost column of Fig. 3) were tested on two equally plausible | ||
| + | rules: (1) all strings involve an AAB repetition, and (2) all | ||
| + | strings end in di. These infants failed to generalize the first | ||
| + | rule to a novel string that retained the AAB pattern but did not | ||
| + | end in di. In contrast, infants who heard only four AAB strings | ||
| + | lying along the diagonal in Figure 3 replicated the Marcus | ||
| + | et al. result. Because these strings shared an AAB pattern but | ||
| + | ended in four different syllables, only the AAB rule was | ||
| + | reliable. | ||
| + | \\ | ||
| + | \\ | ||
| + | //MyNote//: **QUESTION**: | ||
| + | \\ | ||
| + | Consider the set of 4 strings: //leledi, wiwije, jijili, dedewe// | ||
| + | \\ | ||
| + | The following rules are equally reliable for all strings: | ||
| + | \\ | ||
| + | 1. AAB | ||
| + | \\ | ||
| + | 2. starts with 2x //le, wi, ji or de// | ||
| + | \\ | ||
| + | 3. ends in //di, je, li, we// | ||
| + | \\ | ||
| + | Why do learners sometimes stick to the narrow generalizations [2,3] and sometimes make a wider generalization (category-based) [1]? | ||
| + | \\ | ||
| + | \\ | ||
| + | In recent work, we (Reeder, Newport, & Aslin, 2009, 2010) | ||
| + | demonstrated a similar phenomenon—and described some of | ||
| + | the principles for its operation—in the learning of an artificiallanguage | ||
| + | grammar. In our experiments, | ||
| + | presented with sentences made up of nonsense words that | ||
| + | came from three different grammatical categories (A, X, and | ||
| + | B), much like subjects, verbs, and direct objects in sentences | ||
| + | such as “Bill ate lunch.” Depending on the experiment, the | ||
| + | input included sentences in which **all of the words within a | ||
| + | particular category occurred in the same contexts** (e.g., words | ||
| + | X1, X2, and X3 all occurred after any of the A words and before | ||
| + | any of the B words), or **the input included only sentences in | ||
| + | which the X words occurred in a limited number of overlapping | ||
| + | A-word or B-word contexts**. | ||
| + | Adult learners are surprisingly sensitive to these differences. | ||
| + | Our results showed that **// | ||
| + | depended on the precise degree of overlap among word | ||
| + | contexts that they heard in the input, and also on the consistency | ||
| + | with which a particular A or B word was missing from | ||
| + | possible X-word contexts// | ||
| + | \\ | ||
| + | \\ | ||
| + | **Adults generalize rules when the | ||
| + | shared contexts are largely the same, with only an occasional | ||
| + | absence of overlap (i.e., a “gap”). However, when the gaps are | ||
| + | persistent, adults judge them to be legitimate exceptions to the | ||
| + | rule and no longer generalize to these contexts.** | ||
| + | \\ | ||
| + | \\ | ||
| + | //MyNote//: this is a broad description of the observed results, but no explanation as to why this is the case, and no precision in describing: " | ||
| + | \\ | ||
| + | \\ | ||
| + | Thus, similar | ||
| + | to the results of Gerken (2006), our findings showed that it | ||
| + | was the consistency of context cues that led learners to generalize | ||
| + | rules to novel strings, and it was the inconsistency of | ||
| + | context cues that kept learners from generalizing and led them | ||
| + | to treat some strings as exceptions. | ||
| + | The key point here is that in terms of the reliability of context | ||
| + | cues, statistical learning and rule learning are not different | ||
| + | mechanisms (see Orban, Fiser, Aslin, & Lengyel, 2008). When | ||
| + | there are strong perceptual cues, such as the repetition of elements | ||
| + | in an AAB sequence, a statistical-learning mechanism | ||
| + | can compute the regularities of the repetitions (i.e., they are | ||
| + | either present or absent) or of the elements themselves (e.g., | ||
| + | the particular syllables). And, as hypothesized by Gerken | ||
| + | (2006) and Reeder et al. (2009, 2010), even when there are no | ||
| + | perceptual cues, the consistency of how the context cues are | ||
| + | distributed across strings of input determines whether a rule is | ||
| + | formed—enabling generalization to novel strings—or whether | ||
| + | specific instances are learned. According to this hypothesis, | ||
| + | statistical learning is a single mechanism whose outcome | ||
| + | applies either to elements that have been experienced or to | ||
| + | generalization beyond experienced elements, depending on | ||
| + | the manner and consistency with which elements are patterned | ||
| + | in the learner’s input. Importantly, | ||
| + | accomplished without instruction, | ||
| + | structured input. | ||
| + | ---- | ||
| + | **Conclusion: | ||
| + | \\ | ||
| + | Perceptual salience and the patterning of context cues are not | ||
| + | the only factors that can influence what learners acquire via a | ||
| + | statistical-learning mechanism. An extensive literature in linguistics | ||
| + | has argued that languages of the world display a small | ||
| + | number of universal patterns—or a few highly common patterns, | ||
| + | out of many that are possible—and has suggested that | ||
| + | language learners will fail to acquire languages that do not | ||
| + | exhibit these regularities (Chomsky, 1965, 1995). | ||
| + | Recently, a number of studies using | ||
| + | artificial grammars have indeed shown that both children and | ||
| + | adults will more readily acquire languages that observe the | ||
| + | universal or more typologically common patterns found in | ||
| + | natural languages. | ||
| + | For example, Hudson Kam and Newport (2005, 2009) and | ||
| + | Austin and Newport (2011) presented adults and children with | ||
| + | miniature languages containing inconsistent, | ||
| + | occurring forms (e.g., nouns were followed by the nonsense | ||
| + | word ka 67% of the time and by the nonsense word po the | ||
| + | remaining 33% of the time). This type of probabilistic variation | ||
| + | is not characteristic of natural languages, but it does occur | ||
| + | in the speech of nonnative speakers who make grammatical | ||
| + | errors. Adult learners in these experiments matched the probabilistic | ||
| + | variation they had heard in their input when they produced | ||
| + | sentences using the miniature language, but young | ||
| + | children formed a regular rule, producing ka virtually all of the | ||
| + | time, thereby restoring to the language the type of regularity | ||
| + | that is more characteristic of natural languages. | ||
| + | \\ | ||
| + | \\ | ||
| + | It is not always clear why learners acquire certain types of | ||
| + | patterns more easily than others (and why languages therefore | ||
| + | more commonly exhibit these patterns). Some word orders | ||
| + | place prominent words in more consistent positions across different | ||
| + | types of phrases; other patterns are more internally regular | ||
| + | or conform better to the left-to-right biases of auditory | ||
| + | processing. A full understanding of the principles underlying | ||
| + | these learning outcomes awaits further research. What is clear, | ||
| + | however, is that statistical learning is not simply a veridical | ||
| + | reproduction of the stimulus input. Learning is shaped by a | ||
| + | number of constraints on perception and memory, at least | ||
| + | some of which may apply not only to languages but also to | ||
| + | nonlinguistic patterns. | ||