Silvia Rădulescu

Trace: abstract

Abstract

LIMITS AND VARIATIONS OF LINGUISTIC GENERALIZATIONS
Silvia Radulescu (Utrecht University), Frank Wijnen (Utrecht University) & Sergey Avrutin (Utrecht University)
- published in the Proceedings of the 20th Annual Conference: Architectures and Mechanisms for Language Processing (pp.207), Edinburgh, Scotland, 2014

When confronted with the challenge of learning their native language, children manage impressively fast to infer generalized rules from a limited set of linguistic items, and apply those rules to strings of words never heard before. This study investigates what triggers and what limits the inductive leap from memorizing specific items to extracting general rules. Our new entropy-based approach allows the prediction that generalization is a cognitive mechanism that results from the interaction of linguistic input complexity (entropy) and the limited processing and memory capacity of the human brain (i.e. a limited channel capacity).

It has been argued that children detect patterns in auditory input, like phonotactic information (Chambers, Onishi & Fisher, 2003), and word boundaries (Saffran, Aslin & Newport, 1996), by statistical learning. Given that statistical learning deals with computing the probability that a sound in the input occurs after another sound, it cannot account for abstraction of rules beyond the input. A conceptual distinction was drawn between the types of abstractions made by learners (Gómez & Gerken, 2000): abstractions based on perceptual characteristics of specific elements (ba follows ba) and category-based abstractions (relations between abstract categories, e.g. Noun-Verb-Adverb). It was also proposed that humans have an algebra-like system (Marcus, Vijayan, Rao & Vishton, 1999), for extraction of general rules that apply to categories of items, such as “the first item is the same as the third item” (li_na_li). An algebra-like system addresses the case of generalizing to novel input, but it does not explain how humans tune into such algebraic rules, and what the factors (if any) in the linguistic input are that facilitate or impede this process. Our entropy model addresses these questions and bridges the gap between previous findings unifying them under one consistent account. According to this model, less complexity in the linguistic input allows memorization of specific items, while a higher input complexity that overloads the channel capacity drives the tendency to make generalizations (i.e. reduce the number of features that individual items can be coded for and group them in abstract categories, and acquire relations between these categories).

We exposed adult participants to 3-syllable AAB strings that implemented a miniature artificial grammar to probe the effect of input complexity on the process of generalization. To obtain three degrees of input complexity, two factors were manipulated: the number of (different) syllables and the number of repetitions of each syllable. Entropy (a function of the number of items and their probability of occurrence) was used as a measure of input complexity, to design three experimental conditions: low entropy (4×6 A-syllables/4×6 B-syllables), medium entropy (2×12 As/2×12 Bs) and high entropy (1×24 As/1×24 Bs). Participants were asked to give grammaticality judgments on four types of test strings: grammatical AAB strings with trained syllables, grammatical AAB strings with novel syllables, ungrammatical novel ABC strings (three different new syllables), and ungrammatical ABC strings with trained syllables. The results showed that the higher the input complexity, the higher the tendency to abstract away from specific items and make a category-based generalization (i.e. consider novel strings with AAB structure to be grammatical). There was a roughly U-shaped performance for ungrammatical ABC strings with trained syllables: participants in the low entropy condition correctly rejected these test strings significantly more often than participants in the medium entropy condition, but only slightly better than participants in the high entropy condition. These results point to the interplay between the two tendencies - perceptually-bound learning and category-based abstraction – which were working against each other, pushing in different directions depending on their gained strength.

This entropy model can account for the inductive leap from perceptually-bound learning that memorizes and produces constructions with items encountered in the input, towards a category-based mechanism that applies abstract rules productively, in response to the degree of complexity (entropy) in the environment. Unlike previous findings this model also gives a quantitative measure for the likelihood of making generalizations in different ranges of input complexity.

Silvia Rădulescu 2014/12/29 18:30