Publications

Prosodic cues enhance rule learning by changing speech segmentation mechanisms

authors:

Abstract

Human learners, including infants, are highly sensitive to structure in their environment. Statistical learning refers to the process of extracting this structure. A major question in language acquisition in the past few decades has been the extent to which infants use statistical learning mechanisms to acquire their native language. There have been many demonstrations showing infants’ ability to extract structures in linguistic input, such as the transitional probability between adjacent elements. This paper reviews current research on how statistical learning contributes to language acquisition. Current research is extending the initial findings of infants’ sensitivity to basic statistical information in many different directions, including investigating how infants represent regularities, learn about different levels of language, and integrate information across situations. These current directions emphasize studying statistical language learning in context: within language, within the infant learner, and within the environment as a whole.

What is statistical learning? In its broadest sense, statistical learning entails the discovery of patterns in the input. This type of learning could range, in principle, from the supervised learning found in operant conditioning (learning that a certain behavior leads to reinforcement or punishment), to unsupervised pattern detection, to the sophisticated probability learning exemplified in Bayesian models. The types of patterns tracked by a statistical learning mechanism could be quite simple, such as a frequency count, or more complex, such as conditional probability. Likewise, the actual elements over which the computations are done could vary in complexity such as geometric shapes and faces, or in concreteness, such as syllables and syntactic categories.

The field of language acquisition has taken special interest in the idea of statistical learning because of the rapidity with which infants typically acquire their native language, despite the complexity of the structures to be acquired. The goal of this review is not to cover the well-trodden recent history of this area (for useful overviews, see Refs 1,2). Instead, we will highlight current directions in this field, with an eye toward the next phase of research on statistical language learning. A decade ago, the driving question in this area was whether infants actually track statistics in linguistic input. The answer to that question appears to be an unequivocal yes. Given that infants are clearly good pattern learners, the next set of questions concern how infants use those patterns.

This review is thus organized around some of the most interesting directions in which statistical language learning research is heading: upward through the levels of language structure beyond the initial task studied in this area, word segmentation; inward to connect with other cognitive mechanisms; and outward to ask whether statistics are actually useful given the rich input characteristic of natural languages. While this review will pose more questions than it will answer, we hope it will help to elucidate the next crucial steps for this burgeoning field of research.

In language acquisition, the term ‘statistical learning’ is most closely associated with tracking sequential statistics—typically, transitional probabilities (TPs)—in word segmentation or grammar learning tasks. A TP is the conditional probability of Y given X in the sequence XY. Typically, experimental materials are designed so that TPs can be calculated over the ‘phonetic’ content of the speech stream, such as segments, syllables, or words. However, a broad understanding of statistical learning incorporates both a greater range of possible computations and more aspects of the speech stream. It is possible that learners are computing any of several basic statistics such as frequency of individual elements, frequency of co-occurrence, mutual information, or many others. Prosodic patterns, stress patterns, distributional cues such as frequent frames, phonotactic patterns, the physical context of the interaction (e.g., objects in view), and the social context of the interaction (e.g., the speaker’s eye gaze direction) could all enter into the computations of the learner. All of these types of regularities provide probabilistic information regarding language structure and use and are potentially helpful for learning about where words begin and end, lexical category membership, grammatical structure, and word meanings. While the primary focus of research to date has been demonstrating infant sensitivity to these regularities, it is also clear that no single cue is sufficient to acquire any aspect of language nor are cues independent of one another. The field is now moving toward an integrative approach: how do infant learners bring together multiple cues, both within domains (e.g., within the auditory stream) and across domains (e.g., between the auditory stream and the visual context) and examining how information is integrated and used over time (e.g., associating meanings with word forms that have been segmented using statistical cues).