Dr. Michael Ramscar
Recent Presentations


Invited talks

  • Does associative learning work?
    23rd November, 2020 — Bridging silos: Learning mechanisms for first language. On-line Workshop, Department of Psychology, University of Alberta, Canada.

  • How systems of morphological contrasts contribute to the discriminative process of communication
    7th February, 2020 — Workshop on Implications of psycho-computational modelling for Morphological Theory (PsyComMT 2020), Vienna, Austria.


Invited talks

  • The discriminative nature of human communication systems.
    16th July, 2018 — Department of Psychological Sciences, Liverpool University, Liverpool, UK

      Linguists often talk about language using words like encode, semantics and communicate. They use these words in the absence of clear formal definitions, the assumption being that these will emerge out of further study. By contrast, information theory has provided precise mathematical definitions for at least some of these terms, albeit within an abstract formal framework that focuses on discriminating the actual message sent in a signal from all of the possible messages that might have been sent. In this talk, I willll describe how the precise, if abstract, nature of information theory makes clear predictions about the kind of statistical structures that we should expect to find throughout any communicative code, and show how these exact structures are to be found everywhere in natural languages, even in arcane areas such as personal name grammars and grammatical gender systems (which many linguists consider to be barely communicative at all). I will show how a geometric distribution of name tokens is universal across all major languages, and describe how this gives rise to the communicative functions of personal names. I then will show how the sets of nouns and verbs that English speakers actually encounter in communication also have a geometric distribution, such that they appear to support the same kind of communicative process. Finally, I will describe some of the implications the empirical phenomena I identify have for our understanding of human communication and cognition.

    • The discriminative nature of human communication systems.
      27th June, 2018 — Invited Talk, The Schultink Lecture LOT Summer School, Groningen University, Netherlands




    Invited talks

    • The information structure of discriminative human communication systems.
      27th November, 2017— Invited Colloquium Czech National Corpus, Prague, Charles University, Czech Republic

    • The discriminative nature of human communication.
      26th November, 2017— Invited Colloquium Linguistics Department, Charles University, Prague, Czech Republic

    • Self-Organization and Optimization in Collective Linguistic Behavior.
      20th November, 2017 — Invited Talk Workshop on Population effects on languages (PopLang) , Lyon, France

        it is a truth universally acknowledged, that a single man in possession of a good fortune must be in want of a wife. — Jane Austen, Pride and Prejudice
        This talk describes a much less recognized and perhaps far more surprising truth: that property and marriage, in conjunction with population size, have governed -- and continue to govern -- naming practices in English-speaking countries in a remarkably law-like way. In doing so, I shall also describe another little recognized truth, that ought to be less surprising than it is likely to be to many: that at a micro-level, humans linguistic behavior is exquisitely sensitive to the information it communicates (and receives), such that at a macro-level within communities of speakers -- and over generational time amongst their ancestors -- collective linguistic behavior self-organizes to optimize its efficiency in response to communicative and environmental pressures. This self--organizing behavior has led to the evolution of the highly structured systems we call languages, and it continues to maintain these systems in response to communicative and environmental pressures even today. I shall illustrate each of these points in relation to an aspect of linguistic behavior that linguists have traditionally considered to be barely communicative, and largely unstructured: the naming of individuals within linguistic communities. First, I shall show that the name grammars of Western and Sinosphere languages are in fact highly structured communication systems that historically shared what seems to be a universal form; And second, I will show that contrary to widespread belief, there appears to have been no fundamental change in the principles governing English naming practices between the early modern period, where for centuries it was common for half of all English males to be called either John, William or Thomas, and today, where the apparent volatility of naming has led to the annual publication of what appear to be ever--changing lists of most popular names. Finally, I will describe work showing how the same self-organizing structure can be seen across the rest of the grammar, and lay out some of the implications that the phenomena identified here have for theoretical understandings of human communication and cognition.

    • Gendered insights into the syntax and semantics of English adjectives.
      31st August, 2017 — Invited Plenary International Symposium on Polyadjectival Nominal Phrases, Edge Hill University, Ormskirk, UK.

    • The discriminative nature of human communication
      26th January, 2017 — Colloquium: Cambridge Linguistics Forum, University of Cambridge, UK.

        Information theory has shown that exponential distributions are beneficial to the design of efficient communication systems, because they are both optimal for coding purposes and memoryless. It has recently been shown that Sinosphere family names are exponentially distributed, and I will show how consistent with this, the empirical distributions of names -- and other classes of lexical items -- that English speakers and hearers engage with in moment to moment communication are also exponential, such that the Zipfian distributions long thought to play a functional role in language are an artifact of the mixing of these empirical distributions. I will illustrate the detailed workings of the communicative process that this distributional structure supports by presenting a full account of the incremental, discriminative syntactic and semantic properties of personal names. I will further show that the distributional structures supporting this process are universal to the world’s major languages. Finally I will describe the implications that the phenomena identified here have for theoretical understandings of human communication and cognition.

    • What replication problems tell us about the possibilities for psychological science
      17th February, 2017 — Invited talk: Leopoldina (German National Academy of Sciences) Workshop on Replication in Psychology, Würzburg University, Germany.




    Invited talks

    • Learning, competition and the nature of morphology
      21st February 2016 — Invited Plenary: 17th International Morphology Meeting, Vienna, Austria.

        For most of the last century, the study of language has largely assumed an atomistic model in which linguistic signals comprise discrete, minimal form elements which are in turn associated with a discrete, minimal elements of meaning. Accordingly, production has been seen to involve the composition of messages from an inventory of form (i.e., morphological) elements, and comprehension the subsequent decomposition of these messages. Research in linguistics has thus tended to focus on identifying and classifying these elements, and on attempting to formulate lossless processes of composition and decomposition (Bloomfield, 1933; Matthews, 1991). This program has raised as many questions as answers, especially when it comes to specifying the nature of form - meaning associations (Blevins, 2016).

        By contrast, across the same period behavioral and neuroscience research based on human and animal models has revealed that “associative learning” is a discriminative process (Ramscar, Dye, & McCauley, 2013). Learners acquire predictive understandings of their environments through competitive mechanisms that tune systems of internal cue representations to eliminate or reduce any uncertainty they promote. Critically, models of this process better fit empirical data when these cue representations do not map discretely onto the aspects of the environment learners come to discriminate (Ramscar & Port, 2015). Seen from this perspective, languages are probabilistic communication systems (Shannon, 1948; Ramscar & Baayen, 2013) that exhibit continuous variation within a multidimensional space of form-meaning contrasts. Discrete descriptions of these systems at either an individual (psychological) or community (linguistic) level are thus necessarily idealizations. Since idealizations inevitably lose information, the different types of idealizations explored in different atomistic models of morphology over the past century can be seen to differ mainly in terms of the kinds of information that they lose.

    • Learning, discriminaton and language
      15th April 2016 — Colloquium: Institut für Vergleichende Sprachwissenschaft, Universität Zürich, Switzerland.

    • Learning, discriminaton and language
      4th May 2016 — Colloquium: Division of Psychology & Language Sciences, University College London, UK.

    • The discriminative nature of human communication
      October 7, 2016 — Institute for Mathematical Behavioral Sciences, UC Irvine, California, USA.

    • Peer reviewed conference talks




    Invited talks

    • Nonlinear dynamics of lifelong learning: The myth of cognitive decline
      15th January 2015 — Colloquium: Department of Linguistics, Saarland University, Saarbrücken, Germany.

        As adults age, their reaction times slow across a range of psychometric tests. This has been widely taken to show that cognitive information-processing capacities decline over the course of adulthood. I will show that these response patterns, which are typically taken as evidence for (and measures of) declining cognitive-processing capacities, arise naturally out of basic principles of learning. These basic, formal learning principles both correctly identify the pattern of performance exhibited by very young children in word learning tasks (which can differ greatly from that of young adults) as well as successfully predicting that older adults will exhibit far greater sensitivity to fine-grained properties of test stimuli than younger adults. Taken together, the findings I present show that the patterns of change observed in cognitive performance across the lifespan simply reflect the consequences of learning from the kind of statistical distributions that typify human experience. Once the information-processing loads are inevitably imposed by learning from this experience are controlled for, it appears that the performance changes that are usually taken as evidence of innate abilities in infants, or declining cognitive capacities in older adults, support little more than the unsurprising idea that the way individuals choose between, or recall, items will be affected by the number of items there are and what an individual has already learned about them. I will consider the implications of this for our scientific and cultural understanding of lifelong cognitive development.

    • Nonlinear dynamics of lifelong learning: The myth of cognitive decline [video]
      12th February 2015 — Colloquium: Department of Cognitive Science, Rensselaer Polytechnic Institute, New York.

    • Discriminative Language Learning
      27th February 2015 — Invited Talk: Workshop on Child Language Development, University of Manchester, United Kingdom.

    • The discriminative brain
      13th July 2015 — Invited Talk: The First International Quantitative Morphology Meeting - IQMM1. Belgrade, Serbia

    • Frequency of what, exactly? Linguistic metrics and the discriminative stance
      30th July 2015 — Invited Talk: Frequency metrics in psycho- and sociolinguistics. Freiberg, Germany

    • Explorations in discrimination learning and language
      31st July 2015 — Invited Talk: Frequency metrics in psycho- and sociolinguistics. Freiberg, Germany

    • Nonlinear dynamics of lifelong learning: The myth of cognitive decline
      30th October 2015 — Colloquium: Laboratoire Dynamique Du Langage, Université Lyon 2, France

    • Sticks and stones: some social and cognitive consequences of names
      1st December 2015 — Colloquium: LS Psychologie II, Universität Würzburg, Germany

    • learning and information processing across the lifespan: why it is time to take the superstition out of healthy aging
      3rd December 2015 — Invited Talk: Workshop on the Future of Aging: Centre for Integrative Neuroscience, Universität Tübingen, Germany

    • From learning to language: how cognitive development shapes human communication
      11th December 2015 — Invited Talk: Studying Language Learning From the Laboratory to the Classroom, Universität Tübingen, Germany

    • Peer reviewed conference talks




    Invited talks

    • Nonlinear dynamics of lifelong learning: The myth of cognitive decline
      24th January 2014 — Colloquium: Basque Center on Cognition Brain and Language, San Sebastien, Spain.

    • Nonlinear dynamics of lifelong learning: The myth of cognitive decline
      21st February 2014 — Colloquium: Department of Psychology, University of Sheffield, Sheffield, United Kingdom.

    • Using corpora to model lifelong learning
      3rd June 2014 — Invited talk: Workshop on Corpus resources for quantitative and psycholinguistic analysis, Eszterházy College, Hungary.

    • Expectation and negative evidence in morphological learning: the curious absence of “mouses” in adult speech
      21st July 2014 — Invited talk: The Stuff Words are Made of: An International Conference on the Cross-linguistic Comparison of Indo-Germanic and Semitic Languages, University of Konstanz, Germany.

    • The discriminative approach to language learning and communication
      4th September 2014 — Workshop on Information Theory, 8th International Conference on Construction Grammar, University of Osnabrück, Germany.

        The study of language has been dominated by combinatorial models of linguistic communication. The shortcomings of these models can be distilled into a single fundamental flaw: they assume “forms” of representations -- and processes acting upon them -- that could never be realized on any computational device because they violate some very basic mathematical principles of coding. Because of this, combinatorial models of language — and, to a large extent, our folk intuitions about language — ultimately make little computational sense. If the way the psychological processes which allow our minds to communicate are to be understood computationally, we will need a different, more profitable way of thinking about language. To this end, I will describe a discriminative model of communication that is both grounded in, and, more importantly, consistent with formal theories of coding, information processing and learning. I will then describe a surprisingly wide range of predictions and empirical findings that lend strong support to this alternative model of language. The approach I describe offers not only a far more profitable and productive conception of communication for scientists and engineers studying language (and cognitive processes more generally), it is also one that has benefits for the users of language: improving our understanding of the way in which we actually communicate with one another can allow the way we learn, teach and even use language to be refined and improved on.

    • Language learning: The discriminative approach
      14th October 2014 — Colloquium: Symposium on Child Language Development, Max Plank Institute of Psycholinguistics, Nijmegen, Netherlands.

    • Language as a complex adaptive discriminative system
      22nd October 2014 — Colloquium: Linguistics Department, Lyon University, Lyon, France.

    • What can linguistics tell us about cognitive aging?
      14th November 2014 — Invited talk: SfSB, Tübingen University, Germany.

    Peer reviewed conference talks

    • Information, discrimination learning… and morphology
      30th May 2014 — Symposium on Information Theory in Morphology, 16th International Morphology Meeting, Budapest, Hungary. Joint presentation with Jim Blevins.

    • How network complexity affects our understanding of cognitive processing across the lifespan
      23rd November, 2014 — 55th Annual Meeting of the Psychonomic Society, Long Beach, California.




    Invited talks


    Peer reviewed conference talks


    Invited talks
    Peer reviewed conference talks




    Invited talks


    • Language as Prediction
      B.F. Skinner Lecturer, 37th Annual Association of Behavioral Analysis International Convention, Denver, CO.

        In this talk, I'll explore the idea that when humans communicate, they engage in a process of joint prediction. When talking, speakers use a rich set of cultural and experiential priors to produce behavior that they expect will change the beliefs or behavior of others. Speakers use semantic cues to activate appropriate linguistic units. These words and chunks, along with other developing contextual cues, then activate subsequent linguistic units as speakers generate the utterances they believe are most likely to bring about changes in listeners' beliefs or behavior. At the same time, listeners, far from being passive decoders of tokens of meaning, are using broadly the same process to predictively build up their understanding of what is being said. Listeners use both learned semantic cues to words, and words themselves as cues to other words, in order to predict the behavior and intentions of speakers. Successful communication thus relies both on the collaboration between speaker and listener, and the degree to which shared prior knowledge enables mutual predictability. An attractive property of this approach is that it allows human communication to be couched in terms compatible with theories of learning.


      Information structure and learning: The artificiality of grammar [slides]
      Quantitative Measures in Morphology and Morphological Development, San Diego, CA.


      Discrimination, prediction and language: The importance of being wrong
      Cognitive Science Colloquium Series, Indiana University.

        What is going on in the mind of someone who is speaking, or listening to someone speak? I will present a series of findings in support of the idea that human communication is best conceived of as a process of mutual prediction, based on a system of conventionalized cultural and experiential cues. These ideas are grounded in the notion that cognition and learning are themselves fundamentally predictive: that the purpose of a cognitive system is to successfully predict events in the environment, and that the purpose of learning is to minimize error in doing so. Conceiving of language from within this kind of predictive framework not only makes sense, it can also provide formal solutions to many common puzzles about language learning, and explain the function of otherwise baffling linguistic structures such as grammatical gender. I’ll present evidence from computational and behavioral experiments which suggest that grammar and understanding are predictive and probabilistic, and that human languages -- and even the names we give our children -- are shaped by well established information and learning theoretic principles.


      Program Colloquia, Department of Linguistics, Tübingen University.


    Peer reviewed conference talks


      Manipulating information structure as a method of localizing information processing in the brain
      18th Annual Cognitive Neuroscience Meeting, San Francisco, CA.


        When formalized in terms of prediction and cue competition, symbolic learning takes two forms: learning to predict labels from the features of objects and events (Feature-to-Label learning), or learning to predict features from labels (Label-to-Feature learning). When the information available in training is structured in one or another of these formats, qualitative differences in symbolic learning occur. Discrimination learning is facilitated when objects precede labels (FL), because the structure of information promotes cue competition between individual features. However, this competition is inhibited when labels predict objects (LF; Ramscar et al, 2010). We report an fMRI investigation of these Feature-Label-Ordering effects in learning. Participants were trained and tested on a category-learning task while the frequency of confusable categories was manipulated so that successful discrimination was essential to successful categorization. Participants trained to predict labels from features (FL) showed higher levels of dorsal striatal activity (caudate and putamen), which correlated with overall performance at test. The opposite pattern was observed with ventrolateral prefrontal cortex (VLPFC) activation, which was greater in participants trained to predict features from labels (LF), and which correlated negatively with performance on the difficult to discriminate low frequency items. The increased striatal activity we observed in the FL-trained participants is consistent with evidence linking this area to discrimination learning, while the correlation between VLPFC activity and poorer discrimination in the LF-trained participants supports the idea that the structure of information in training forced participants to rely on working memory, fixating on cues that were frequent, salient, and yet ultimately uninformative.


      How children learn to value numbers: Information structure and the acquisition of numerical understanding
      33rd Annual Meeting of the Cognitive Science Society, Boston, MA
      *Winner of the 2011 CogSci CaSL Prize sponsored by the Institute of Education Sciences

        Although number words are common in everyday speech, for most children, learning these words is an arduous, drawn out process. Here we present a formal, computational analysis of number learning that suggests that the unhelpful structure of the linguistic input available to children may be a large contributor to this delay, and that manipulating this structure should greatly facilitate learning. A training-experiment with three-year olds confirms these predictions, demonstrating that significant, rapid gains in numerical understanding and competence are possible given appropriately structured training. At the same time, the experiment illustrates how little benefit children derive from the usual training that parents and educators provide. Given the efficacy of our intervention, the ease with which it can be adopted by parents, and the large body of research showing how strongly early numerical ability predicts later educational outcomes, this simple discovery could have potentially far-reaching import.


      The Evolution of Noun Classification in Two Germanic Languages.
      44th Annual Meeting of the Society for Mathematical Psychology, Boston, MA.

        For generations, linguists, philosophers and psychologists have accepted the idea that grammatical gender serves no functional purpose, even though it has co-evolved across many different languages. We question this ‘purposeless’ assumption by considering the case of the German gender system and examining whether gendered determiners might play an informative role in language processing. An information theoretic analysis of German reveals that the gender system serves to make nouns more predictable in context. Moreover, like other subsystems of language - such as verb inflection - the gender system is more specifically informative about high frequency items than low frequency items. To further assess the functional role that gender plays, we then compare German to modern English, a Germanic language that has largely shed its gender system. We find that grammatical gender allows German speakers to use a wider variety of nouns after articles. However, it appears that English has systematically compensated for its diminished gender systemby extending the use of prenominal adjectives, employing them with greater frequency as the frequency of the nouns they precede decreases. We show that not only do English prenominal adjectives help to make nouns more predictable in context, but that the distribution of prenominal adjectives is organized to optimize this function by ensuring that prenominal adjectives provide more support for low frequency nouns than high frequency nouns, thereby helping to makeall nouns equally predictable in context. We consider the implications of these findings for our wider understanding of language and communication.


    Lab Talks


      Princeton University, Department of Psychology.
      University of Pennsylvania, Department of Psychology.
      University of California at San Diego, Department of Linguistics.