overview

  • theoretical & experimental pragmatics
    • natural language quantifiers
    • typicality of quantifier \(\textit{some}\)
  • generalized linear model
    • types of dependent variables
    • predictors & link functions
  • probabilistic model
    • gradient salience of alternative expressions
    • one predictor feeds two link functions

natural language quantifiers

dummy

dummy

penguinLogic

natural language quantifiers

some examples:

  • \(\textit{no}\), \(\textit{some}\), \(\textit{all}\)
  • \(\textit{most}\), \(\textit{many}\), \(\textit{few}\)
  • \(\textit{three}\), \(\textit{at least 4}\), \(\textit{between 8 and 12}\)
  • \(\textit{less than Jake drank}\), \(\textit{2 more than Jake could possibly drink}\)

logical semantics

  • "No \(A\) is \(B\)" is true iff there is no \(A\) that is also a \(B\).
  • "Some \(A\) is \(B\)" is true iff there is at least one \(A\) that is also a \(B\).
  • "All \(A\) are \(B\)" is true iff there is no \(A\) that is not also a \(B\).

test your intuitions

"None/some/all of the circles are black."

0balls 1balls 2balls 3balls

4balls 1balls 2balls 3balls

4balls 4balls 4balls

experimental data (preview)

truth-value judgements for "Some of the circles are black."

pragmatics

Herbert Paul Grice

life & work

  • March 13, 1913 - August 28, 1988
  • Oxford & Berkeley
  • natural language philosophy
    • non-natural meaning
    • implicature

grice

dummy

dummy

implicature

utterance meaning = semantic meaning + rational language use

Gricean pragmatics

Cooperative Principle

Make your contribution such as it is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged.

dummy

Maxim of Quantity

  1. Make your contribution as informative as is required for the current purposes of the exchange.
  2. Do not make your contribution more informative than is required.

dummy

Maxim of Relevance

Be relevant!

Grice 1975 "Logic and Conversation"

Gricean language use

When would a cooperative speaker say: "Some of the 10 circles are black"?

no. of black balls probability of using "some" salient alternative
0 very, very low "none"
1 very low "one"
2 low "two"
3 meh "three"
4-6 high ???
7-9 lower "most"
10 low "all"

upshot

The pragmatic felicity of a description \(m\) for a situation \(c\) is a measure of how adequate \(m\) is for a given purpose of talk relative to alternative descriptions.

upshot

pragmatic felicity depends on:

  • purpose of conversation
  • salience of alternatives

dummy

pramgatic felicity is an elusive notion

ideally, we'd like to have a formal, even quantitative notion

  • enter computational pragmatics

experiments

overview

  • replication/extension of previous work
    • van Tiel (2014), Degen & Tanenhaus (2015)
  • 4 experimental variants:
    • binary truth-value judgements vs. 7-point rating scale
    • include filler sentences with \(\textit{many}\) and \(\textit{most}\) or not
  • participants recruited via Amazon's Mechanical Turk

dummy

expTable

truth-value judgement task

binary

rating scale task

ordinal

results

methodological puzzles

  • do binary and ordinal tasks measure the same thing?
    • one is about truth, the other about "goodness"
  • is what either task measures influenced by presence/absence of alternatives?
  • how would we answer these questions with standard statistical techniques?

generalized linear model

recap: simple regression

data

head(cars) 
##   speed dist
## 1     4    2
## 2     4   10
## 3     7    4
## 4     7   22
## 5     8   16
## 6     9   10

dummy

model

\[\beta_0, \beta_1 \sim \text{Norm}(0, 1000)\] \[\sigma^2_{\epsilon} \sim \text{Unif}(0, 1000)\]

\[\mu_i = \beta_0 + \beta_1 x_i\] \[y_i \sim \text{Norm}(\mu_i, \sigma^2_{\epsilon})\]

generalized linear model

glm_scheme

common link & likelihood functions

logistic function

\[\text{logistic}(\eta, \theta, \gamma) = \frac{1}{(1 + \exp(-\gamma (\eta - \theta)))}\]

dummy

dummy

threshold \(\theta\)

gain \(\gamma\)

threshold-Phi model

threshPhi

dummy

dummy

dummy

dummy

Kruscke, 2015, Chapter 23

pragmatic felicity model

idea

  • quantitative notion of pragmatic felicity \(F\) replaces predictor \(\eta\)
    • \(F\) is (function of) relative expected utility:
      • goodness of description (with "some") compared to alternative descriptions?
    • data-driven approach to infer gradient salience of alternatives
  • \(F\) feeds into two link functions:
    • logistic model for binary truth-value judgements
    • thresholded-Phi model for rating scale judgements

full model

modelGraph

set up

  • conditions \(c \in \{0, \dots, 10\}\): number of black balls
  • degrees \(d \in \{1, \dots, 7\}\)
  • messages \(m \in M = \{\textit{none}, \textit{one}, \textit{two}, \textit{three}, \textit{many}, \textit{most}, \textit{all}, \textit{some}\}\)
  • semantics:
##       c=0 c=1 c=2 c=3 c=4 c=5 c=6 c=7 c=8 c=9 c=10
## none    1   0   0   0   0   0   0   0   0   0    0
## one     0   1   0   0   0   0   0   0   0   0    0
## two     0   0   1   0   0   0   0   0   0   0    0
## three   0   0   0   1   0   0   0   0   0   0    0
## many    0   0   0   0   0   1   1   1   1   1    1
## most    0   0   0   0   0   0   1   1   1   1    1
## all     0   0   0   0   0   0   0   0   0   0    1
## some    0   1   1   1   1   1   1   1   1   1    1

Gricean speakers

literal listener picks literal interpretation (uniformly at random):

\[ P_{LL}(c \mid m) = \text{Uniform}(c \mid \{ c' \mid m \text{ is true in } c' \} ) \]

utility for true \(c\) and interpretation \(c'\):

\[ U(c, c' \ ; \ \pi) = \exp(- \pi \ (c - c')^2 ) \]

expected utility:

\[ \text{EU}(m, c \ ; \ \pi) = \sum_{c'} P_{LL}(c' \mid m) \ U(c, c' \ ; \ \pi) \]

Gricean speakers choose maximally informative/useful messages:

\[ m \in \arg \max_{m' \in M} \text{EU}(m', c \ ; \ \pi) \]

(c.f., Frank & Goodman, 2012, Science; Franke, 2014, Proceedings CogSci)

pragmatic felicity

scaled expected utility given set \(X\) of entertained alternatives:

\[ \text{EU}^*(c , X \ ; \ \pi) = \frac{\text{EU}(\textit{some}, c) - \min_{m \in X} \text{EU}(m, c)}{\max_{m \in X} \text{EU}(m, c) - \min_{m \in X} \text{EU}(m, c)} \]

salience of alternatives \(m \in M \setminus \{ \textit{some} \}\):

\[ s_m \sim \text{Beta}(1,1) \]

probability of entertaining \(X \subseteq M\) (crudely assume independence!):

\[ P(X \mid \vec{s}) = \prod_{m \in X} s_m \prod_{m \in M \setminus X} \ (1-s_m) \]

expected relative felicity:

\[ \text{F}(c \ ; \ \vec{s}, \pi) = \sum_X P(X \mid \vec{s}) \ \text{EU}^*(c , X \ ; \ \pi) \]

full model

modelGraph

results

MCMC set up

  • model implemented in JAGS (Plummer 2003)
  • 10,000 samples after 10,000 burn-in steps (2 chains, every second sample used)
  • convergence checked visually and by \(\hat{R}\) (Gelman & Rubin 1992)

dummy

dummy

dummy

dummy

dummy

dummy

dummy

dummy

dummy

dummy

dummy

## 256 sets to create.

posteriors: link function parameters

posteriors: model parameters

posteriors: salience

posteriors: salience differences

posteriors: pragmatic felicity

posteriors: felicity differences

posterior predictive checks

conclusions

conclusions

  • idea that truth-value and rating-scale task measure the same thing is tenable
  • measure: scaled relative expected utitlity under variably salient alternatives
  • this is influenced by presence/absence of alternatives
  • theory-driven probabilistic modeling can advance methodological debate
  • important to make explicit link functions part of full data-generating model