Barriers to effective giving and concern for global priorities: evidence synthesis and a field experiment on donors’ response to effectiveness information

David Reinstein (Exeter); ‘Barriers’ synthesis: Nick Fitz, Ari Kagan, Janek Kretschmer, Luke Arundel. Field experiment: Scott Dickerson (Bayesian); reported experiments with Kiki Koutmeridou/Donors’ Voice advisory team)

08/12/2020

Outline

Outline

  1. ‘Effective donation behavior’: an important goal and measure, engaging ‘you lot’
  • Actively ‘move’ £100’s of billions/year \(\rightarrow\) effective GP

    • about £300b/year from US alone; cf GoodVentures 272 million USD in 2019, GiveWell ‘directed’ 161 mln/year approx
  • (Ineffective) giving: a measure of concern for GP and ‘effectiveness’

  • Drivers of personal/professional/political choices

  • GPI: i. ‘Priorities’ only matter if ‘actionable’ \(\leftarrow\) popular will; ii. definition of barriers; ii. ‘Enlightenment project’ as (existential) priority

  1. Collaborative project: Barriers to effectiveness in giving (taxonomy/outline); ‘puzzle’
  1. Prior evidence and gaps, our empirical approach
  1. DonorsVoice mailing experiment (many co-authors) i) Setup ii) Results (and stat. discussion)
  1. Preliminary conclusions, proposed future directions (time permitting)

Definition of ‘effective giving’ for our purposes, motivation (EA audiences may skip this)

Definition

Impact aka effectiveness of a charity

\(B(G_i)\): Beneficial outcome achieved by charity \(i\) with total donations \(G_i\)


  • Ultimate outcome – Lives saved, QALY added, etc.

  • Not ‘output’ – ‘nets provided’ nor ‘paintings purchased’

Impact of a donation:

\(B'(G_i)\) for the marginal donor

  • GiveWell and others attempt to measure this


We know:

\(B'(G_i)\) is much larger for most impactful vs. most popular charities.

Raises questions

  1. “Why don’t we give more to the most effective charities and to those most in need?”,


and


  1. “Why are we not more efficient with our giving choices?”


A ‘Puzzle’?

Barriers to effectiveness in giving; project, brief taxonomy/outline


Review at bit.ly/eg_barriers, extending innovationsinfundraising.org; mini-meta-analyses planned

Collaborators: David Reinstein, Nick Fitz, Ari Kagan, Luke Arundel, Ben Grodeck, Janek Kretschmer, others

“Increasing effective charitable giving: The puzzle, what we know, what we need to know next” project

See bit.ly/eg_barriers

See bit.ly/eg_barriers

Barriers: a workable categorization

  1. No moral-utilitarian concerns

  2. (Psychological) distance \(\rightarrow\) (lack of) emotional arousal/awareness


LT: Temporal distance

2. Identity & other/self-perception;      3. Inertia and systemic factors;     4. Quantitative biases2. Identity & other/self-perception;      3. Inertia and systemic factors;     4. Quantitative biases2. Identity & other/self-perception;      3. Inertia and systemic factors;     4. Quantitative biases

  1. Identity & other/self-perception; 3. Inertia and systemic factors; 4. Quantitative biases

… and…
  1. Obstacles/aversion to doing evaluations; analytical/empathetic clash


  • Reluctant to evaluate (taboo tradeoffs, ‘social v. market norm’, scary to consider LT …)
  • The evaluation process switches off empathy \(\rightarrow\) focus of this ‘first’ project

‘Processing of effectiveness information’ and spontaneous/deliberate responses

  • Heuristic (fast) \(\rightarrow\) spontaneous generosity?

  • Deliberative (slow) \(\rightarrow\) thoughtful giving … or ‘calculated greed’

  • Also: relational models theory (Fisk, ’91); Motivated reasoning (e.g., Exley ’19) \(\rightarrow\) all suggests analytical information \(\rightarrow\) less giving

Prior evidence (and gaps) on donor responses to (analytical) effectiveness information

Evidence brief: key findings

  • Small, Lowenstein, Slovic, ’07 (1) (Lab char’l giving), prime analytic (vs emotion): NEGATIVE

  • Karlan & Wood, ’17 (2), field mailing, scientific impact info: NULL, tight-ish bounds; (+/- for ‘prior large/small donors’)

  • **Bergh & Reinstein, ‘20** Effectiveness (and other) information; \(\times\) emotional information: close-to-zero effect for each (small bounds in pooled anal. of ’donation share’)

  • Caviola ’ea, ’20, ‘Debunking’ misconceptions, informing about (GiveWell) effectiveness increased effectiveness of stated (hypothetical) giving choices/intentions

  • Mulesky, ’20, evidence suggests (unrealistic) impact information increased hypothetical donations

  • Parson, ’07, field mailing, numeric overhead info: POSITIVE effect for previous donor subset

  • Mixed/null/positive evidence of impact of ‘real-world ratings’ (Yoruk ’16; Brown ea ’17; Gordon ’ea ’19)

  • Mixed evidence (lab; charity/non) of ‘excuse-driven information seeking’ (Exley ’16b; Fong & O, ’10; Metzger & G ’19)

This sort of experimental evidence

A variety of lab/web studies (Bergh/Reinstein)

Identifiable victims effect/ deservingness vs deliberation


Claim: Better to portray an individual (child) than convey the total affected Small & Loewenstein (03); Small et al (07); Kogut & R (05)

  • Driven by System-1 empathy, switched off by analytic thinking

Small, Lowenstein, Slovic (2007):


[Study 3] “individuals who faced an identifiable victim donated more…”


  • “…than those who faced victim statistics, p < .01,”


  • …“than those who faced [both] an identifiable victim [and] statistics, p < .05.”

Small et al, ’07, Study 4


Priming analytic thinking reduced donations to an identifiable victim relative to a feeling-based thinking prime.


Yet, the primes had no distinct effect on donations to statistical victims, which is symptomatic of the difficulty in generating feelings for these victims.


Tightly bounded null, but … nonlinearity?

Verkaik (2016)


While previous studies have convincingly shown that providing output information, informing donors of how their donation is used, increases generosity (Cryder & L, ’10; Cryder ea ’13; Aknin ea ’13)


…the evidence on the effects of impact information are more mixed, with mainly null effects (Metzger & G ’15; Karlan & W, ’14; Baron & S, ’10; Caviola ea ’14, Berman ea ’15)

Null or UNDERPOWERED?

Ratings and information in general: mixed evidence

  • Yörük (2016, JEMS): RD w/ Charity Navigator; significant for ‘small’ charities only

    • See also Brown ea (2017), Gordon ea (2009)

“Effectiveness” info

Karlan and Wood (2017)

Add scientific impact text to real charitable appeal (& remove emotional text):

\(\rightarrow\) little net effect

\(\rightarrow\) reduced (increased) giving among small (large) prior donors (not a preregistered hypothesis)


Potential confounds, specificity

Details of Karlan first wave: SCIENCE vs EMOTION

According to studies on our programs in Peru that used rigorous scientific methodologies, women who have received both loans and business education saw their profits grow, even when compared to women who just received loans for their businesses. But the real difference comes when times are slow. The study showed that women in Freedom from Hunger’s Credit with Education program kept their profits strong–ensuring that their families would not suffer, but thrive.


Because of caring people like you, Freedom from Hunger was able to offer Sebastiana a self-help path toward achieving her dream of getting “a little land to farm” and pass down to her children. As Sebastiana’s young son, Aurelio, runs up to hug her, she says, “I do whatever I can for my children.”

Information as an ‘excuse’ not to give; allows motivated reasoning

Exley, 2016b: Greater discounting of ‘less-efficient’ charity in charity-charity decision-making than in charity-self d-m


Fong & O, ’10:

“Dictators [charitable giving] who acquire information mostly use it to withhold resources from less-preferred types, leading to a drastic decline in aggregate transfers”

But…



Metzger & G, ’19

Lab donations to high/low-performing NGO

  • More purchasing of ‘recipient type’ than ‘impact’ info

  • Mixed & weak evidence on excuse-driven information-seeking


Caveats…

Our preferred empirical approach

  • Naturalistic environments, meaningful choices

  • Show robustness across setups/frames

  • Honest presentation of evidence, allowing integration with other work

DonorsVoice mailing experiment - setup

Co-authors: David Reinstein, Elizabeth Keenan, Ayelet Gneezy, Hengchen Dai, Enrico Rubaltelli, Stephan Dickert, Kiki Koutmeridou, and Peter Ayton

We are running this subject to the final say of the charity. We have proposed that the Treatment emails (but not the control emails) will include a sentence/fragment such as the following in both a captioned photo in the email, and the email text:

“Last year, we were able to provide [general provision of an outcome here relevant to the charity] to a [recipient unit] with just $[small amount of money].”

(from prereg)

We plan to perform standard nonparametric statistical tests of the effect of this treatment on

  • Average gift amount (including zeroes)

  • Incidence/number of people making a gift, [and] incidence of gifts of exactly $10.

In particular, we will focus on Fisher’s exact test (for incidence) and the standard rank sum and t-tests for the donation amounts.

We will report confidence intervals on our estimates, and make inferences on reasonable bounds on our effect, even if it is a ‘null effect’.

Power calculations

Response rates in previous such emails were extremely low: approximately 1 per 3,000 emails. Our power calculations suggest that we have .29 power to detect a 50% effect, and 0.90 power to detect approximately a 100% (doubling) on incidence…

Because of this limited power, we will ask the charity to run this trial a second time with an equivalent-sized sample. [Which we did.]

Stopping rule

We aim to continue this treatment in future charity appeals until we can statistically bound (with 95% confidence) the impact of the treatment on both incidence and average donation within a margin of 1/3 of the incidence and average donation in the control condition.

Setup

Context

Charity: A large US religiously-associated international poverty relief charity

Timing: Emails sent out at the same time withing each trial

First trial: 21-Nov-2018 ‘Thanksgiving email’ Second trial: Nov 2019 (also Thanksgiving email)


Sample size and composition:

First trial:

  • Charity’s standard email list (previous donors with emails)
  • Approx 182,600 emails sent out, 91.3k in each condition

Second trial:

  • 79,754 emails sent out, exactly half in each condition

Setup (first trial)

Setup (second trial)

(Very similar to first trial, but more realistic impact info)


DonorsVoice mailing experiment: Results

Overall summary statistics (and some tests)

run treatment recipients bounces opens clicks conversions
both control 131175 180 29047 681 74
both treat 131173 178 28558 645 109
2018 control 91298 39 16906 414 27
2018 treat 91296 42 16195 371 71
2019 control 39877 141 12141 267 47
2019 treat 39877 136 12363 274 38

Fisher: 95% CI OR ‘donations over $100 within 7 days (opened emails)’ = [0.992 1.688]

run treatment recipients opens clicks conversions conv_per_10k_recip conv_per_click
both control 131175 29047 681 74 5.6 0.11
both treat 131173 28558 645 109 8.3 0.17
2018 control 91298 16906 414 27 3.0 0.07
2018 treat 91296 16195 371 71 7.8 0.19
2019 control 39877 12141 267 47 11.8 0.18
2019 treat 39877 12363 274 38 9.5 0.14

Cost (Impact per dollar) information treatment \(\rightarrow\)

  1. Slightly lower rate of opened emails:
  • \(\frac{16816}{91298}\) (18.4%) in control vs \(\frac{16105}{91296}\) in treatment (17.6%)
  • Highly significant in Fisher’s exact test (\(p<0.001\))
  1. Slightly (insignificantly) lower rate of click-through after opening
  • 2.3% vs 2.5% (95% OR: 0.81, 1.08)

Next 7 days, all channels, for email-openers: 267 > 241 (previous table)

  • “Marginally insignificant” in Fisher’s exact (\(p=0.1\), 95% CI OR: (0.97,1.39))

Key Q: Does including impact information affect the propensity to donate?

(#tab:fisher-dv)Fisher tests: impact info treatment (relative to baseline)
Experiment estimate p.value conf.low conf.high
Opens, 2018 0.95 0.00 0.93 0.97
Opens, 2019 1.03 0.09 1.00 1.06
Opens, both years 0.98 0.02 0.96 1.00
Clicks, 2018 0.90 0.13 0.78 1.03
Clicks, 2019 1.03 0.80 0.86 1.22
Clicks, both years 0.95 0.34 0.85 1.06
Direct conversions, 2018 2.63 0.00 1.67 4.26
All conversions, 2018 1.11 0.25 0.93 1.32
Direct conversions, 2019 0.81 0.39 0.51 1.27
Direct conversions, both years 1.47 0.01 1.09 2.01

Bayesian inference/credible intervals (incidence)

Draws from Beta(0.5, 100) prior

(#tab:bayesian_Prior_Prob_Comparison)Comparing the two probabilities from the different priors
Experiment Prob. Treatment > Control (Uniform Prior) Prob. Treatment > Control (Informative Prior)
Opens, 2018 0.00074% 0.00074%
Opens, 2019 95.7903% 95.87664%
Opens, both years 0.91281% 0.9709%
Clicks, 2018 6.16993% 6.16069%
Clicks, 2019 62.05441% 62.26238%
Clicks, both years 16.09107% 16.09281%
Direct conversions, 2018 99.99961% 99.99967%
All conversions, 2018 87.58211% 87.60764%
Direct conversions, 2019 16.57803% 16.44126%
Direct conversions, both years 99.51893% 99.53089%
(#tab:bayesian_density_plots)Bayesian tests for difference in proportion (relative to control), Uniform prior
Experiment LB: 99% Credible Interval for ∂ UB: 99% LB: 95% UB: 95% LB: 90% UB: 90% LB: 80% UB: 80%
Direct conversions, 2018 2.067 7.7 2.72 7.0 3.05 6.6 3.43 6.23
All Conversions, 2018 -3.518 9.2 -1.99 7.7 -1.21 6.9 -0.31 6.01
Direct conversions, 2019 -8.389 3.8 -6.87 2.3 -6.11 1.6 -5.25 0.72
Direct conversions, both years 0.014 5.4 0.65 4.7 0.97 4.4 1.34 4.00
(#tab:bayesian_density_plots_Informative_Prior)Bayesian tests for difference in proportion (relative to control), Beta(0.5, 100) prior
Experiment LB: 99% Credible Interval for ∂ UB: 99% LB: 95% UB: 95% UB: 90% LB: 90% UB: 80% LB: 80%
Direct conv., 2018 2.080 7.7 2.72 7.0 3.05 6.6 3.43 6.2
All conv., 2018 -3.503 9.2 -1.98 7.7 -1.21 6.9 -0.31 6.0
Direct, 2019 -8.332 3.7 -6.83 2.3 -6.08 1.5 -5.22 0.7
Direct conv., both years 0.023 5.4 0.65 4.7 0.98 4.4 1.35 4.0
(#tab:bayesian_density_plots_Informative_Prior)Bayesian tests for difference in proportion (relative to control), Beta(0.5, 100) prior
Experiment LB: 99% Credible Interval for ∂ UB: 99% LB: 95% UB: 95% UB: 90% LB: 90% UB: 80% LB: 80%
Direct conv., 2018 2.080 7.7 2.72 7.0 3.05 6.6 3.43 6.2
All conv., 2018 -3.503 9.2 -1.98 7.7 -1.21 6.9 -0.31 6.0
Direct, 2019 -8.332 3.7 -6.83 2.3 -6.08 1.5 -5.22 0.7
Direct conv., both years 0.023 5.4 0.65 4.7 0.98 4.4 1.35 4.0

Key Q: Does including impact information affect the amount raised (average amount given)?

run treatment rev_per_recip av_pos_gift
both control 0.14 248
both treat 0.09 103
2018 control 0.16 537
2018 treat 0.07 90
2019 control 0.10 82
2019 treat 0.12 128

Mixed results on avg. don., amount raised

Average revenue/email, 2018

A. Via email clickthrough:

  • Trtmt $0.07 per email; Ctrl $0.16 per email ($90.46 vs $536.89 CoP)

  • Ranksum: insignificant overall, strongly significant (but probably misleading) for CoP (mean ranks for latter: 30.3 vs 47.0)

B. 2018 – Next 7 days (among email openers)

  • Treatment $0.48 per email vs Control $0.34

  • Ranksum: marginally insignificant overall (p=0.10) and for CoP

  • T-test: marginally insignificant in levels (p= 0.10, CI [-1.84, 0.161]), windsorised at 1000 (p= 0.17, CI [-1.054, 0.182])

  • Difference seems driven by largest donations

C. 2019

  • Rank sum test (donations including zeroes): 0.31

  • T-tests cannot be computed (they didn’t give us the sums of squares)

Preliminary conclusions, future directions (time permitting)

Preliminary take-aways: Don’t fear the info?

We have mixed evidence on the impact of analytical efficiency information. More analysis and synthesis is needed. Some evidence that ‘unrealistic positive’ efficiency information increases giving. The ‘crowding out of emotion’ doesn’t seem to be a strong effect (Bergh and Reinstein), but more power is needed.

Future directions

  • Meta-analyses and systematic review (including our own and others’ work; we have the data, need to analyse it)

    • Follow (PRISMA?) guidelines, consider publication and file-drawer bias
  • Further field experiments involving social fundraising

  • Encouraging co-authors and collaborators