10  Analytical (impact) information \(\rightarrow\) generosity

Note: this relates to an ongoing project of David Reinstein with several co-authors, including ongoing field experiments as well as meta-analysis underway

Much of this project is being openly presented in the (in-progress) “Impact of impact treatments on giving: field experiments and synthesis” bookdown, a project organised in the dualprocess repo.

An earlier set of presentation slides are hosted here - press ‘O’ to see the fill multi-dimensional slide map; the outline slide is here s

Here we consider the impact of being presented with (or actively pursuing) effectiveness information (which is naturally analytical information) on generosity and the willingness to donate. (We also consider some work involving analytical information about charities that would not be seen as particularly relevant to impact in the sense described above.)1

Further ‘grant-worthy’ motivation (unfold).

As noted above, scientific evidence suggesting that organizations’ “cost per outcome” differs substantially, perhaps by a factor of 1000 or more (Jamison, Karlan, and Schechter 2008). This has motivated an increasing focus on charity effectiveness, spearheaded by initiatives from the Rockefeller and Gates foundations. Furthermore, organizations like GiveWell now provide direct ratings on the basis of per-dollar impact (e.g., cost per life saved), and are reaching out to larger audiences. This approach might boost giving, by leveraging donors’ preferences for contributions “to be put to good use”—i.e., for direct interventions.

E.g., In Aknin et al. (2013) participants reported greater happiness when the impact of their contribution was highlighted. (van Iwaarden et al. 2009; Aknin et al. 2013).

However, while we have rigorous evidence on what works and doesn’t work in anti-poverty and health interventions (acknowledged by the 2019 Economics Nobel Prize), we actually know very little about how potential donors react to this impact information!

10.1 Exposure to cost effectiveness and impact information (analytical information) might reduce generosity

  • May turn off System-1 and reduce giving

  • Statistics diminish impact of ‘identifiable victim’

References: Small, Loewenstein, and Slovic (2007), @(Karlan and Wood, 2017), Bergh and Reinstein (2020), Smeets, Bauer, and Gneezy (2015)

See also: Linda, Organizations, and Parsons (2007)

10.1.1 Outline of the evidence

A subjective outline of the evidence (Reinstein):

We consider a range published work (from a non-systematised literature review) including laboratory and hypothetical experiments, and non-experimental analysis of related questions (such as ‘the impact of general charitable ratings). However, we focus on field and ’field-like’ experiments involving truthful analytical ‘impact-per-dollar’ relevant information in settings involving substantial donations.

The experimental evidence is limited, mixed and overall indeterminate. We could only find a single large-scale field trial closely bearing on ‘impact-per-dollar’ information. (Karlan and Wood 2017) ran a field experiment on a charitable mailing campaign for Freedom From Hunger, reporting mixed (null overall, positive for some subgroups, negative for others), and some underpowered results. Small, Loewenstein, and Slovic (2007) (involving laboratory experiments with real donations) find that giving to an identifiable victim is reduced when statistics are also presented and “priming analytic thinking reduced donations to an identifiable victim relative to a feeling-based thinking prime.” Bergh and Reinstein (2020)’s work (M-turk charitable giving, plus one small-scale field-ish context) mainly finds a near-zero effect (e.g., in pooled analysis across all six experiments, see table 1 right column, table 2 bottom, and appendix page 51. *

Further related evidence is also indeterminate (see fold).

Trussel and Parsons (2007) finds a positive impact of providing a particular type of analytical charity financial information on a subset of donors.

There is also mixed evidence on the impact of real-world charity ratings (Yoruk 2016; Brown ea 2017; Gordon ‘ea ’19), and mixed evidence on ’excuse-driven information seeking’ (Exley ’16b; Fong & O, ’10; Metzger & G ’19).

There is some further related evidence lab experiments, but the results are limited. Fong and Oberholzer-Gee (2010) apparently findd that exogenous information about recipients increases donations, but they do not report this specifically. There is some speculation, but again, mixed evidence, that individuals already in a “system 2” (deliberative) frame are more likely to be positively affected by impact information. There is also a distinction to be further explored between “output information” (how the donation is used) and “impact information”; the former is seen to increase generosity in several studies. [Reference needed here]

We are planning more careful multilevel/meta-analysis of this work, incorporating ‘new evidence’ from a set of field experiments run by Reinstein and others involving two distinct charities and three campaigns.

Presenting “Effectiveness” and other types of analytical efficiency information

Karlan and Wood (2017) KW (2017) run a field experiment on a charitable mailing involving scientific impact information. Their treatment involved adding scientific impact text to an appeal, while removing some emotional text:

Effectiveness treatment language:

According to studies on our programs in Peru that used rigorous scientific methodologies, women who have received both loans and business education saw their profits grow, even when compared to women who just received loans for their businesses. But the real difference comes when times are slow. The study showed that women in Freedom from Hunger’s Credit with Education program kept their profits strong–ensuring that their families would not suffer, but thrive.

Control (emotional) treatment language:

Because of caring people like you, Freedom from Hunger was able to offer Sebastiana a self-help path toward achieving her dream of getting “a little land to farm” and pass down to her children. As Sebastiana’s young son, Aurelio, runs up to hug her, she says, “I do whatever I can for my children.”

They find a null effect overall with fairly tight bounds. They report a near-0 impact on incidence (in table 2: column 1), and a standard error under 1%, relative to a baseline incidence of 14%. They estimate an impact on amount of donated +2.35 USD, with a standard error of 1.98; this is relative to a mean of 14.17 USD. (In their Winsorised results the point estimate is -0.074, the standard error is 0.82, and the mean is 11.30)

When they differentiate this by an ex-post classification (not preregistered), they find positive effects of this information for ‘prior large donors’ and negative effects for ‘prior small donors’.

This is probably the most substantial and relevant study. We consider this further in our meta-analysis further below.

Nonetheless, the study is presents only a single piece of evidence from a specific context. It also has some potential limitations in considering our main question.

  1. Their treatment changed two things changed at once: impact information was added which emotional information was removed

  2. It may have been entangled with a Yale/institution effect

  3. The nature of ‘impact’ information was not particularly quantative; it did not express a an ‘impact per dollar’

Design and results summary:

Treatment(s): Add scientific impact text to base script (included a standard qualitative story about an individual beneficiary), while removing emotional text:

\(\rightarrow\) little net effect

\(\rightarrow\) reduced (increased) giving among small (large) prior donors (not a preregistered hypothesis)

Linda, Organizations, and Parsons (2007)

2 x 2 mailing appeal for People with Aids Coalition-Houston, with several treatments:

  • adding “service efforts and accomplishment info”(SEA),
  • and adding favorable “FINANCIAL” spending/overhead ratio infos.

The FINANCIAL treatment (alone) \(\rightarrow\) 180% increase in odds of donating among prior donors (\(p<0.05\))

(The other effects are mainly insignificant, with wide error margins, suggesting the study was underpowered)


Further details from Parsons (unfold, direct quotes from paper)

(This is all direct quotations:)

Potential donors were sent, via a direct mail campaign, fundraising appeals containing varying amounts of financial and nonfinancial information in order to determine whether individual donors are more likely to contribute when accounting information or voluntary disclosures are provided …

A logistic regression provides evidence that some donors who have previously donated use financial accounting information when making a donation decision. The results are inconclusive regarding whether donors use nonfinancial service efforts and accomplishments disclosures to determine whether and how much to give, but participants in the lab experiment judged the nonfinancial disclosures to be useful for making a giving decision

Both experiments use a two-by-two design to manipulate the direct provision of (1) financial information (derived from mandatory informational tax filings which are available only if requested by the donor) and (2) voluntary disclosure of nonfinancial accounting information (not otherwise available to the donor). By analyzing actual cash receipts from the fundraising appeal, I find that donors who had previously contributed to the organization are more likely to donate when financial accounting information is directly provided. New prospective donors make larger contributions when either financial information or voluntary, nonfinancial accounting information is included with a basic fundraising appeal, but differences are not statistically significant.

The first manipulation is to include financial information drawn from the audited financial statements with the basic fundraising appeal. Summary charts and graphs, instead of full financial statements complete with footnotes, are used to highlight the efficiency measures typically emphasized in previous literature. The financial information indicates that 92.5 percent of the entity’s expenditures were directed to program expenses in the prior year. This figure compares favorably with the 60 percent suggested minimum level recommended by the National Charities Information Bureau

The second manipulation is to include a voluntary disclosure of service efforts and accomplishments (SEA) that describes the organization’s past efforts to serve its beneficiaries and gives specific information about the success of its programs (see Appendix C). This information is in narrative form and written in lay terms. 3 It provides both output (quantity of product or service produced) and outcomes (results) information as defined in Hatry et al. (1990).

Charity: People with Aids Coalition- Houston

Limitations of @Linda2007 (for the present question)

  • The efficiency measures they use, while largely analytical (thus, helpfully, potentially triggering dual-process shifts), is not meaningful ‘impact-per=dollar’ or ‘output-per-dollar’ information

  • They highlight a differentiated treatment effect, but this may be an ex-post comparison. They don’t report pre-registration.

  • Unclear at times, missing some important information:

    • Is the model a logit or LPM?

    • Not effect-coded; no measure of overall impact of FINANCIAL across both SEA treatments.

    • Confidence intervals are not shown Bergh and Reinstein (2020) (Lab and field experiments)

From Abstract:

Across six experiments we examined how images of identified victims interact with information about charity efficiency (money toward program) and effectiveness (program outcome). We further examined if the images primarily get people to donate (yes/no), while efficiency/effectiveness might provide a tuning mechanism for how much to give. Results showed that images influenced the propensity to donate and induced participants donate their full bonuses, indicating heuristic effects. Efficiency and effectiveness information had no effects on donations.2

Notes on above table:

(Note that in the above, all binary variables are ‘effect coded’) Considering ‘Don share’ (the share of the endowment contributed), as well as the linear probability model, we see that the pooled effect of the effectiveness information is fairly tightly bounded around zero. Even at the 95% lower bound, the effect is no more than an 11% reduction in the share donated (.04/0.35), and a 10% reduction in incidence (.06/.62).

Study 6 had the most straightforward ‘impact information’ as per our definitions.

Above we see estimated of odds ratios, relative to the control group, of the incidence of donating to RB (Carter Center: the river-blindness charity) and GD (Guide Dogs for the blind). Confidence intervals reveal a lack of power. However, there is suggestive evidence (p=0.09 and p=0.16, respectively) that the image lead people to be less-likely to donate to GD and more likely to donate to RB. This may have been driven by the African appearance of the blind girl depicted.

The wide confidence intervals of the odds ratios suggests that Study 6 had limited statstical power. (Caviola, Schubert, and Nemirow 2020) (Hypothetical/intentional experiments; Information treatments, debunking/debiasing) {#caviola}3

  • Series of four hypothetical experiments, mainly on M-Turk

Abstract: … Across six tasks (Studies 1a, 1b), we found support for both explanations. Among lay donors, we observed multiple misconceptions—regarding disaster relief, overhead costs, donation splitting, and the relative effectiveness of local and foreign charities—that reduced the effectiveness of their giving. [*]

Similarly, we found that they were unfamiliar with the most effective charities (Studies 2a, 2b). Debunking these misconceptions and informing people about effectiveness boosted effective donations [*]; however, a portion of lay donors continued to give ineffectively to satisfy their personal preferences.

By contrast, a sample of self-identified effective altruists gave [*] effectively across all tasks.^[Notes on this paper are adapted from Janek Kretschmer’s work.

* All experiments in this paper involved hypothetical (stated) giving ‘choices’. ]

Summarised findings (as reported by authors) and main points:

  • Both ‘lack of knowledge’ and preferences impede effective giving

  • It is important to understand the (ambivalent) relationship between the desire to do the most good and ‘knowing about the best option’ and ‘actually giving to the most effective charities’

  • There are huge difference between ‘EA donors’ and the general population [?] (lay people) in term of knowledge and preferences

  • Study 2 finds strong effects of informing participants about GiveWell’s top-recommended charities:

    • e.g., in Study 2a, it resulted in 41% of participants expressing an interest in donating to one of them.
  • Misconceptions about charities and preferences for ineffective options were closely linked (Mulesky 2020) - Hypothetical choices, realistic versus over-optimistic impact information

(Unfold key elements of abstract)

… investigates whether donor-driven market incentives can explain why nonprofit organizations routinely lack an evidentiary basis to justify their operations. Across three preregistered online survey experiments (N = 1,058), two domains (global health and human rights), and two types of interventions (medical care and issue-awareness), this paper finds that people do not reward exceptionally positive charitable impact, but they do punish charities that admit their programs were ineffective. Charities are only rewarded for revealing information about their impact when the results are unrealistic and unattainable. The results also demonstrate that donors are more sensitive to information about administrative overhead than they are to direct information about impact.


  • Participants “recruited through Positly, an online study recruiter that solicits high quality participants from Amazon’s M-Turk”

  • Series of experiments giving participants a hypothetical choice: “you have the opportunity to donate 100 USD to i. [described charity] or ii. Some other international health charity.” They needed to hypothetically allocate the 100USD completely between these two.

Study 1 treatments:

  • No Impact information
  • Null Impact information (“study found that Human Rights International did not have the desired impact, suggesting that reform is needed to increase effectiveness’.)
  • Realistic impact information … “results of the study suggest … [HHI} saves the life of one person for each 2000 USD spent”
  • Unrealistic impact information “… for each 150 USD spent”

Study 2: Similar to Study 1, but for preventing HIV infections4

Results considered

Stated finding:

Charities are only rewarded for revealing information about their impact when the results are unrealistic and unattainable.

In fact, the mean donations were ordered: Unrealistic (Impact information) \(>\) Realistic \(>\) No(ne) \(>\) Null.

While the difference between Realistic and None was not significant by the standard tests, the confidence interval does not rule out an affect of up to almost 15 percentage points.

Other comments (unfold)

Comments from Janek Kretschmer:

  • The author considers three aspects of the ‘problem’ with beliefs: no-reward for actual high impact, punishment for low impact, and reward for unrealistic high impact. This might be reduced to the simple assumption that actual cost-effectiveness information can not live up to donors’ expectations [which] makes an investment into scientific evaluation unprofitable for NGOs.

  • The results may largely be explained by the evaluation bias (Caviola et al. 2014). This could be emphasized further.

  • In study 3, the presentation of possible overhead spending is potentially too negative.

  • Also, after accounting for overhead the difference between both scenarios is small 2000 USD vs. 1800 USD, and more mental resources are required to obtain this conclusion

  • More information about the “another charity option” would be helpful. Ideally, the author would provide the participants’ decision screen in the appendix.

When the author wonders whether “active donors […] differ from the general population,” she asks a relevant question (cf. Karlan and Woods, 2017). However, she simply provides the mean donation (M = 70.9 USD) to Human Health International among respondents who have donated in the past twelve months. Thus, it can be said that there is no statistical difference from the mean donation (M = 69.9 USD) among all respondents, including non-donors However, the more interesting question “do large donors react differently to the four conditions?” remains answered. Evidence: Analytical information \(\times\) emotional information and ‘identifiable victim’

(Kogut and Ritov 2011; Loewenstein and Small 2007; Small, Loewenstein, and Slovic 2007) (Drouvelis and Grosskopf 2016), (Caviola et al. 2014)

(Small, Loewenstein, and Slovic 2007), studies 3-4

Study 3:

individuals who faced an identifiable victim donated more than those who faced victim statistics, p < .01, and also donated more than those who faced an identifiable victim in conjunction with statistics, p < .05.

(They interpret the statistics as possibly ‘debiasing’ the IVE)

Study 4: “Priming analytic thinking [math problems] reduced donations to an identifiable victim relative to a feeling-based thinking prime [”impression questionnaire”]. Yet, the primes had no distinct effect on donations to statistical victims, which is symptomatic of the difficulty in generating feelings for these victims.”


Basic design has key strengths: Double-blind, real donations, distractors, careful use of language.

Primes: Note, the latter non-effect appears tightly bounded; but this could simply be driven by nonlinearity. If people gave little to statistical victims, there is less room for this to decrease further. A classic problem when considering interactions.

Ideas42 Summary:

Researchers gave study participants the opportunity to donate $0-5 to famine relief efforts at Save the Children (n = 159). One group received letters that included a picture and brief description of a little girl. A second group received letters describing factual information about food security, and a third group received letters with both the little girl’s profile and factual information. The photo and description prompted an emotion-based response, raising more than twice as much money as the factual solicitation. Including factual information with the girl’s profile reduced this effect, with no significant difference in giving between those who received both pieces and those who received factual information only

Bergh and Reinstein, 2020, SPSS

10.2 Overall ‘net’ responses to charity ratings

There is a small body of evidence on how charity quality ratings (which are not typically ‘impact’ ratings as we have defined it) affect, or at least correlate with a charity’s fundraising success. The effect of these ratings presumably relates both to individual’s willingness to seek out and process this information (as in our discussion of CBA), and to the impact of this information on an individual’s generosity… .

10.2.1 Evidence on responses to charity ratings

One characterization, from Bergh and Reinstein (2020)

Some work further suggests that changes in charity ratings lead to changes in charity revenues (e.g., Gordon, Knock, and Neely (2009); Yörük (2016)), but it is unclear if this is driven by efficiency evaluations per se. For instance, people might respond to the number of stars given to a charity without deeply considering what these stars represent.

Yörük (2016).

“Charity ratings” Journal of Economics & Management Strategy, 25(1), 195-219.

Relevance: Reasonably strong causal evidence that in general, charity ratings may boost a charity’s fundraising, at least for some types of charities. However, this is based on Charity Navigator ratings, which do not generally agree with our measures of impact.


Type of evidence: Observational, claiming causality through a regression discontinuity framework

  • Charity Navigator stars are based on a continuous score across categories
  • Identification via RD: Impact of crossing a ‘star’ threshold on amounts raised

Background mechanisms and related evidence: the role of consumer reviews and independent ratings in for-profit sectors.5

Key findings:

  • For relatively smaller and unknown charities one star increase in ratings is (causally) associated with a 19.5 percent increase in the amount of charitable contributions received

10.2.2 Evidence for “Charity choices do not respond ‘efficiently’ to information about costs and benefits of charity”

Null (2011) ran a set of experiments at Kiwanis/Rotary clubs and with “professional subjects” (presumably, university administrators) at the Berkeley X-lab. The former strictly involved allocations among charities, in the latter case what was not given away could be kept. For the main reported treatments, participants made a series of decisions under different incentives (mostly on the same page and thus simultaneously?). The “prize” was $100; in each session only one decision from one subject was chosen for actual payment/donations.

Many participants who choose to donate positive amounts to multiple charities in earlier (?stages) continue to donate to multiple charities when one charity is given a better match rate; they only “imperfectly substitute” (and some even substitute away from the now “lower-priced” charity). She attributes this to both risk aversion (diminishing utility in to each charity’s actual impact, along with uncertainty about this impact) as well to as a version of “warm glow” with a diminishing marginal benefit in the amount given to each charity.

She also introduces exogenous risks over matching rates, and notes that roughly 2/3 of those that choose to shift only imperfectly are not measured to be “risk averse”.

However, this could also be attributed to a simple failure to make these cost-benefit calculations (as she also found some evidence suggesting misunderstanding of the nature of these incentives).

Note: There is a great deal of other evidence on this point that should be incorporated. The Null experiment makes this particular explicit, but I believe there is substantial additional work, both involving experiments that participants were aware* of, and in more natural contexts.*

* Consider also (Metzger and Günther 2019).

See also

Vesterlund (2003) Chhaochharia, Ghosh, et al. (2008), Landry et al. (2010), Brown, Meer, and Williams (2016)

Gordon, Knock, and Neely (2009)


There is also (informal?) evidence that people may dramatically underestimate the cost of saving lives.6

  1. Effectiveness information may also affect how donors perceive the social signaling value of their donation. We return to this in the signaling and social pressures… section.↩︎

  2. Need to delve into this further: tight null effects or underpowered studies? Consider confidence intervals of effects reported, as in tables below. These need some clarification and improved formatting.↩︎

  3. CONSIDER: This paper/result may not inform ‘evaluation aversion’ should we move it to another section or subsection?↩︎

  4. The third study revolves around the addition of overhead information, in with what might be seen as a fairly negative presentation. I do not discuss it here↩︎

  5. E.g. Luca (2011): one star increase in online rating leads to a 5 to 9 percent increase in revenue, Jin and Sorensen (2006): health plan ratings have a significant impact on individuals’ health plan choices, Reinstein and Snyder (2005): positive expert reviews have a significant effect on the box office revenue of movies.↩︎

  6. DR: I am considering whether this should be seen as a relevant barrier. It is important to note in trying to measure the impact of effectiveness information. Evidence comes from Caviola et al 2019 and Greenberg et al 2017↩︎