These have been downloaded from email and stored in dualprocess/data/dv_data; or are only in Kiki’s files.
Full spreadsheet index of these, issues and key statistics: data/dv_data/table_of_sheets_sent.xlsx
2018 test. Files in order in which they were received:
2019.02.26 TYVideo Impact Test DV
2019.02.26 TYVideo Impact Test DV2-no outlier
Stats for DV Formula_TYVid(23rd August)
2019.08.27 TYVideo Impact Test DV 2
TYEmailVideo_Results2.xlsx: includes both ‘direct gifts from email’; does not include the 9k gift because it had not happened yet (KK: first thing CRS sent us)
Other gifts (not direct) “7 days after the person opened the email, gifts made through any channel” (seems surprisingly high)
Q: check we interpret it correctly… it seems plausible but we want to know.
Q: what date is this up to (when had CRS checked their database)?
Within this Excel file (TYEmailVideo_Results2.xlsx)
The first table is total as of XX date
Second table ‘Direct Gifts from Email’: we think this means people who clicked on email link and donated
Third table is everything, broken down by donation range
Fourth table is everything, broken down by ‘Form Name’ … but we are not sure what it means
2019.02.26 TYVideo Impact Test DV.xlsx: Sent to Kiki, includes the 9k outlier; only included direct gifts
Note: this may be the same file as
TYVideo Impact Test DV 2.xlsx
2019.02.26 TYVideo Impact Test DV2-no outlier.xlsx: Does not include the 9k outlier but otherwise has everything above (and no [in?]direct gifts)
unlinked-2019.02.26 TYVideo Impact Test DV2-no outlier.xlsx: Stored version of the above, removing problematic linked personal data
Kiki: ‘But I did manage to clear up what was happening with our test results. A donation of $9,000 was made in the control condition between the first set of data they sent me and the second. That’s what created all the confusion and discrepancies.’
Note: The ‘statistics’ files only include direct-from-email donations.
unlinked-2019.02.26 TYVideo Impact Test DV2-no outlier.xlsx has a different breakdown of donations in ranges/treatments than
TYEmailVideo_Results2.xlsx even among the ‘email only’ group; some have more and some have fewer:
Control conversions, bins (<50, 50-99, 100-499, 1000+):
TYEmailVideo_Results2.xlsx : 6, 9, 13, 3
unlinked-2019.02.26 TYVideo Impact Test DV2-no outlier.xlsx: 4, 6, 13, 3
If the data came later, than why are there fewer conversions here, and these are NOT in the outlier category?
Treated conversions, bins (<50, 50-99, 100-499, 500-999):
TYEmailVideo_Results2.xlsx: 19, 17, 25, 6
unlinked-2019.02.26 TYVideo Impact Test DV2-no outlier.xlsx: 26, 19, 21, 5
*Here we have more conversions in some bins and fewer in other bins, and not only the ‘outlier’ bins
[Kiki] Number of conversions between the first and second file. Some are higher (which is normal since more donations came in) but some are lower. I remember talking to CRS about this and they said that some donations don’t go through and that could explain seeing fewer conversions at a later file.
\(\rightarrow\) So the later file is more trustworthy
[Kiki] Revenue for control condition between first and second file. It’s way higher in the second one. Looking closer at gift distributions, we found an outlier donation of 9k. Which led to the 3rd file sent to us without it.
Stats for DV Formula_TYVid.xlsx: Requested statistics including ranks and sums of squares. Statistics clarifying individual donations and rankings.
Stats for DV formulas_TY Video-ID-removed.xlsx: As above, but also includes full list of donation behaviours (with ID removed)
16 Dec 2019 email “FW: thank you video a/b test” thread
tyvid-email-test-dv-2019.xlsx renaming of
TYVid Email Test DV.xlsx (removing links): Probably includes only direct donations from email, probably not censoring outliers.
Stats for DV formulas_TYVid DV.xlsx: Requested statistics including ranks
2019.12.06 TYVid Email Test DV.xlsx: same content as “tyvid-email-test-dv-2019.xlsx” This is the correct stat, Larissa confirms
Q: These two sheets do not agree; the latter has fewer conversions; 44 and 35 donations … these may have been done on different days?
Resolution: the larger number is correct; Larissa checked and found 85 donations.
We were able to put together a connected dataset across both years with identifying information stripped. However, we are not permitted to share this ‘raw data’ publicly.
For our own records, the (.gitignored) files used for this are the following
dualprocess/data/dv_data/archive-and-original/ID_link_2019.02.26 TYVideo Impact Test DV2-no outlier (1).csv, with identifying information stripped from
2019.02.26 TYVideo Impact Test DV2-no outlier (1).xlsx, mentioned above
dualprocess/data/dv_data/archive-and-original/ID_link_2019.12.06 TYVid Email Test DV.csv, with identifying information stripped from
2019.12.06 TYVid Email Test DV.csv, mentioned above.
|TYEmailVideo_Results2.xlsx||for indirect gifts only?||2019-01-17||2018||1||TYVideoEmail||yes||yes||31||67||241||267||yes||yes||no||$5,366||7248||12614||30931||43448||no|
|2019.08.27 TYVideo Impact Test DV2.xlsx||yes?||2019-08-27||2018||NA||NA||yes||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA|
|Stats for DV Formula_TYVid.xlsx||NA||2019-08-23||2018||2||TY Video’, WRD (ignore latter)||yes||no||27||71||NA||NA||yes||no||yes||NA||NA||recoverable||NA||NA||yes|
|Stats for DV formulas_TY Video-ID-removed.xlsx||yes||2019-08-23||2018||2||TY Video’, WRD (ignore latter)||yes||no||27||71||NA||NA||yes||no||yes||14496||6423||recoverable||NA||NA||yes|
|TYVid Email Test DV.xlsx||no (redundant)||2019-12-16||2019||1||TYVid||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||recoverable?|
|Stats for DV formulas_TYVid DV.xlsx||for ranks only?||2019-12-16||2019||1||stats||yes||no?||44||35||NA||NA||yes||no||yes||NA||NA||recoverable||NA||NA||yes|
|(2019.12.06 TYVid Email Test DV.xlsx)||(yes)||2020-04-29||2019||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA||NA|
|filename||Key missing elements|
|TYEmailVideo_Results2.xlsx||Sum of squares and ranks for donations from all modes|
||Counts and amounts by treatment for gifts from all modes including indirect|
|2019.08.27 TYVideo Impact Test DV2.xlsx||NA|
|Stats for DV Formula_TYVid.xlsx||Similar statistics for gifts from all modes including indirect|
|Stats for DV formulas_TY Video-ID-removed.xlsx||Similar statistics for gifts from all modes including indirect|
|TYVid Email Test DV.xlsx||NA|
|Stats for DV formulas_TYVid DV.xlsx||NA|
|(2019.12.06 TYVid Email Test DV.xlsx)||NA|
|TYEmailVideo_Results2.xlsx||what is ‘form name’?||NA|
||Different breakdown of donations in bins than for TYEmailVideo_Results2.xlsx; some less, some more||Explanation (Kiki/CRS) – more donations came in, some were cancelled or never paid; latter file more reliable|
|2019.08.27 TYVideo Impact Test DV2.xlsx||Updates the above through August; but the only difference is 1000 more email opens and 1 click.||ATM (20 Feb 2020) this is in Kiki’s folder only.|
|Stats for DV Formula_TYVid.xlsx||Not sure on the date||NA|
|Stats for DV formulas_TY Video-ID-removed.xlsx||Not sure on the date||NA|
|TYVid Email Test DV.xlsx||NA||NA|
|tyvid-email-test-dv-2019.xlsx||renaming of ‘TYVid Email Test DV.xls’, removing links||order of impact/anchor switched in table|
|Stats for DV formulas_TYVid DV.xlsx||Why are fewer gifts listed here vs. in ‘tyvid-email-test-dv-2019.xlsx’? It cannot be explained by outliers||Kiki: When I asked I got the attached response which didn’t shed any light. They never replied back to my response/hypothesis. I would trust the first file since they might have messed something up when they were preparing the formula file.|
|(2019.12.06 TYVid Email Test DV.xlsx)||same content as “tyvid-email-test-dv-2019.xlsx”||CRS confirms this is the correct one|
TODO: re-construct data from original files using R, no Excel work