Social Experiments “for” Skeptics: #1.1
Posted: 07 December 2008 06:05 PM   [ Ignore ]
Jr. Member
Avatar
RankRank
Total Posts:  34
Joined  2008-11-26

Howdy, All.  This posting was inspired by (a) some recent media attention to events involving atheism (b) my possibly naive view that we (skeptics, freethinkers, naturalists, etc.) could advance some of our “causes” more effectively using science.  Below I’ll describe a hypothetical study about how to promote skepticism and point out several ways it might be improved.  On one hand this is just an interesting (to me) exercise in designing a “field” study (i.e., not in a well-controlled lab).  On the other hand, such studies are used in, say, large-scale public-health interventions (e.g., consider program-evaluation studies such as those by David MacKinnon and many others); is it far-fetched to think CFI or similar organizations might invest in studies like this as an empirical basis for allocating their limited resources?  I’m curious about others’ thoughts on any aspect of this; I’ll try to make time to check back in and contribute to any discussion.

RESEARCH QUESTION: Suppose some organization wants to increase what I’ll call doubt about god (DG) among people in some fairly large geographic region (e.g., a U.S. state), and they want to figure out how best to do this.  Let’s say they’re considering just two types of ad campaigns:

A. Display on city buses light-hearted signs to the effect that “god’s not real so just enjoy life.”

B. Display somber signs in public places that effectively say “religion is a delusion that makes people mean and stupid.”

HYPOTHETICAL STUDY: Consider this fairly simple experiment to compare A versus B: Pick two cities in the region, randomly assign one ad campaign to each city (call these City A and City B), run each ad campaign in its respective city for four weeks (simultaneously), and after the campaign survey members of each community to assess DG with the question “Do you believe in god?”  Suppose we did this, and in City A and City B the percent who expressed DG (i.e., answered something like “No”) were pA = 15% and pB = 5%.  Looks like ad campaign A won, right?  Well, kind of, but this simple study is subject to scads of problems.

Because my remarks about this study are sort of long-winded, I’ll post them as replies in a few minutes.  CAUTION: Before reading further you may want to consider for yourself what can be concluded from this study’s results, whether any alternative explanations are plausible, and how to improve the study.

 Signature 

Adam

Profile
 
 
Posted: 07 December 2008 06:20 PM   [ Ignore ]   [ # 1 ]
Jr. Member
Avatar
RankRank
Total Posts:  34
Joined  2008-11-26

Social Experiments “for” Skeptics: #1.2

Okay, so in the initial post I posed a fairly simplistic research question and described a little experiment—so-called because the cities were assigned randomly to the ad campaigns—to address it.  Now I’ll point out some problems related to the interpretation that campaign A won; you may or may not agree with these points (and I may disagree with them after more thought).

CRITIQUE PART I:
To start seeing some of the study’s problems and considering how to design a better one, imagine that immediately before the ad campaigns these cities’ DG percentages were pA0 = 25% and pB0 = 10%.  Whoa!  It now appears both campaigns lowered DG but B worsened it less (5%) than A (10%).  (For now I’ll ignore two important but slightly technical issues: Some of these percentages may be indistinguishable from each other due to sampling error, and differences may not be the best way to compare percentages—in fact, B lowered DG more than A if we consider relative differences [e.g., 40% drop for A, 50% drop for B] or odds ratios.)  But wait, there’s more!  Suppose we’d randomly assigned another city to be a “control” city, say City C, in which no DG signs were posted, and before and after the ad-campaign period this city’s DG percentages were pC0 = 20% and pC = 5%.  So now it seems both the A and B campaigns may actually have been effective in that they helped avoid the substantial drop observed in City C.  (To help make comparisons, it could be helpful to organize the results in a table or plot; I couldn’t get the crude pain-text versions I made to appear properly in this font.)

What’s going on here, and what do these results tell us about the effects of the two focal ad campaigns on DG?  Well, it’s hard to say, given the substantial pre-ad differences among cities and City C’s strange drop in DG.  One plausible explanation is that (a) cities in this region simply vary a lot in DG prevalence, and (b) something besides the ads happened in the region during the course of the experiment to decrease DG (e.g., a widely publicized miracle, a region-wide effort by religious groups to counter the ads).

At any rate, the above example suggests a few potential improvements to our initial study:

1. Include a pre-ad survey and a control condition (e.g., no signs, signs about something other than god): Both of these facilitate meaningful comparisons given that we can’t control pre-existing differences among cities or influential events that happen during the experiment; random assignment might reduce the importance of the pre-ad survey, but I think it could still be useful (but I’d have to think harder about why exactly).

2. Choose cities that are similar in some respects: This would likely reduce pre-experiment variation, though at the expense of limiting generalizability (i.e., our findings would apply to a smaller universe of cities).

3. Monitor regional news or relevant activity during the experiment to detect events that might influence DG for all or parts of the focal population.

4. Include more than one city in each condition (i.e., A, B, control): Although having at least two per condition complicates analyses—whose details I’m ignoring—it helps separate a particular campaign’s effect from its associated city’s effect (e.g., City A’s DG drop from 25% to 15% may be due to something about City A or something about campaign A [or both]).

I have additional thoughts, but because they’re about different aspects of the study design I’ll stop this here and post them in a second “reply.”

 Signature 

Adam

Profile
 
 
Posted: 07 December 2008 06:31 PM   [ Ignore ]   [ # 2 ]
Jr. Member
Avatar
RankRank
Total Posts:  34
Joined  2008-11-26

Social Experiments “for” Skeptics: #1.3

CRITIQUE PART II: My first “reply” included some initial thoughts about the hypothetical study I’d described.  Below are some other ideas off the top of my head that might be worth considering before, say, committing resources to actually doing the study (in case anyone’s seriously considering it).  The first several of these are potential problems with the initial study, and the later ones are more about extensions to address relevant issues.  (In practice it’d be good to be more systematic and thorough by considering threats to the four major types of validity—the above thoughts mainly concern statistical-conclusion and internal validity—but I’ll defer that.)

1. Did the survey actually include people likely to see the signs?  For instance, what if people who work in a given city tended to see the signs but those who live in that city were surveyed about DG?  Or what if some people saw both signs (e.g., commuters)?  It might be useful to include a “manipulation check” in the survey by assessing which sign (if any) each respondent saw (or heard about?)—being careful not to let this influence their response to the DG question (e.g., by asking the DG question first).

2. Was DG measured appropriately?  Whatever abstract, latent construct DG is meant to represent, one self-report item probably doesn’t measure it reliably or validly.  Would the results have been markedly different had the question been “Do you believe in a god?”, “Do you believe god exists?”, “Is there a god?”, “Are you sure there’s a god?”, “Do you doubt god?”, or any of several other variants?  What if some type of rating scale had been used instead, or perhaps indirect or behavioral measures (e.g., physiological responses to god-related cues)?  If there’s not a psychometrically sound measure of DG that’s practically feasible for the survey (e.g., not too time-consuming or complicated for respondents), a pilot study could be used to develop one.

3. How do people in the focal region actually interpret the message of each type of sign?  For instance, what does it make them think about, or how does it make them feel?  This could be addressed using focus groups or some sort of qualitative research—not something I’m used to, so I’d be interested in thoughts about this.

4. Do the ad campaigns have negative side effects, like increasing the prevalence of “immoral” behavior, mental-health problems, natural disasters, pestilence, famine, etc.?  If so, this would be important to know so these side effects could be avoided.  If not, this would be valuable empirical evidence for responding to critics who say the ads are bad for the communities.

5. Do the ad campaigns differ in their long-term effects on DG, such as weeks, months, or years after the campaign ends?  In some respects this is more important than their immediate impact, and for the planning of future campaigns it could be useful to know how long it takes for a campaign’s effect to “wear off.”

6. Which type of campaign is more cost effective?  For example, with the above results campaign A might be judged much more cost effective if its signs were substantially cheaper to produce and display, yielding more bang for the buck.

7. Does a given ad campaign work better for some types of communities than others, or for some types of persons than others?  To address these questions we’d need to measure some things about the communities and about the individual respondents to use them as explanatory variables in more sophisticated analyses (e.g., multilevel/mixed models).

8. Are there interesting effects of variations in the ads, such as particular wording, visual design features (e.g., colors, layout, graphics, fonts), or placement of the signs (e.g., side of bus, back of bus, inside bus)?  What about variations in the timing of the ad campaigns, such as the overall length or maybe the frequency of rotation among alternative signs (e.g., a different sign each week)?

9. How could similar campaigns be implemented in communities without public transportation?  Posted flyers?  Billboards?  Sandwich boards?  Rented ad vehicles?  Sky-writing?

10. Would other study designs be better?  For example, we might include the same people in both the before and after surveys, and/or we might use more than one campaign in each of the cities with breaks in between and counterbalancing of the order (e.g., ABC in one city, BCA in another, CAB in a third, and reflected versions of these sequences in three more).  Other design strategies could be used to deal with issues that are somewhat technical, such as Latin squares or fractional factorials to study things that can be manipulated experimentally (e.g., features of the signs or their placement), or blocking cities on important features to reduce error variance due to factors that can’t be manipulated (e.g., demographics, pre-campaign measures of religious behavior).


That’s about all I have time for right now.  Again, I’m curious not only about any thoughts on this particular hypothetical study—admittedly a kind of silly toy example, but perhaps instructive—but also on the bigger issue of whether CFI or similar organizations might ever undertake large-scale social experiments like this to gather scientific evidence about how they might more effectively accomplish their stated missions.


Cheers.

 Signature 

Adam

Profile