It’s the elusive golden chalice of evaluation: proving FOR SURE that a SINGLE program made a REAL difference for those served. To meet such a high standard of knowledge, evaluators turn to experimental methods. The primary experimental method used is random control trials – or RCT.  These start by taking a group eligible and randomly assign them to either a “treatment” group that gets services, or a control group that does not. Then, you gather relevant quantitative data (i.e. survey, census, records, etc.) before and after your program and compare how your treatment group changed with how the control group changed. This is the closest we in the social sciences get to putting anything in a petri dish since our last biology class. Why would you make such efforts to fit people and their lives into a strict scientific structure? If done correctly, such a study can answer that golden question. It can provide the strongest possible evidence that your program really did make a difference for participants. However enticing this prospect, the evidence is only as strong as the study’s faithfulness to its scientific inquiry process and thus this method poses serious challenges:
  • The program must be implemented without change during the study period. This makes the approach an ill fit for those who wish to continually evolve their program and/or have a participatory approach to program design and implementation (since design is settled before participants exist).
  • The population must be large enough to support statistical analysis of its results.
  • You must truly randomly select your groups. Random selection can present ethical challenges; you will need to effectively deny service to those qualified. You cannot allow people to self-select for participation, and may compromise your study if participants know they have/have not/will be selected (if attempting a “double-blind” study which is typical in medical research).
  • You must have a full understanding of the context of your study participants’ (treatment and control groups) lives to know you are isolating YOUR program’s impact. Related events, such as a policy change or concurrent program, may affect the indicators you are tracking and thus make it impossible to truly attribute any impact on those indicators to your own program.
Given these challenges, we often feel that there are often evaluation designs that make better use of resources and more closely align with organizational strengths and values. These can include using quasi-experimental methods (these use statistics or other tools to approximate the results of an RCT), qualitative methods and growth models. These methods require less adaptation in your program design, but do you require you to have a strong, full dataset. As ever, the Improve Group always suggests starting any evaluation design with careful thinking about your questions, the purpose of your evaluation and your resources, rather than deciding on your methodology first.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine