Evaluating Crime Prevention through Social Development Projects: Handbook for Community Groups

Evaluating Crime Prevention through Social Development Projects: Handbook for Community Groups PDF Version (911 KB)
Table of contents

Acknowledgements

Evaluating Crime Prevention through Social Development Projects was developed as part of a train-the-trainer program to equip program staff at the National Crime Prevention Centre (NCPC) with the knowledge and resources needed to encourage and enhance the evaluation capacity of community groups involved in crime prevention through social development projects.

The Trainer's Guide and accompanying Handbook for Community Groups were developed by Mary Sehl, Senior Evaluation Analyst, National Crime Prevention Centre. As with most projects, this project could not have happened without the advice and support of many others. Special thanks are due to Colleen Ryan, now with Health Canada, for identifying the need for this initiative and to members of the project's advisory committee for their valuable direction:

Special thanks are also due to Brin Sharp, whose training and advice in adult education techniques are reflected throughout the Trainer's Guide, and to Susan Howe for the energy and support she brought to this initiative and for her help in planning and cofacilitating a pilot training session in Vancouver. Her creativity is reflected in many of the exercises used in the training workshops.

Thank you also to all members of the NCPC evaluation unit for their comments and advice on the training package and for the energy they have brought to delivering the training to NCPC program staff. Particular thanks go to Muguette Lemaire, Senior Evaluation Analyst, Montreal, for her extensive help with the French version of the Trainer's Guide and Handbook for Community Groups and to Carolyn Scott for taking a lead role in planning the evaluation of this initiative. Tim Peters, Antoine Bourdages and Wayne Stryde also deserve much appreciation for their direction and ongoing support for this project as members of NCPC's management team.

Mary Sehl
Senior Evaluation Analyst
NCPC

Why evaluation training?

The National Crime Prevention Centre (NCPC) sees evaluation as a tool for project management and learning. Evaluation is not done simply to prove that a project worked, but also to learn about and improve the way it works.

Community groups interested in applying for NCPC funding to support their crime prevention through social development project will be expected to play different roles in evaluation depending on the stream of funding they receive.

Projects funded through the Crime Prevention Action Fund (CPAF) develop innovative ways to prevent crime. The CPAF helps people working at the ground level undertake activities that deal with the root causes of crime. It aims to build partnerships between sectors such as policing, community health and voluntary and private sectors to enhance community capacity to prevent crime through social development. It helps community groups, to make their crime prevention efforts more sustainable, and to increase public awareness and support for crime prevention activities.

Projects funded under CPAF conduct evaluations to:

The Policing, Corrections and Communication Fund (PCCF) supports projects where community partners work together to prevent crime primarily through social development. It is intended for law enforcement agencies, community corrections groups/organizations, Aboriginal communities, community-based organizations and the municipalities in which they work.

If you have received or are interested in funding though the Crime Prevention Action Fund, or the Policing, Corrections and Communication Fund , this training will help to improve your ability to develop a sound project plan and to conduct a credible evaluation.

The Research Knowledge and Development Fund (RKDF) supports a range of research activities, demonstration projects, knowledge transfer initiatives and evaluations that identify and analyze gaps in the current body of knowledge related to crime prevention in Canada; create new knowledge in areas where gaps have been identified; synthesize the results of existing research; and contribute to a growing awareness and recognition of promising practices and models for community-based crime prevention. Projects are intended to demonstrate what works and what is promising in reducing the risk factors associated with crime and victimization. Third-party evaluators are hired to conduct rigorous evaluations of these projects in order to identify the costs, benefits, and overall effectiveness of innovative efforts to prevent crime.

Project management and staff will work closely with the third-party evaluator, and in most cases will be involved in the collection of information for the evaluation. If you are interested in the RKDF, this training is an important way to improve your understanding of evaluation and your ability to work with an evaluation contractor.

For more information about NCPC funding programs, see the National Crime Prevention Strategy web site www.publicsafety.gc.ca/ncpc

Organization of this Handbook

This Handbook is organized into seven chapters that correspond to the seven modules of the Crime Prevention through Social Development Evaluation Training package. The end of each chapter provides a glossary of terms used in the chapter and a list of resources relevant to the topics covered. Worksheets used in the training sections are provided at the close of each chapter.

We hope you find the handbook a helpful reference during the training sessions and long after you have completed them. We encourage you to make use of the resources in the resource section of each chapter as you plan evaluations of your crime prevention projects.

Module 1: An overview of evaluation

Learning Objectives

Why should we care about evaluation?

If we care about preventing crime and victimization in Canadian communities, it only makes sense to care about what works in reducing crime and victimization. The only way to know this for sure is to invest in evaluation.

Why don't we evaluate?

Time

We know that time is especially a problem for community groups that are operating on shoestring budgets. It takes time to plan an evaluation, implement it, analyze data, report the results, and review their implications for project activities. The good news is that much of this work is part of good project management and can be integrated into daily activities.

Money

You may feel that the costs devoted to evaluation could be better spent on project activities. It is true that, at a minimum, evaluation requires costs in staff time. Rigorous third-party evaluation can cost a lot more. Local university or college faculty members and students can sometimes provide free help as part of student internships or projects.

Expertis

Your group may have little experience in planning projects that are eligible for government funding. We hope to show you how the knowledge and skills needed to plan projects are similar to those needed to plan evaluations.

Evaluation has a reputation for being complex and requiring outside expertise. While sometimes expertise is needed to conduct statistical analyses or to help determine how to answer evaluation questions, simpler evaluations can be done in-house. We'll talk more about this in this section of your handbook.

Intrusiveness

To answer questions like "Who are we reaching?" or "Did the project result in changes in attitude or behaviour," evaluations ask questions about people's life experiences, their attitudes and behaviours.

We have found that project staff are often more concerned about the intrusiveness of these questions than are the participants in their projects. It is important to remember that participation in an evaluation should always be voluntary. Participants should always be told they can refuse to answer questions or can end their participation in the evaluation at any time without affecting their involvement in project activities.

We already know the project is effective

You probably have a lot of stories or anecdotes that have proved to you the effectiveness of the project you are planning. You might feel this is more than enough evidence to prove the planned activities are effective. Evaluation helps to provide an evidence base so others can also be convinced.

Philosophy

You may feel the work you do cannot be quantified in numbers or described in a simple "linear" way. Projects often have many parts. They can affect participants in subtle, unanticipated ways. It's true that evaluations sometimes fail to capture the complexity of project activities and the ways in which they work. Adding evaluation questions that give participants and staff an open place to tell their stories can ensure these aspects are captured.

Long-term change vs. short-term funds

It seems contradictory. On the one hand, the National Crime Prevention Centre (NCPC) provides funds for only a short time; on the other, it recognizes that change often takes a long time to occur. While it's ideal to be able to track change over the long term, if you can make strong arguments as to why the short-term outcomes of your project are likely to lead to long-term change, you can focus on measuring the short-term changes and don't need to track change over the long term.

This Handbook for Community Groups and the accompanying training sessions will show you how to develop a logic model. A strong logic model that shows how project activities will lead to short- and long-term outcomes and how they all link together demonstrates to others how the short-term changes your project accomplishes can lead to further changes long after the project ends.

Fear

It's natural to worry that a negative evaluation might mean your group will not be able to get further funding. But projects often get better as a result of evaluations that show how some of their results could be improved. Evaluation is a good way to show funders you are interested in continuous improvement.

Why evaluate?

Decision making, managing the project

Evaluation is part of good management. It doesn't have to involve a lot of time or money, but some time and some money should be devoted to evaluation if you want to manage your project effectively. How ambitious your evaluation will be is likely to depend on the size and budget of your project.

You are probably already doing some kind of evaluation, at least in informal ways. You might be asking questions about participant satisfaction. Or you might be assessing the need for additional staff.

Evaluation can answer questions that need to be answered in order to ensure good project management. For example, you might ask:

Project improvement

If your group has been involved in other community projects, you have probably made changes to improve aspects of these projects over time. For example, you may have changed the location or time in order to improve access. You may have engaged a new partner to increase referrals. You may even have evaluated these changes to learn if they made a difference. Documenting what worked can help others to learn from your project. It can help to improve not just your project, but also other projects in your community or across Canada.

Did the project work?

If we really care about crime prevention, we'll want to know that what we do works. After all, why invest our time and money in something that, in the end, isn't making a difference?

If it's hard to document whether a project prevents crime or reduces victimization, we can often show how it reduces factors associated with crime and victimization (risk factors) or increases factors that help to prevent crime or to reduce victimization (protective factors). We'll also want to know how it works so that others can copy it.

Unanticipated outcomes

Sometimes projects have effects we never predicted. These can be good or bad. For example:

Accountability

Taxpayers want to know their money is spent wisely. Government needs to be accountable for the dollars it spends on community projects. Failure to document whether these projects make a difference results in questions from the Auditor General, politicians, and ultimately, fellow Canadians.

Too often we forget that we also need to be accountable to participants in community projects. They deserve to have the opportunity to express their views about what works or doesn't work and to learn from evaluation reports about the ability of programs to achieve their intended outcomes. Front-line staff often express concerns that evaluations ask too many questions of participants and that these questions are too intrusive. These are legitimate concerns. But we often find that when participants are approached in a positive way and given an overview of the purpose of the evaluation and their role in it, they are excited about their role as "research assistants." They are interested in "what works."

Public relations/fundraising

Strong project results are the best tool to promote your project and to encourage others to donate money or provide resources to sustain it.

What is evaluation?

Evaluations do not necessarily do all of the things listed above. Some focus more specifically on reviewing the project's development and examining project activities to assess whether the project is being offered in the way it was intended (process evaluation). Others focus more on the last three points and assess whether the project achieved its intended outcomes (outcome evaluation).

Types of evaluation

Needs assessment

A needs assessment is used to learn what the people or communities that you hope to reach might need in general or in relation to a specific issue. For example, you might want to find out about safety issues in your community, about access to services, or about the extent to which your community is dealing with a certain type of crime or a certain form of victimization.

Resource assessment

A resource assessment is used to assess the resources or skills that exist among the people or communities with which you hope to work. It is often conducted alongside a needs assessment. Resource assessments identify the skills that community members can contribute to a project and resources such as community space, in-kind and financial donations, volunteer time, and other attributes that can be tapped by your crime prevention project.

Evaluability assessment

An evaluability assessment is done to determine whether a project is ready for a formal evaluation. It can suggest which evaluation approaches or methods would best suit the project.

Project monitoring

Project monitoring counts specific project activities and operations. This is a very limited kind of evaluation that helps to monitor, but not assess the project.

Formative

Also known as process evaluation, a formative evaluation tells how the project is operating, whether it is being implemented the way it was planned, and whether problems in implementation have emerged (for example, it might identify that a project is reaching a less at-risk group than it intended, that staff do not have the necessary training, that project locations are not accessible, or that project hours do not meet participant needs.).

Outcome

An outcome evaluation examines the extent to which a project has achieved the outcomes it set at the outset.

Summative

Summative evaluations examine the overall effectiveness and impact of a project, its quality, and whether its ongoing cost can be sustained.

Cost-effectiveness

A cost-effectiveness study examines the relationship between project costs and project outcomes. It assesses the cost associated with each level of improvement in outcome.

Cost-benefit

Cost-benefit analysis is like cost-effectiveness analysis in that it looks at the relationship between project costs and outcomes (or benefits). But a cost-benefit study assigns a dollar value to the outcome or benefit so that a ratio can be obtained to show the number of dollars spent and the number of dollars saved. A well-known cost-benefit analysis was done of the Perry Preschool initiative in the United States. It concluded that for every one dollar spent, more than seven dollars were saved (Barnett, 1993, cited in Schweinhart, 2002).

Some major approaches

External evaluation

This approach employs an external evaluator (a third party or person/organization not previously associated with the project being evaluated) to conduct the evaluation. Using an evaluator who is not part of the organization being evaluated increases the perceived objectivity of the results. External evaluators may be used in all of the approaches described below. Outside contractors are often hired to facilitate participatory or empowerment evaluations.

Utilization-focused

This approach focuses on what project managers and staff need to know to assist with project decision making and improvement.

Participatory

This is a method that involves participants in all aspects of the evaluation, from identifying the evaluation questions to deciding what information to collect, how to do it, and how to interpret the findings.

Empowerment

This is an approach that uses evaluation concepts, techniques, and findings to help community groups improve their programs and services. The evaluator acts as a coach or facilitator to help project staff and participants through a process of self-evaluation and reflection. Empowerment evaluation follows three steps: a) establishing a mission or vision statement, b) identifying and prioritizing the most significant program activities and rating how well the program is doing in each of those activities, and c) planning strategies to achieve future project improvement goals (Fetterman, 2002).

The approach you choose for your evaluation will depend on the evaluation's purpose. If you wish to learn ways to improve the services you offer, a utilization-focused or an empowerment approach might be appropriate. If you want to convince outside organizations that you are having a positive impact on participants, an external evaluator will help to assure objectivity.

The basic steps of evaluation
Steps STOP fraud against seniors project
Identify goals (anticipated outcomes)
  • Reduce the incidence of fraud against seniors
  • Increase partnerships between seniors organizations, police, crime prevention organizations, and business associations
  • Increase public awareness of fraud against seniors
  • Increase seniors' knowledge of practices that reduce vulnerability to fraud
  • Ensure the project's sustainability by the end of the funding period.
Describe the project Project Activities:
  • Develop a coalition of seniors' organizations, police, neighbourhood groups, local businesses, municipal recreation centres, seniors' housing
  • Offer a series of workshops on fraud targeted at seniors and strategies to prevent victimization
  • Train volunteer participants to deliver the workshop series and to form a speakers' bureau
  • Develop public awareness activities directed at seniors and their families including public service advertisements, STOP fraudagainst seniors.com web site, fridge magnets, speakers bureau
Identify what you want to know (evaluation questions)
  • Was the project carried out as planned?
  • Did the project reach seniors identified as most vulnerable to victimization?
  • Was the project successful in achieving its objectives?
Identify data sources and data collection tools
  • Partners will be surveyed to determine their satisfaction with the project and its relevance to their work
  • A random sample of 100 community members will be asked participate in a telephone survey before and after the public awareness campaign to assess awareness about fraud against seniors
  • Workshop participants will complete pre-post questionnaires about strategies to reduce fraud victimization
  • Police reports of fraud against seniors will be analyzed
  • Monitor number of volunteer speakers and number of workshops and talks delivered
Collect the information Student interns will assist with data collection
Organize the information Contractor will enter data into database
Analyze the data Contractor will analyze data for:
  • Partner satisfaction
  • Pre-post change in community awareness, seniors' knowledge, and police reports
  • Program outputs
Report the results, identify next steps
  • Final report to funders
  • Fact sheet on evaluation results to partners and community members
  • Community forum with seniors' associations

When to bring in evaluation professionals

Community groups often think evaluation requires the services of an expert outsider. While expert help is sometimes needed, it's not always required. Projects funded under the Crime Prevention Action Fund (CPAF) often manage their evaluations themselves. Some choose to contract with an outside evaluator on a short-term basis to undertake key activities. For example, they may hire an evaluator to help them identify or develop appropriate data collection instruments, to develop a database, or to analyse evaluation data.

Projects funded under the Research and Knowledge Development Fund (RKDF), on the other hand, always rely on outside evaluators to ensure a rigorous and objective assessment of their project's effectiveness.

This series of workshops is intended to help you to better understand the basic steps of evaluation. We hope you'll see evaluation as an ongoing part of good project management.

Of course, there are times when you will not have the evaluation knowledge or the time and resources needed to conduct your own evaluations. Here are some situations in which an outside evaluator might be useful:

If you decide to hire an external evaluator, think about using your time with the evaluator as a learning opportunity. Consider adding to the evaluator's contract a requirement that he or she prepare you to use evaluation as an ongoing practice to manage your project effectively.

Glossary of terms

Analyze (data)

Analyzing data involves bringing some sense or meaning to the information you have collected. In the case of qualitative data, this might involve categorizing the information you collected into themes that summarize what was said. In the case of quantitative data, descriptive statistics and, in some cases, statistical tests are used to provide meaning to raw numbers. This might involve, for example, identifying the mean or average response, the range of responses from highest to lowest, or the statistical likelihood that a change in scores over time is due to more than just chance. More information about analyzing data is provided in Module 6 of this Handbook.

Comparison (or control) group

Community-based research refers to a comparison group as opposed to a control group, the term more often used in experimental research. A comparison group is a group of participants who have similar characteristics to participants in the program or project being evaluated, but who do not receive exposure to the project activities.

Data

Data is another word for information that is collected to provide knowledge or insight into a particular issue.

Evaluability assessment

An evaluability assessment is way of assessing whether a project is ready for a formal evaluation. It can suggest which evaluation approaches or methods will best suit the project.

Experimental group

An experimental group is a group of people who participate in an intervention (or program). The results for this experimental group can be compared to those of a comparison group who do not receive the intervention. The comparison group should have similar characteristics to those of the experimental group, except that they do not receive the intervention under study. The difference in results between the two groups is then measured.

Formative evaluation

Formative evaluation assesses the design, plan, and operation of a program. It reports on whether the project is being implemented the way it was planned and whether problems in implementation have emerged.

Logic model

A logic model is a way of describing a project or program. It is a tool to help in project planning and evaluation. A logic model describes the resources and activities that contribute to a project and the logical links that lead from project activities to the project's expected outcomes. Logic models are often depicted as a flow chart that includes the project's inputs, activities, outputs, and outcomes.

Needs assessment

A needs assessment is a way to collect and analyze information about the needs of local communities or groups in general or in relation to specific issues.

Outcome evaluation

Outcome evaluation assesses the short and long-term outcomes that result from participation in a project or program.

Pre-post testing

Pre-post testing involves administering the same instrument before and after an intervention or program.

Process evaluation

A process evaluation reviews project development and examines project activities to assess whether the project is being offered in the way it was intended and to identify areas where project administration and delivery can be improved.

Random sample

A random sample is made up of individuals who have an equal opportunity of being selected from a larger population. Whether any one individual from the larger population is selected for the sample is determined by chance.

Resource assessment

A resource assessment is used to assess the resources or skills that exist among the people or communities with which a project plans to work.

Sample

A sample is a subgroup of a larger population. It is studied to gain information about an entire population.

Summative evaluation

A summative evaluation examines the overall effectiveness and impact of a project, its quality, and whether its ongoing cost can be sustained.

References

Einsprunch, E.L., & Deck, D.D. (1999, November). Outcomes of peer support groups. Retrieved March 16, 2004, from
http://www.rmcorp.com\Project\PIeval\Peer.pdf

Fetterman, D. (2002). Collaborative, participatory, and empowerment evaluation. Retrieved March 16, 2004, from
http://www.stanford.edu/~davidf/empowermentevaluation.html

Ottawa Police Services. (2001, August). You can do it: A practical tool kit to evaluating police and community crime prevention programs. Retrieved March 16, 2004, from http://dsp-psd.communication.gc.ca/ Collection/J2-180-2001E.pdf

Schweinhart, L.J. (2002, June). How the High/Scope Perry Preschool study grew: A researcher's tale. Phi Delta Kappa Center for Evaluation, Development, and Research, Research Bulletin No. 32. Retrieved March 16, 2004, from
http://ww.highscope.org

Suggested resources

Websites

Bureau of Justice Assistance Evaluation
Evaluation Strategies for Human Services Programs
http://www.bja.evaluationwebsite.org/html/documents/evaluation_strat

This website provides a "Road Map" which answers the following questions: What is evaluation? Why do we conduct evaluation? What types of programs are evaluated? When do we evaluate?

Centre for Substance Abuse Prevention
Prevention Pathways
http://pathwayscourses.samhsa.gov/samhsa_pathways/courses/index.htm

This website offers free tutorials on various evaluation topics. "Evaluation for the Unevaluated 101," is an excellent introductory course to evaluation that addresses the main components of evaluation and why evaluation is important.

United Way of America
Outcome Measurement Resource Network
http://www.unitedway.com

This website is a good starting point to learn the basics of outcome measurement. It includes an introduction to outcome measurement and a discussion of why it is important.

Guides and Manuals

Annie E. Casey Foundation
When and How to use External Evaluators
http://www.aecf.org/publications/data/using_external_evaluators.pdf

This publication reports on various issues related to hiring an external evaluator. It includes questions to use when interviewing external evaluators and suggestions for managing evaluation contracts.

Health Canada
Guide to Project Evaluation: A participatory Approach
http://www.phac-aspc.gc.ca/ph-sp/phdd/resources/guide/index.htm

Chapters One and Two of this guide provide a basic introduction to evaluation. The remainder of the guide provides useful advice for data collection, analysis, and reporting.

U.S. Department of Health and Human Services - Administration for Children and Families
The Program Manager's Guide to Evaluation http://www.acf.hhs.gov/programs/opre/other_resrch/pm_guide_eval/reports/pmguide/pm guide_toc.html

The Program Manager's Guide consists of nine chapters that address the purpose of evaluation and its main components. An additional feature of this guide is a discussion about hiring and managing external evaluators.

W.K. Kellogg Foundation
Evaluation Handbook
http://www.wkkf.org/Pubs/Tools/Evaluation/Pub770.pdf

This handbook introduces evaluation as a practical and useful tool, and assists the user in creating a blueprint of evaluation.

Textbooks

Research Methods Knowledge Base
Introduction to Evaluation
http://www.socialresearchmethods.net/

This on-line textbook introduces the user to evaluation, its basic definitions, goals, methods, and the overall evaluation process. It includes answers to frequently asked questions about evaluation.

Newsletter

Centre for Community Enterprise
Making Waves, "The 'Who' Of Evaluation," Vol. 11, No.2.
http://www.cedworks.com/waves03.html

This article addresses the issues involved in using an outside evaluator and recommends use of a combination of internal and external expertise.

Module 1 Worksheets

Worksheet #1

What is in a name? How many evaluation terms can you find?

Worksheet #2

Why should we care?

Why should we care about crime prevention in your community?

Module 2: Setting the stage for evaluation – Preparing a logic model

Learning Objectives

Step 1: Identify project goals (outcomes) and who you intend to serve

A good project plan clearly identifies your goals or outcomes and the population you plan to serve. It tells others where you are headed. (In keeping with the planning tools included in the application guide for the Crime Prevention Action Fund , we're using the words "goals" and "outcomes" interchangeably.)

Presenting the project's logic
Step 1 Examples
Goals (anticipated outcomes) – What you expect the project to accomplish or change
  • Reduce the incidence of crime against seniors
  • Reduce seniors' vulnerability to common frauds and scams
Priority group – Who you intend to serve
  • Seniors living in community "A"
  • Seniors from specific ethno-cultural groups

This is the first step in presenting the project's logic.

This training module shows how to develop a project plan, which will also serve as the first part of the evaluation plan. The two fit together. In the next training module, we will explain how to develop the rest of the evaluation plan.

Your knowledge of community needs and resources will help you to identify the goals of your project and the group of people it will serve. It is best to base your knowledge of community needs and resources on an objective assessment. You may already be familiar with needs assessments and resource assessments. These are research and project management tools that can help you plan your project. We have included some resources on needs assessments at the end of this chapter.

If you represent an agency or service that is planning a project, be sure to include members from the community you hope to serve at the project planning stage. You will want to know:

Bringing together human service agencies and community members to discuss their unique perspectives can result in a stronger project.

When you bring everyone together, we suggest you give them hints about writing good project goals. We've listed some below.

Hints for developing project goals

Saying that a project goal is "to provide recreational opportunities" does not tell us anything about the purpose of those recreational activities or the changes they are expected to bring about. Programs are developed to make change. They are not developed simply for the sake of delivering products or services alone.

Saying that these recreational opportunities are going to increase teamwork and leadership skills or reduce vandalism in the after-school hours are what we call SMART goals.

Be SMART

Specific

Is the goal/outcome specific? Is it clear? If you want to increase community safety, specify the particular changes you are trying to achieve to increase safety. Mention the particular group you are targeting – such as seniors, children or youth – and the particular issue you are trying to change. Here are some sample goals related to community safety:

Measurable

Will you be able to measure (see) change? Will you be able to answer whether or not you achieved your goal? For example, how would you measure "improved partnerships"? Consider rewriting this goal to specify what changes will take place. Here are some examples that are easier to measure:

Achievable, attainable

Will the project be able to achieve the outcomes it set out? It may not be realistic to set "reduced crime" as a goal or outcome if the project activities focus on increasing coordination of services or improving awareness of a particular issue.

Relevant, realistic

Does the goal mean something to people involved in the project? Be realistic about what you can do, keeping in mind the resources available to you.

Trackable, timely

Don't set long-term goals for a short-term project. Focus on something that can be completed within the project period.

Goals/Outcomes: Indicating the direction of change

Here are some examples of the kinds of words that goals or outcomes should include. They are action words that indicate the direction of change – that is, whether something will be reduced or increased.

What's next? Project components, inputs, activities, & outputs

Now that you have completed Step 1, you can fill in the remaining steps to complete your project plan. These steps are sandwiched between the priority group and the achievement of your goals or final outcomes. They show how you will accomplish the goals you have set. They are the key components of your project's logic model. Each step naturally leads to the next. Although the outcomes come last in the logic model, they are identified up front in order to show where we're headed.

Logic model

Splash and Ripple

Take a look at the Splash and Ripple Primer located at: http://www.ucgf.ca/English/Downloads/RBMSept2003.pdf (PLAN:NET Ltd., 2003).

It provides a wonderful metaphor to help you remember the key components of a logic model and how they fit together.

It talks about a person standing over a pond and holding a rock. When the person drops the rock in the pond, it creates a splash and then a series of ripples. If we liken this image to the steps in developing a project plan or logic model:

Control decreases as the ripples spread, just as it does as we move toward longer-term outcomes. Influences other than the project are more likely to intervene as time passes. We can contribute toward the longer-term outcomes, but we can rarely control them.

What is a logic model?

A logic model is a way of describing a project. It describes what goes in and out of your project. It answers questions in five areas:

Some sample outcomes :

Short term

Intermediate

Long term (impact)

Why develop a logic model?

Because your project is likely to change as a result of all kinds of influences, you should review your logic model regularly to ensure it continues to reflect your project's goals, activities, and anticipated outcomes.

The logic model

The logic model shown on the previous page is just a sample of what a logic model might look like. Logic models can be depicted in chart form, as on Worksheet #3, or as a flow chart, as show on the previous page.

The flow chart helps to show how various parts of the logic model link together. These links are an important part of the logic model. They show the logic between the different parts of the project. You should have a rationale to explain why each activity you plan is likely to lead to a particular outcome or outcomes. If a combination of activities result in a particular outcome, the lines in the flow chart should reflect that logic.

The following checklist can help you check how well you're doing in preparing your logic model.

Logic model check list

Glossary of terms

Evaluability assessment

An evaluability assessment is way of assessing whether a project is ready for a formal evaluation. It can suggest which evaluation approaches or methods would best suit the project.

Input

Inputs refer to the resources invested in the delivery of a program or project. Sample inputs include funding, human resources (both paid and volunteer), equipment, or services. Inputs may be funded through a project budget or provided in-kind by project partners or volunteers.

Logic model

A logic model is a way of describing a project or program. It is a tool to help in project planning and evaluation. A logic model describes the resources and activities that contribute to a project and the logical links that lead from project activities to the project's expected outcomes. Logic models are often depicted as a flow chart that includes the project's inputs, activities, outputs, and outcomes.

Needs assessment

A needs assessment is a way to collect and analyze information about the needs of local communities or groups, either in general or in relation to specific issues.

Output

Outputs refer to the concrete results anticipated to occur after a project or activity is delivered. Examples of outputs include the number of flyers or materials distributed, the number of referrals made or workshops offered, or the number of participants who attend a particular service or activity.

Resource assessment

A resource assessment is a way to collect and analyze information about the resources within a particular community or group. Resources can include people or things that can support the community being assessed (e.g., financial resources, the skills and abilities of community members, community space, community programs or activities).

References

PLAN:NET Ltd. (2003, September). Splash and ripple: Planning and managing for results. Retrieved March 18, 2004, from
http://www.ucgf.ca/English/Downloads/RBMSept2003.pdf

Suggested Resources

Websites

Canadian Outcomes Research Institute
http://hmrp.net/canadianoutcomesinstitute/

This website offers general outcome measurement resources and a variety of resources related to logic models.

Innovation Network Online
http://www.innonet.org/

There is no charge to register on this network. It provides general guides to evaluation and an array of logic model resources. Innovation Network Online also provides an interactive Logic Model Builder that assists the user in developing a logic model.

University of Wisconsin
Program Development and Evaluation
http://www.uwex.edu/ces/pdande/evaluation/evallogicmodel.html

This website provides resources, worksheets, and examples of program logic models.

W.K. Kellogg Foundation
Evaluation Toolkit
http://www.wkkf.org/Programming/Resources.aspx?CID=281

This website provides resources on developing logic models and general resources for program evaluation.

Manuals and Guides

Innovation Network Online
Logic Model Workbook
http://www.innonetdev.org/

This workbook is available from Innovation Network Online. It provides a step-by-step process for creating a logic model. Items discussed in the workbook include goals, resources, activities, outputs, and outcomes. Additional logic model resources are provided.

Western Centre for Substance Abuse and Prevention
Building a Successful Prevention Program
http://casat.unr.edu/westcapt/bestpractices/eval.htm

This comprehensive guide illustrates program evaluation using a logic model. Topics include planning an evaluation, building a logic model, and conducting an evaluation using a logic model.

W.K. Kellogg Foundation
Logic Model Development Guide
http://www.wkkf.org/Pubs/Tools/Evaluation/Pub3669.pdf

This is a comprehensive guide to building your own logic model and includes examples and worksheets.

Module 2 Worksheets

Worksheet #1 Step 1: Developing a project/evaluation plan

Title of Crime Prevention Project:

Priority Group:

Project Goals/Outcomes:

Worksheet #2 What is wrong with these outcomes?

  1. Review the sample outcomes provided.
  2. List problems with the outcomes provided.
  3. Re-write the outcomes to correct the problems you have identified.

Worksheet #3:

Inputs → Activities → Outputs → Outcomes → Impacts
Inputs (Drop) Activities (Fall into pond) Outputs (Splash) Outcomes (Ripples) Impacts (Long-term outcomes) (0uter ripple)
Short-term Intermediate
           

Worksheet #4a:

Case study: Teaching young people to deal with abusive relationships

Step 1: Goals and priority group

Goals

Priority Group

The goals listed above were set for the project in the planning stage. The next step is to develop a logic model for the project. As you work through the model, you may identify more specific outcomes than the goals identified here.

Logic model check list

Worksheet #4b: Case study: Youth development project for at-risk aboriginal youth

Step 1: Goals and priority group

Goals

Priority Group

The goals listed above were set for the project in the planning stage. The next step is to develop a logic model for the project. As you work through the model, you may identify more specific outcomes than the goals identified here.

Logic model check list

Worksheet #4c: Case study: Network and coalition building

Step 1: Goals and priority group

Goals

Priority Group

The goals listed above were set for the project in the planning stage. The next step is to develop a logic model for the project.

As you work through the model, you may identify more specific outcomes than the goals identified here.

Logic model check list

Module 3: Developing an evaluation plan

Learning objectives

What have we got so far?

As you will remember from Module 2, the relationships between project goals and inputs, activities, outputs, and outcomes are outlined in the logic model by drawing lines to show how they relate to each other. The assumptions behind these relationships are not portrayed in the model, but it is a good idea to identify these assumptions in the evaluation plan. For each outcome identified, a rationale should be provided to explain why the activity is likely to lead to the particular outcome.

Evaluations of projects funded under the Research and Knowledge Development Fund prepare a theory of change that tests the assumptions made in the logic model against what is known from existing literature.

If you are interested in learning more about how to write up the assumptions behind your logic model or to check the logic of your program, check out the web site:
http://www.theoryofchange.org/html/example.html

What's next?

What should an evaluation plan include?

We completed the first two steps (shown in black) in Module 2. In this Module, we will review the remaining steps in developing an evaluation plan.

Identifying evaluation questions

Identifying indicators

What is an "indicator"?

There are two kinds of indicators:

So, an indicator must be something we expect to change or vary from the time the project begins (known as the baseline) until a later point when the project activities have taken place and are likely to have had an impact.

Indicators can focus on inputs, outputs, or outcomes, but they should be narrowly defined in a way that precisely captures what you're trying to measure. Indicators are probably the trickiest part of designing an evaluation. They should:

How to choose good indicators

Think back to the splash and ripple metaphor (PLAN:NET Ltd., 2003) we used in Module 2. If we wanted to measure what impact that drop in the pond had, what would be a good indicator? (Hint: the ripples are the outcomes).

Let's say we decided an indicator of the drop's impact would be the circumference of the outer ripple:

Some additional considerations when choosing indicators

A good reference tool to help you select indicators can be found in Splash and Ripple: Planning and Managing for Results (PLAN:NET Ltd., 2003).

Identifying information sources

Once you have identified your indicators, you will need to think about who will provide the information you need. It's best to use a number of sources of information.

Researchers often talk about the importance of triangulation. This refers to bringing together information from more than one source. For example, you might analyze results from a pre-post survey, a focus group, and a review of project files. You can then compare the results of each of these separate information sources to confirm whether they are saying similar things. If more than one source reports similar information, you can feel more confident in the validity of the results you report.

The following page provides a list of some typical sources of information and suggested ways to gather information from them:

Choosing data collection methods

How can you get the information?

When deciding what data to collect, how much to collect, and from where, avoid stretching your capacity to collect information. Develop priorities and start with information you can obtain within your organization, such as information on your project's activities and procedures. You can always amend your evaluation plan if you find something surprising that warrants further research.

Although their resources are limited, even projects funded under the Crime Prevention Action Fund should try to find some ways to measure the impact of their projects – how did it make a difference? – and not just the process. At the proposal development stage, groups applying for CPAF funding should think about ways to wrap the evaluation component into their project activities. Focus groups, for example, can sometimes serve two purposes: helping to further community development while at the same time, gaining perspectives on what has worked or what has not worked to date.

In all cases, it's important to ensure informed consent is provided for any information collected. If information collection will include photographs or videos of participants, always obtain participants' permission to use the photos/videos in whatever way is anticipated. When collecting information from or taking pictures of children and youth, first obtain permission from their parents.

Who should get the information?

Decisions about who should collect evaluation information will depend on a number of factors: convenience, the need for objectivity, issues related to the quality of the information collected, and the protection of confidentiality. Sometimes it will be most practical and convenient to have project staff gather information from participants. For example, when staff are already gathering intake information to better understand participant needs when entering a project, it makes sense to adapt the intake interview to include questions for evaluation purposes. In other situations, it is best to have a third party collect the information in order to reduce bias. In still other situations, participants may self-complete questionnaires. When self-completion is considered as an option, potential threats to the quality of data, such as literacy or comprehension of English or French as a second language, should be taken into account.

Closely tied to consideration of who will collect the information is the question of confidentiality. Think about how you will protect the confidentiality, and in some cases the anonymity, of those who provide the information.

When?

Some options include:

When collecting outcome information, at a minimum, you should try to gather information:

Information collected before a project or activity begins is known as baseline information. It shows what the situation was like in the community or for individual participants before the project or activity began or before individual participants entered the project or activity.

In addition to collecting information after the project is over (or after participants complete a series of project activities), it is a good idea to collect outcome information at another point six months to one year after the intervention. This will allow you to see if any of the changes found immediately after the intervention last over time. This longer-term follow-up may not be possible for small CPAF projects with budget or time limitations.

Factors to consider

Appropriatenes
Acceptance by respondents
Resources needed for analysis
Credibility

Qualitative vs. quantitative data

1. Quantitative

Quantitative measures tend to look at:

Quantitative data answer questions such as: How much? How many? They can be obtained from questionnaires in the form of rated scales, checklists, or true/false questions. These methods are often used to assess changes in attitudes or behaviour.

Quantitative data can be compared across different populations, studies, or time. As an example, the new CPAF application guide has a number of multiple-choice questions. The NCPC can roll up the responses to these questions to learn more about the kinds of projects being proposed across the country, in specific provinces or communities, or over different periods of time.

Quantitative measures are often simple to administer and easy to score and interpret. However, depending on the level of sophistication of the data you collect and the analyses you want to do, you may need someone with a background in statistics to analyze your quantitative data.

Here are some examples of questions that result in quantitative data:

Rated scale:

1. Please rate your satisfaction with this program.

Forced choice or close-ended question:

2. Which services did you receive? (Check all that apply.)

True/False question:

3. Indicate whether the following statements are true or false:

  1. Jealousy is a sign of love. T F
  2. When a woman gets hit by her partner, she must have provoked him in some way. T F
2. Qualitative

Qualitative measures provide descriptive information that explains how and why things occurred. When combined with quantitative measures, qualitative measures can provide context to the results of a study. On their own, they provide rich information that can help to explore different issues, including how and why projects work the way they do.

Here are some examples of questions that result in qualitative data:

  1. How would you describe your agency's involvement in the relationship-abuse prevention project?
  2. How effective do you think the community coalition has been in raising student awareness of the warning signs of relationship abuse?
Strengths and weaknesses
Data Type Strengths Weaknesses
Quantitative
  • Easier to combine data to get overall results
  • Seen as objective
  • Analysis can be done quickly
  • Difficult to design good questions
  • Doesn't provide in-depth information
  • Less personal
Qualitative
  • Provides in-depth "rich" information
  • Easier to design questions
  • Fits within oral tradition
  • Analysis is time consuming
  • May not be suitable for large samples
  • Difficult to combine data across participants

Considerations when choosing survey methods

The table on the following page can help you to consider the data collection method best suited to your evaluation. When "yes" is indicated for a particular option, that means it will work in the situation cited in the left column.

It's important to remember that, while telephone interviews have many advantages, they may not be the best method if many of your project participants have limited incomes and may not have phones.

Considering the data collection method best suited to your evaluation
Important Consideration Survey Options
Mail-out Survey Telephone Interview Face-toface Interview Focus Group
Large sample needed Yes Maybe No No
Require high response rate No Maybe Yes Yes
Target specific groups No Maybe Yes Yes
Issues are complicated Maybe No Yes Yes
Must have open-ended questions No Maybe Yes Yes
Need to probe for details No Maybe Yes Yes
Trained interviewers are not available Yes No No No
Results required quickly No Maybe Yes Yes
Budget is limited Yes Maybe No No

Deciding whether to "sample"

Sampling is the process of choosing a subset or sample of people to study in order to make generalizations about the larger population or group.

Evaluations do not necessarily require a sampling strategy, but sampling can reduce the resources required to collect and analyze information. In cases where the number of participants is small (for example, less than 30 participants), collect information from all participants. Generally, if it is simple and inexpensive to include all participants in your study, it is best to do so. If, on the other hand, the population being studied is large, sampling can reduce the resources needed for data collection and analysis. Choosing a sample from a very large population can even help to reduce error.

When determining your sample size, consider:

Choosing a sample

Deciding how many to include in a sample

Tables are available to help you identify the appropriate sample size for your study. A sample size table is included in You Can Do It: A Practical Tool Kit to Evaluating Police and Community Crime Prevention Programs (p. 52). (See the web site reference at the end of this module of your handbook.)

Remember to factor in the expected response rate when choosing a sample size.

Think about how many responses you will ideally need, factor in the anticipated response rate, then choose your sample size. Here is an example of how you can do this:

Analyzing the results

Your evaluation plan should propose how you will analyze the information you obtain. We discuss some basic ways to analyze quantitative and qualitative data below.

Quantitative

Analysis involves sorting out the meaning and value of the data you have collected. We will talk more about how to analyze results in Module 6 of this handbook, but we provide a very brief overview here.

First of all, don't panic! Analysis does not have to be difficult. You can do simple analyses:

See pages 4 to 8 of Module 6 of this handbook for information about how to calculate these basic statistics.

Qualitative

Small amounts of qualitative data can be summarized to provide an overall picture of what was said. As an example, if you have a few open-ended questions on a satisfaction survey, you can simply summarize the responses. ("Open-ended" means the questions do not have forced responses, such as yes/no, true/false, or a fixed number of responses from which the respondent must select.) As you're summarizing the responses, if some responses come up time and time again, list the number of times these more frequent responses were provided. This will give a sense of how common these responses were across respondents.

If you have larger amounts of data, content analysis can be done. This is a process where patterns or themes in the data are identified, given a code or name, and categorized. If you conduct a number of interviews with open-ended questions, you will need to do this more formal kind of analysis. When you report on the analysis, list and discuss each of the major themes identified (for example, teen respondents interpreted jealousy as a sign of love), summarizing what was said about this theme and providing representative quotations that reflect what was said.

Many people prefer to manually identify categories in the data by carefully reading through the transcripts of interviews, recording in the margins the themes they identify, and highlighting the quotes that best represent these themes. They feel this process enables them to gain a closer and more in-depth understanding of the data. Others prefer to use software programs (e.g., NVIVO/NUD*IST) that can do the categorizing of themes or patterns for them.

More information about content analysis is provided on pages 9 to 11 of Module 6

Reporting the results

Your evaluation plan should propose how you will communicate your results. Here are some things to consider when deciding how you will report the results (Ottawa Police Services, 2001):

You should also keep in mind who your audience will be. Different methods of presentation are suited to different audiences. Some ways to present the results of your evaluation include:

Think about who should receive your evaluation findings. Don't forget to include project participants in your list. They deserve to know what you have learned.

As you plan your reporting strategy, make sure you plan to address the issues the reader or user will see as important. Check the requirements of your funder. What do they want to know?

No matter who your audience will be, use plain language. That way no one will be left out because they don't understand the jargon unique to your project or your area of expertise. Remember that participants may need a different reporting style than funders and other partners.

Finally, make sure you deliver your report on time!

Glossary of terms

Baseline

Information collected before a project or activity begins is known as baseline information. It shows what the situation was like in the community or for individual participants before the project or activity began or before individual participants entered the project or activity.

Close-ended question

Close-ended questions ask respondents to choose from a list of possible answers. A multiple-choice question is an example of a close-ended question.

Content analysis

Content analysis is the process by which patterns or themes in qualitative data are identified, given a code, and categorized (Patton, 1990).

Cultural appropriateness

Cultural appropriateness refers to the degree to which a measure is appropriate and sensitive to cultural variation. If members of a particular cultural group are not included in the validation and standardization studies used to develop an evaluation tool, the tool may not be appropriate to use with that cultural group (Ogden/Boyes Associates Ltd., 2001).

Focus group

Focus groups are one method of data collection. They normally involve less than 15 people. A facilitator asks the group a series of questions to gain their perceptions and opinions on a particular topic. Their responses are recorded.

In-depth interview

An in-depth interview is a guided conversation between an interviewer and a respondent. The interviewer asks a series of open-ended questions. The interviewer normally follows a guide, but may deviate from the guide to pursue a line of questioning relevant to a particular thought or idea.

Indicator

An indicator is information that is collected about a particular process or outcome. For example, an indicator of partner satisfaction with a project might be the number of referrals partners make to the project, the number of partnership meetings they attend, or their responses to a satisfaction questionnaire.

Informed consent

Participants in research and evaluation studies, or their guardians, should provide free (voluntary) and informed consent. This normally involves providing written consent, but other methods of recording consent may be appropriate for particular groups. Informed consent procedures should include disclosure of all information to be collected in the research; information about the nature and purpose of the research, the identity of the researcher, the expected duration and nature of participation, any potential harms and benefits of participation in the research, how the results of the research will be used and with whom they will be shared; and assurance that participants may drop out of the research or refuse to participate in any part of it without being penalized in any way. More information about the nature of informed consent is available in the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (see http://www.pre.ethics.gc.ca/english/policystatement/policystatement.cfm).

Open-ended question

Open-ended questions allow respondents to answer in their own words rather than being restricted to a set of predetermined categories of response.

Pre-post survey

A pre-post survey involves administering the same survey instrument before and after an intervention or program.

Qualitative data

Qualitative data are a descriptive form of information presented in a non-numerical format. Qualitative data result from open-ended questions. They are normally collected in one of three different ways: a) in-depth, open-ended interviews, b) direct observation, or c) written documents (Patton, 1990).

Quantitative data

Quantitative data are numeric measurements. They tell us about quantity, frequency, intensity, and duration.

Random sample

Each member of a random sample has an equal opportunity of being selected from a larger population. Whether any one person from the larger population is selected for the sample is determined by chance.

Sample

A sample is a subgroup of a larger population. It is studied to gain information about an entire population.

Theory of change

A theory of change is a way to describe the assumptions or rationale for why a program or set of project activities is likely to lead to particular outcomes. It outlines the steps between each activity and its ultimate impact and cites theories that support the assumptions made about the links between the activity and its outcomes or impact.

Triangulation

Triangulation is a process that involves collecting information about similar questions or issues using different methods (e.g., interviews, questionnaires, focus groups) and/or different sources of information (e.g., staff and participants). Responses from various sources or methods are then compared to determine if they support or contradict each other.

References

The Aspen Institute Roundtable on Comprehensive Community Initiatives. (2003). Theory of change.org. Retrieved October 18, 2004, from
http://www.theoryofchange.org/html/example.html

Interagency Advisory Panel on Research Ethics. (1998). Tri-Council policy statement: Ethical conduct for research involving humans (with 2000, 2002 updates). Retrieved October 18, 2004, from
http://www.pre.ethics.gc.ca/english/policystatement/policystatement.cfm

Ogden/Boyes Associates Ltd. (2001). CAPC program evaluation tool kit: Tools and strategies for monitoring and evaluating programs involving children, families, and communities. Unpublished report for Health Canada, Population and Public Health Branch, Alberta/Northwest Territories Region, Calgary, AB.

Ottawa Police Services. (2001, August). You can do it: A practical tool kit to evaluating police and community crime prevention programs. Retrieved October 18, 2004, from
http://www.ottawapolice.ca/en/resources/publications/pdf/you%5Fcan%5Fdo%5Fit%5Fevaluation%5Ftoolkit.pdf

Patton, M. Q. (1990). Qualitative evaluation and research methods. London: Sage.

PLAN:NET Ltd. (2003, September). Splash and ripple: Planning and managing for results. Retrieved October 18, 2004, from
http://www.ucgf.ca/English/Downloads/RBMSept2003.pdf

Suggested Resources

Websites

Centre for Substance Abuse Prevention
Prevention Pathways

http://pathwayscourses.samhsa.gov/

This website offers tutorials on various evaluation components. "Evaluated for the Unevaluated 102," is an excellent course that provides information on developing an evaluation plan.

Community Toolbox
Developing an Evaluation Plan

http://ctb.ku.edu/tools/en/section_1352.htm

This site provides basic information on key considerations when developing an evaluation plan.

Management Assistance Program for Non-Profits (MAP)
Basic Guide to Program Evaluation

http://www.managementhelp.org/evaluatn/fnl_eval.htm#anchor1581634

This website provides information to assist in the development of an evaluation plan. Also included are key considerations when planning evaluation.

North Central Educational Regional Laboratory
Evaluation Design Matrix

http://www.ncrel.org/tech/tpd/res/matrix.htm

This matrix assists in outlining important components in evaluation planning, and is useful for developing your own evaluation plan.

United States Agency for International Development
Performance Monitoring and Evaluation Tips

http://www.dec.org/pdf_docs/pnaby215.pdf

This publication outlines the importance of performance monitoring plans, and identifies key areas to consider when planning an evaluation.

Textbooks

Trochim, William M., (2000).
The Research Methods Knowledge Base, 2nd Edition.

http://www.socialresearchmethods.net/

This is an excellent on-line textbook, which covers all topics relating to research methods, including evaluation planning and sampling.

Module 3 Worksheets

Worksheet #1

Identifying data collection methods: Case study
Indicator Data collection method Rationale
Source of information Tool/instrument used Frequency of collection
         

Worksheet #2

Planning & organizing your data collection
Evaluation Questions Key indicators Information sources Resources needed to collect info.
Source Tools to use Frequ'y of collection Dates Persons Time
               

Module 4: Data collection methods

Learning objectives

What is "data"?

Data is a Latin word for " information."

It sounds technical, but data collection is simply collecting information from people. There are some tricks to doing a good job at data collection. We'll learn more about those in this module.

Let's begin by looking at different sources of data (or information).

Official sources

Evaluators often use official sources of data to assess whether change occurs in a community over time. Below we have listed typical sources of this information for crime prevention projects, along with some of the pros and cons of using them.

Crime reports may underestimate actual crime, particularly for sensitive crimes such as domestic violence, sexual assault, and even some types of fraud where people are less likely to report crimes due to embarrassment or fear of community perceptions.

Police records should be treated with caution. Information on charges and arrests doesn't necessarily reflect actual rates of crime, and can be influenced by changes in legislation or policing policy (e.g., a crack down on drug trafficking). Information on convictions tends to be more reliable, but is subject to other influences such as the quality of legal representation received by the accused.

Despite these problems, arrest data tend to be one of the best measures we have of recidivism.

School records can include information on absenteeism, suspensions, academic success or failure, and special needs. Permission is required to obtain school records for individual students. Some provinces report on overall rates of academic success by school. For example, the Ontario Ministry of Education and Training provides school-level information about scores on province-wide tests administered in certain grades.

Census data can be used to describe demographic characteristics of Canadians as a whole, those living within a particular province or territory, or those living in as small an area as a census tract. (A census tract is a small area with a population of 2,500 to 8,000 within a large urban centre [Statistics Canada, n.d.]). The census provides information such as family income, education level, ethnicity, religious affiliation, housing type, marital status, or number of children in a household. This information can be used to create a community profile. But keep in mind one caution: Detailed census information, particularly that at the census tract level, is often years out of date once it is made publicly available.

Public or community health departments view "health" in the broadest sense, including physical, emotional and spiritual well-being. They are concerned with health at the individual and community level. As a result, they have interest in the area of community safety and crime prevention. Health departments have staff who analyze census data for your community. They also conduct their own community-level research. They can be a good source of information about your community. Ask your local public health department if it can provide resources or join your project partnership. Public health staff may be interested in helping you to conduct needs assessments or other community research.

Surveys

Pro: Surveys are relatively easy to administer.
Con: Surveys can be difficult to construct.

You have probably completed a survey at some point in your life. It might have been for the census, for a product or service, or as part of your job. Some skill is needed to develop a good survey.

Surveys can be administered by:

Telephone and in-person surveys conducted in an interview format should be done in a consistent way. They should include a standardized introduction to the survey. Respondents should always be told how the information will be used and should sign a consent form that assures them their information will be kept confidential and anonymity will be protected. Interviewers should be trained to record responses accurately. Survey forms can facilitate accuracy and ease of completion by including check boxes and probable categories of responses. For example, if the survey asks respondents about their level of education, the survey form might have potential categories that can be checked off by the interviewer, such as:

The interviewer can then check the relevant category rather than writing the response in full.

Surveys are more difficult to construct than it may first appear:

Survey tips

What is wrong with these questions?

Try to identify the problems with the questions below.

  1. On a scale of 1 to 5, how would you rate this project?
    • 1
    • 2
    • 3
    • 4
    • 5
  2. I feel safe in my community.
    • usually
    • sometimes
    • hardly ever
    • never
  3. Please check the age range that best represents your age.
    • 20-30 yrs.
    • 30-40 yrs.
    • 40-50 yrs.
    • 60+ yrs.
  4. Check the response that best represents your situation.
    • No children
    • Pregnant woman
    • Parent with children
  5. Did you learn anything new about evaluation and developing your crime prevention project?
    • Yes
    • No
  6. In what areas of the housing complex do you feel unsafe?
    • parking lots
    • stairwells
    • elevators
    • hallways
    • playgrounds
    • other (specify)
  7. This project is offered to families with children aged 7-10 in the Radcliffe neighbourhood. Please indicate which of the following groups the project should be expanded to include:
    • Families with younger children
    • Families with older children
    • Families from other neighbourhoods

Standardized tests

Don't be fooled by the word "test." There are no right or wrong answers to standardized tests. They are typically used to assess individual attitudes and behaviours. Evaluators often use them to assess changes experienced by program participants over time.

Knowledge tests can also be used to evaluate changes in knowledge that may occur as a result of an intervention.

You may already be familiar with some standardized measures such as the Rosenberg Self-Esteem Scale (Rosenberg, 1989), a self-completed measure that has been translated into many languages. You can find a copy of the Rosenberg Self-Esteem Scale at: http://www.bsos.umd.edu/socy/grad/socpsy_rosenberg.html.

Another standardized test you may be familiar with is the Child Behavior Checklist (Achenbach & Edelbrock, 1983). It comes in two versions, one completed by the child's parent and one completed by the child's teacher.

Standardized tests can also be used to assess community attributes. For example, the Sense of Community Index (Chavis, Hogge, McMillan, & Wandersman, 1986) can provide information about community cohesion. You can obtain a copy of the Sense of Community Index at http://www.capablecommunity.com/pubs/SCIndex.PDF.

Tests are "standardized" by administering them to large groups of people who, ideally, are similar to those who will complete the measure for assessment purposes. The data collected in the process of standardizing the test provide information about the way the average person completes the test. These "norms" help those who administer the test to interpret test scores. They enable the administrator to determine if a person's score is high, average, or low compared to the "norm."

Standardized tests are generally protected by copyright. Sometimes, such as in the case of the Rosenberg scale, the copyright holder allows the test to be used without cost under certain circumstances. In other cases, like that of the Child Behavior Checklist, the test must be purchased from the publisher. Some tests can only be administered by trained professionals.

Validity and reliability

As you will recall, we talked about validity and reliability in Module 3. These are especially important considerations when using assessment measures.

Validity refers to the ability of a measure to assess what it is intended to measure. "Validating" an instrument or scale normally involves administering the test along with other tests that measure similar attitudes or behaviours to a large group of people. The results are compared to determine if there is a correlation between the results of the measure being tested and the other measures already shown to measure associated characteristics. For example, the Rosenberg Self-Esteem Scale was assessed against self-reports and ratings by nurses and peers of constructs associated with self-esteem such as depression, anxiety, and peer-group reputation. When the results of the measure being tested correlate with the results of other related measures, the measure is considered to have "validity."

Reliability refers to the ability of a measure to provide consistent information over time and when completed by different groups. It is normally measured by looking at the results from tests completed by the same respondents on a number of occasions. The results are then compared from one time to another. When tests are considered to be reliable, there is little variation in responses over time. This means the test is not likely to be vulnerable to changes in mood or circumstance, which is a good thing. However, it also causes some risk when these tests are used for program evaluation. While reliability is considered a good thing, scales with good reliability ratings may be less subject to change even after an intervention has occurred.

Once a test has been validated and reliability tested, it is generally not recommended that individual researchers change the wording or items in the test. Such changes could affect the instrument's validity and reliability. On the other hand, it's important to recognize that the testing of assessment measures is often done with large groups of undergraduate students who are relatively well educated and are generally middle class, white, and American. As a result, these tests may use language that is too sophisticated or include items that are culturally based, making them inappropriate for participants in community settings. If possible, it is best to use scales that have been tested across a range of cultures and socioeconomic classes. When these are not available, try pilot testing the scales you hope to use with your participant population. This will give you an idea as to whether the test is a good fit for your participants.

Testing tips

Many people view standardized tests as the most objective way to measure knowledge, attitudes, and behaviours. The fact that standardized instruments are often widely tested and carefully researched supports this view. But there are some important factors that can influence the credibility of results on standardized tests.

In-depth interviews

In-depth interviews tend to be more qualitative than surveys, but the same rules for developing questions apply.

An interview guide can help to direct the interviewer. The interview guide looks a bit like a survey with the questions the interviewer will ask listed in the order they will be asked. Some interview guides are more flexible than others. They provide general guidance to the interviewer, but he or she is free to expand the list of questions to pursue a particular line of thought or story shared by the respondent. Other interview guides are more restrictive, restricting the interviewer to the questions included in the guide.

Depending on the nature of the interview, the interview guide might include blank spaces for the interviewer to record responses directly on the guide. Forced-choice responses are sometimes listed so the interviewer can simply check the response provided. Sometimes the interviewer is directed to read a list of possible responses to the respondent. Other times the interviewer is asked not to read the responses, but is directed to check the category that best represents the response provided by the respondent. Interview guides sometimes suggest probes to be used when respondents have trouble providing an answer.

What's a probe?

A probe is used to prompt the respondent to provide more information. Some ways to do this include:

Practicing with other interviewers will ensure each of you is consistent in how you interpret the questions.

We recommend that you pilot test the interview guide with a small number of people to identify potential problems. This can help to point out problems related to the interpretation of questions or to the order or the wording of questions. It will also give you an estimate of the amount of time needed to complete the interview.

Interview tips

The following tips are from Prairie Research Consultants (2001). Check out their resource on in-depth interviews at http://www.pra.ca/resources/indepth.pdf.

Before:

During:

After:

Focus groups

When are focus groups useful ?

When are they no useful?

In-depth interviewing or self-completed surveys are more appropriate when sensitive information is discussed or when tension exists among stakeholders. Focus groups are used to collect data, not to solve problems.

Focus group tips

Observation

Observation is often a good way to collect information about how communities or program participants respond to a particular project. Observers should be given some guidelines to help them record their observations. The observations they make can provide rich information that will enliven an evaluation report. But, first, observers should obtain permission to record what they observe.

Observation tips

Multiple methods

You can strengthen your evaluation by collecting information about the same issue from the different groups involved in your project (e.g., partners, participants, staff, managers) and by using different methods (e.g., focus groups, observation, surveys, etc.). When all the data are collected, compare the information you obtained from the various sources and methods to see if they support or contradict each other. This is called triangulation. You will have more confidence in the findings if you know that different stakeholders told you similar things or that various methods pointed to the same results.

If, for example, if you are documenting changes in levels of vandalism, you might want to compare results across sources and methods:

Ideally, you should collect this information at different times (e.g., before, during, and after the vandalism prevention project).

If all of the sources tell you the same thing, you can be fairly confident in the results you obtained. If there are significant contradictions in what you find, you may be aware of possible explanations for them. Or, you might want to go back to some of your sources to get their opinions on what the contradictions might mean.

Glossary of terms

Focus group

A focus group is a group of selected individuals who are invited to discuss a particular issue in order to provide insight, comments, recommendations, or observations about the issue. Focus groups are a means to collect information and can be used to assist in the evaluation of a particular program.

Interview guide

An interview guide provides structure to research or evaluation interviews. It provides instructions as to how the interviewer should introduce the interview and, if it hasn't already been done, inform the respondent of consent procedures and obtain written consent. It lists the interview questions in the order they should be asked. Interview guides can provide loose guidelines for the interviewer or precise instructions as to how to present interview questions.

Pilot test

A pilot test involves pre-testing evaluation instruments with a few representatives similar to those who will be completing the instruments for the evaluation. Problems with the instruments are noted so the instruments can be revised before the evaluation is implemented.

Probing questions (or probes)

A probe is a question that assists in bringing forth a more detailed response or additional information based on the respondent's original answer.

Social desirability

Respondents sometimes answer questions in a way they believe will please the researcher or in a way that presents their attitudes, opinions, or behaviours in a positive way. They provide what they perceive to be socially desirable responses. Standardized scales have been developed to identify whether respondents are answering questions in a socially desirable rather than a truthful manner. These are known as "social desirability scales."

Standardization

A standardized measure is one that has been administered to a very large group of people similar to those with whom the measure would be used. The data collected from this group serves as a comparison for interpreting results from participants in a program evaluation or research study. Standardized tests allow you to determine if a person's test score is high, average, or low as compared to the norm (Ogden/Boyes Associated Ltd., 2001).

Triangulation

Triangulation is a process that involves collecting information about similar questions or issues using different methods (e.g., interviews, questionnaires, focus groups) or from different sources of information (e.g., staff and participants). Responses from various sources or methods are then compared to determine if they support or contradict each other.

References

Achenbach, T.M., & Edelbrock, C. (1983). Manual for the child behavior checklist and revised child behavior profile. Burlington, VT: Queen City Printers.

Chavis, D.M., Hogge, J.H., McMillan, D.W., & Wandersman, A. (1986). Sense of community through Brunswick's lens: A first look. Journal of Community Psychology, 14(1), 24-40.

Ogden/Boyes Associates Ltd. (2001). CAPC program evaluation tool kit: Tools and strategies for monitoring and evaluating programs involving children, families, and communities. Unpublished report for Health Canada, Population and Public Health Branch, Alberta/Northwest Territories Region, Calgary, AB.

Prairie Research Associates, Inc. (2001). The in-depth interview. Retrieved September 2, 2004, from
http://www.pra.ca/resources/indepth.pdf

Rosenberg, M. (1989). Society and the adolescent self-image (revised ed.). Middletown, CT: Wesleyan University Press.

Statistics Canada. (n.d.) 2001census dictionary. Retrieved October 19, 2004, from:
http://www12.statcan.ca/english/census01/Products/Reference/dict/geo013.htm

Suggested Resources

Websites

American Statistical Association
What is a Survey?
http://www.amstat.org/sections/srms/whatsurvey.html

This electronic brochure discusses issues related to survey research such as planning, data collection, and the quality of the survey.

Bureau of Justice Assistance Center for Program Evaluation
Data Collection
http://www.ojp.usdoj.gov/BJA/evaluation/guide/dc1.htm

This website contains information and links to various data collection strategies.

Innovation Network Online Evaluation Resource Center
http://www.innonet.org/index.php?section_id=62&content_id=142

This website presents information and helpful suggestions related to survey research. Also available is a resource on data collection instruments.

Management Assistance Program for Non-Profits
Basic Guide to Program Evaluation
http://www.managementhelp.org/evaluatn/fnl_eval.htm#anchor1585345

This website provides helpful information about data collection methods including questionnaires, focus groups, and surveys.

University of Wisconsin
Program Development and Evaluation
http://www.uwex.edu/ces/pdande/evaluation/evaldocs.html

This user-friendly website offers many links relating to all aspects of data collection.

Manuals and Guides

Duval County Health Department
Essentials for Survey Research and Analysis: A Workbook for Community Researchers
http://www.tfn.net/%7epolland/quest.htm

This workbook is comprised of 12 lessons ranging from how to identify different levels of data to collecting and reporting data. It focuses on survey research.

National Science Foundation
User Friendly Handbook for Project Evaluation
http://www.nsf.gov/pubs/2002/nsf02057/nsf02057.pdf

This handbook provides detailed information about various data collection methods, including helpful tips and examples.

National Science Foundation
User-Friendly Handbook for Mixed Method Evaluations
http://www.ehr.nsf.gov/EHR/REC/pubs/NSF97-153/pdf/mm_eval.pdf

This user-friendly guide provides information about qualitative and quantitative evaluation designs and the data collection methods associated with each.

Horizon Research Inc.
Taking Stock: A practical guide to evaluating your own programs.
http://www.horizon-research.com/reports/1997/stock.pdf

This practical guide is an excellent resource that explains how to collect both qualitative and quantitative data. It proposes several strategies for data collection.

Textbook

Trochim, William M. (2000).
The Research Methods Knowledge Base, 2nd Edition.
http://www.socialresearchmethods.net/

This on-line textbook covers research methods.

What is wrong with these questions? (see p. 5)
  1. Vague, ambiguous wording
    • Rate the project on what? – satisfaction, quality, accessibility??
    • Rate what aspect of the project? – content, staff, location, service, specific components?
    • What do the numbers represent? Is 1 extremely satisfied and 5 extremely unsatisfied?
    • Some researchers prefer to avoid scales with a midpoint like the one shown in this question (i.e., scales with an odd, as opposed to an even number of responses). When there is a true midpoint, respondents who are uncertain what they think will often choose this middle score. This, unfortunately, doesn't tell the researcher too much. Using a four-point scale can force the respondent to select a response that leans in one direction or another.
  2. The scale with forced choice responses is not evenly balanced
    • The options are unbalanced because there is one positive, one neutral, and two negative responses.
  3. The response categories are not exclusive
    • There is overlap between the age groups (e.g., if you are 30, you could check either of the first two categories).
  4. The response categories are not exclusive
    • The respondent could be pregnant but also a parent with children OR a person with no children. How does she decide which option to check?
  5. Double-barrelled question
    • Ask separately: Did you learn anything new about evaluation? Did you learn anything about developing your crime prevention project?You might also want to change Yes/No to something like: Learned A lot, Learned Some, and Learned Very Little/Nothing Better yet, change it to a qualitative question.
  6. Leading question
    • The question assumes areas are unsafe. It should be preceded by a question asking if the respondent feels unsafe in the housing complex. If not, a note should instruct the respondent to skip this question and move on to the next.
  7. Leading question
    • The question assumes the project should be expanded.

Module 4 Worksheets

Worksheet #1 Survey writing and interviewing

Instructions:

Recorders: You are preparing the survey or interview guide for all group members, so make sure everyone can read your writing.

Surveyors: Include true/false, rated scales, multiple choice, and open-ended questions in your surveys.

Interviewers: Include both open-ended questions (those that leave the response open to the respondent) and close-ended questions (those that provide a fixed menu of responses) in your interviews.

Survey tips

Use the space below and the following page to develop your survey or interview guide. Use the back of the page if you need more space.

Interview tips

Before:
During:

Worksheet #2 Focus groups and observations

Follow the instructions on the flip chart.

Use the space below to develop your focus group or observation guide. When you are done, photocopy the recorder's version of the guide, making enough copies for each member of your group.

Focus group tips
Observation tips

Module 5: Evaluation desig n

Learning objectives

In previous training modules, we reviewed various methods for collecting quantitative and qualitative measures of project performance. In Module 6, we will be taking a more detailed look at the analyses of these measures. The current module focuses on evaluation designs, or the ways in which we estimate the impact of project activities.

What is evaluation design?

An evaluation design involves:

  1. A set of quantitative or qualitative measurements of project performance and
  2. A set of analyses that use those measurements to answer key questions about project performance.

Evaluation designs include ways to describe project resources, activities, and outcomes as well as methods for estimating the impact of project activities (Wholey, Hatry, & Newcomer, 1995).

Levels of evaluation

Before we discuss evaluation designs in more detail, we'll review the different levels of evaluation.

Project monitoring involves counting specific project activities and operations. It tracks how or what the project is doing. For example, you might track how many activities you offer, how many staff are involved, how many hours they work, how many participants attend activities, how many partners are involved and whether these things change over time. You probably do this already!

Process evaluation assesses project processes and procedures and the connections among project activities. Process evaluations ask how the project is operating and how to make it better. For example, you might want to know whether your project is reaching who it intended to reach. Are activities occurring in the way they were originally planned? Is the number of participants affecting staff workloads or the amount of service available to participants?

Outcome evaluation assesses project impact and effectiveness. For example, a project working with youth at risk of involvement in crime or victimization might ask questions such as: Did participants stay in school longer? Did they have less contact with police or become victims less often? How can the program be improved to better meet its goals? Good outcome evaluations include some project monitoring and process evaluation.

Most evaluations include components from each level. We are going to focus on designs for outcome evaluations.

Threats to the validity of evaluation results

In Module 3, we talked about the importance of choosing indicators that are valid. The validity of the evaluation design as a whole is just as important as the validity of the specific indicators you choose. There are a few kinds of validity (see sidebar on the next page), but we are going to focus on internal validity.

Internal validity refers to the extent to which we can feel confident that the changes an evaluation identifies in the community or among project participants are, in fact, the result of the project. Choosing an evaluation design that rules out alternative explanations for your results best ensures its internal validity.

We'll show you some of the typical threats to the internal validity of evaluation results. A strong evaluation design will counter these threats. It will increase your confidence that the results you find are due to your project and not to outside factors.

History – Remember Sir John A.? He represents a different kind of history than what we want to tell you about. The kind of history we're referring to is an outside event not related to your project that influences the changes your evaluation is able to track over time. There might have been some "history" in the community that influenced the changes you detect but that is not related to your project.

For example, a project focused on reducing substance use may choose to use police charges as an indicator of change. If changes to police policies about laying charges for simple possession took place during the project time frame, using this indicator could lead to a misinterpretation of the effects of the project. An increase or decrease in the number of charges might be interpreted to mean there was a change in substance use when, in fact, the increase or decrease resulted from a change in police policies.

Maturation – Maturation refers to the changes that occur naturally due to the passage of time. For example, children change as they grow up. Change in their behaviour or attitudes over time might just reflect the normal process of maturation.

Studying developmental changes in children and youth can be especially difficult because they naturally experience change as they mature, regardless of their involvement in projects. To rule out this threat, you could compare the changes experienced by the project participants to those experienced by members of a comparison group not involved in the project. If the project group experienced similar changes to those of the comparison group, it would suggest the changes might simply have been due to maturation. If the changes experienced by the project group exceeded those experienced by the comparison group, you could be more confident that the changes were not the result of maturation alone, but resulted from the project activities.

Other types of validity

Here are definitions of three other types of validity:

Too much information!?

If this just seems like "too much information," don't worry. You don't need this level of detail. (We just thought you might be interested!) The main thing to remember is that threats to validity are factors other than participation in project activities that lead to change in the group being studied. Validity threats can lead to false conclusions about your project's ability or failure to obtain the results you anticipated.

Selection - The threat of selection results from the fact that people who choose to join your project activities might be different from those who do not.

For example, a project offering workshops about fraud against seniors might draw seniors who are more likely to seek out information, making them less vulnerable to fraud in the first place. This is another good reason to use a comparison group whenever possible. If you could look at change among those seniors who came out to project workshops and compare them to a similar group of seniors who did not participate, you could rule out the threat of selection.

However, you would want to ensure that no bias was involved in selecting members of the participant and comparison groups. We talk more about ways to avoid selection bias in project and comparison groups at the end of this module when we discuss experimental and quasi-experimental designs.

Experimental designs help to reduce selection bias. Participants are randomly assigned to a project and a comparison group.

When random assignment is not possible, quasi-experimental designs can be used. They use other means to create a comparison group, while still attempting to control for selection biases. For example, a comparison group might be drawn from a waiting list for the project or from a similar community that has members with similar histories to those of the project's participants.

Mortality – Mortality has some similarities to the threat of selection. It results when the analysis of change focuses on participants who complete an activity. Those who complete an activity might be those who were most motivated to succeed, while those who dropped out might have been more at risk in the first place.

For example, let's say a project focusing on anger management asked participants to complete a questionnaire about anger management before and after the program. The handcuffs represent those participants who were charged with offences and incarcerated before the program ended. When it came time to do the post-test, they weren't available to be tested so their scores could not be included in the post-test average. Yet they were likely more at risk to begin with, while those who completed the program were probably most likely to succeed even without the program. For this reason, it is a good idea to collect some additional information about participant history and demographics at pre- and posttests. Then you can compare those who drop out and those who stayed in the project to see if they differed in some way from the outset.

Testing – The more often you measure something, the more familiar participants will be with the test. Their responses might be influenced by how they responded the first time. This is a good reason to avoid doing pre- and post-tests within too short a period of time.

Instrumentation – Sometimes the test or questionnaire you use as an indicator is itself a problem. For example, if your evaluation used two interviewers who asked the same set of questions in different ways or who recorded the responses differently, the results from the interview might differ because of the way the interview was administered rather than because of the effect of the project.

Statistical Regression – Sometimes factors such as having a particularly bad day or not getting enough sleep the previous night might result in some participants scoring lower on a pretest measure than they would score under normal circumstances. This is particularly true when the pretest measure has a low reliability rating. (In Module 4 we explained that reliability means the measure provides consistent information over time). If those participants who had very low scores then complete the posttest measure on a more normal day, their score will likely improve, but at least part of this improvement will not be due to the program or intervention they received. Generally, very high or very low pretest scorers often move closer to the mean score over time. This is called "regression to the mean" or statistical regression. This threat to validity is particularly a problem when participants are placed in an experimental group based on the more extreme scores they received on the pretest.

Evaluation design

In the earlier training modules we have talked about:

As you are considering your data collection methods, you will also need to consider the design of your evaluation:

  1. What design will best suit the project you wish to evaluate?
  2. How will the design counter potential threats to validity?

Design #1: Single group posttest only

Let's say:

X = your project intervention
O = when you will measure or assess project change

If participants are involved in your project activities and you measure the outcomes afterward, it would look like this:

X: What your project does.
O: When you measure the outcome.

What your project does. When you measure the outcome.

Let's say your project involved activities to increase youth pride in their cultural heritage. This is represented by X.

After these activities, project staff asked youth who participated in the activities to answer some questions to assess their knowledge about their culture, their feelings of belonging, and their pride in their culture. 0 represents this assessment.

Your evaluation involved an assessment of the project outcome at one point only: after the activities occurred.

Some questions:

Looking back at the list of threats to validity, what threats do you think this design might be vulnerable to? What would you compare the outcomes to?

How will you know the outcomes are a result of the project and not from outside factors? What do you know about how much youth knew about their culture and how they felt about their culture before they became involved in the program?

What do you know about the intervention experienced by participants or the activities aimed at achieving the outcomes? What changes would you recommend to the design to help answer these questions?

Design #2: Single group pre- and posttest

Once again, let's say:

X = your project intervention
O = when you will measure or assess project change

If you measure the change before and after the program, it would look like this:

O: Pretest assessment
X: Project activities
O: Posttest assessment

Let's say this is the same project we discussed on the previous page. This time youths' knowledge of their culture, their feelings of belonging, and their pride in their culture were measured before the project activities and then again afterward.

Here are some questions to help you think further about this design.

Questions:

Is this design better than the posttest only design ?

What threats to validity is it vulnerable to?

What would you compare the outcomes to?

How will you know the outcomes are a result of the program and not from outside factors? What do you know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the program?

What do you know about the intervention experienced by participants or the activities aimed at achieving the outcomes? What changes would you recommend to the design to help answer the questions that remain unanswered?

Design #3: Comparison group posttest only

In this design, you measure the results for participants involved in your project after the project activities. You use the same measure with a second group that was not involved in your project. You test the second group at the same time as you test project participants.

O1 = the assessment of project participants
O2 = the assessment of a comparison group

If you were to use the symbols to show how this design works, it would look like this:

There is no X for the comparison group because they did not participate in an intervention.

If we go back to our previous example, in this design the youths' knowledge, feelings of belonging, and pride in their culture were measured after the project activities AND we have a comparison group whose members were not involved in the project and who were tested at the same time.

Questions:

What do you think of this design?

Is it better or worse than the previous designs?

What threats to validity is it vulnerable to?

What would you compare the outcomes to?

How will you know the outcomes are a result of the program and not from outside factors?

What do they know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the project?

How do you know that the comparison group was treated differently than the project group? What changes would you recommend to the design to help answer the questions that remain unanswered?

Design #4: Comparison group pretest and posttest

This design involves measuring the results for participants involved in your project before after the project activities. The same measures are used with a second group not involved in your project. You test them at the same times as project participants.

If you were to use the symbols to show how this design works, it would look like this:

As with Design #3, there is no X for the comparison group because they did not participate in an intervention.

Let's go back to our sample project to increase young people's pride in their cultural heritage. In this design, participants' knowledge, feelings of belonging, and pride in their culture were measured before and after the project activities. A comparison group whose members were not involved in the project was tested at the same times.

Questions:

What do you think of this design?

Is it better than the previous designs ?

What threats to validity is it vulnerable to?

What would you compare the outcomes to?

How will you know the outcomes are a result of the program and not from outside factors?

What do you know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the project?

How do you know that the comparison group was treated differently than the project group?

Design #5: Time-series designs

Most community groups involved in crime prevention through social development are unlikely to have the resources or time to do a time-series design. But in some cases, these may be possible, so we are providing a brief outline of these designs. Time-series designs are similar to pre-posttest designs except they take measurements a few times before the project takes place and an equal number of times after it takes place. Time-series designs can involve a single group only or a comparison group. Using the symbols we've used to depict the designs on the previous pages, we provide some sample designs.

A single-group time-series design might look like this:

In this case, assessments are taken at three different times before the project activities and at three times after the project activities.

A comparison-group time-series design might look like this:

As with the single-group example, the pattern shown here includes pre-assessments at three different times – this time involving both the project and the comparison group – and post-assessments on three different occasions after the project activities are complete.

A single-group interrupted time-series design would look like this:

Interrupted time-series designs take measurements between components of an intervention or project to get a better sense of how the different parts of the intervention work together to produce outcomes. In this case, assessments are done on two occasions before a project activity or a set of project activities take place. Assessments are again made on two occasions after the activity or set of activities. Another project activity or set of project activities then takes place and assessments are made on two later occasions.

A comparison-group interrupted time-series design would look like this:

The pattern shown here is the same as that of the single-group design. A comparison group that receives assessments at the same time as the project group is added to this design.

Generally, time-series designs can be stronger than their pre-post counterparts. Analysis of time-series results can give a better idea of trends. For example, if you take measurements every three months, three times before the project occurs and three times afterward, you can see if the period over which the project took place led to a clear change in the outcomes. A comparison-group time-series design will strengthen your ability to attribute those outcomes to the project and not to other outside factors.

These designs face similar threats to validity as their pre-posttest counterparts. Testing can pose a more serious threat to time-series designs when measures such as questionnaires are administered directly to participants because they are taken frequently throughout the evaluation period. This will not be the case if the measures used are school attendance records, grades, or police reports that do not involve participants directly. One thing to keep in mind with time-series designs: Statistical analyses of these designs can be very complex.

A word about comparison groups

It is clear from what we've learned so far that the use of a comparison group in an evaluation will strengthen our ability to draw conclusions about effectiveness. Yet we all know this ideal is not so easy to accomplish. It is often very difficult for evaluations of community-based programs to obtain a reasonable comparison group. So what can be done to strengthen your conclusions in the absence of a comparison group? Here are a few suggestions:

Use your creativity to think of other ways to strengthen your findings.

Further words of caution.

If you can find a comparison group to strengthen your evaluation design, it's important to recognize that some comparison groups are better than others. First, here is the "gold standard" in evaluation design:

Random selection is used in what is known as experimental design. Participants are randomly assigned to either the comparison or the experimental (intervention) group. For example, youth who have been in trouble with the law might either be assigned to a program to reduce further conflict with the law or to a non-program group from which comparable information is obtained. They are assigned to either the project group or the

comparison group in a random way (e.g., every second youth goes to the alternate group). This helps to ensure there are no systematic differences between the two groups.

Often random selection is not possible so we instead try to construct a comparison group from a readily accessible group of similar candidates. For example, in the case of the program for youth in conflict with the law, we might ask youth on a waiting list or youth from a similar neighbourhood to participate in the comparison group. This is known as a quasi-experimental design. In quasi-experimental designs it is important to ensure both groups are comparable. Demographic information such as information about age, ethnicity, level of education, income, and previous encounters with the law is collected from both groups to help us determine if the comparison group is comparable to the intervention group.

Glossary of terms

Comparison (or control) group

A comparison group is called a control group in laboratory settings. Since researchers have far less "control" over community-based settings, it is known as a comparison group in this context. The comparison group is made up of people who have similar characteristics to participants in the project being evaluated, but who do not receive exposure to the project.

Experimental group

An experimental group is a group of people who participate in an intervention (or project). The outcomes experienced by the experimental group can be compared to those of a comparison group who do not receive the intervention. The comparison group should have similar characteristics to those of the experimental group, except that they do not receive the intervention under study. The difference of effects between the two groups is then measured.

Pre-post testing

Pre-post testing involves administering the same instrument before and after an intervention or program.

Random selection

Random selection means that people (or communities) have an equal opportunity of being selected to be part of either a comparison or an intervention group. Whether any one individual or community is selected to be part of one group or the other is determined by chance.

Sample

A sample is a subgroup of a larger population. It is studied to gain information about an entire population.

References

Clark, A. (1999). Evaluation research: An introduction to principles, methods and practice. Thousand Oaks, CA: Sage.
Wholey, J.S., Hatry, H.P., & Newcomer, K.E. (1995). Handbook of practical program evaluation. San Francisco: Jossey-Bass.

Suggested Resources

Websites

North Central Educational Regional Laboratory (NCREL)
Evaluation Design and Tools

http://www.ncrel.org/tandl/eval2.htm

This website provides information about evaluation design. It answers three main questions: Why should you evaluate your project? What should you evaluate? How should you evaluate? Common threats to validity are discussed.

Northwest Regional Educational Laboratory
Impact Evaluation

http://www.nwrac.org/whole-school/impact_c.html

This website offers helpful information on various evaluation designs and models.

Manuals and Guides

National Science Foundation
User-Friendly Handbook for Mixed Method Evaluations
http://www.ehr.nsf.gov/EHR/REC/pubs/NSF97-153/start.htm

This user-friendly guide provides information about both qualitative and quantitative research designs.

Treasury Board of Canada
Program Evaluation Methods: Measurement and Attribution of Program Results, Vol. 3
http://www.tbs-sct.gc.ca/eval/pubs/meth/pem-mep_e.pdf

This guide addresses various methodological considerations. It includes an extensive chapter on evaluation designs and strategies.

The Urban Institute
Evaluation Strategies for Human Services Programs
http://www.urban.org/url.cfm?ID=306619

This guide provides insight into various evaluation issues, including a comprehensive section on developing an evaluation design.

Books

Trochim, William M. (2000).
The Research Methods Knowledge Base, 2nd Edition.
http://www.socialresearchmethods.net/

This on-line textbook guides the user through evaluation design and the strengths and weaknesses associated with various designs.

Evaluation design questions (pp. 5 - 9)

Design #1

What threats do you think this design might be vulnerable to?
It may be vulnerable to all threats except testing (unless participants had previously been exposed to the test in some other situation). It's hard to know for sure whether it is vulnerable to problems with instrumentation or statistical regression because we don't know whether the test was a good measure of the outcome, whether it was administered in a consistent manner, or how the participants would have scored before the program.

What would you compare the outcomes to?
There is nothing to which the results can be compared.

How will you know the outcomes are a result of the program and not from outside factors?
There is no way to know this for certain.

What do you know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the program?
There is no way to know this for certain.

What do you know about the intervention experienced by participants or the activities aimed at achieving the outcomes?
You won't know this without conducting a process evaluation or doing some kind of project monitoring. If process information about how the intervention was delivered is not collected, it's hard to draw conclusions about what worked or did not work.

Design #2

What threats to validity is it vulnerable to?
It may be vulnerable to all threats to validity. It's hard to know for sure whether it is vulnerable to problems with instrumentation because we don't know whether the test was a good measure of the outcome or whether it was administered in a consistent manner.

What would you compare the outcomes to?
This time you can compare the outcome to the pretest results.

How will you know the outcomes are a result of the program and not from outside factors?
Once again, there is no way to know this for sure.

What do you know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the program?
The pretest will tell you something about the youths' knowledge and feelings before their involvement in the program.

What do you know about the intervention experienced by participants or the activities aimed at achieving the outcomes?
Once again, you won't know this without doing a process evaluation or project monitoring.

Design #3

What threats to validity is it vulnerable to?
It is less vulnerable to history, maturation, selection and mortality than the single-group designs. However, there could be a selection bias in determining who is in the comparison group and who is in the program group. Random assignment to these two groups would address this concern. It is likely to be vulnerable to testing only if the participants were previously exposed to the test in some other situation. It is hard to know if it's vulnerable to instrumentation or statistical regression because we don't know whether the test was a good measure of the outcome, whether it was administered in a consistent manner, or how the participants would have scored before the program.

What would you compare the outcomes to?
You can compare the outcomes for the project group to those for the comparison group. If the outcomes are better for the project group than the comparison group, this is an indication that the project has had an effect.

How will you know the outcomes are a result of the program and not from outside factors?
If the project participants do better than the comparison group, it suggests the project made the difference. However, you cannot be certain about this without knowing if there were differences between the comparison and project groups at the outset. It would also help to know something about any influences affecting the comparison group.

What do you know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the project?
You won't know this with this design.

How will you know that the comparison group was treated differently than the project group?
You would need to document the project activities and know more about any activities to which the comparison group might have been exposed.

Design #4

What threats to validity is it vulnerable to?
It is less vulnerable to all threats to validity. However, there could still be a selection bias in determining who is in the comparison group and who is in the program group. Random assignment to these two groups would address this concern. This design may be vulnerable to testing if the pre- and posttests are administered too close together in time. It is hard to know if it's vulnerable to instrumentation because we don't know whether the test was a good measure of the outcome or whether it was administered in a consistent manner.

What would they compare the outcomes to?
You can compare the change over time between the project group and the comparison group. This helps to rule out most threats to validity, provided the two groups are comparable at the outset.

How will you know the outcomes are a result of the program and not from outside factors?
If the project participants do better than the comparison group, it suggests the project made the difference. However, some doubt remains unless we know something about the influences affecting the comparison group and the extent to which the program and comparison groups were comparable at the outset.

What do you know about how much youth knew about their culture and how they felt about their culture BEFORE they became involved in the project?
You will have this information from the pretest.

How do you know that the comparison group was treated differently than the project group?
You would need to document the project activities and know more about activities to which the comparison group might have been exposed.

Module 5 Worksheets

Worksheet #1 Evaluation Design

Design Type:

Proposed measures, when they will be administered, and who will administer them:

Strategies/rationales for dealing with threats to validity:

1 – History

2 – Maturation

3 – Mortality

4 – Testing

5 – Instrumentation

Module 6: Analyzing data and reporting resultsFootnote 1

Learning objectives

The art of interpreting data

Evaluation is part art, part science. If you are analyzing numerical data, it's important to keep in mind that the numbers you obtain are not outcomes in themselves but indicators of the outcome. Need a reminder of what an indicator is?

An indicator is a variable (or information) that measures one aspect of a program or project. It indicates whether a project has met a particular goal.

But indicators don't tell the whole story. They should be reported in the context of the factors that may influence them. This is part of the art of reporting results. Factors that can influence results include:

The art of interpreting data

It is best to consider how you will analyze your data before you begin collecting it. Your plan of analysis should be part of your evaluation plan. Be open to different methods of analysis. It is always a good idea to consult others who can give you advice about the best way to analyze the data you hope to obtain. As you plan your evaluation, talk to academics, consultants, or others in the field of crime prevention through social development to gain ideas on the most appropriate methods of analysis. Once you have analyzed the data, participants, partners, and staff might be able to shed light on possible explanations for the results you obtain. Include them in your consultations about how to interpret your evaluation results.

The science of analyzing data

While art is involved in the interpretation and reporting of data, there is only room for science when recording and analyzing results. This is one place where accuracy and attention to detail are essential.

Imagine the consequences of mistakenly entering in the wrong column of a database the scores obtained from project participants on measures of behaviour or attitudes. This would lead you to draw conclusions based on inaccurate information. Or think about the consequences of basing conclusions on participants' responses to questions they had consistently misunderstood. These examples give you an idea of why it is important to take care in implementing all aspects of your evaluation.

Here are some things to watch:

The science of analyzing data

Make sure qualitative and quantitative data are accurate and complete:

Key steps in data analysis

Analyzing results

As we mentioned in Module 3, analysis involves sorting out the meaning and value of the data you have collected. In that module we provided you with descriptions of basic terms used in data analysis. We have included them again in the box shown on this page.

Frequencies and Percentages

Let's say we asked 472 people whether they agreed, disagreed, or had no opinion on whether they felt safer in their neighbourhood after a community worker had organized a series of workshops that brought neighbourhood residents together to discuss common issues of concern and to get to know each other better.

The results are presented in the table below.
Response Frequency Percentage
Agree 288 61%
Neutral 112 24%
Disagree 72 15%
TOTAL 472 100%

You probably already know how to calculate a percentage. Just in case, we'll use the number of people who agreed to the statement to demonstrate how it is calculated.

The number of respondents who said they agreed with the statement (288) are divided by the total number of respondents (472) then multiplied by 100 to provide the percentage of respondents who agreed that their neighbourhood felt safer (61%).

288 ÷ 472 x 100 = 61%
% of total responses = Frequency of response ÷ total number of responses × 100

Getting more sophisticated...
Age Agree % (Freq) Neutral % (Freq) Disagree % (Freq) Total % (Freq)
Age 18-29 31% (38) 25% (31) 44% (55) 26% (124)
Age 30-59 58% (114) 36% (71) 6% (12) 42% (197)
Age 60+ 90% (136) 7% (10) 3% (5) 32% (151)
TOTAL 61% ( 288 ) 24% (112) 15% (72) 100% (472)

You can provide more information by further breaking down the frequencies and percentages by the different categories of people who responded. For example, you could break the neighbourhood residents down by age, gender, level of income, housing type, or any other factor that is likely to influence people's perceptions.

In the table above, we categorized the responses of the neighbourhood group by age. The table shows the percentage and the frequency (in brackets) of responses for each age group. The underlined figure shows the percentage of residents from 18 to 29 years of age who agree that the neighbourhood feels safer. This age group represents 26% of the total group of respondents. The row and the column labeled TOTAL should always add up to 100%.

Statistical significance

If you had collected the opinions of neighbourhood residents before the series of workshops had begun, you could break the responses down by those received before (pre) and after (post) the project. This would show you whether residents' views changed from before to after the project. Of course, as we know from Module 5, you would have to be cautious in attributing this change to the workshop series without having similar information from a comparison neighbourhood that was not exposed to the workshops. There are more sophisticated statistical calculations that can tell you whether the differences in opinions before and after a program or intervention are "statistically significant" and unlikely to have occurred due to chance.

The higher the level of significance, the less likely the result is due to chance. If a result is significant at the .01 level, there is a one-in-100 likelihood (or probability) that it is due to chance.

Why .01? Because 1 ÷ 100 = .01 Thus, .01 represents a one-in-100 likelihood of being due to chance.

1 ÷ 1000 = .001 or a one-in-1,000 likelihood of being due to chance.

1 ÷ 20 = .05 or a one-in-20 likelihood of being due to chance.

You will often see results from a research study or evaluation listed at a certain level of significance. For example, you might see: p < .01. This means the probability of a result being due to chance is less than one in one hundred.

General practice in professional journals has been to report findings with a minimum .05 probability level. Someone with expertise in statistics can advise you on the recommended probability level for your analysis. If your sample size is very large, there is a greater likelihood of results being due to chance, so higher levels of significance are generally required.

Most statistical tests require that certain assumptions about the data being analyzed be met. For example, when comparing means from two separate groups using a t-test, we assume the samples are drawn from populations with a normal distribution and equal standard deviations. A normal distribution means most scores fall around the centre with a smaller, but equal proportion falling toward either end of the range of scores. A graph of a normal distribution looks a bit like a bell, so it is sometimes referred to as a "bell curve."

Normal distribution or "bell curve"

We won't get into the specifics of what a standard deviation is. A simple definition is that it is a measure of the extent to which scores are spread between the highest and lowest scores and deviate from the mean or average score. The standard deviation increases in proportion to the spread of the scores.

Rather than make things too complicated here, we have included some resources at the end of this chapter to help you learn more about variance and standard deviations. We have also provided a more technical definition of a standard deviation in the glossary. The recommended resources included in this chapter will be helpful if you are interested in doing tests of significance. The use of tests of significance requires some expertise to ensure you are making the correct calculations and are using data that follow the assumptions needed for specific tests. A consultant can help you with this.

The three Ms

The three Ms – the Mode, Median, and Mean – are formally known as Measures of Central Tendency. You are probably familiar with these measures. We introduced them in Module 3. The three Ms, or measures of central tendency, are three different ways of summarizing where most responses from a group of people fall. The table below shows the level of satisfaction of 51 people who indicated on a five-point scale how satisfied they were with a training workshop on evaluating community-based projects. A score of 1 indicated they were not at all satisfied and a score of 5 indicated they were very satisfied.

Level of satisfaction of 51
Level of Satisfaction Frequency Percentage
5 5 9.8%
4 15 29.4%
3 19 37.3%
2 10 19.6%
1 2 3.9%
TOTAL 51 100%

The mean is the "average" score. It is calculated by adding all the values, then dividing by the total number of values. To calculate the mean for the example on the chart above, we determine the total value of all of the scores indicated by respondents, then divide by the total number of scores (51) to obtain a mean score of 3.22:

Mean = (5x5 + 4 x 15 + 3 x 19 + 2 x 10 + 1 x 2) ÷ 51 = 3.22

The median tells you where the midpoint of the scores is found. You obtain the median by ranking all responses from highest to lowest, then finding the middle response (in this case, the 26th response). If there is an even number of cases, the median is the point halfway between the two middle cases. For example, if there were 52 responses to the satisfaction scale, the median point would be half way between the score of the 26th and 27th cases, if they were different.

Median = the 26th response (the midpoint) from highest to lowest = 3

The mode is the response most frequently given. In this case most people (19 of 51) rated their satisfaction as 3.

Mode (the most frequent response) = 3

Which one do I use? Mean, median, or mode?

Here are some hints about when to use each:

Of the three measures of central tendency, the mean is most affected by outlying responses. In the satisfaction example we have used, the scores fall in what is known as a normal distribution or bell curve (i.e., most scores fall in the middle, with a relatively equal but smaller number at the high and low end). As a result, all three measures – the mode, mean, and median – are close to each other. Sometimes a few scores at either the high or low extreme can influence the mean, making the median a better measure of the central point. In this case, there is a slight influence on the mean resulting from the higher number of people who rate their satisfaction level as 5. Thus, while the mode and median are 3, the mean is 3.22.

Analysis tips

When to seek help

We have mentioned that consultants can help you with your analysis. You may be wondering when you can do the work yourself and when you should hire someone to help you with it. We recommend that you hire someone who understands statistics to help with more sophisticated analyses such as tests of statistical significance. A consultant can also help you determine what kind of data you'll need, the sample size required, and other things to consider before collecting your data.

When conducting more sophisticated statistical analysis, consultants can help to ensure:

Standardized tests

In Module 4 we discussed standardized tests. Standardized tests are often used to assess individual knowledge, attitudes, and behaviours. They generally involve a number of statements each of which is followed by a scale (e.g., a four-point scale ranging from strongly agree to strongly disagree) on which the respondent indicates the perspective that most applies to him or her. The process of standardizing a test involves administering it to a large group of people who, ideally, are similar to those who will complete the measure when it is used in the field. The data collected from this large group serve as a comparison to help interpret respondents' scores. Standardization allows you to determine if an individual's score is high, average, or low compared to the "norm."

Because standardized scales are constructed so that a number of items contribute toward the assessment of one attribute or concept, the results of individual items should generally not be reported on their own. In Module 5, we talked about validity, the ability of a measure to assess what it is intended to measure. We also talked about reliability, the ability of a measure to provide consistent information over time and when completed by different groups. A single item on a standardized scale is likely to have low validity and reliability when used on its own. It might be subject to misinterpretation by the respondent or may be more likely to result in some respondents providing socially desirable responses. The use of a total score on a scale or subscale is less likely to be vulnerable to these forms of response bias. For this reason, it is best to report results from standardized scales based on the total result of a scale or subscale.

Qualitative dataFootnote 2

You can analyze small amounts of qualitative data by summarizing the responses in order to provide an overall picture of what was said.

When you have large amounts of qualitative data, a more systematic method of analysis should be used. Content analysis is a process where patterns in qualitative data are identified, given a code or name, and categorized (Patton, 1990).

Analyzing qualitative data is usually a time-consuming process, but software programs such as NUD*IST and NVIVO (see http://www.qsrinternational.com/) have been developed to help with the coding of qualitative data.

If you are going to analyze your qualitative data by hand using paper copies of the data (e.g., transcripts of interviews or copies of open-ended responses to written questions), make at least four copies before you begin your analyses (Patton, 1980). If you are analyzing transcripts of interviews, ask the transcriber to leave wide margins on each page to allow for coding. Interview transcripts should always be transcribed verbatim. They should never be summarized before the analysis stage.

Tips for preparing paper

The four copies of your data will be used as follows:

  1. Safe storage
  2. To be referred to throughout the analysis
  3. For coding margins
  4. (or more copies) For cutting and pasting onto memos

After the data are collected, transcribed, and copied, you are ready for analysis. The following is a brief summary of steps used in content analysis:

  1. Review and organize data – Review the transcripts (or written responses to questions). Make notes about emerging themes and the main categories into which data seem to fit.
  2. Code data – Go through the transcripts (or written responses) carefully. Look for information relevant to the evaluation questions generated at the beginning of the evaluation process and for emerging themes and categories. "Code" the data by writing the topic, category, or theme in the right-hand margin of the page.

    Example:

    Let's say you are coding the responses of teen project participants to a question about what they learned about problem solving. You might code the responses as to whether they fell into the general categories of ignoring the problem, seeking support from others, escaping or avoiding the problem, or actively seeking a solution to the problem.
  3. Construct memos – Now that all of the data are coded, use the codes you have placed in the right-hand margins of the transcripts or interview responses to review your data by category. You should be looking for contradictions, themes consistent within categories, or linkages between categories. Summarize these in written memos.

    Example

    Let's say you have interviewed project staff and staff from partner organizations. One of the interview questions asked respondents about their perceptions of how the partnership was working. You might have coded these perceptions under the general title "perceptions of partnership." Within this category you might notice contradictions in how different groups perceive the partnership. You could develop a memo describing the contradictions you identified. Another memo might describe any other patterns you identified in how respondents viewed the partnership.

    Miles and Huberman (1984) note that memos "do not just report data, but tie different pieces of data together in a cluster, or they show that a particular piece of data is an instance of a general concept" (p. 69). Memos help to frame the way in which you will present the results of your analyses. They outline the major themes, contradictions, linkages, and categories of data.
  4. Cut and paste – Cut and paste directly onto the memos quotes from the raw data that pertain to the themes or concepts the memos present. These quotes will act as examples of the concepts or themes you are identifying. Always remember to code the quote with information as to its source (not the person's name, but the type of respondent – a participant, partner, staff person, etc.) and page number.
  5. Check conclusions – When possible, use questions arising from the memos you have constructed as a basis for further interviews with participants. These interviews will be more successful if you provide participants with a summary of your preliminary conclusions before the interview. Miles and Huberman (1984) and Guba and Lincoln (1985) recommend these checks as a way to correct and verify the data.
  6. Revise your original assertions – Transcribe and analyze the follow-up interview data following steps 1 to 4 above and use the resulting information to revise your original assertions.

Reporting your results

How you choose to present your data is to some extent a matter of preference. Some people prefer tables and others prefer graphs. Sometimes one method is better suited to presenting a particular kind of information than another. Here is some general information about the strengths of different methods of presentation:

Let's review in more detail these different methods of reporting results.

Bar charts

Participant-reported feelings of safety in 2001 (n =90) and 2003 (n =88)

Bar charts are good for comparing two sets of data. For example, the chart shown above presents values across categories (e.g., always, sometimes, never) and over time (e.g., 2001 and 2003). Bar charts are also useful for comparing pre-post data. The horizontal axis represents the categories (always, usually, etc.), while the vertical axis represents the frequency of occurrence of the responses (in this case in percentages).

When you prepare a bar chart, remember to include a "legend." The legend is the box on the right that explains how the colours or patterns used on the bars relate to the year the data were collected. Label both axes and give the graph a title that fully explains its contents. Indicate the sample size. In this case, the sample size differs from one time to another, so both are shown.

Bar charts can be misleading.

The bar chart below presents the same data as those shown in the chart on the previous page. The chart on the previous page shows a vertical axis that is cut off at 50% rather than 100%. The proportions between the bars stay the same, but the reduced size of the vertical axis allows the bars to better fill the graph. This presentation can mislead the eye to think more people usually or sometimes feel safe than actually do. But it is not incorrect to present the data in this way. The reader should always review the vertical axis to see if it presents the full scale or only a portion of the scale. While you can get away without presenting the top range of a scale on a vertical axis, you should always start the scale at zero.

The maximum score on the vertical axis on the graph shown below is 100%. It is less likely to trick the eye than that on the previous page. But it doesn't fill the page as nicely! Pay attention to the vertical axis. Don't let your eye be tricked to thinking more people responded in a certain way than actually did.

Participant-reported feelings of safety in 2001 (n=90) and 2003 (n=88)

Line graphs

Line graphs are a good way to show continuous change such as how feelings of safety increase and decrease over time. Line graphs are especially useful for reporting trends. They can be used to compare change experienced by more than one group or in more than one area by including different lines in the graph.

The horizontal axis on the line graph shown below represents the points at which the data were collected. The vertical axis represents the frequency of responses (in this case, the percentage who perceived the area to be safe).

Remember to include a descriptive title and legend and to label the axes of your line graph. Use equal increments on the scale.

Resident Perceptions of Safety in Areas of the Housing Complex: 1998 (n=200), 2000 (n=192), 2002 (n=216) and 2004 (n=230)

On the next page we show another example of playing with the vertical axis. In this case, we presented the same data as in the previous graph, but we narrowed the range of the vertical axis to start at 30 and end at 90. As we mentioned earlier, it is not a good idea to have the vertical axis start higher than zero. As you can see from the graph on the next page, this misleads the eye by exaggerating the extent to which there was change in the perception of safety over the six-year period.

Resident Perceptions of Safety in Areas of the Housing Complex: 1998 (n=200), 2000 (n=192), 2002 (n=216) and 2004 (n=230)

Pie charts

Pie charts are a good way to show the various components that make up a larger group. They should be used when the data are portrayed as a percentage of the whole. If, for example, you are describing the demographic make up of your participant population, a pie chart is a good way to present the population by level of education, income group, or marital status.

Pie charts should be presented with a legend. Each category should be identified with a value label that shows what percentage of the whole it represents.

Level of education (n =100)

Tables

Tables are a good way to present the relationships between information. They can also be used to present work plans and progress in activities.

Give the table a complete title. Label all rows and columns within the table. If symbols are used, provide a legend to explain them. The table below presents the differences in demographic characteristics of a treatment (or intervention) group and a comparison group.

Demographic Comparison of Treatment vs. Comparison Groups
  Treatment (n =54) Comparison (n=47)
Mean age 27 24
Mean education (in years) 14 12
Mean income (monthly) $1400 $1250
% Female 55% 52%

One more tip...

Correlation ≠ Causation

This is one of the most important things to remember as you report the results of your analysis. The fact that two things are correlated does not necessarily mean one thing caused the other.

For example, if adolescents who smoke also have higher levels of dropping out of school, that does not mean they are more likely to drop out of school because they smoke. What does it mean? It means that those who smoke are more likely to drop out. Nothing more.

Likewise, even if there is a correlation between reduced involvement in crime and program participation, we can't say with absolute certainty that participants in the program were less involved in crime because of that program. We can say that participants in a particular program were less likely to be involved in crime. If you've done your research right and have eliminated other possible explanations for the reduced involvement in crime of project participants, you can say your results are promising. You can suggest the program may be having an effect.

Questions to consider in your report

When reporting the results of your evaluation, try to answer the questions your key stakeholders will want to know. Here are some questions to consider:

Match your presentation style to your audience

Writing a report can be a good way to summarize the results of your evaluation, but it is not the only way. How you communicate your results will depend on the audience you are trying to reach. Your funder might prefer a written report. A fact sheet written in plain language or a community forum might work best for project participants.

If you are providing a written report, make sure you provide an executive summary. An executive summary can be shared more widely than a full report. It is a good way to reach interested readers who do not have time to read a long report.

Videos or photographs that show aspects of community change are another effective reporting technique. Let's say you wanted to show the extent to which your project resulted in changes to the amount of graffiti in your community. A series of photographs showing the changes over time would be an ideal way to present your project's results. Another project might want to show how community action resulted in greater pedestrian traffic in an area that was formerly avoided by neighbourhood residents. A series of photographs or videos taken over time would make a compelling statement.

Glossary of terms

Correlation

Correlation refers to the relationship between two variables. It does not mean that one variable causes the other, but simply that one variable is related to another. From a statistical perspective, correlation is measured through a correlation coefficient (r). It measures the similarity or the strength of association between two variables. Statistical correlations range from minus one to plus one. The further the value is away from zero, the more the two variables are related (Worsley, Hoen, Thelander, & Women's Health Centre at St. Joseph's Health Centre, 2002).

Indicator

An indicator is a variable (or information) that measures one aspect of a program or project. It indicates whether a project has met a particular goal. There should be at least one indicator for each significant element of the project (i.e., at least one for each outcome identified in the logic model).

Inter-rater reliability

Inter-rater reliability is used to examine the extent to which different raters or observers agree when measuring the same phenomena (Aspen Institute of Comprehensive Community Initiatives, 1999).

Normal distribution

A normal distribution means the responses or scores from a particular population are distributed in a way in which most responses fall around the centre with a smaller but equal proportion falling toward either end of the range of scores. A line graph showing a normal distribution looks a bit like a bell, so it is sometimes referred to as a "bell curve."

Norms

Norms indicate the average scores on survey items. The scores of respondents to a survey can be compared to the survey's norms in order to determine how they compare to the average population or to a population with characteristics similar to their own.

Pilot test

A pilot test is a way to test out a particular instrument before it is used in a study or evaluation. For example, you might want to try out a survey you have developed with a few potential respondents to see if they understand it and if the questions provide you with the kind of information you are hoping to obtain.

Scale/subscale

A scale is a test where questions that measure the same thing, or different aspects of the same thing, are linked together (Rittenhouse, Campbell, & Dalto, 2002). A subscale measures one aspect of the larger scale. For example, if a scale were used to measure problem-solving skills, one subscale might assess the person's ability to seek the support or advice of others when faced with a problem.

Socially desirable

Sometimes respondents provide responses to a survey or measure that they believe are most favourable to their self-esteem, are most in agreement with perceived social norms (Polland, 1998), or that the researcher will want to hear. These are considered socially desirable responses.

Standard deviation

A standard deviation is a measure of the extent to which scores are variable or are spread around the mean or average score. The statistical definition of a standard deviation is the square root of the variance of the scores. Variance is the extent to which each individual score in a set of scores deviates from the mean of that set of scores. The higher the standard deviation, the more widely the scores are spread around the mean.

Standardization

A standardized measure is one that has been administered to a very large group of people similar to those with whom the measure would be used. The data collected from this group serve as a comparison for interpreting individual, small-group, or program measure results. Standardized tests allow you to determine if an individual's test score is high, average, or low as compared to the norm (Ogden/Boyes Associates Ltd., 2001).

Statistical significance

Tests of statistical significance are done to determine if results are due to chance or are likely to reflect a real difference or change.

T-test

A t-test is a statistical test used to test the difference between the means obtained from two different populations or from the same population but under different conditions. For example, a t-test might be used to determine if the difference in the mean scores obtained on a test administered to participants in a project and that obtained by members of a comparison group is statistically significant (i.e., unlikely to be obtained by chance). It might also be used to test the statistical significance of the difference in the mean scores obtained on a test administered to participants before and after a project.

References

Aspen Institute Roundtable on Comprehensive Community Initiatives. (1999). Measures for community research. Retrieved November 2, 2004, from
http://www.aspenmeasures.org/html/glossary.htm

Guba, E.G., & Lincoln, Y.S. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage.

Miles, M. B. , & Huberman, M. A. (1984). Qualitative data analysis: A sourcebook of new methods. Beverly Hills, CA: Sage.

Ogden/Boyes Associates Ltd. (2991). CAPC program evaluation tool kit: Tools and strategies for monitoring and evaluating programs involving children, families and communities. Unpublished report for Health Canada, Population and Public Health Branch, Alberta/Northwest Territories Region, Calgary, AB.

Ottawa Police Services. (2001, August). You can do it: A practical tool kit to evaluating police and community crime prevention programs. Retrieved October 6, 2004, from http://www.ottawapolice.ca/en/resources/publication/pdf/you%5Fcan%5Fdo%5Fit %5Fevaluation%5Ftoolkit.pdf

Patton, M. Q. (1980). Qualitative evaluation methods. Beverly Hills, CA: Sage

Patton, M. Q. (1990). Qualitative evaluation and research methods. London: Sage.

Polland, R.J. (1998). Essentials of survey research and analysis: A workbook for community researchers. Retrieved November 4, 2004, from
http://www.tfn.net/%7Epolland/quest.htm

Rittenhouse, T., Campbell, J., & Dalto, M. (2002, February 15). Dressed-down research terms: A glossary for non-researchers. Retrieved November 4, 2004, from the Missouri Institute of Mental Health Web site: http://www.cstprogram.org/PCS&T/Research%20Glossary/Dressed_Down_Glossar y.pdf

Worsley, J., Hoen, B., Thelander, M., & Women's Health Centre at St. Joseph's Health Centre. (2002, September). Community Action Program for Children: Ontario regional evaluation final report. Toronto, ON: Health Canada, Population and Public Health Branch, Ontario Region.

Suggested Resources

Websites

Centre for Substance Abuse Prevention
Prevention Pathways
http://pathwayscourses.samhsa.gov/courses.htm

This series of online tutorials provides information about data analysis. The on-line course titled "Wading through the Data Swamp" includes topics such as descriptive statistics, correlation coefficients, t-tests, and chi-square analysis.

Corporation for National and Community Service
Project Star : AmeriCorps Program Applicant Performance Measurement Toolkit
http://www.projectstar.org/star/AmeriCorps/pmtoolkit.htm

This website includes easy-to-use step-by-step tools for analyzing performance measurement data. A reporting checklist provides guidelines on what to include in an evaluation report.

Innovation Network Online
http://www.innonet.org/

Free registration on this website entitles you to a number of excellent resources, including a statistics tutorial and a sample outline for a final evaluation report. Once you have registered, go to the resources section for these tools.

Microsoft Education
Analyzing Data with Excel 2002
http://www.microsoft.com/Education/Excel2002Tutorial.aspx

An on-line tutorial demonstrates how to analyze data using the Excel program.

Plain Language Network
Plain Language Online Training Program
http://www.plainlanguagenetwork.org/plaintrain/index.html

This network offers information on using plain language in various subject matters. Also available in French: http://www.plainlanguagenetwork.org/plaintrain/Francais/

Robert Niles.com
Statistics Every Writer Should Know
http://nilesonline.com/stats/

This excellent website is intended to help reporters better understand the statistics they encounter as journalists. As such, it is easy-to-use and intended for the lay reader. The site includes simple lessons on basic statistics and a table to help you determine appropriate sample sizes. The site also provides links to recommended reading on statistics; again, for the lay reader.

University of Wisconsin Program Development and Evaluation
http://www.uwex.edu/ces/pdande/evaluation/evaldocs.html

This is an excellent website that offers learning tools on the analysis of qualitative and quantitative data and on reporting evaluation results. It provides tips for preparing graphs and charts.

Western Centre for Substance Abuse Prevention
Step 7: Evaluation
http://casat.unr.edu/bestpractices/eval.htm

This website provides information on best practices in setting up a prevention program, including the implementation of an evaluation. It includes a useful section on analyzing, using, and interpreting data, both qualitative and quantitative.

Journals

Practical Assessment, Research and Evaluation
http://pareonline.net/

This on-line Journal includes articles on data analysis and interpreting results.

Books/Manuals

Crime Prevention is Everybody's Business: A Handbook for Working Together

Pages 76-79 of this manual provide information about how to organize your final evaluation report.

Bigwood, S., & Spore, M. (2003 )
Presenting numbers, tables and charts
. New York: Oxford University Press.

This is a good reference book to help you present your data.

Trochim, W. (2000).
The Research Methods Knowledge Base, 2nd Edition.
http://www.socialresearchmethods.net/

This on-line textbook provides extensive information about data analysis and the reporting process.

Ottawa Police Services. (2001, August).
You can do it: A practical tool kit to evaluating police and community crime prevention programs. http://www.ottawapolice.ca/en/resources/publications/pdf/you%5Fcan%5Fdo%5Fit%5Fev aluation%5Ftoolkit.pdf

Pages 78-79 of this manual provide information about how to organize your final evaluation report.

Module 6 Worksheets

Worksheet #1

Analysis of results

Worksheet #2

Reporting resultsFootnote 3

  1. Prepare a three-minute presentation of the results of your analysis.
  2. Suggest what factors may have influenced the results (participant, project, process).
  3. Use at least one graph, chart, or table.
  4. Draw some conclusions: From these data it looks like.
  5. If the results were different from what you expected, why do you think that is?

Module 7: Evaluation challenges and solutions

Learning objectives

This module describes some of the key challenges faced in evaluating crime prevention through social development projects. The description of each challenge is followed by some potential solutions to get your evaluation on its way.

Challenge #1: Getting "buy-in"

In order for any evaluation to succeed, it will need the support of partners, project managers, front-line staff, and participants. First, let's look at some of the challenges you might face in getting the buy-in of project staff and partners. We'll talk about participant involvement in Challenge #2.

Project staff

Project staff may have many reasons for being reluctant to support an evaluation of the programs or project in which they work. As you review the challenges we have listed below, remember that staff working in your project may share some but not all of these anxieties. Or, they may not share them at all: Some staff will be excited at the opportunity to participate in an evaluation, learn new skills, and demonstrate that the activities in which they are involved make a difference in the lives of others.

Partners

Community-based projects involved in crime prevention through social development generally work in partnerships or coalitions. Your organization may co-deliver project activities with other community groups, human service agencies, the educational system, police, and voluntary groups. Your partners might receive funding from different sources to support these activities. As a result, all of your partners and their funders will likely have an interest in your project's evaluation. Sometimes partners are a key source of information about project activities and outcomes. Their support – and often their active participation in the evaluation – is essential.

Solutions

Create a shared vision - Involving staff, partners and participants in evaluation planning can help to reduce anxiety and fear. If all of your key stakeholders are involved in identifying outcomes and potential indicators of success, they are more likely to support the evaluation. Consult with staff, partners, and participants about how to capture the kinds of change they see and experience.

View evaluation as research & development - Project staff are often concerned about how evaluation will affect the project's long-term prospects. It's important to take time to answer questions about the evaluation and address concerns. Explain that evaluation is intended to show what works and where improvements can be made. We should not expect that everything we do in community and human services will succeed, especially when we're working in the complex area of human behaviour. Successful businesses invest much time and resources in research and development (or "R&D") before they get their products right. Evaluation can help us learn from and refine what we do. It's not meant to judge anyone's work, but to advance our ability to make a difference in community life.

Embed evaluation in project activities - When evaluation is seen as a way to improve our ability to make change, we tend to see why it needs to be an integral part of project activities. It should not be considered an "add-on" responsibility, but a part of our day-to-day work. Integrating evaluation tools into project activities can help to address concerns about workload. Here are some ways to do it:

Provide training and support – Ensure staff get the training and support they need to feel confident in their skills and ability to do all aspects of their job, including evaluation. Identify those who need assistance. Employ experienced or knowledgeable staff as mentors to those who need help. Seek support from local universities or colleges to help with staff preparation. Local health units and foundations may also be able to help. Use this handbook and some of the exercises you learned in these training sessions at in-service staff training opportunities.

Consult funders – Funders can be flexible. (It's true!) If you and your partners have multiple funders, each with their own evaluation requirements, there may be creative ways to make these requirements fit together. If they truly compete with each other, approach your funder about the problem. Be prepared with some ideas about how you can meet them part way.

Define success – If partners have different views about what represents "success," include indicators for different success outcomes. But be careful not to measure more than you can realistically collect and analyze.

Challenge #2: Participant involvement

No one wants to appear like this fellow when recruiting participants for an evaluation. But maintaining participant involvement can be a challenge. While it is often simple enough to obtain participant consent in the first place, as time goes on, it can be difficult to maintain participation in follow-up surveys or interviews.

Solutions

Evaluation planning – Include participants in evaluation planning. A good way to start is to involve participants in developing a logic model. Use participatory methods – like the one we used in Module 2 – to involve participants in identifying what they see as the outcomes for the project. Involving them in project and evaluation planning ensures participants have a say in what the project is doing. After all, if you want to encourage community or individual change, you'll want to know whether these changes fit with what participants want for themselves. Involving them in the design phase will increase their "buy in" both for the project and the evaluation.

Consent forms and scripts for staff – Obtaining participant consent is an important part of doing evaluation. We'll talk more about this when we discuss research ethics. Projects can give their staff a script to help them explain the purpose of the evaluation and the important role of participants in identifying what works in social development projects. Written consent forms should be worded in plain language. Staff should read the consent form aloud in case participants have literacy problems. Projects that serve people who speak English or French as a second language should have the consent form translated.

Incentives - Token gifts, cash incentives, or food vouchers can be offered as a way to acknowledge the time participants take to complete questionnaires or to participate in interviews or focus groups. If your budget doesn't allow for this, consider approaching local businesses for donations. This is a good way to let local businesses know about your project and the importance you place on learning whether it works. At the same time, it's a good way for businesses to let project participants know what they have to offer.

Convenience – Arrange for participants to complete surveys or interviews at home or just before or after project activities. Phone interviews can work if the participant has a private place to talk. Focus on what you "need to know" and avoid information that would simply be "nice to know." This will help to keep surveys short and avoid respondent fatigue.

Larger samples – There will always be a certain number of participants who drop out of the project or who choose not to participate in follow-up interviews or surveys. Select a large enough sample to ensure there are enough completed surveys or interviews at the end of your data collection period.

Feedback – Participants can be kept in the loop by providing updates on evaluation results. Projects can do this in many ways – fact sheets and community forums are well suited to keeping participants informed.

Challenge #3: Data collection

An evaluation's success rests on getting accurate and complete information. It is often too late to go back for information if you later realize it is missing. Yet despite the crucial role of good data in evaluations, there are often many players involved in data collection, resulting in less control over this aspect of the evaluation than you might like.

Consider an evaluation where front-line staff collect participant information and record notes about project activities, partners provide information about the people they refer to the project and, when they have continued contact with participants, they provide assessments of the changes those participants experience. (Some crime prevention projects affiliated with schools, for example, ask teachers to report on any changes they have seen in the behaviour of students who participate in project activities.) Community workers and neighbourhood leaders are asked to participate in interviews and focus groups to share information about community perceptions of a project. As you can see, this evaluation relies on a variety of players to provide accurate and complete information.

Control over data collection becomes even more challenging when projects rely on partners to offer components of their project activities. In these situations, the project sponsor must rely on partner organizations to provide basic attendance information or to obtain baseline information from participants at the time they enter the program and outcome information at a follow-up time.

Needless to say, the challenge of obtaining accurate and complete data is closely linked to the challenges of obtaining staff and partner and participant support for an evaluation.

Solutions

If partners are involved in collecting data, develop signed protocols outlining how participant consent will be obtained, what information will be collected, and how the information will be stored and exchanged. Put systems and agreements in place at the outset and not midway through the project.

Of course, even the best-laid plans can fall into disarray. A common problem is that evaluators or program planners fail to review evaluation information until they need to analyze it. At that point, it's too late to remedy common problems such as misunderstood questions, incomplete forms, or the failure to provide identifying information that links pre- and posttest results. Make sure you try out your data collection instruments with a small group before you start formal data collection. Review completed instruments regularly to ensure questions were understood and the information is complete. This will avoid surprises at the analysis stage.

Challenge #4: Realistic outcomes

Success = reduced crime and victimization

It seems pretty straightforward, doesn't it? The best way to evaluate the success of crime prevention projects is to look at the extent to which they prevent crime and victimization in their communities.

But linking reduced crime and victimization to a particular project can be difficult. Crime and victimization rates are affected by many factors outside of the control of a community project:

You can probably think of many more examples of the difficulties of linking crime prevention to particular project activities.

So, if community rates of crime and victimization are not realistic outcomes, what are? You might decide that you at least have more control over crime reduction among the people directly involved in your project's activities.

Thus, success = reduced crime by project participants.

You can request access to court and police records to see if participant involvement in crime changes over time (with the written permission of participants, of course). Many projects involved in crime prevention through social development use this as an outcome. It can be a very useful way to determine what changes result from a project. But some caution is needed here too. Let's look at some of the difficulties associated with different sources of information about participant involvement in crime.

Solutions

First, remember that you must decide at the start of your project what information (indicators) you will use to assess success. If you wait until the project is underway, you might find that it's too late to get some of the information you would have needed.

If your project is based on a solid logic model, you might not need to measure actual change in crime or victimization. You can instead measure change in some of the short-term outcomes that the logic model suggests will eventually lead to change in crime and victimization.

Below we have listed some alternatives to measures of crime and victimization, but these will depend on the type of intervention you have planned. We have also provided some suggestions to help you if you choose to use measures that assess reduced involvement in crime and victimization.

Consider:

Alternative measures to assess participant or community change might include:

More than one measure – Imagine you were planning to look at changes to the number of contacts with police among youth involved in your gang-exit program and among a comparison group. You believe the fact that youth in your project had previous contact with police might make them more likely to be the subject of police interest in the longer term. This could lead to continued contact with police regardless of whether they commit offences. Collecting information about the number of convictions as well as police contacts will help to determine if the youth involved in the project were judged by a court to have been involved in crime or were simply the subject of continued police interest.

In another example, you may be concerned that participants in an anti-bullying program are providing socially desirable responses on your pre-post questionnaire about their involvement in bullying activities. You could use school records about behaviour in school and suspensions or expulsions from school to corroborate the results of the questionnaire.

The measures you choose do not have to relate directly to crime, but can provide information about outcomes such as anger management or peer relations that are related to progress toward the longer-term outcome of reduced involvement in crime or reduced victimization.

Collect information about the type of offence – This is particularly a good idea if you want to look at recidivism. Even if participants do re-offend, these offences may not be as serious as earlier ones, suggesting some progress toward change.

Other important considerations – When using criminal involvement as an indicator of success, consider the following suggestions:

Challenge #5: Data analysis

Analyzing the information you collect in your evaluation might seem like an intimidating task. If so, you are not alone. Community and human service workers don't often have training or experience in data analysis, whether qualitative or quantitative. Who isn't intimidated when they are asked to do something they do not feel qualified to do?!

Solutions

First, just as you should decide at the start of your project what indicators you will use to assess its outcomes, you should decide at the start of your project how you will analyze the data you plan to obtain.

If you have chosen to hire an external evaluator, you can ask them to analyze the data that is collected. If you're completing your evaluation on your own, here are some suggested resources to help you analyze your data.

Challenge #6: Finding an evaluator

You may not have the financial resources within your budget to pay for an outside evaluator. But even if you are lucky enough to have money to hire an evaluator, you might find it's a challenge to find someone who is suited to your evaluation project and who will provide quality work. This is particularly a problem in rural and remote communities.

Solutions

Even if you don't have the resources within your budget to hire an outside evaluator, don't give up. Partner organizations and local universities or colleges may have staff or students who are interested in the work you are doing and can help with the evaluation.

Here are some options for finding external evaluators, both paid and unpaid.

Challenge #7: Ethical considerations

Ethical issues can arise when conducting research. We have listed some of them here. You may be aware of others that are particular to the population you serve.

Solutions

Many resources exist to help you conduct your evaluation in a way that respects participants and follows guidelines for research ethics. Make sure you follow the key principles of research ethics:

Community research ethics boards (REBs) are available in some provinces to help community groups review the ethics of research activities. Ensuring your evaluation has passed an ethics review by a REB can help to relieve concerns about ethics. For example, while project staff may be concerned about random assignment, ethics boards may not share their concern. When the effectiveness of an intervention is not known, random assignment may be considered the most ethical way to determine who gets service and who is assigned to a waiting list or a comparison group. Bear in mind that, because we can't assume that project activities are effective, withholding a service may be less harmful than providing it. Even when evaluating new drugs that may help to save lives, random assignment is used to determine if the drug is effective.

Research ethics boards follow the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans in their review of research studies. It includes guidelines for handling situations such as naturalistic observation, one of the challenges we listed earlier. If a REB is not available in your community, reviewing your evaluation plan in light of the Tri-Council Policy Statement is a good way to ensure you are following standard guidelines for research ethics (see http://www.pre.ethics.gc.ca/english/ policystatement/policystatement.cfm for a copy of the Tri-Council Policy Statement).

University research offices can provide sample resources related to research ethics such as consent forms and scripts for recruiting research participants in an ethical manner. The web site of the University of Waterloo Office of Research is a good example (see http://www.research.uwaterloo.ca).

Voluntary participation is a cornerstone of evaluation research. Always explain to participants that they can choose not to participate in the evaluation, they may refuse to answer any questions they do not wish to answer, and they can withdraw their consent to participate at any time. Reassure participants that if they decide not to participate in the evaluation or to withdraw from it at any time, their decision will not affect their ability to participate in project activities.

Fully inform participants about what they will be asked to do as part of the evaluation. Explain whether the evaluation will include surveys, observation, requests to obtain school or court records, interviews with referral sources, or any other personal information. Let them know how this information will be used. When informing participants about evaluation activities, stress the role of the evaluation in project improvement. Ensure participants understand the intent is to evaluate the program, not them.

Obtain written consent. Obtain parent consent when children or youth under the age of 17 are involved.

All personal information should be kept confidential. Evaluation findings should always be reported in a way that will not reveal individual identities. But be clear about the limits to confidentiality and explain these limits to project participants. This applies not just to information obtained through an evaluation, but to information obtained during project activities too. Provinces and territories have legislation that requires staff in social or community services to report to child welfare authorities information that suggests a child is at risk of abuse and/or neglect. Review the child welfare legislation in your province or territory.

Where participants are involved in the judicial system, courts can subpoena records that might provide further information about the defendant's personal circumstances. Evaluation (and project) participants should be made aware of these limits to confidentiality.

Ensure you are informed about cultural practices and traditions that will need to be respected both in the evaluation and in your project activities. The Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans includes a section with recommended practices for research involving aboriginal people (see http://www.pre.ethics.gc.ca/english/policystatement/section6.cfm).

Use language that is respectful of project participants in your evaluation reports. Evaluation results should be written or delivered in a way that is accessible to project participants.

Challenge #8: Reflecting cultural and community differences

Some communities and cultural groups have unique histories and circumstances that can provide additional challenges to evaluation planners.

Solutions

Participatory methods – Involving community members in evaluation planning can go a long way toward overcoming resistance to evaluation. If participants are involved in planning the project and determining the indicators of success, they are likely to be more willing to participate in data collection. Participatory methods have the advantage of involving potential participants in the selection of evaluation measures.

Qualitative methods - Qualitative measures are well suited to story telling and an oral tradition. As such, they are particularly suited to aboriginal cultures. While quantitative measures may still be required to tell the full story of a project's ability to achieve its outcomes, the use of qualitative measures will provide an in-depth and rich context to the evaluation.

Culturally appropriate measures - While many standardized measures are inappropriate for some cultures, some measures have been tested with a variety of cultures. Others, such as those developed for the Aboriginal Head Start (AHS) programs in Canada and the U.S., have been developed with particular groups in mind. The AHS programs would be good starting points for culturally appropriate measures used with aboriginal children. If it is not possible to find measures specifically developed for a particular population, ask representatives from the cultural groups you serve to review potential survey questions for evidence of cultural bias.

Multiple measures - When sample sizes are small, use more than one method of data gathering. You can use the results from various methods to corroborate each other.

Training – Remote communities and groups new to Canada may have limited experience not just with evaluation, but with developing community-based projects. In such cases, look for ways to obtain training in project planning and design. These evaluation modules developed by the NCPC may be a good start. Modules 2 and 3 focus on techniques that are useful both for evaluation planning and for project planning.

Interpreters – If your project serves new immigrants and/or refugees who have limited knowledge of English or French, set aside some money for translators/interpreters or make provisions to access volunteers who can help with these tasks. This is important not just for your evaluation, but for project activities too. Because it is not always possible to assess the accuracy of translation when oral interpretation is provided, ensure the interpreters you hire have a sound knowledge of both languages.

Translation – Translations used for evaluation purposes should ideally be translated into the language to be used, then reverse translated back to the original language. This will allow you to assess the accuracy of the translation. We know that few projects will have the resources to do this, however. At a minimum, ensure that more than one person reviews the translation for accuracy.

What other evaluation challenges come to mind?

You have undoubtedly encountered or can anticipate other challenges in evaluating community-based social development projects. Take some time to think about what they include. What are some possible solutions? We've left some space for you to record them.

Resources and supports

Because resources available in different locations across Canada can differ widely, this list is far from comprehensive. You may know other resources unique to your province or community that can help with various aspects of your evaluation. Talk to your partners and other stakeholders to learn about sources of support in your community.

Glossary of terms

Indicator

An indicator is a variable (or information) that measures one aspect of a program or project. It indicates whether a project has met a particular goal. There should be at least one indicator for each significant element of the project (i.e., at least one for each outcome identified in the logic model).

Logic model

A logic model is a way of describing a project or program. It is a tool to help in project planning and evaluation. A logic model describes the resources and activities that contribute to a project and the logical links that lead from project activities to the project's expected outcomes. Logic models are often depicted as a flow chart that includes the project's inputs, activities, outputs, and outcomes.

Naturalistic observation

Naturalistic observation is a method of data collection in which the researcher or evaluator observes project activities and records information about them in a structured and systematic way. Observation as a data collection method is discussed further in Module 4 of this handbook.

Respondent fatigue

Respondent fatigue can occur when participants in an evaluation are asked to respond to too many questions at once. The questionnaire, interview, or focus group becomes tedious and respondents pay less attention to their responses in an effort to complete the questionnaire or to end the interview or focus group.

Socially desirable

Sometimes respondents provide responses to a survey or measure that they believe are most favourable to their self-esteem, are most in agreement with perceived social norms (Polland, 1998), or that the researcher will want to hear. These are considered socially desirable responses.

Standardized measures

A standardized measure is one that has been administered to a very large group of people similar to those with whom the measure would be used. The data collected from this group serve as a comparison for interpreting individual, small-group, or program measure results. Standardized tests allow you to determine if an individual's test score is high, average, or low as compared to the norm (Ogden/Boyes Associates Ltd., 2001).

References

Interagency Advisory Panel on Research Ethics. (1998). Tri-Council policy statement: Ethical conduct for research involving humans (with 2000, 2002 updates). Retrieved October 12, 2004, from
http://www.pre.ethics.gc.ca/english/policystatement/policystatement.cfm

Ogden/Boyes Associates Ltd. (2991). CAPC program evaluation tool kit: Tools and strategies for monitoring and evaluating programs involving children, families and communities. Unpublished report for Health Canada, Population and Public Health Branch, Alberta/Northwest Territories Region, Calgary, AB.

Polland, R.J. (1998). Essentials of survey research and analysis: A workbook for community researchers. Retrieved November 4, 2004, from
http://www.tfn.net/%7Epolland/quest.htm

Resources

Websites

American Evaluation Society
Guiding Principles for Evaluators
http://www.eval.org/Guiding%20Principles.htm

This document was prepared by the American Evaluation Society, a professional body that represents evaluators in the United States. It provides information on professional conduct in program evaluation.

Canadian Evaluation Society
Guidelines for Ethical Conduct
http://www.evaluationcanada.ca//site.cgi?s=5&ss=4&_lang=an

This document outlines ethics guidelines for members of the Canadian Evaluation Society (CES), the professional body of evaluators in Canada.

Interagency Advisory Panel on Research Ethics
Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans
http://www.pre.ethics.gc.ca/english/policystatement/policystatement.cfm

The Tri-Council Policy Statement provides ethics guidelines for research funded by three federal granting agencies: the National Sciences and Engineering Research Council, the Social Sciences and Humanities Research Council, and the Canadian Institutes of Health Research.

Government of Canada – Evaluation and Data Development
Evaluation Forum Newsletter
http://www11.hrdc-drhc.gc.ca/pls/edd/evalForNews.main

The Evaluation Forum Newsletter addresses topics such as cultural sensitivity in evaluation and conducting evaluations with Aboriginal communities.
National Council on Ethics in Human Research
http://www.ncehr.medical.org/english/home.php

This website provides various resources on ethical issues in research.

University of Michigan
Program Evaluation Standards
http://www.eval.org/EvaluationDocuments/progeval.html

Standards of conduct for individuals conducting evaluation research are listed on this site.

Manual and Guides

Holt, J.D. (1993).
"How About Evaluation: A Handbook about project self evaluation for First Nations and Inuit Communities." Department of National Health and Welfare, Medical Services Branch.

This guide is geared toward First Nations and Inuit communities that are evaluating projects and programs.

Ministry of the Solicitor General of Canada. (1998).

This manual is useful not just for First Nations communities involved in evaluation research, but for anyone interested in evaluating a crime prevention project. The self-evaluation manual leads the reader through the steps of evaluation.

National Evaluation of Sure Start
Conducting Ethical Research
http://www.ness.bbk.ac.uk/documents/GuidanceReports/165.pdf

This guide provides advice on conducting research in a way that reflects research ethics.

University of Victoria
Protocols and Principles for Conducting Research in an Indigenous Context
http://web.uvic.ca/igov/programs/masters/igov_598/protocol.pdf

This guide provides information on ethical conduct in research involving Aboriginal communities.

Textbook

Trochim, W. (2000).
The Research Methods Knowledge Base, 2nd Edition.
http://www.socialresearchmethods.net/

This on-line textbook provides a simple overview of ethical considerations in evaluation research.

Module 7 Worksheets

Worksheet #1

Challenge #1: Getting staff and partner buy-in

Solutions:

Worksheet #2

Challenge #2: Participant buy-in

Solutions:

Worksheet #3

Challenge #3: Data collection

Solutions:

Worksheet #4

Challenge #4: Realistic outcomes

Solutions:

Worksheet #5

Challenge #5: Data analysis

Solutions:

Worksheet #6

Challenge #6: Finding an evaluator

Solutions:

Worksheet #7

Challenge #7: Ethical considerations

Solutions:

Worksheet #8

Challenge #8: Reflecting cultural and community differences

Solutions:

General Resource List

Websites

American Evaluation Association
http://www.eval.org/

This is the web site for the U.S. equivalent of the Canadian Evaluation Society. The site provides information, links, and products related to evaluation.

Canadian Evaluation Society
http://www.evaluationcanada.ca

This is the website for the national organization of Canadian evaluators. It offers information, links, journals, and newsletters in both English and French.

Innovation Network Online
http://www.innonet.org

This site provides useful resources for evaluation research, ranging from general information to specific topic areas.

Management Assistance Program for Non-profits (MAP)
Basic Guide to Program Evaluation
http://www.mapnp.org/library/evaluatn/fnl_eval.htm

This website is relevant to community organizations that are conducting program evaluation. Topics range from basic evaluation concepts to challenges and issues.

Research and Statistics
http://www.ed.gov/rschstat/landing.jhtml?src=rt

This site provides a variety of resources related to research, evaluation, and best practices for education and prevention programs.

United Way of America
Outcomes Measurement Resource Network
http://national.unitedway.org/outcomes/

This site is an excellent source for an introduction to outcome measurement. It lists useful resources related to evaluation research.

United Way of Toronto
PEOD Evaluation Clearinghouse
http://www.unitedwaytoronto.com/PEOD/index.html

This site is highly recommended as a clearinghouse for evaluation guides and instruments appropriate for evaluation research.

University of Wisconsin – Extension
Program Development and Evaluation
http://www.uwex.edu/ces/pdande/evaluation/index.html

This website includes extensive coverage of evaluation information, publications, instruments, and suggestions.

Western Michigan University
Key Evaluation Checklist
http://www.wmich.edu/evalctr/checklists/kec.pdf

This document provides a checklist to help in evaluation planning and implementation.

W. K Kellogg Foundation
Evaluation Toolkit
http://www.wkkf.org/Programming/Overview.aspx?CID=281

This website provides access to downloadable publications and resources related to evaluation design.

Manuals and Guides

Child Survival
Participatory Program Evaluation Manual: Involving Program Stakeholders in the Evaluation Process
http://www.childsurvival.com/documents/PEManual.pdf

This is a comprehensive manual that thoroughly covers the processes of program evaluation. Additional features of this manual include addressing program evaluation challenges.

Health Canada
Guide to Project Evaluation: A Participatory Approach
http://www.phac-aspc.gc.ca/ph-sp/phdd/resources/guide/index.htm

This is an excellent guide for beginners in evaluation. It outlines all aspects of evaluation research.

Management Assistance Program for Non-Profits (MAP)
Basic Guide to Outcome Evaluation for Non-profit Organizations with Very Limited Resources
http://www.mapnp.org/library/evaluatn/outcomes.htm

This guide provides an overview of the basic steps in outcome evaluation. It is geared to non-profit agencies with limited resources.

National Science Foundation Directorate for Education and Human Resources
User-Friendly Handbook for Project Evaluation
http://www.nsf.gov/pubs/2002/nsf02057/nsf02057.pdf

This handbook provides an excellent overview of program evaluation, from planning to reporting results.

Horizon Research, Inc.
Taking Stock: A Practical Guide to Evaluating Your Own Programs.
http://www.horizon-research.com/reports/1997/stock.pdf

This guide provides a comprehensive overview of program evaluation.

Journals

Canadian Journal of Program Evaluation
c/o Canadian Evaluation Society, 1485 Laperriere Ave., Ottawa, ON K1Z 7S8
http://www.evaluationcanada.ca/site.cgi?s=4&ss=2&_lang=an

This journal covers a wide range of evaluation topics. Electronic access is restricted to members of the Canadian Evaluation Society (CES). Non-members can access the document at some university libraries. Memberships can be obtained through the CES website.

Practical Assessment, Research and Evaluation (Pare)
http://pareonline.net

This on-line journal provides articles pertaining to various evaluation research subjects.

Textbooks

Trochim, W.M. (2002).
Research Methods Knowledge Base, 2nd Edition
http://www.socialresearchmethods.net/

This on-line textbook introduces the user to evaluation, its basic definitions, goals, methods, and the overall evaluation process. It includes answers to frequently asked questions about evaluation.

Newsletters

Harvard Family Research Project
The Evaluation Exchange: Emerging Strategies in Evaluating Child and Family Services
http://gseweb.harvard.edu/~hfrp/eval.html

This free newsletter provides insight into various emerging issues and topics related to evaluation research.

Community/Crime Prevention Resources

Websites

These websites provides information relating to crime prevention and evaluation in First Nations communities.

National Crime Prevention Strategy
www.publicsafety.gc.ca/ncpc

The National Strategy's website offers information regarding evaluation and examples of prevention programs that have undergone evaluation. (Available in French)

Manuals and Guides

International Centre for the Prevention of Crime
From Knowledge to Policy and Practice: What Role for Evaluation?
http://www.crime-prevention-intl.org/publications/pub_4_2.pdf

This publication explores evaluation in the context of prevention. Various government frameworks for evaluation and prevention are illustrated.

National Strategy on Community Safety and Crime Prevention
You Can Do It: A Practical Tool Kit to Evaluating Police and Community Crime Prevention Programs
http://dsp-psd.pwgsc.gc.ca/Collection/J2-180-2001E.pdf

This tool kit, developed by Ottawa Police Services, provides an overview of evaluation and information for planning and implementing an evaluation and communicating results.

Northern Territory Department of Justice, Office of Crime Prevention
Guide for Community Crime Prevention Partnerships
http://www.nt.gov.au/justice/ocp/docs/guide.pdf

This guide provides insight into assessing the need for crime prevention, creating prevention partnerships, and implementing an action plan.

Footnotes

  1. 1

    Special thanks to Brenda Simpson who graciously shared the contents of her workshop entitled Analysing, Interpreting and Reporting Outcomes. Some of the ideas in this module are taken from her workshop.

  2. 2

    This section is adapted with permission from Kenton, N., & Sehl, M. (2002). Community Action Program for Chidren (CAPC) regional evaluation tool kit. Toronto: Health Canada.

  3. 3

    This exercise is adapted from: Analysing, Interpreting and Reporting Outcomes Workshop by Brenda Simpson.

Date modified: