Some form of evaluation will certainly be a feature of your project, even if it is only based on subjective perceptions and general observations.
The most important question in evaluation is: what was the purpose or objective of the project - what was the desired outcome? It is unreasonable to expect one small project to change everything overnight, so set realistic targets.
If you hope to evaluate in a way which gives results of any consistency, reliability or validity, your planning at the outset will need to cover this area. To wait until the end of the project before attempting to evaluate is a recipe for confusion and self-deception - all you will be able to check on a post-hoc only basis is whether the 'consumers' say they were satisfied, and whether they have more or less positive attitudes towards their experiences. While this kind of 'evaluation' can give the project organiser a nice warm rosy glow, the 'grateful testimonials' approach will be regarded with cynicism by many hard-headed professionals.
Having said that, to evaluate thoroughly can be fairly time consuming, and there is no point in devoting scarce time to this part of the exercise unless you have particular objectives in mind.
Positive evaluation information can be extremely valuable to feed back in suitable form to the helping participants to enhance even further their motivation. It may also be useful for a wide range of social, educational and quasi-political purposes. Evaluation information can be useful in a host of ways in addition to its primary purpose, which is to check whether what you have done has worked, in order to enable you to adjust or improve your organisation on a subsequent occasion. Positive evaluation results can serve to improve the project co-ordinator's motivation, but if in the final analysis there is no clear purpose in mind, then don't do it, for it is likely to be a waste of time.
Firstly, a consideration of the research design is necessary. Most adequate summative evaluation will include at least a 'before and after' assessment of some sort, hence the need to plan evaluation before the project starts. The school may already routinely collect data in the curriculum area in question, and thus information on progress prior to peer tutoring and after peer tutoring could be readily available. Wherever possible, it is useful to establish a control or comparison group of children who are not experiencing peer assisted learning, or are having some alternative experience which includes as much time on the curriculum task and individual attention as the peer tutoring project.
Attainment and cognitive gains should also be checked for both tutors and tutees, since the helpers may be expected to gain as much if not more than the helped in these areas. Attitudinal and social gains should be the subject of assessment in both helpers and helped also. Especially in cross-age projects, separate measuring instruments may be necessary for the helpers and helped.
In measuring attainment gains, decisions have to be taken about whether to use some form of norm-referenced testing (to compare progress with normal expectations) or to use some form of criterion-referenced testing (which would check mastery of specific knowledge or information and tasks and skills learned). The latter approach tends to give bigger gains per se, because the assessment is more closely related to what has been learned, but the former gives a better index of generalisation to other materials. Some form of assessment delivered and scored by computer (as discussed above) could greatly reduce time costs for the teacher.
Some form of qualitative analysis could be applied, perhaps including error frequency counts, increased speed, and so on. For either norm- or criterion-referenced measurement, a decision will have to be made about whether to assess individually or in a group - this is basically a choice between the quick and easy but unreliable vs. the slow and time consuming but more detailed and trustworthy approach.
The social gains which can accrue from helping will probably be evident from direct observation, but attempts could be made to measure this in other ways, referring to either improved relationships or improved behaviour or both. Unfortunately the paper and pencil measures available for this purpose (most typically checklists, rating scales, sociometry and so forth) tend to be of low reliability. The number of disciplinary referrals over time can be counted, but again these tend to be of doubtful reliability. Attitudinal data tends to be equally nebulous. This can come from individual or group interview or discussion, which could be tape recorded for later analysis, or from a variety of questionnaires, rating scales or checklists.
The collection of 'process' data about the organisational effectiveness of the project is essential. You will need to note whether:
Attendance rates, distribution of lengths of helping sessions, information from observational checklists, and so forth can all be analysed and related to summative outcome information. You might cumulate and synthesise your own informal observations regarding learning behaviours and styles or developments in meta-cognitive awareness.
Whatever your best efforts, you can be sure that there will be some surprises. Observation may indicate a variety of unpredicted side effects, both positive and negative. You will be interested to see generalisation by project students to other times, other materials and other curriculum areas, and from the helping sessions to other students (who may be helped or begin helping spontaneously even though not part of the project). Once you have this kind of motivation and enthusiasm beginning to bubble in the peer group, you will soon begin to think of ways of capitalising upon it.
Only rarely does peer assisted learning generate contagious enthusiasm right through an establishment. However, spontaneous generalisation from the specific helping curriculum to other wider areas of school life may be evident - thus there may be evidence elsewhere of improved examination results, a higher percentage of academic assignment completion across the curriculum, and so forth.
At this stage, it is easy to be so persuaded by the positive impact of your efforts that you devote all your time to establishing new, grander and more wide ranging peer tutoring projects. A word of caution is necessary. Many educational innovations have shown good short-term impact, but at longer term follow-up the positive results have been found to have 'washed out'. It is thus worth devoting a little of your precious time to both short- and long-term follow-up of social and attainment gains in your original project group.
For the project to have been really worthwhile, some enduring effect should be perceptible six months later, and maybe even twelve months later. How realistic it is to expect one short intervention to have an impact that remains discernible much longer than this is a matter for considerable debate among educational evaluators. In the long run, an accumulation of spontaneous and random events is likely to mask the impact of almost anything.
There are considerable problems of assessment and measurement in mathematics, particularly with young children. Consider early how you intend to measure whether the objectives or outcomes have been attained, and this will lead you to consider qualitative measures, quantitative measures, or both.
Especially if tests are to be used, the selection of a measuring instrument for quantitative outcome evaluation depends on the purposes underlying the project, and whether group or individual administration is practical. The age, range of ability, and level of attainment of the children also affect the choice of test. Norm-referenced, criterion-referenced or diagnostic tests can be used. Commercial normed group maths tests tend to be very wide ranging and rather superficial and insensitive, as maths is such a vast subject and mathematical competence far from being a unitary skill. In addition to the discussion about research earlier in this section, more detailed information about suitable tests will be found in Topping and Bamford (1998b).
Qualitative evaluation often relies on the subjective experiences of the people who are involved with the project. Evaluation insights can be gained through discussion and questionnaires. Questionnaires might need to be read out to weaker readers. Changes in attitudes can be measured either by scales and questionnaires or by open-ended discussion or by structured direct observation. Designing questionnaires is not easy, and care much be taken not to introduce bias. A sample of types of questionnaires that have been used by parents and pupils will be found in the Reproducibles section of this manual. The Structured Planning Format also includes a detailed section concerning evaluation choices (see Reproducibles section of this manual).
Process data should also be gathered during the project, to indicate how smoothly the organisation is running. Diary/record cards yield weekly information which can be summarised.
If your evaluation outcomes seem disappointing, think carefully about why this might have been. Were your objectives too wide- ranging, or unrealistically high? Did you measure only the outcomes you wanted to see, forgetting that other key players might have had quite different objectives, which were not measured? Was the project organisation appropriate in principle, but just not properly implemented (for whatever reasons), so your results are not a fair test of the programme when properly implemented? Ask yourself about the measuring instruments used - were they relevant or sensitive enough? Or was it that the programme design was perfectly appropriate for some cultural settings and contexts, but not really for the one in which you operate? In short, poor outcome results might be due to wrong objectives, wrong organisational planning, poor implementation, wrong measures, or some combination thereof. If you suffer from any of these, take steps to fix the problem.