Skip Ribbon Commands
Skip to main content
Sign In Skip to Content

Advice for teachers -

Measurement terms

VCE Psychology requires that students can distinguish between and apply the terms ‘repeatability’, ‘reproducibility’, ‘reliability’ and ‘validity’ when analysing their own and others’ investigation findings. An understanding of the terms ‘accuracy’ and ‘precision’ may also be important in the analysis and discussion of investigations of a quantitative nature.

Replication of procedures: repeatability and reproducibility

Experimental data and results must be more than one-off findings and should be repeatable and reproducible to draw reasonable conclusions. Repeatability refers to the closeness of agreement between independent results obtained with the same method on identical test material, under the same conditions (same operator, same apparatus and/or same laboratory). Reproducibility refers to the closeness of agreement between independent results obtained with the same method on identical test material but under different conditions (different operators, different apparatus and/or different laboratories). The purposes of reproducing experiments include checking of claimed precision and uncovering of any systematic errors that may affect accuracy from one or other experiments/groups. Reproducibility is often used as a test of the reliability of an experiment.


Experimental reliability refers to the likelihood that another experimenter will perform exactly the same experiment under the same conditions and generate the same results (within a very narrow range of values). Experiments that use human judgment may not always produce reliable results. Small sample sizes or insufficient trials may also yield results that are not reliable.


A measurement is ‘valid’ if it measures what it claims to be measuring; for example, a test of memory should measure memory and not something else (such as intelligence or emotional state). Both experimental design and the implementation should be considered when evaluating validity.

A distinction can be made between internal and external validity; both types are relevant to evaluating the validity of a research investigation or procedure:

  • Internal validity refers to whether there is a causal relationship between the independent and dependent variables or whether any observed effects of the investigation or procedure are due to some other factor. Internal validity can be improved through a number of methods including using a particular type of research design, controlling extraneous variables, using standardised instructions and procedures, counterbalancing and eliminating experimenter effects.
  • External validity refers to the extent to which the results of an investigation can be generalised to other settings, other people and over time. External validity can be improved through a number of methods including conducting experiments in settings natural to the research question of interest and using random sampling to select participants.

Experimental data are said to be valid if the measurements that have been made are affected by a single independent variable only. They are not valid if the investigation is flawed and control variables have been allowed to change or there is observer bias.


Experimental accuracy refers to how close the experimental result obtained is to the accepted, or ‘true’, value of the particular quantity subject to measurement. The true value is the value that would be found if the quantity could be measured perfectly. For example, if an experiment is performed and it is determined that a given substance had a mass of 2.70 g, but the actual or known mass is 3.20 g, then the measurement is not accurate since it is not close to the known value. The difference between a measured value and the true value is known as the ‘measurement error’.

‘Accuracy’ is not a quantity and therefore cannot be given a numerical value. It is allowable for a measurement to be described as being ‘more accurate’ when its method and/or instruments clearly reduce measurement error, such as using a triggered electronic timer system compared to a hand-operated stopwatch.


Experimental precision refers to how closely two or more measurement values agree with each other. A set of precise measurements will have very little spread about their mean value. For example, if a given substance was weighed five times, and a mass of 2.70 g was obtained each time, then the experimental data are precise. However, this gives no indication of how close the results are to the true value and is therefore a separate consideration to accuracy, so that if the true mass in the above example was 3.20 g then these data are precise but inaccurate.

Quantitatively, a measure of precision includes the standard deviation around a mean or some other measure of spread. A quantitative treatment of precision is beyond the scope of the VCE Psychology Study Design.

Statistical analysis of data

In VCE Psychology students are expected to calculate mean as a measure of central tendency for a set of data. There is a qualitative understanding that standard deviation is used to summarise the spread of data values around the mean. Students should recognise that standard deviation can be useful for comparing the means and the spread between two or more population samples, particularly that:

  • although data sets may have the same mean they may not have the same degree of variation, or spread, in the data
  • a higher standard deviation represents greater variation, or spread, in the data set.

Calculations of variance, standard deviation and significance between two sets of data are beyond the scope of the VCE Psychology Study Design.


Readings that lie a long way from other results are sometimes called outliers. Outliers must be further analysed and accounted for, rather than being automatically dismissed. Repeating readings may be useful in further examining an outlier.