Psychometrics in Education Overview of Assessment Validation

Psychometrics can be described as a science that looks in to the development of instruments and theories in relation to educational and psychological assessments aimed at assessing someone’s knowledge, attitude, practices and personality. The practices in relation to psychometrics are driven based on three essential principles, namely,

-Validity

-Reliability

-Standardization

Validity

When formulating an assessment tool, one of the most important considerations would be its ‘validity’. The term validity refers to the ability of a test instrument to measure what it is suppose to measure. In doing so, a test instrument has to be scrutinized on the following grounds,

-Is the test acceptable to all its stakeholders?

-Does the test assess the necessary content areas intended to be assessed?

-How does the test instrument perform against different groups and similar test instruments?

Thus, several categories can be identified in a validation process as mentioned below.

Content validity

Perhaps the most important part of determining the validity, subject matter experts should take the lead in determining whether the test instrument actually measures the candidate on intended subject areas or on defined course objectives.

A discussion between the subject matter experts based on the content blueprint for the test design would be required in making a decision as to whether each item assesses the intended objectives pertaining to a particular subject area. If not, the test items may require replacement or appropriate modifications.

Concurrent validity

In concurrent validity, we would be comparing the proposed test instrument on its performance between two subject groups whom are known to be either ‘good’ or ‘bad’ in their performance at other exams. If the test results correlate well with the discrimination already made by other test instruments over the same group, it would consider as having a high validity as a test instrument.

Predictive validity

Contrast to concurrent validity, predictive validity would refer to the ability of the test instrument to predict the future performance of a candidate. Thus, the analysis would be done in two different time frames where the performance of students to a particular test instrument should correlate with the performance to a test instrument applied at a future date or with the work performance during their practice.

Construct validity

In construct validity, the focus would be to find out if the test instrument assesses a latent trait pertaining to a particular ‘construct’. In theory, a construct can be defined as a characteristic which cannot be directly measured, is not apparent overtly and can only be interpreted though an indirect measure. Thus, the construct validity would have to be obtained using two methods, which is to,

-Compare between two candidate groups (e.g. 1st year residents and final year residents)

-Compare with other exams (e.g OSCE vs Short cases, OSCE vs MCQ)

In doing so, if there is a high correlation between the two, a ‘convergent validity’ is said to be existing between the two and if it is a low correlation, it can be described as a ‘divergent validity’.

Face validity

This refers to the general acceptability of a test instrument for all other stakeholders. The main focus would be put on the non subject matter experts such as the candidates, industry representatives, institutional policymakers…etc. The decision on its face validity would not be based on any statistical analysis but would be based on the acceptability from these stakeholder groups.