Assessing the performance of junior doctors in the workplace is important but challenging. The optimum assessment is by direct observation of doctors’ interactions with patients and comprises multiple assessments by multiple examiners on a variety of patient problems. Clinical supervisors are best suited to observe and certify trainees, but often do not observe them directly.1 Performance assessment is not done well in most instances, as it requires multiple sampling over time.2 In-training assessments done at the end of a term introduce a “halo effect”.3
Most of these problems can be overcome by the mini clinical evaluation exercise (mini-CEX), developed by the American Board of Internal Medicine.4 The mini-CEX involves direct observation of a trainee in a focused clinical encounter, followed by immediate feedback. The assessment is recorded on a rating form that has been shown to have high internal consistency and reliability among internal medicine trainees, giving scores comparable with a high-stake clinical examination.5,6 The mini-CEX has higher fidelity than other formats.7
International medical graduates (IMGs) comprise about 25% of the medical workforce in developed countries.8 Their certification for registration is a major task of the medical boards and registration authorities in Australia and other countries.9 The Australian Medical Council (AMC) has conducted clinical examinations to assess IMGs since 1978.10 Successful candidates undertake 12 months of supervised practice before obtaining full registration. Despite having passed the current AMC clinical examination, IMGs’ competence and performance in the workplace have been criticised.11
The study was conducted in three large metropolitan teaching hospitals in Australia, one each in New South Wales, Queensland and Victoria, as part of a larger international collaborative study with the Medical Council of Canada. The ethics committee in each centre approved the study.
All IMGs at the participating hospitals who had passed the AMC clinical examination in the previous 12 months and 50 potential examiners were asked to volunteer for the study. All IMGs gave written, informed consent.
The following skills were rated: medical interviewing, physical examination, professionalism/humanistic qualities, counselling, clinical judgement, organisation/efficiency and overall clinical competence. Ratings were on a nine-point scale, where 1–3 signified unsatisfactory performance; 4–6, satisfactory performance; and 7–9, superior performance at a mid-postgraduate year 1 level. Examiners were also asked to grade the encounters as “met expectations”, “borderline” or “did not meet expectations”.
The examination comprised four assessments in emergency medicine (two examination, one history taking and one counselling), three in medicine (one history taking, one management and one counselling), and three in surgery (two examination and one management). These three specialties were selected because these terms are compulsory for internship in most Australian states and territories.
Reliability was assessed using generalisability theory analysis. Generalisability analyses (G studies) allow estimation of the variance components associated with the different examination conditions (eg, types of tasks, number of tasks, and number of markers). The ratio of the variance component for the object of measurement (in this case, differences between IMGs) to the total error variance yields an estimate of reliability: the generalisability (G) coefficient, with values ranging from 0 to 1. The effect of changes in the examination conditions (eg, increase in the number of tasks to be performed by the IMG, or change in the number of markers for each task) can be modelled to inform decisions on optimising the measurement (decision [D] studies).
All 28 IMGs who had passed the AMC examinations within the previous 12 months and 35 examiners volunteered to participate in the study. Twenty-two examiners were trained in assessing the mini-CEX; the remaining 13 examiners participated without training. The examiners included specialists and specialist trainees in internal medicine, surgery and emergency medicine.
The 28 IMGs were assessed by the 35 examiners on 209 clinical encounters: 122 assessments were done in wards, 70 in emergency departments, eight in intensive care units, six in outpatient clinics and two in offices; location was not recorded for one encounter. The mean number of mini-CEXs completed by IMGs and examiners was 7.2 (range, 2–13) and 6.0 (range, 1–20), respectively. Assessments were scored as “met expectations” for 150 encounters, “borderline” for 40, and 19 (9% across 12 IMGs) “did not meet expectations”. The average mini-CEX duration was 20 minutes (range, 6–45 minutes). The average time for feedback was 12 minutes (range, 3–20 minutes). Complexity of encounters was rated by examiners as low for 19, moderate for 150 and high for 31 encounters; data were missing for nine encounters.
Because of differences in the number of encounters per participant, we included a maximum of eight encounters in the generalisability study. The results of the variance components estimation are shown in Box 1. The G coefficient for eight encounters was 0.88. As a measure of discrimination, the standard error of measurement for the measurement design with eight encounters was estimated at 0.35 (that is, 19/20 times, the “true” score of an IMG will fall within ± 0.69 of an observed score). The results of the D study indicated that 10 encounters were necessary to achieve a reliability of 0.90 (Box 2).
In the evaluation survey, 16/28 IMGs (57%) and 18/35 examiners (51%) responded. Most respondents (10 IMGs; 15 examiners) never or only occasionally experienced difficulty arranging the mini-CEX encounters. When problems did occur, they were often due to rostering issues and patients being away from the wards.
Under the conditions and settings used, the mini-CEX reliably assessed the clinical performance of IMGs with eight to 10 encounters. This is consistent with the results of other studies.7 As the mini-CEX is conducted within the workplace with real patients, it has high fidelity and it is acceptable to both IMGs and examiners. A fail rate of 9% (19/209 encounters) across 12 IMGs is concerning, given these IMGs had passed the AMC clinical examination.
Abstract
Objective: To evaluate the feasibility, reliability and acceptability of the mini clinical evaluation exercise (mini-CEX) for performance assessment among international medical graduates (IMGs).
Design, setting and participants: Observational study of 209 patient encounters involving 28 IMGs and 35 examiners at three metropolitan teaching hospitals in New South Wales, Victoria and Queensland, September–December 2006.
Main outcome measures: The reliability of the mini-CEX was estimated using generalisability (G) analysis, and its acceptability was evaluated by a written survey of the examiners and IMGs.
Results: The G coefficient for eight encounters was 0.88, suggesting that the reliability of the mini-CEX was 0.90 for 10 encounters. Almost half of the IMGs (7/16) and most examiners (14/18) were satisfied with the mini-CEX as a learning tool. Most of the IMGs and examiners enjoyed the immediate feedback, which is a strong component of the tool.
Conclusion: The mini-CEX is a reliable tool for performance assessment of IMGs, and is acceptable to and well received by both learners and supervisors.