Analysis of Clinical Instructors Level of Agreement to Interpret Supportive Comments on the Clinical Performance Instrument
Purpose
Most physical therapist (PT) education programs1 use the Clinical Performance Instrument (CPI)2, tested for validity and reliability3,4, to evaluate students’ clinical performance. To complete the CPI, clinical instructors (CIs) must include comments. Directors of Clinical Education (DCEs) seek congruence between comments and points on the rating scale to determine final grades. A regional group of DCEs noticed inconsistencies between narrative comments and rating points used on CPIs, particularly related to “entry-level performance” (ELP). The purpose of this study was to determine the level of agreement between raters on whether or not comments support a rating of ELP on the CPI before and after focus group discussions.
Methods/Description
Participants were 52 CIs from 4 states within 5 practice settings. Each person reviewed 5 narrative comments, representing 5 categories on the CPI (safety, clinical reasoning, communication, examination, and professional behavior). The comments were constructed based on language commonly used by CIs on completed CPIs. Participants completed a pre-test and post-test indicating if they agreed, disagreed, or were undecided if a student met ELP, based on the comments for each category. Pre-tests were administered at the beginning of the workshop and post-tests were completed after participants engaged in focus group discussions about their interpretations of language used in the CPI instructions. The post-test was comprised of 10 narrative comments; 5 were identical to the pretest (post-test1) and 5 were different (post-test2), but addressed the same 5 categories
Results/Outcomes
Using generalizability theory, index of dependability coefficients were calculated to measure the absolute level of agreement among CI raters. Values <0.40-0.59 were considered poor.5 The results for this analysis follows: pretest=0.29, post-test1= 0 .46, post-test2=0.33. Furthermore, analyzing all pairwise ratings among all CIs, the percentage of same ratings (agreements), for each category was determined. The range of agreement for “red-flag” categories of clinical reasoning, professional behavior, and communication on any of the tests was 33.3%-58.1%. Agreement for safety was highest, ranging from 65.2%-95.3% on the 3 tests. Using a 95% confidence interval, analysis indicated that focus group discussions did not produce statistically significant improvement in raters’ level of agreement on post-test1 or post-test2.
Conclusions/Relevance to the conference theme: The Pursuit of Excellence in Physical Therapy Education
The CPI is an outcome measure used to support PT students’ ability to enter professional practice and data from the CPI often supports PT education programs’ accreditation information. This study suggests that CIs may not consistently interpret language used to describe students’ clinical behavior, especially for communication, professional behavior and clinical reasoning in relationship to ELP. Further study is necessary to ensure that CPI users consistently interpret language used to describe students’ clinical performance, especially on “red-flag” items.
References
1. American Physical Therapy Association. About the PT CPI Version 2006 Update. <a href="http://www.apta.org/PTCPI/Version2006/">http://www.apta.org/PTCPI/Version2006/</a>. Accessed April 9,2016.
2. American Physical Therapy Association. Physical Therapist Clinical Performance Instrument for Students. Alexandria, VA: American Physical Therapy Association; 2006.
3. Roach KE, Frost J, Francis NJ, Giles G, Nordrum JT, Delitto A. Validation of the Revised Physical Therapist Clinical Performance
Instrument (PT CPI): Version 2006. J Phys Ther Ed. 2012;92(3):416-428.
4.Adams CL, Glavin K, Hutchins K, Lee T, Zimmerman C. An Evaluation of the Internal Reliability, Construct Validity, and Predictive Validity of the Physical Therapist Clinical Performance Instrument (PT CPI). J Phys Ther Ed. 2008;22(2):42-50.
5. Cicchetti DV, Sparrow SA. Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. Am J Ment Defic. 1981;86:127–137