|
|
||||||||
Rapid Responses to:
|
|
Rapid Responses published:
|
|
|||
|
Alan M Jette, Director Health & Disability Research Institute, Boston University School of Public Health, Stephen M Haley, Wei Tao, Pengsheng Ni, Richard Moed, Doug Meyers, and Matthew Zurek.
Send rapid response to journal:
ajette{at}bu.edu Alan M Jette, et al.
|
We appreciate Dr Hart’s thoughtful letter in response to our article. We agree with Dr Hart’s major points and would like to comment on a few of the issues he raises. We agree with Dr Hart that differential item functioning across patients with various impairments is clinically relevant and an issue deserving of additional study. Although this issue was not the focus of our paper, we did examine the potential presence of differential item functioning in the AM-PAC items most frequently administered in the AM-PAC -CAT across outpatients in our sample with different primary impairments. Only one item out of 36 displayed a significant differential item functioning. We suspect that more focused constructs such as basic mobility and daily activity function may have less potential for significant differential item functioning than broader health-related concepts. We do agree with Dr Hart that differential item functioning is an issue that should be examined both during item bank development and in CAT applications in various patient populations(1). With respect to Dr Hart’s suggestion that the daily activity and basic mobility domains of the AM-PAC might not be distinct in an outpatient population, we wish to clarify that outpatients were represented in our calibration samples. The AM-PAC’s combined calibration samples of 1041 patients in post-acute care included patients from 4 different care settings: Outpatient Therapy (N=237); Home Health Care (N=246); Skilled Nursing or Transitional Care (N=138); and Inpatient Rehabilitation (N=420).(2) The AM-PAC was intentionally developed and tested in samples drawn from several post-acute care settings to provide users with one instrument that had the ability to track functional recovery across settings throughout an entire episode of post-acute care. In separate analyses done on the outpatient sample used in this pilot study, we confirmed a distinction between the daily activities and basic mobility domains of the AM-PAC. We saw only a moderate positive correlation between the basic mobility and daily activity scales (0.40 on admission and 0.55 at discharge), suggesting the psychometric and clinical merits of keeping these two domains of activity function separate and distinct. Given the dynamic nature of CAT outcome instruments, a feature that allows for periodic refinements and updating, Dr Hart raises an interesting concern about a potential challenge of keeping users (as well as journal editors and reviewers) current with pertinent changes in various CAT instruments being used with increasing frequency in health care. We agree that this is an important issue that must be taken seriously. Our current thinking is that CAT instrument developers might look to the broader software development field for guidance on how this might be efficiently accomplished by adopting a policy of labeling different versions of CAT software. For instance, although ‘version 1’ (AM -PAC-CATv1 ) was examined in this pilot study, ‘version 2’ (AM-PAC-CATv2) is soon to be released and will be the subject of future study. Accurate labeling of software and instruments may help readers and various users keep track of the evolution of CAT software. Again, we thank Dr Hart for his letter and look forward to further discussions of these and related issues relevant to the introduction and use of CAT instruments in health care. References 1. Haley SM, Coster WJ, Andres PL, Ludlow LH, et al. Activity outcome measurement for post-acute care. Med Care. 2004;42(1 Suppl):I-49-I-69. 2. Haley S, Ni P, Hambleton R, Slavin M, Jette A. Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank. J Clin Epidem. 2006;59:1174 -1182. |
|||
|
|
|||
|
Dennis L. Hart, Physical Therapist Focus On Therapeutic Outcomes, Inc.
Send rapid response to journal:
dsailhart{at}verizon.net Dennis L. Hart
|
The authors are to be complimented for a strong paper describing a methodologically complex process in an understandable manner. Although others have applied computerized adaptive testing (CAT) applications in outpatient rehabilitation for several years,(2) Jette et al are the first to publish results of a practical application of a CAT in a peer-reviewed journal. The strength of the work by Jette et al(1) lies in the process used to develop the product. Item response theory (IRT) methods and CAT applications have the potential to be the foundation of outcomes measurement development in rehabilitation just as IRT and CAT were in educational measurement.(3) We should not forget that IRT and CAT are not new; they are just new to rehabilitation and medicine. Jette et al discuss in the current paper and earlier work how these methods can be used to develop a new outcomes scale, assess the strengths and weaknesses of the scale, and discuss how a scale can be improved when scale deficits are identified via practical application in busy clinics. These methods are sorely needed for many common paper-and-pencil instruments that are so popular in rehabilitation. The study is not without limitations, many of which are detailed nicely by Jette et al. One psychometric issue not discussed relates to differential item functioning (DIF).(4) DIF occurs when patients from different groups--for example, patients with hip versus knee impairments--have different probabilities of endorsing item response categories. In clinical terms, that means patients with knee impairments perceive the act of squatting as more difficult compared with patients who have hip impairments, which is clinically logical and important.(5) DIF is common in patients treated in outpatient rehabilitation.(5,6) When DIF is present and of practical importance, the lack of control for DIF can erode the validity of the outcomes measure.(3) One of the strengths of IRT techniques is the ability to detect and possibly control for DIF. However, when DIF is identified but of no practical importance, DIF can be ignored when calibrating items.(7,8) Discussion of DIF at least by body part treated would have strengthened the Jette et al paper, particularly because differences in item calibrations by body part treated(6) have been published for the physical functioning items of the SF-36,(9) which appear to be included in the AM-PAC-CAT item bank.(10) From previous evidence,(5,6) it would not be unexpected to find DIF between patients with hip, knee, or foot/ankle impairments, between patients with shoulder compared to elbow/wrist/hand impairments, and between patients with lumbar compared to cervical impairments, for body mobility and activity item banks. The authors rely on earlier factor analytic work(10) that identified the mobility and daily activity constructs. Although the conceptual foundation identifying these two factors appears sound, there is evidence that these factors might not be distinctly separate. In the original sample that did not contain patients treated in outpatient facilities,(10) factor loadings supported grouping items into the mobility and activity factors. However, in the current study of outpatients, using the CATs developed from a sample that may not have included outpatients provides some evidence supporting the need for further unidimensionality testing. Specifically, in the current study, mobility measures were most responsive for patients with lower-extremity impairments compared with patients with spine or upper-extremity impairments. The lowest effect size for both scales and all impairments was for patients with upper-extremity impairments using the mobility scale. However, the greatest effect size for the activity scale was recorded for the patients with upper-extremity impairments. These results are clinically logical, given the items and sample. However, do the activity and mobility items really describe different constructs for patients treated in outpatient clinics? Could the mobility and activity items be combined into one item bank that is ‘essentially unidimensional,’ where one dimension is dominant, possibly in the presence of one or more minor dimensions,(11) without erosion of the scale psychometrics? Do patients’ impairments demand different scales in order to assess the most appropriate construct of interest to the patient, that is, mobility for lower versus activity for upper-extremity impairments? Do more difficult items (assessed using item calibrations) describe a separate construct compared to easier items, regardless of construct (mobility versus activity)? If payers were to reimburse outpatient therapy services for value (unit of functional improvement per dollar cost),(12) which construct should be used, that is, should we assess mobility for patients with lower-extremity impairments and activity for upper-extremity impairments? Which construct is more important for patients with cervical or lumbar impairments? The Jette et al(1) results combined with the results of Hart et al(5,7,8) suggest the need for further assessment of item unidimensionality in patients receiving outpatient therapy. In addition, given that CATs are continuously evolving, how do developers, journal editors, and users keep current with pertinent CAT changes? The results describing the responsiveness and construct validity of the AM-PAC-CAT measures support previous work using CATs applied in outpatient rehabilitation. For example, the effect size for prospectively collected data using body part-specific CATs on average was .92 in an earlier study,(2) which is similar to the highest effect size for patients with lower-extremity impairment in the Jette et al study. However, Stratford and Riddle(13) recommend using an external standard to assess sensitivity to change in a sample of patients who are likely to change at different rates. Such analyses are recommended for future AM-PAC-CAT investigations. Furthermore, construct validity results using CATs in the Hart and Connolly(2) report are similar to the results reported by Jette et al. Taken together, results support that CAT administrations produce responsive and valid estimates of ICF activity measures in patients receiving outpatient therapy. Jette et al describe in detail the content balancing performed by their CAT. However, is content balancing, which was developed for educational tests, as important in outpatient rehabilitation as it is in educational testing? The answer may be ‘probably.’ Given that the primary advantage of CAT applications is reduced respondent burden without erosion of measure precision and validity, the answer may be that providers should take advantage of the efficiency of CATs and collect more data. In this way, providers in busy clinics can assess multiple constructs of interest efficiently, such as mobility, activity, and fear-avoidance.(14) CATs should save clinicians time assessing multiple constructs. Finally, the collaboration of good researchers and a proprietary database management company (eg, CRE Care, LLC) facilitated the implementation of the current study. The integration of good science, electronic application of psychometrically sound outcomes instruments in busy clinics, and a journal’s need to publish scientifically sound material produced a result that may affect clinical practice positively. As payers progress toward new methods of payment that may include value- based purchasing,(2,12,15,16) proprietary database management companies may become more important, as they manage large databases of scientifically sound outcomes measures without undo political pressures. Jette et al and the editors of PTJ have taken the “high-road” by publishing this paper, and the readers will be the benefactors. Thank you for the opportunity to contribute to this important discussion. Dennis L Hart, PT, PhD Director of Consulting and Research Focus On Therapeutic Outcomes, Inc White Stone, VA, USA References 1. Jette AM, Haley SM, Tao W, et al. Prospective evaluation of the AM -PAC-CAT in outpatient rehabilitation settings. Phys Ther. 2007;87:385- 398. 2. Hart DL, Connolly JB. Pay-for-Performance for Physical Therapy and Occupational Therapy: Medicare Part B Services. Grant #18-P-93066/9-01: Health & Human Services/Centers for Medicare & Medicaid Services.; 2006. 3. Wainer H, ed. Computerized Adaptive Testing. A Primer. 2nd ed. Mahway, NJ: Lawrence Erlbaum Associates; 2000. 4. Crane PK, Hart DL, Gibbons LE, Cook KF. A 37-item shoulder functional status item pool had negligible differential item functioning. J Clin Epidemiol. 2006;59:478-484. 5. Hart DL, Mioduski JE, Stratford PW. Simulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments. J Clin Epidemiol. 2005;58:629-638. 6. Hart DL. Assessment of unidimensionality of physical functioning in patients receiving therapy in acute, orthopedic outpatient centers. J Outcome Meas. 2000;4:413-430. 7. Hart DL, Cook KF, Mioduski JE, Teal CR, Crane PK. Simulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function. J Clin Epidemiol. 2006;59:290-298. 8. Hart DL, Mioduski JE, Werneke MW, Stratford PW. Simulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function. J Clin Epidemiol. 2006;59:947-956. 9. Ware JE, Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care. 1992;30:473-483. 10. Haley SM, Coster WJ, Andres PL, et al. Activity outcome measurement for postacute care. Med Care. Jan 2004;42(1 Suppl):I49-61. 11. Stout WF. A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation. Psychometrika. 1990;55(2):293-325. 12. Porter ME, Teisberg EO. Redefining Health Care. Creating Value- Based Competition on Results. Boston, MA: Havard Business School Press; 2006. 13. Stratford PW, Riddle DL. Assessing sensitivity to change: choosing the appropriate change coefficient. Health Qual Life Outcomes. 2005;3:23. 14. Waddell G, Newton M, Henderson I, Somerville D, Main CJ. A Fear- Avoidance Beliefs Questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993;52:157- 168. 15. Institute of Medicine. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academy Press; 2001. 16. Institute of Medicine. Rewarding Provider Performance: Aligning Incentives in Medicare. Washington, DC: National Academies Press; 2006. |
|||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |