|
|
||||||||
Research Reports |
SR Barreca, MScPT(Dip), is Assistant Clinical Professor, McMaster University, Hamilton, Ontario, Canada, and Research Clinician, Orthopedic and Rehabilitation Services, Hamilton Health Sciences, Hamilton, Ontario, Canada
PW Stratford, MScPT, is Professor, School of Rehabilitation Science, and Associate Member, Department of Clinical Epidemiology and Biostatistics, McMaster University
LM Masters, MScOT, is Research Therapist, Orthopedic and Rehabilitation Services, Hamilton Health Sciences
CL Lambert, BScPT, is Physical Therapist, Orthopedic and Rehabilitation Services, Hamilton Health Sciences
J Griffiths, BScPT, is Physical Therapist, Orthopedic and Rehabilitation Services, Hamilton Health Sciences
(barreca{at}hhsc.ca) Address all correspondence to Ms Barreca at Hamilton Health Sciences, Box 2000, Station A, Holbrook 1, Chedoke Site, Hamilton, Ontario, Canada L8M 3Z5
Submitted May 24, 2005;
Accepted September 1, 2005
| Abstract |
|---|
Subjects. One hundred five people with upper-limb dysfunction following a stroke were stratified into 2 impairment groups (mild to moderate and severe), which were expected to change by different amounts.
Methods. The CAHAI-13 and ARAT were administered twice (time between assessments varied from 2 to 6 weeks). Receiver operating characteristic curves, Pearson product moment coefficient of correlation, and regression analyses were used.
Results. Receiver operating characteristic curve areas (CAHAI-13=0.86, CAHAI-9=0.82, ARAT=0.72) were significantly greater for the CAHAI versions. Scores on both CAHAI versions had identical levels of cross-sectional validity.
Discussion and Conclusion. Both CAHAI versions demonstrated more sensitivity to change than the ARAT. It remains unclear whether the CAHAI-9 provides precise estimates of CAHAI-13 scores at the individual level. [Barreca SR, Stratford PW, Masters LM, et al. Comparing 2 versions of the Chedoke Arm and Hand Activity Inventory with the Action Research Arm Test. Phys Ther. 2006;86:245–253.]
Key Words: Arm Cerebrovascular accident Hand Outcome assessment Recovery of function
| Introduction |
|---|
|
|
|---|
Over the years, there has been dissatisfaction with the ability to assess recovery in the paretic upper limb of people who have had a stroke. This dissatisfaction has led to the development and application of numerous outcome measures. For example, the Consensus Panel for the Management of the Post Stroke Arm and Hand identified 88 upper-extremity measures.2 Of these measures, only the Action Research Arm Test (ARAT)3 and the Fugl-Meyer Test–arm subscale (FMA)4 were cited more than 10 times. Existing measures have been criticized for focusing on impairments or consisting of contrived tasks that do not reflect real-life activities.5 The Chedoke Arm and Hand Activity Inventory (CAHAI) was developed in an attempt to overcome these shortcomings.6
Comprehensive descriptions of the conceptual framework, development, and psychometric properties of the CAHAI are detailed elsewhere.6–8 In brief, this measure was conceived to be consistent with the World Health Organizations "activity" domain.9 Candidate items originated from the literature as well as from people who have had a stroke and their family members. Content validity and sound psychometric properties played prominent roles in determining the final CAHAI item composition. Preference was given to items that reflected real-life bilateral functional activities and maximized the range of normative upper-limb movements and grasps. The final measure consists of 13 items—hereafter referred to as the CAHAI-13—and takes approximately 25 minutes to administer and score. Each item is scored on a 7-point scale similar to that of the Functional Independence Measure.10
Previous investigations7,8 have supported the psychometric properties of the CAHAI-13, and the following measurement properties have been reported: test-retest reliability, Shrout and Fleiss type 2,1 intraclass correlation coefficient=.98; standard error of measurement=2.8 CAHAI-13 points; minimal detectable change score (MDC90)=6.3 CAHAI points for 90% of the patients tested at 2 points in time; CAHAI-13 correlations with the ARAT and Chedoke-McMaster Stroke Assessment (CMSA)11 arm and hand components of .93 and .81, respectively; and with respect to longitudinal validity (also referred to as "sensitivity to change"),12 an area under a receiver operating characteristic (ROC) curve of 0.95. Specifically, the ROC curve analysis examines a measures ability to distinguish different amounts of change between groups with acute stroke and chronic stroke. The area under the ROC curve—which can vary from 0 to 1 with greater areas representing better measures—represents the measures ability to distinguish different amounts of change. A curve area of 0.50 would be expected on the basis of chance alone. High levels of internal consistency (coefficient alpha=.98) and factorial validity (the principal component analysis factor loadings for all items exceeded 0.76 and accounted for 81.9% of the variance) also have been reported.6
With so many upper-limb measures available, it is natural to question the development of another measure. Although the merits of the CAHAI-13 could be argued on its theoretical framework,6 the true test of a measures ability demands a head-to-head comparison with one of the more highly regarded measures currently available. The Consensus Panel2 found the ARAT and FMA to be the most frequently cited measures, and van der Lee et al13 found the ARAT was more responsive than the FMA in people who had a median time of 3.6 years since their stroke (limits of agreement: ARAT=5.7–6.2, FMA=5.0–6.6; responsiveness ratio: ARAT=2.03, FMA=0.41). In addition, the ARAT items more closely approximated the World Health Organizations "activity" domain8 compared with those of the FMA. For these reasons, we chose to compare the CAHAI-13 with the ARAT. Given a previous parameter estimation study that demonstrated a high correlation between CAHAI-13 and ARAT scores, we were interested in determining whether the longitudinal validity of CAHAI-13 scores was greater than that of ARAT scores. That is, was the CAHAI-13 more adept than the ARAT in detecting true change in upper-limb function over time?
A noted barrier to the successful implementation of standardized outcome measures is the time it takes to administer and score the measures.14 For example, the ARAT and the CAHAI-13 both take approximately 20 to 25 minutes to administer and score. With the thought of increased efficiency in mind, a previous study8 investigated the feasibility of reducing the number of CAHAI-13 items and ultimately its administration time. This exploratory study investigated 7-, 8-, and 9-item versions abstracted from the CAHAI-13. The results suggested that shorter versions of the measure seemed viable, and this finding prompted the present comparison of the CAHAI-13 with a shorter 9-item version (CAHAI-9).8
There were 2 objectives to this study: (1) to determine whether the longitudinal validity of scores on 2 versions of the CAHAI was significantly greater than that of scores on the ARAT and (2) to determine whether the cross-sectional and longitudinal validity of the CAHAI-13 scores was significantly greater than that of the CAHAI-9 scores.
| Method |
|---|
|
|
|---|
Study Design and Procedures
The CAHAI-13 was administered to the participants by their treating therapist at their initial visit and following completion of their rehabilitation program. A total of 11 treating occupational therapists and physical therapists, with an average of 8.5 years (range=0–10 years) of experience participated. These therapists were required to have completed and passed the CMSA training course as well as trained on the administration and scoring of the CAHAI to 85% criterion in a one-day training workshop. The CAHAI-13 and ARAT also were administered by 1 of 2 research therapists within 36 hours of being administered by the treating therapist. Random assignment determined whether the treating therapist or the research therapist would complete the assessment first and which measure, the CAHAI or the ARAT, was administered first by the research therapist. The treating and research therapists were blinded to each others assessment results. Prior to the study, written guidelines for the ARAT were developed from the literature.3,13,15 Information from these assessments was used to compare the longitudinal validity of scores on the CAHAI-13, CAHAI-9, and ARAT and to examine the cross-sectional convergent validity of scores on the 2 versions of the CAHAI. The CAHAI-9 scores were abstracted from the CAHAI-13 scores. In addition, the CMSA was administered at the initial assessment, and the information was used to categorize participants for the longitudinal validity component of this investigation.
The longitudinal validity component applied a strong group construct validation design that made use of the clinical history of the condition being studied.16,17 A previous investigation7 showed that people with stroke and with mild to moderate impairment—defined as a combined initial arm and hand CMSA score between 7 and 11—who were 8 weeks or less poststroke changed more than individuals with severe impairment—defined as a combined initial arm and hand CMSA score of 5 or less—who were 3 months to 1 year poststroke. We applied this classification to define 2 groups who were expected to change by different amounts and will subsequently refer to the groups as "mild/moderate" and "severe." Table 1 provides a summary of the samples characteristics. Given that the purpose was to compare the longitudinal validity of scores on the 2 CAHAI versions and the ARAT and not to resolve what accounted for the change, no attempt was made to standardize the patients rehabilitation programs.
|
Measures
CAHAI-13
This measure evaluates 13 bilateral real-life activities (Appendix) using a 7-point scale (1=client performs less than 25% of the task; 2=upper limb stabilizes during task, with client performing 25% to 49% of the effort to complete the task; 3=upper limb partially manipulates and stabilizes during task, with client performing 50% to 74% of the task; 4=upper limb requires light touch to manipulate or stabilize during task, with client performing 75% or more of the effort to complete the task; 5=requires supervision, coaxing, or cueing; 6=requires use of assistive devices or requires more than reasonable time, or there are safety concerns; and 7=complete independence). Total scores are obtained by summing the item scores. Accordingly, total scores can range from 13 to 91, with higher scores reflecting greater ability.
CAHAI–9
The following items were eliminated from the CAHAI-13: "zip up the zipper," "clean a pair of eyeglasses," "carry bag up the stairs," and "place container on table."8
ARAT
This measure, which consists of 19 items that assess some aspects of function and impairment, uses a 4-point ordinal quantitative scale (0=no movement possible, 1=movement partially performed, 2=movement performed but abnormally, 3=movement performed normally)3 to give a total raw score of 56. Spearman rank-order correlation coefficient and intraclass correlation coefficient have been used to demonstrate high levels of interrater reliability (.99) and test-retest reliability (.98) for this measure.18
Data Analysis
Receiver operating characteristic19 curve analysis was applied to determine whether the known group longitudinal validity coefficients—expressed as the area under the ROC curve—of the 2 CAHAI versions exceeded that of the ARAT. Curve areas were compared using the method of Hanley and McNeil20 for data derived from the same cases, hereafter referred to as "dependent data." The difference in the areas under the ROC curve areas for the measures were divided by the standard error of the difference in areas, giving z values for comparison.21
The convergent cross-sectional construct validity analysis applied a Pearson product moment coefficient of correlation (r). The Meng Z test for dependent samples was applied to evaluate whether the cross-sectional validity coefficient for the CAHAI-13 exceeded that of the CAHAI-9.22 When comparing correlated correlation coefficients on the same set of patients, this analysis is considered superior to the Hotelling t test. It uses the Fisher z transformation, which converts the Pearson r to a normal distribution and, in this way, allows a comparison of 2 or more measures with a dependent variable (eg, gains in upper-limb function).
Regression analysis, including the calculation of 95% confidence intervals (CIs) and prediction bands, was applied to describe the ability of the CAHAI-9 to predict CAHAI-13 scores and change scores.23 Confidence intervals provide the estimated range of values in which the mean CAHAI-13 value is likely to fall for a specified CAHAI-9 value; prediction bands give the estimated range of values for which an individuals CAHAI-13 value is likely to lie for a specified CAHAI-9 value.23 Data were analyzed using SPSS 11.5.*
Sample Size
In a previously reported pilot study of 39 patients, we found the ROC curve areas to be 0.95 for the CAHAI-13 and 0.94 for the CAHAI-9.8 The correlation between these curve areas exceeded .90. The sample size estimate for the current studys ROC curve area comparison was based on the following assumptions, which we believed to be conservative: (1) curve area for the CAHAI-13=0.90, (2) curve area for the comparison measure=0.87, (3) correlation between curve areas=.90, (4) an important difference in curve areas=0.07, (5) type I error probability=0.05 (one-tailed), (6) type II error probability=0.20, and (7) an equal number of patients with mild to moderate and severe presentations. Applying these assumptions, a sample size of 50 people with stroke per group, or 100 in total, were required.
| Results |
|---|
|
|
|---|
|
|
Initial visit: CAHAI-13=0.70+(1.38xCAHAI-9) Follow-up visit: CAHAI-13=0.56+(1.39xCAHAI-9)
The 95% CIs and prediction bands for the initial visit data are presented in Figure 1. The relationship for the follow-up scores (not shown) was virtually identical to that shown in Figure 1. The correlation between the change scores of the 2 CAHAI versions was .96, and the predictive equation was: CAHAI-13=0.23+(1.38xCAHAI-9). Figure 2 provides the 95% CIs and prediction bands for the change scores.
|
|
| Discussion |
|---|
|
|
|---|
Our findings were clear for the CAHAI and ARAT comparison and ambiguous for the CAHAI-13 and CAHAI-9 comparison. The longitudinal validity of data for both CAHAI versions exceeded the specified clinically important difference in area under the ROC curves of 0.07, and the observed differences were statistically significant. The results for the CAHAI-13 and CAHAI-9 comparison were as follows: (1) both measures had identical levels of cross-sectional validity; (2) mean CAHAI-9 scores accurately predicted mean CAHAI-13 scores; (3) individual CAHAI-9 scores displayed moderate variability in predicting individual CAHAI-13 scores; (4) the longitudinal validity of scores on the CAHAI-13 was statistically superior to that of scores on the CAHAI-9, although it is uncertain whether the difference is clinically important; (5) mean CAHAI-9 change scores accurately predicted mean CAHAI-13 change scores; and (6) individual CAHAI-9 change scores displayed moderate variability in predicting individual CAHAI-13 change scores.
We believe the CAHAI has several benefits over the ARAT for evaluating upper-limb dysfunction in people with stroke. From its conception, the CAHAI was designed specifically to assess upper-limb function in people with stroke, whereas the ARAT was derived from a measure, designed in 1965, to assess upper-limb dysfunction in the general neurological population.15 A second advantage is that the CAHAI consists of bilateral, real-life tasks that people with stroke deemed meaningful and important, whereas the ARAT is composed of a mix of impairment-based and contrived functional items.
Careful consideration of the theoretical constructs underpinning the CAHAI has resulted in a tool that is consistent with the current frameworks of motor learning and performance.24–27 Although the ARAT unilaterally evaluates the paretic upper limb, the CAHAI examines how the affected arm and hand stabilizes or manipulates objects, as part of the upper limbs working in a coordinated fashion. Lastly, the CAHAI uses materials that are easily obtained, inexpensive, and transportable, whereas the ARAT is expensive to build and difficult to move.
Turning our attention to the CAHAI-13 and CAHAI-9 comparison, it is necessary to distinguish between 2 potential goals for the CAHAI-9. The principal goal is to assess upper-limb functional recovery, and a secondary goal may be to predict CAHAI-13 scores. If the intended application of the CAHAI-9 is to predict CAHAI-13 scores and change scores for a patient, our findings suggest that there is too much error to accomplish this with a high level of precision. Accordingly, if the CAHAI-13 is administered to a patient at the initial assessment, the CAHAI-9 should not be administered at a subsequent point in time with the intent being to predict what the CAHAI-13s score would have been had it been administered at the subsequent assessment. If the purpose is to assess upper-limb recovery, there is strong evidence that the CAHAI-13 is superior to the CAHAI-9; however, based on the difference in ROC curve areas, it is less clear whether the superiority is clinically important.
Within the context of our article, the greater the area under an ROC curve, the higher the probability that the measure would correctly identify true change in upper-limb function. For example, in a pair of patients who have both truly changed by different amounts, using a measure that has a curve area of 0.80 would correctly identify 80% of the time the patient who has changed the most in how the patient uses his or her arm and hand. Although ROC curve analysis has been applied to examine competing measures abilities to detect change for over 2 decades, we were unable to find guidelines concerning the magnitude of a clinically important difference in curve areas. We acknowledge that our choice of 0.07 is arbitrary; however, it was based on our belief that a difference of 0.10 in curve areas was too great and that a different of 0.05 was likely to be inconsequential. Accordingly, we split the difference. If one accepts a difference in curve areas of 0.07 as important, our results do not exclude the possibility that a difference of this magnitude truly exists (Z=1.84, P1=.033) based on the observed difference of 0.04 (range=0.00–0.08). This uncertainty is a consequence of our underestimating the required sample size. Specifically, our pilot study data8 overestimated the absolute curve areas and the correlation between curve areas and underestimated the true difference in curve areas. Applying the results from our current study (ie, ROC curve areas of 0.86 and 0.82 and a correlation between curve areas of .87) and maintaining the type I and type II error probabilities declared previously and an important difference in curve areas of 0.07, approximately 151 patients with mild to moderate presentations and 151 patients with severe presentations would be required.
| Conclusion |
|---|
|
|
|---|
| Appendix |
|---|
|
|
|---|
|
| Footnotes |
|---|
Ms Barreca provided concept/idea/research design and fund procurement. Ms Barreca, Mr Stratford, and Ms Masters provided writing. Ms Lambert and Mr Griffiths provided subjects and data collection. Mr Stratford provided data analysis. Ms Masters provided project management, data collection, and clerical support. The authors acknowledge the financial support provided by the Ontario Ministry of Health and Long-Term Care through the Ontario Heart & Stroke Initiative.
* SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606. ![]()
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |