|
|
||||||||
Research Reports |
GP Williams, PhD, is Senior Physiotherapist, Physiotherapy Department, Epworth Hospital, 89 Bridge Rd, Richmond 3121, Melbourne, Victoria, Australia
KM Greenwood, PhD, is Professor, School of Health Sciences, University of Canberra
VJ Robertson, PT, PhD, is Associate Professor, University of Newcastle
PA Goldie, PT, PhD, is Associate Professor, School of Physiotherapy, La Trobe University
ME Morris, PT, PhD, FACP, is Professor, School of Physiotherapy, University of Melbourne
(gavinw{at}epworth.org.au). Address all correspondence to Dr Williams
Submitted April 14, 2005;
Accepted September 14, 2005
| Abstract |
|---|
Key Words: Brain injuries Neurologic gait disorders Outcome assessment Reliability
| Introduction |
|---|
|
|
|---|
A key requirement for the measurement of physical performance is that results are found to be reliable. Reliability refers to the extent to which a measure is consistent from one testing occasion to the next and free from error. The 3 main types of scale reliability are interrater, retest (intrarater), and internal consistency.4–6 Interrater reliability and retest reliability are particularly important for the clinical use of the HiMAT because recovery from severe TBI may take many years7, 8 and several different therapists can be involved in treatment over an extended period of time. It is important to establish the interrater reliability for a scale to ensure agreement in scoring when it is likely that different therapists will examine the same person.
Retest reliability refers to the consistency with which measurements obtained for the same person can be replicated on a different occasion. It reflects the stability of a measure for consecutive testing in which the time interval between tests is short enough that no true change has occurred but long enough to reduce the confounding effects of fatigue or practice. The investigation of retest reliability is particularly important after TBI because cognitive impairments and behavioral dysfunction also may affect physical performance. Therefore, it is important to establish the impact of confounding effects on performance because their influence may be inconsistent between tests.
Internal consistency, also a form of reliability, assesses the homogeneity of the items on a scale. For example, hopping forward, jumping, or running while throwing a ball may be considered to be high-level mobility tasks. In addition to requiring fast movement, running while throwing a ball necessitates upper-limb skills and challenges cognitive ability for judgment and planning. Such actions often are described as dual-task activities because they require 2 or more activities to be performed simultaneously. Dual-task activities were excluded during the developmental stages of the HiMAT because of the possible impact of concomitant impairments, such as upper-limb function or cognitive impairments, on mobility testing in the TBI population.1 Examining the internal consistency of the HiMAT can establish the extent to which the items relate to each other and belong together.
| Method |
|---|
|
|
|---|
The Table
summarizes the age, length of posttraumatic amnesia (an indicator of severity of injury), and length of time postinjury for each of the groups. The values displayed represent the median and interquartile range. The last column in the Table
displays the individual characteristics for the original group of 103 participants from which the HiMAT was developed and internal consistency was calculated. There was no significant difference for age, length of posttraumatic amnesia, or length of time postinjury between the interrater reliability group (n=17) and the cohort of 103 participants used to calculate internal consistency. There was no significant difference for age or length of posttraumatic amnesia between the retest reliability group and the original group of 103 participants, but the retest reliability group had a statistically significant (t=2.35, df=82, P<.05) longer time postinjury as a result of selection criteria. Thus, the deliberate strategy of sampling for participants in the chronic recovery phase for the investigation of retest reliability was successful.
|
Investigation of live (not observed from a videotaped recording) interrater reliability is essential because it reflects the conditions experienced in the workplaces in which the scale is intended for use. Three physical therapists independently and concurrently scored the performance of the 17 participants. No discussion among physical therapists was allowed. The physical therapists had 11, 10, and 6 years of experience in neurologic physical therapy. Because 2 of the 3 physical therapists had no prior knowledge of, or training on, the HiMAT, an instruction sheet was provided 5 minutes before testing. The 3 physical therapists independently timed all items. The distance jumped in each of the 2 bounding items was measured between the front of the participants foot before the jump to the participants heel at the point of landing after the jump. This distance was measured by only one therapist because this measurement simply required recording the distance between 2 marks made on the ground and was unlikely to result in enough error to affect the HiMAT score. The therapists independently converted the scores to calculate an overall score for the HiMAT and submitted their results.
The testing procedure for the retest reliability study followed the guidelines previously published for the HiMAT.2 The 2-day break between tests allowed for adequate recovery but was short enough that natural recovery in a group of participants in the chronic recovery phase was unlikely to occur. Subjects performed the items in the same order for both test sessions to control for any ordering effect. The testing procedure for the internal consistency study also followed the previously published HiMAT guidelines.2
Data Analysis
Statistical analyses were performed with SPSS version 11.0.*
Interrater Reliability
Interrater reliability was assessed for each of the items with an intraclass correlation coefficient (ICC[2,1]), for the raw (timed) scores and the converted scores. The total scores obtained on the HiMAT were independently calculated by the physical therapists for each subject and assessed with an ICC(2,1).
Retest Reliability
To investigate retest reliability, an ICC(2,1) was calculated for the total HiMAT scores. Mean difference scores were obtained by subtracting retest scores from initial scores for each participant. A paired t test was used to compare the initial and retest HiMAT scores to determine whether a practice effect had occurred. The 95% confidence intervals for determining the minimal detectable change (MDC95) on the HiMAT were calculated with the formula: MDC95=mean difference±(1.96xSEM), where SEM is the standard error of measurement.
Other methods for determining MDC95 have been reported9, 10 that include an adjustment when calculating errors associated with both the test and the retest scores. The adjustment was unnecessary in this study, because the error associated with the retest scores (mean difference) could be calculated directly to accurately determine the MDC95. To calculate the SEM, the standard deviations from the initial and retest scores were pooled according to the equation outlined by Mendenhall et al.11
Internal Consistency
The internal consistency of the HiMAT items was investigated with Cronbach alpha. The HiMAT scores obtained from the original group of 103 participants were assessed.
| Results |
|---|
|
|
|---|
Retest Reliability
The retest reliability of the HiMAT scores also was very high (ICC=.99), indicating that people with TBI had highly consistent performances. A comparison of mean differences identified a mean improvement of only 1.0 point (range=–1 to 3) at retest. A paired t test showed the mean differences to be significant (t=3.82, df=19, P<.001), indicating that only a very small systemic improvement had occurred. The SEM was 1.36. The standard deviations used to calculate the SEM were 13.4 for the initial test and 13.7 for the retest. The MDC95 for the HiMAT was calculated to be 1±2.66 points. Because the HiMAT can be scored only in whole numbers, the MDC95 is adjusted to –2 to +4, indicating that participants must deteriorate by 2 points or improve by 4 points for clinicians to be 95% confident that a real change has occurred.
Internal Consistency
The internal consistency of the final version of the HiMAT was very high (Cronbach alpha=.97), indicating that the HiMAT consists of a group of homogeneous high-level mobility items.
| Discussion |
|---|
|
|
|---|
The results also showed that the retest reliability for the HiMAT was high. Nonetheless, the ICC, which measures both agreement and consistency, did not reflect the systematic change identified by the paired t test. Participants showed a mean improvement on the HiMAT of 1.0 point when retested. The mean improvement could have been attributable to a variety of factors, including natural recovery between testing sessions; improvement in physical impairments, such as strength or balance; improved cardiovascular fitness; skill acquisition or motor learning; and improved confidence in physical abilities. Several strategies were implemented to limit systematic change. First, only participants well beyond the acute recovery phase for TBI were asked to return for repeat testing. Second, we used the 2-day time interval between tests recommended for retest reliability studies of physical measures.12–14 Third, all participants were offered practice or familiarization trials to reduce the impact of impaired cognition and reduced confidence on the test score and to improve the likelihood that a true measure of mobility was attained. Familiarization trials were considered to be especially important because many of the participants had not attempted some of the high-level mobility items contained in the HiMAT since their accident. Despite these strategies being in place, a small systematic improvement was found.
It is highly unlikely that the systematic improvement at retest was attributable to natural recovery, because the median time postinjury for this group was 4.5 years and the time to retest was 2 days. The short period between tests made it highly unlikely that any true change had occurred. It is also highly unlikely that the systematic improvement was attributable to a reduction in physical impairment, an improvement in cardiovascular fitness, or skill acquisition because the interval between tests was only 2 days and participants were not given an opportunity to practice. The most likely reason for the small improvement at retest was the confidence gained during the initial test, enabling some participants to attempt the items with more vigor on the second occasion. Williams and Goldie15 obtained a similar finding when investigating several high-level mobility items in the TBI population. They found a systematic improvement in bounding distance, walking speed, and running speed when 40 people with TBI were retested. These participants were retested after a period of 2 days, were given familiarization trials and, although not as far along in the chronic recovery phase as the current group, were still well beyond the acute recovery phase for TBI (mean time postinjury=22 months). Many of the higher-level mobility items on the HiMAT represent physically challenging tasks not routinely encountered from day to day, especially given that subjects were asked to perform the majority of items as quickly as they safely could. Clinicians need to be aware that, despite strategies deliberately designed to reduce the likelihood of a practice effect, people with TBI can demonstrate a small improvement in successive trials.
The MDC95 for the HiMAT was small (1±2.66 points), representing less than 5% of the total scale. When the MDC95 is adjusted to take systematic improvement into consideration, participants must improve by 4 points or deteriorate by 2 points (the HiMAT uses only whole points) for clinicians to be 95% confident that true change has occurred. No participant deteriorated by 2 or more points or improved by 4 or more points at retest.
Internal consistency was high, indicating that the HiMAT items are homogeneous and are a reliable group of items for measuring high-level mobility. Therefore, variations in subject test scores can be attributed to differences in ability rather than measurement error. This result supports the findings of the Rasch analysis in the developmental stages of the HiMAT.2
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
This study was approved by the ethical standards boards of La Trobe University and Epworth Hospital.
This study was supported by a Faculty Research Grant from La Trobe University.
This work was presented at the Sixth World Congress on Brain Injury; May 6–8, 2005; Melbourne, Victoria, Australia.
* SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606. ![]()
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |