|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Reports |
J Paltamaa, MSc, is Researcher and Physiotherapist, Department of Physical Medicine and Rehabilitation, Central Hospital, Jyväskylä, Finland, and a doctoral student in the Department of Health Sciences, University of Jyväskylä, PO Box 35 (Viv), FI-40014, Jyväskylä, Finland
T Sarasoja, MD, is Neurologist, Department of Neurology, Central Hospital, Jyväskylä, Finland
E Leskinen, PhD, is Professor in Statistics, Department of Mathematics and Statistics, University of Jyväskylä
J Wikström, MD, PhD, is Senior Lecturer, Department of Neurology, University of Helsinki, Finland
E Mälkiä, PhD, is Physiotherapist and Professor in Physiotherapy, Department of Health Sciences, University of Jyväskylä, Finland
Address all correspondence to Ms Paltamaa at: jaana.paltamaa{at}sport.jyu.fi
Submitted February 22, 2007;
Accepted August 27, 2007
| Abstract |
|---|
Subjects: The participants were 120 people with MS who were ambulatory from a population-based sample.
Methods: Physical functioning was assessed by quantitative clinical measures of activities (n=5) and body functions (n=7) and by self-reported performance in self-care, mobility, and domestic life domains in the activities and participation component of the ICF at baseline and 2 years later. A participant's perception of change and a change in Expanded Disability Status Scale (EDSS) scores were used as external criteria in the analysis of the receiver operating characteristic curve and the minimally important change score. The minimal detectable change was calculated as distribution-based responsiveness.
Results: According to the external criteria, 51% of the participants showed deterioration as measured by their own perceptions compared with the 26% of the participants who showed deterioration as rated by the clinician. Regardless of the external criterion applied, the measures most responsive to deterioration were self-reported scores in self-care, mobility, and domestic life; distance walked and change in heart rate during a 6-minute walk test; 10-m walk test speeds, stride length, and cadence; repetitive squatting; and Box and Block Test scores.
Discussion and Conclusion: The results show the relative responsiveness of different measures in the subsample who deteriorated and provide data that can facilitate the interpretation of score changes in people with MS who are ambulatory for future studies and in clinical practice.
| Introduction |
|---|
|
|
|---|
Responsiveness has been defined as an instrument's ability to detect change over time.5 The main criteria for responsiveness are: (1) evidence of changes in the scores of the instrument, (2) longitudinal data comparing a group that is expected to change with a group that is expected to remain stable, and (3) populations on which responsiveness has been tested, including the time intervals between assessments, the interventions or measures involved in evaluating change, and populations assumed to be stable.5
There is no consensus in the literature on the exact definition of responsiveness. An integrated system for defining clinically meaningful change that combines anchor-based and distribution-based methods has been recommended.3,6 Anchor-based methods focus on the correspondence between change in the outcome measure of interest and change in an external criterion.3 Longitudinal anchor-based methods are preferable to cross-sectional methods because the former are more directly linked to change.6 In considering these longitudinal methods, patients self-ratings are especially well suited for assessing perception of change from the individual's perspective and are recommended for that purpose. Clinicians global ratings and longitudinal disease-related measures of outcomes are the most suitable methods of determining meaningful change from the clinician's perspective. Distribution-based methods include those based on statistical significance, sample variability, and measurement precision.6
In previous studies of MS, the responsiveness of the Multiple Sclerosis Impact Scale (MSIS-29) was compared with that of other self-report scales,7 and the responsiveness of 3 health-related quality-of-life instruments8 and 5 clinical rating scales used in MS research9 was reported. De Groot et al4 studied the responsiveness of 23 outcome measures in the domains of disease-specific outcomes, physical functioning, mental health, social functioning, and general health. At the group level, the most responsive measures per domain for early MS were the Medical Outcome Study 36-Item Short-Form Health Survey subscales of physical functioning and health, the Disability and Impact Profile psychological subscale, and the Rehabilitation Activities Profile occupation subscale. On the individual level, none of the measures reliably detected minimally important change (MIC).
To our knowledge, no other findings have been published with regard to the responsiveness of the physical functioning measures used in people with MS. This dearth of published data raises questions about the use of these measures for monitoring the clinical recovery of people with MS or for monitoring responses to interventions, despite the fact that a large number of different outcome measures have been widely used for these purposes.
With these considerations in mind, we conducted a 2-year follow-up study in people with MS who were ambulatory in order to provide estimates of responsiveness for a variety of physical functioning measures. The World Health Organization's International Classification of Functioning, Disability, and Health (ICF)10 formed the framework for all of the measures and data used in this study. We used both anchor-based and distribution-based methods to assess responsiveness. The specific aims of this study were: (1) to examine clinically significant deterioration in the physical functioning measures in relation both to participants perceptions of changes and to a clinician's rating and (2) to provide evidence on changes in the scores of the physical functioning measures.
| Method |
|---|
|
|
|---|
|
Procedure
A single-group design method was used in this study. All participants gave their written informed consent before entering the study.
The selection of physical functioning items was based on the ICF10 and covered the physiological functions of the body system (body functions), the execution of a task or action by an individual (activities), and involvement in a life situation (participation). We inventoried these components, selected the most appropriate generic or disease-specific measures for each major domain, and linked the constructs of the measures to the ICF.14 In the activities and participation component, functioning can be subdivided into "performance" and "capacity."10 Performance describes what individuals do in their current environment, and it also can be understood as "the lived experience" of people in their actual life context.10 In this study, performance in the activities and participation component was assessed by self-reported scores in self-care, mobility, and domestic life. Quantitative clinical measures were used to assess both capacity, indicating the highest probable level of physical functioning that a person may attain in a given activity domain at a given moment, and body functions.
Each participant attended 3 test sessions over the period September 2000 to December 2002. All measures were conducted in the physical therapy department at the Central Hospital by 2 physical therapists (the primary researcher [JP] and one other therapist, both of whom had participated in the earlier interrater reliability study14). The examiners used a detailed protocol that included precise, standardized instructions. The measures were administered in the same order during each testing session. Rest breaks were given, as needed, both within and between measures. At the time of each testing, the participants had to be stable in their MS, with no ongoing relapse. Test sessions were separated by nearly 1 year in order to minimize the effect of season on functioning. The same experienced neurologist obtained all of the participants EDSS scores every year prior to or within 3 days following the physical functioning test session.
Outcome Measures
Performance in the self-care, mobility, and domestic life domains in the activities and participation component.
The participants performance regarding self-care, mobility, and domestic life were assessed by self-reported scores from the Functional Status Questionnaire (FSQ).15 Only FSQ items that assess aspects of physical function (ie, basic and instrumental activities of daily living) were included. In our study, these FSQ items were reclassified to enable the results to be linked to the activities and participation component of the ICF. A 4-item self-care scale (washing, toileting, dressing, and eating), a 5-item mobility scale (walking inside, climbing stairs, walking
-km distances, driving a car, and using public transportation), and a 5-item domestic life scale (preparing meals; washing clothes; cleaning the house; acquiring goods and services; and taking care of plants, indoors and outdoors) were constructed.
Self-reported difficulty performing self-care, mobility, and domestic life activities during the past month was scored on the FSQ as follows: "usually did with no difficulty," "usually did with some difficulty," "usually did with much difficulty (ie, require some aid or assistance)," "usually did not do because of MS," and "usually did not do for other reasons." The calculation of each FSQ score was transformed according to published algorithms.15 The FSQ score ranges between 0 and 100, with 0 representing fully dependent and 100 fully independent performances. The reliability of the FSQ scales has been found to be high across a wide range of settings and patient populations.15–17
Measures of activities (capacity).
Participants underwent 6 quantitative clinical measures. Table 1 presents an overview of the measures, the parameters used, and data from the participants at baseline and 2 years later. The reliability of data for these measures for the people with MS has been described elsewhere.14
|
Changing and maintaining body positions were assessed using 3 different measures. The Berg Balance Scale (BBS) is a 14-item, performance-based instrument for individuals with some degree of balance impairment.19 Each item is scored on a 5-point scale, from 0 (cannot perform) to 4 (normal performance), giving a maximum score of 56. As exceptions to the standard instructions, the tandem stance and one-leg stance were performed on both legs, and the poorest score was recorded. The Kela Coordination Test,20 a measure developed by the Social Insurance Institution of Finland (Kela), consisted of 2 parts: walking forward and backward on a narrow plank (width 9 cm and height 4 cm) and performing a series of steps on a track. The time (in seconds) used to complete the task was measured, and the number of possible errors (stepping off from the plank or outside the marks) was counted. In the postural stability tests, which were performed on a Good Balance force platform,* participants stood with their feet 20 cm apart with eyes open and with eyes closed.21 Sway during a 30-second period was recorded, and the median velocity (velocity moment, in square millimeters per second) was analyzed as the outcome.
Walking was measured over both shorter and longer distances. In the 10-m walk test, time needed to walk a distance of 10 m was measured by Newtest photocells
at normal gait speed ("own speed") and at maximum gait speed ("fast as possible"), and the velocities (in meters per second) were calculated.22,23 A 6-minute walk test (6MWT) was conducted as described by Guyatt and colleagues.24,25 The distance walked during the 6MWT was recorded to the nearest 10 m covered.
Measures of body functions.
We used 7 measures of the physiological functions of the body. Table 1 presents an overview of the measures, the parameters used, and data from the participants at baseline and 2 years later. The reliability of data for these measures for people with MS has been described elsewhere.14
Exercise tolerance functions were assessed by measuring heart rate (HR), physiological cost index (PCI), and rating of perceived exertion (RPE). Heart rate (in beats per minute) was recorded at rest and every 2 minutes during the 6MWT using the Polar Heartwatch.
Heart rate at the end of the test and change in HR (HR at 6 minutes – HR at rest) were recorded. The PCI (in beats per minute)26 was calculated by dividing the difference between HR while walking and HR at rest (in beats per minute) by walking speed (in meters per minute). Rating of perceived exertion (RPE)27 during the last 10 seconds of the 6MWT was noted.
Gait pattern functions were measured at normal walking speed using a 10-m walk test. We measured actual step lengths by the method described by Kokko et al.23 Mean stride lengths (in centimeters) and cadence (in steps per minute) were calculated. In addition, the walk ratio (in meters/steps per minute)28 was calculated as a speed-independent index of gait pattern.
Muscle power functions were assessed by measuring grip strength (force-generating capacity) and maximal isometric force of the knee extensors. Grip strength was measured using a Jamar dynamometer
following the instructions of the American Society of Hand Therapists.29 The best result out of 3 trials of right-hand grip strength was taken for the analysis. Maximal isometric force of the knee extensors was measured using a custom-built David 200 dynamometer chair|| and Newtest Force Isometric Strength Testing System
.30 The best result out of 3 maximal contractions was taken for the analysis, and the maximum strength (in kilograms) was recorded.
Muscle endurance functions of the upper and lower extremities were measured by the repetitive endurance tests of the Invalid Foundation of Finland.31 We used alternative dumbbell presses with standard weights (5 kg for female participants and 10 kg for male participants) to measure the endurance of the upper extremities.31 If necessary, the weights were adjusted for each participant, and that weight was used throughout the study. The participant stood with feet 15 cm apart and elbows bent with a dumbbell in each hand, raised the right-hand dumbbell upward, and then returned it to the initial position. The same movement was done with the left hand, and the alternative right and left movements were repeated as many times as possible at a constant rate. The repetition maximum was multiplied by the weights (in kilograms) lifted. The muscle endurance of the lower extremities was assessed by the number of squats performed.31 The participant stood with feet 15 cm apart, performed a squat such that both thighs were horizontal, and then returned to a standing position. The movement was repeated as many times as possible at a constant rate, and the repetition maximum was calculated. The repetition rate was 44 times per minute in both endurance tests.
The muscle tone (velocity-dependent resistance to stretch) of 4 muscle groups on each side of the body (the biceps and triceps brachii muscles, the knee flexors, and the quadriceps femoris muscle) was tested by the 6-point Modified Ashworth Scale (MAS).32 The combined upper- and lower-limb spasticity score (0–20) was the sum of the scores for the individual muscles.14
Sensation of muscle stiffness was assessed by measuring hamstring muscle flexibility using the passive straight leg raise (SLR) test.33 The angles of the SLR (in degrees) were measured using a Dualer electronic inclinometer,# and the degrees of the right side were taken for the analysis.
External criteria.
To determine whether a participant's score had changed, 2 external criteria were applied: (1) the participant's perception of change by a single item of the RAND 36-Item Health Survey (RAND-36) that indicates perceived change in health and (2) the change in EDSS scores, representing the perspective of the clinician.
The RAND-3634 is a brief questionnaire that has been well validated in the social science and medical literature and is used extensively around the world as a tool for assessing clinically relevant patient outcomes. The RAND-36 item used was: "Compared with 1 year ago, how would your rate your health in general now?" Participants rated their perceived change on a 5-point Likert-type scale as follows: "much better now than 1 year ago," "somewhat better now than 1 year ago," "about the same," "somewhat worse now than 1 year ago," and "much worse now than 1 year ago."
The EDSS,13 which is the standard measure of disease progression and degree of neurological impairment in clinical practice and clinical trials, was assessed by a neurologist. It divides functioning into 8 functional systems: pyramidal, cerebellar, brain stem, cerebral, bowel and bladder, sensory, visual, and "other." Impairment in each system is graded and then summed across the 8 systems. Scores for the total scale can range from 0 (no neurological abnormality) to 10 (death from MS).
Data Analysis
All statistical analyses were performed with SPSS, version 14.0 for Windows.** Data from the participants at baseline and 2 years later were analyzed. Group differences between participants and dropouts and differences in baseline EDSS scores and in occurrence of relapses were compared using the Mann-Whitney U test or Kruskal-Wallis test for continuous variables and the Pearson chi-square test for categorical variables. The percentage of raw agreement and the Cohen kappa were used to examine the agreement between the participants perceptions and the clinician's ratings. Probability values below .05 were considered statistically significant.
Anchor-based methods.
We used an interval of 1 year between the successive sets of ratings and combined them in the 2-year follow-up results separately for both external criteria. We used trichotomous categorizations of the change scores (deteriorated, stable, or improved). For the participants perception of change, we classified participants as follows: deteriorated (somewhat or much worse), stable (about the same), and improved (somewhat or much better) according to the RAND-36 item. For the clinician's perspective, we used a change in EDSS scores of 1 point as a cutoff point for deterioration or improvement because it has frequently been used in previous trials and has been considered as clinically meaningful for patients with a baseline EDSS score of <6.0.9,35 The participants were classified as follows: deteriorated (change in EDSS score of
1 point), stable (change in EDSS score of between 0 and ±0.5 point), and improved (change in EDSS score of
–1 point).
We calculated the area under the receiver operating characteristic (ROC) curve (AUC) with its 95% confidence interval (95% CI) for each physical functioning measure using changes in scores at 2 years from the baseline. The ROC curve is a graph that compares the rate at which the threshold correctly identifies participants showing change (sensitivity on the y-axis) with the rate at which participants are identified as showing change in the measure but not in the external criterion (1 – specificity on the x-axis) (Fig. 2). Relative responsiveness was assessed separately for deterioration and improvement. For both external criteria, the scores were dichotomized using the category of stable (no change) as the reference category. To compute the AUC, we used a nonparametric method that does not make any distributional assumptions.36 The AUC is a combined measure of sensitivity and specificity and can be interpreted as the probability of correctly discriminating between participants who are deteriorated and those who are stable. It can take any value between 0 and 1. The practical lower limit for the AUC is 0.5. The bigger the AUC, the better the overall performance of the measure.
|
|
|
Distribution-based methods.
The minimal detectable change (MDC) was calculated as the distribution-based responsiveness. The MDC was considered at the individual level and is presented in the units used. Responsiveness is captured as the MDC, with a confidence level of 95% at the individual level (MDC95,ind), as follows3,37:
|
|
, representing the extent to which the measurement of a parameter can vary. We calculated the MDC using the average SEM values obtained from our previous test-retest and interrater reliability study.14 Changes smaller than MDC95,ind cannot (with a confidence level of 95%) be reliably interpreted as "real" changes in the score for an individual.3,38 The MDCproportion was calculated to determine the proportion of the study group that achieved at least the minimal amount of reliable change.3 | Results |
|---|
|
|
|---|
The percentages of participants showing deterioration according to their own and the clinician's perspectives over the follow-up period were 51.4% (n=56) and 25.7% (n=28), respectively. A small percentage of participants improved (17.4% according to the participants perception and 7.3% according to the clinician's rating). Thus, the results for improvement were less clear, and the data are not shown. According to the participants perceptions, 31.2% (n=34) remained stable compared with 67.0% (n=73) from the clinician's ratings. The agreement between the participants perceptions and the clinician's ratings in classifying the participants as deteriorated, stable, or improved was 46% (
=0.16). Pearson chi-square analysis showed no significant association between baseline EDSS scores and deterioration according to either the participants perceptions or the clinician's rating.
Overall, 51.4% of the participants did not report any relapse during 2-year follow-up, 28.0% had 1 relapse, and 20.6% had 2 or more relapses. The association between occurrence of relapses and deterioration according to the EDSS rating was statistically significant (Pearson
24=15.76, P<.01); however, this was not the case when occurrence of relapses was compared with the participants own perception of change (Pearson
24=4.70, P=.320). With the Kruskal-Wallis test, the changes in the parameters significantly related to the occurrence of relapses were Kela Coordination Test time, postural stability test with eyes open and eyes closed, 6MWT distance, change in HR during the 6MWT, and self-reported performance in self-care and in domestic life.
Results for deterioration obtained using the anchor-based methods are shown in Tables 2 and 3. The AUC values ranged from 0.43 to 0.76 and had wide 95% CIs. For 14 out of 26 parameters (clinician's perspective) and 11 out of 26 parameters (participants perception), the AUC significantly differed from 0.50. For MICdeterioration, 15 out of 26 parameters (clinician's perspective) and 10 out of 26 parameters (participants perception), the MIC significantly differed from zero.
|
|
Table 3 shows the results for deterioration according to the participants perceptions. The AUC and MICdeterioration values for performance in the activities and participation component of the FSQ were very similar to the clinician's ratings. From the participants perceptions, there were 5 out of 9 significant AUC values and 4 out of 9 significant MICdeterioration values for the ICF activities component, whereas there were only 3 out of 14 significant AUC values and 3 out of 14 significant MICdeterioration values for the ICF body functions component. Of the measures related to the ICF activities component, the 6MWT distance and the 10-m walk test velocity at normal speed had the highest AUC values (0.76 and 0.69, respectively) and MICdeterioration values (–53.35 and –0.14, respectively). Of the measures related to the ICF body functions component, the 10-m walk test cadence and stride length had the highest AUC values (0.69 and 0.66, respectively) and MICdeterioration values (–6.94 and –6.49, respectively).
The results for MDC as a distribution-based method are shown in Table 4. The MDCind values can be used by clinicians to assist in determining whether an individual with MS has experienced a real change. For example, the responsiveness findings with regard to the BBS demonstrated an MDC95,ind value of 2.3 points, indicating that changes in the BBS scores at the individual level need to be at least 3 points, given a confidence level of 95%, before a real change rather than a chance fluctuation can be reliably concluded. For the BBS in our study group, 25.7% of the participants exceeded at least the minimal standard of change that is considered clinically important (MDCproportion).
|
| Discussion |
|---|
|
|
|---|
Among several distribution-based methods, we decided to analyze the MDC,3 which relies on variation in the SEM. Other distribution-based methods evaluate change in relation to sample variation, such as baseline variation of the sample (effect size) and variation in change scores (standardized response mean),3,6 thus being limited indicators of responsiveness, at least to clinical epidemiological data of this kind.1 It is assumed that a real change takes place when the difference in the scores of an individual at 2 separate points exceeds the MDC. For example, the MDC95,ind of 92 m for the 6MWT distance means that a change of less than 92 m cannot (with a confidence level of 95%) reliably be interpreted as "real" change for the individual compared with chance fluctuations. However, if the individual's 6MWT distance change is over ±92 m (ie, the change noted is likely not due to measurement error), the question follows whether the change is clinically meaningful. Anchor-based methods had to be used in order to address this issue.
Anchor-based approaches require an external, independent standard to "anchor" the meaning of clinical importance. One of the essential problems with most measures used in physical therapist practice is the lack of a clear external criterion to help with the interpretation of the scores.3 For instance, what does it mean to the person with MS if he or she is able to lift an additional 5 kg in knee extension? Because a single gold standard for change is lacking, we decided not to rely on one method alone but used 2 kinds of external criteria, with both of them having some advantages and limitations.
One of the limitations of an anchor-based method that relies on self-ratings is susceptibility to recall bias.3 Such a method requires participants to be mentally able to compare a previous situation with the present situation.39 We used a question drawn from the RAND-36, which is widely accepted in MS studies. Although cognitive dysfunction is among the main symptoms of MS, the report by Gold et al40 provides evidence that cognitive impairment in MS does not affect the reliability and validity of self-report health measures. We used an interval of 1 year between the successive sets of testing in order to minimize this bias and combined ratings in the 2-year follow-up.
The Kurtzke EDSS is the most frequently used scale for rating disability in people with MS. Several limitations of the EDSS have been reported, including low reproducibility in the lower ranges of the scale, absence of both high cortical function and arm function measurements, and poor sensitivity.35 These limitations might make the EDSS relatively unsuitable as an external criterion for change. However, despite of this criticism, it is a scale that is very well known among clinicians. Therefore, we used the EDSS to determine important change from a clinician's point of view. In our study, the same neurologist assessed the EDSS scores on every testing session in order to ensure the scale's reliability.
In our study, we were looking for measures that would be comparable for both external criteria. Finding such a measure would increase our confidence in the measure, because it would imply that the results obtained would have the same meaning for both the person with MS and the clinician. We found that several measures had significant AUC and MIC values for deterioration from both the participants and the clinician's perspectives. These measures were self-reported performance in ICF activities and participation (FSQ self-care, mobility, and domestic life scores); in ICF activities such as fine hand use (BBT of the dominant hand) and walking (10-m walk test velocities at normal and maximal speeds and 6MWT distance); and in ICF body functions such as exercise tolerance functions (6MWT HR change), gait pattern functions (10-m walk test stride length and cadence), and muscle endurance functions of the lower extremities (repetitive squatting).
Given that mobility is the paramount aim of physical therapy, it is important to know that the measures of walking are among the most responsive. One possible reason might be that the EDSS heavily emphasizes mobility.35 However, the participants perceptions showed a similar level of responsiveness. In addition, there is evidence that gait and balance may begin to deteriorate in the early stages of MS, even when the neurological signs are mild.41 Our study group consisted of people with mild disability secondary to MS, and this might have had an effect on the results.
The agreement between the participants perceptions and the clinician's ratings in classifying the participants as deteriorated, stable, or improved was poor. As Crosby et al6 have stated, in defining clinically meaningful change, these perspectives may not always be in agreement. One reason for the mismatch might be that the EDSS mainly explores impairments in the physiological functions of body systems (ie, body functions according to the ICF), whereas the RAND-36 item that we used measures participants perceptions of changes in health, which have a very broad meaning. This was demonstrated in our results, where twice as many significant parameters were found in body functions according to the EDSS criterion compared with the participants perceptions. A meaningful change for the clinician may indicate a change in the prognosis of the disease. In ICF terms,10 the RAND-36 item that we used assesses the change in well-being, including physical, mental, and social aspects. In future studies, it might be much more appropriate to ask specifically about changes in physical functioning.
Multiple sclerosis tends to progress over time; thus, the ability to assess longitudinal changes in physical functioning is critical in outcome measures used in the study of treatment efficacy and in clinical decision making. We decided to study both deterioration and improvement because the course of the disease is unpredictable and individual42 and previous studies4,43 have shown that the scores for deterioration and improvement are not necessarily equal. The separation of those participants who deteriorated and those who improved will increase the homogeneity of our data and will tend to inflate responsiveness compared with other traditional methods. Because only a small percentage of participants showed improvement, those results are not presented. The data of those participants who improved (19 participants according to their own perceptions and 8 participants according to the EDSS criterion) need further analysis.
Reporting the proportion of participants (MDCproportion) achieving a degree of change that is beyond measurement error is a more informative method for describing natural changes or the effects of interventions than overall mean change. Our results showed that, in 61% of the parameters, less than 10% of the participants scores exceeded the MDC95,ind, indicating a stable situation in those activities or body functions. The largest fluctuations (MDCproportion over 20%) were seen for muscle power functions (grip strength and force of the knee extensors), upper-extremity muscle endurance functions (right-hand repetitive dumbbell presses), fine hand use (Box and Block Test), and the BBS for changing and maintaining body position. However, according to the external criteria, only 31% of the participants remained stable according to the participants own perceptions compared with 67% of the participants who remained stable as rated by the clinician. The MDC seems to be more conservative in detecting change. In our study, the MDC reflected an acceptable 95% confidence level and the multiplier of
was used to account for the additional uncertainly introduced by using different scores from measurements at 2 points in time.3,37 Thus, the precision of the MDC was remarkably high, indicating that only definite changes in scores are noticed. The resulting MDC values will be lower and, therefore, less change will be required for real change if the MDC is computed using a 90% confidence level.
De Groot et al4 assessed some outcome measures for early MS, but most of the measures in their study were questionnaires or semistructured interviews. They used the 10-m gait test; however, comparison with our results is difficult because they did not mention whether it was conducted at the normal speed or at maximum speed and they reported time (in seconds) instead of velocity (in meters per second). Depending on the external criterion they used, they found in their sample that, to exceed measurement error, a change of 2.6 to 3.0 seconds for the 10-m timed walk test was required. Their AUC values for clinician's and participants perspectives (0.69 [95% CI=0.59–0.78] and 0.65 [95% CI=0.56–0.74], respectively)4 were comparable to our results for the 10-m walk test. They found, contrary to our results, that the MIC for the 10-m timed walk test did not significantly differ from zero. Disease duration was much longer in our study sample than in the subjects in the study by de Groot et al4 (average time since symptoms=12.3 years and 2.15 years, respectively), although both study samples were similarly mildly disabled (median EDSS scores of 2.0 and 2.5, respectively).
Several authors4,44,45 have reported the sensitivity of the Multiple Sclerosis Functional Composite and its components. The responsiveness associated with other physical functioning measures has not been reported previously in the literature describing individuals with MS who are ambulatory, but some studies of other neurological diseases have been presented. In a study of patients with Parkinson disease,46 MDCs of 0.19 m/s for the 10-m walk test at a comfortable walking speed, of 13 steps/min for cadence, and of 2.84 points for the BBS were reported. Minimal detectable changes of 6 points for the BBS47 and 0.16 m/s for the 10-m walk test48 have been found in patients with stroke. These responsiveness values were similar to ours.
The selection of a measure in clinical practice is not guided only by its responsiveness. It is also important to select outcomes that genuinely represent the phenomena of interest. Therefore, we used the ICF as a theoretical framework. Our categories of activities and body functions contain almost the same ICF items as those influenced by physical therapy in a neurological community health care situation in the study by Finger et al.49 It is only in people who are more severely disabled secondary to MS that there is an obvious need to assess respiratory functions.49 It was difficult to make clear distinctions between the activities and participation components, considering the self-care, mobility, and domestic life domains. We decided to use the same domains for both activities and participation, with total overlap of domains using performance qualifiers.10 Otherwise, the ICF seems to be a useful tool to examine and compare the contents of the physical functioning measures used in this study.
An important strength of this study is the simultaneous assessment of several physical functioning measures. This enables a direct comparison of the measures and facilitates interpretation of the results. Head-to-head comparisons of the responsiveness of the measures for people with MS will help to determine the relative advantages of different measures, thereby enabling evidence-based choice of measures for future studies. By using separate measures for different ICF domains, it is possible to find simple outcomes that have the advantage of obvious meaning both for the person with MS and the clinician.
The present study has some limitations. The disease severity of our sample varied from an EDSS score of 0 to an EDSS score of 6.5, but overall the participants were mildly disabled. Further work is needed to determine whether responsiveness is dependent on the primary level of disability. In addition, future responsiveness studies should focus on subjects who are more severely disabled (EDSS score of
7.0). We used 3 repeated measures, but the results from the first and third test sessions were used in the regression analysis. In future studies, the use of longitudinal data analysis techniques might be appropriate. Estimates of the SEM for the FSQ are missing, and thus we were not able to calculate the MDC values for the FSQ self-care, mobility, and domestic life scores. Because of the skewness of the data, the MIC scores for the FSQ, the BBS, and the MAS should be interpreted with caution. However, the significance of the MICdeterioration values was consistent with the significance of the AUC values, which were computed by using a nonparametric method. Further studies are needed concerning the relationships between the FSQ and the ICF and between the activities and body functions components.
| Conclusions |
|---|
|
|
|---|
| Footnotes |
|---|
Approval of the study was obtained from the Ethics Committee of the Central Finland Health Care District.
This study was supported, in part, by the Central Finland Health Care District, the Emil Aaltonen Foundation, the Finnish MS Society, the Finnish Cultural Foundation, and the Social Insurance Institution of Finland.
* Metitur Ltd, Heinämäentie 7, FIN-40250, Jyväskylä, Finland. ![]()
Newtest Ltd, Kiviharjuntie 11, FIN-99220 Oulu, Finland. ![]()
Polar Electro Oy, Professorintie 5, FIN-90440 Kempele, Finland. ![]()
Sammons Preston, PO Box 5071, Bolingbrook, IL 60440-5071. ![]()
|| David Industries Ltd, Tutkijankatu 2, FIN-83500 Outokumpu, Finland. ![]()
# JTech Medical Industries, 470 Lawndale Dr, Suite G, Salt Lake City, UT 84115. ![]()
** SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606. ![]()
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A Giordano, E Pucci, P Naldi, L Mendozzi, C Milanese, F Tronci, M Leone, N Mascoli, L La Mantia, G Giuliani, et al. Responsiveness of patient reported outcome measures in multiple sclerosis relapses: the REMS study J. Neurol. Neurosurg. Psychiatry, September 1, 2009; 80(9): 1023 - 1028. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |