|
|
||||||||
Research Reports |
LI Strand, PT, is Assistant Professor, Division of Physiotherapy Science, Faculty of Medicine, University of Bergen, Ulriksdal 8c, 5009 Bergen, Norway (liv.strand{at}isf.uib.no). Address all correspondence to Ms Strand
SL Wie, PT, is Specialist in Occupational Health, Department of Occupational Health, Municipality of Bergen, Bergen, Norway
Submitted February 6, 1998;
Accepted October 8, 1998
| Abstract |
|---|
Key Words: Activity limitation Functional evaluation Pain Reliability Validity
| Introduction |
|---|
|
|
|---|
The World Health Organization's International Classification of Impairments, Disabilities, and Handicaps16 (ICIDH), which classifies consequences of disease, was published in 1980. The ICIDH has inspired physical therapists to select and develop functional tests for measuring disabilities,1719 in addition to using traditional measures of impairments. The ICIDH is currently under revision because of dissatisfaction with the classification system as being primarily based on a medical model. The ICIDH-2: International Classification of Impairments, Activities, and Participation20 (ICIDH-2), presented in 1997, is thought by some health care professionals to better integrate biopsychosocial aspects of human functioning and disablement, and it represents a paradigm shift from a medical model to an integrated model. Concerns about health care outcomes have led some researchers to direct their efforts toward the assessment of functioning at the level of the whole human being in daily life. In the ICIDH-2, this need is realized by assessment tools for the individual, both for activity levels and for something now called "participation." The concept of "disabilities" has been replaced by the concept of "activities." The ICIDH-2 states that "activities may be limited in nature, duration, and quality." The words used to describe the possible limitations in activity, along with the basic perspectives of the ICIDH-2, suggest that activities should be evaluated according to how they are performed in a natural context.
Clinical tests should be developed or selected to capture the broad aspects of functioning by the individual. Self-evaluation questionnaires are increasingly being used to assess disabilities in rehabilitation studies.15 How patients perceive their functioning may be a very important aspect to measure in rehabilitation in order to establish and address the patient's health status. Self-reports, however, indicate what an individual believes he or she can do or what problems he or she believes are encountered in performing various activities; they do not necessarily reflect the individual's physical capacity or ability to perform activities. Several studies2124 have documented discrepancies between self-reports and clinician-derived assessments of physical function.
We believe that tools to assess functional restrictions should be part of the physical therapist's evaluation in order to differentiate psychosocial and physical causes of disablement. In our opinion, however, clinician-derived assessments should be directed toward aspects important to patients. Our study was an effort to develop a simple assessment tool, the Sock Test, for evaluating performance of a simple activity of daily living that is probably important to most patients with musculoskeletal pain. These patients appear to have difficulty putting on stockings, socks, and shoes, which is a function demanding good dynamic mobility of the body. The Sock Test simulates the activity of putting on a sock in a standardized way. The examination of reliability and validity is a necessary step in the development of a new clinical test.25 In this report, the standardized Sock Test is presented. Intertester reliability is examined in addition to the test's ability to reflect perceived activity limitation among patients and to indicate restriction of the musculoskeletal system and its ability to predict perceived activity limitation after 1 year.
| Method |
|---|
|
|
|---|
The Sock Test
The Sock Test simulates the activity of putting on a sock. The test is standardized and does not allow alternative ways of moving. The therapist evaluates the patient's performance, observing how far the patient reaches and how easily the activity is done.
The patient should wear loose clothing. The activity is first demonstrated to the patient. The patient is then instructed to sit on a high bench, with their feet not touching the floor. The patient lifts up one leg at a time in the sagittal plane and simultaneously reaches down toward the lifted foot with both hands, one on each side, grabbing the toes with the fingertips of both hands. The foot must not touch the bench and should be in the air at all times during the test. After testing each leg once, the patient is given a score on the most limited performance. Scores are given as ordinal values from 0 (can grab the toes with fingertips and perform the action with ease) to 3 (can hardly, if at all, reach as far as the malleoli) (Fig. 1). Several compensation maneuvers may be demonstrated (Fig. 2). Compensations are not scored. If they occur, the test is explained or demonstrated to the patient again before the test is repeated.
|
|
=3 months, SD=2, minimum=1, maximum=20). Sixty-three percent of the patients had been sick-listed for up to 4 months, 25% had been sick-listed for 4 to 6 months, and 12% had been sick-listed for longer than 6 months. The patients were diagnosed by their physician according to the International Classification of Primary Care (ICPC).26 About half of the group (52%) had back pain, 29% of the patients had neck or shoulder pain, 12% of the patients had generalized musculoskeletal pain, and 7% of the patients had other forms of musculoskeletal pain.
The number of patients in the different parts of the study varied according to available data for the Sock Test and the other methods of measurement, and were samples of convenience. In the clinical controlled trial, some methods of measurement were used for different lengths of time and were randomly distributed. This fact indicates that subsamples evaluated with different methods of measurement were random subsamples and that missing data were missing at random.27 Fewer patients participated in the posttest examination (n=257) than in the pretest examination (n=337). No difference in pretest Sock Test scores (P=.64), however, was found between patients who were examined at posttest and patients who were missing at posttest (n=80) (
2 =1.7, df =3). Missing data were not expected to distort the relationship between the paired measurements, and missing data were ignored in the analysis. Proportions of men and women, as well as age (mean, median, minimum, and maximum), in the major subgroups of patients, were almost identical to the description of the total sample and will not be further described.
Intertester reliability.
Patients with musculoskeletal pain (n=21) who enrolled in the clinical testing during a time period of 14 days were included to examine intertester reliability. Fifteen women and 6 men, with a mean age of 44 years (minimum=26, maximum=62), were examined. Nine patients had back pain; 8 patients had neck, shoulder, or arm pain; and 4 patients had generalized musculoskeletal pain.
Patient-perceived activity limitation and pain.
Patients (n=237) who at baseline were examined by use of the Sock Test and on the same occasion answered questions from the physical therapists about perceived problems in connection with the activity of putting on socks and shoes were included to examine the relationship between clinician-derived Sock Test scores and data from the patients.
Patients who were examined using the Sock Test and who concurrently completed the Disability Rating Index (DRI)23 at the pretest examination were included to investigate the relationship between Sock Test scores and perceived functional problems in various activities of daily living measured by the questionnaire. The DRI contains 12 questions about problems related to activities of living, each scored on a 10-cm visual analog scale. The DRI score is the mean score of all items. The questionnaire has proved to be a robust, practical research instrument, with good intrarater and interrater reproducibility and responsiveness.23 Differing numbers of patients (ie, 298312) filled in the separate items of the questionnaire; 282 patients completed the DRI. Missing data were limited and varied between 0 and 4.5% for each item of the DRI. The missing data, therefore, were ignored in the analysis.
Patients (n=313) who were examined using the Sock Test and who concurrently completed the Norwegian Pain Questionnaire (NPQ)28 at the pretest examination were included to investigate the relationship between Sock Test scores and aspects of the pain experience. The NPQ is a Norwegian approximation of the McGill Pain Questionnaire.29 It contains a total of 106 Norwegian descriptors of pain in 18 groups: 12 sensory, 5 affective, and 1 evaluative. Each word is given an intensity value according to pain ratings on a visual analog scale. High internal consistency for all groups has been demonstrated, and the evidence indicates that the questionnaire contains the pain descriptors most commonly used by Norwegians.28 Quantitative measurements of sensory, affective, and evaluative dimensions of pain as well as a total score are obtained by summarizing intensity values of words chosen by patients to describe their pain.
Patients who had Sock Test scores above 0 at the pretest examination and were examined using the test 1 year later and who completed the DRI and the NPQ at the pretest and posttest examinations were included to investigate whether the Sock Test is responsive to change in perceived function and pain. Change was measured as pretest scores minus posttest scores. The number of patients included in the analysis varied between 149 and 153, representing paired data between the Sock Test scores and the different items of the DRI. One hundred thirty-one patients completed the DRI. Missing data were minor, with a maximum of 2.6% for each item, and were ignored in the analysis. One hundred sixty-two patients representing paired data between the Sock Test and the NPQ were included.
Restriction of the musculoskeletal system due to demographic factors.
Patients (n=326) examined using the Sock Test at the pretest examination who had concurrent data of age, body mass index (BMI) (in kilograms per square meter), and sex were included to investigate whether Sock Test scores reflected differences in restrictions of the musculoskeletal system according to differences in age, BMI, and sex.
Prediction of perceived activity limitation.
Patients (n=257) who were examined using the Sock Test at the pretest examination and who answered the question about perceived problems in putting on socks and shoes at the posttest examination were included to investigate whether pretest Sock Test scores could predict perceived difficulties at the 1-year posttest examination.
Data Collection and Analysis
Intertester reliability.
Intertester reliability between 2 physical therapists was examined. The therapists had not worked together in the clinic before the research project. Therapist 1 had worked as a physical therapist for 25 years, with 7 years in clinical practice working with patients with musculoskeletal pain and 18 years as a teacher at a college of physical therapy. Therapist 2 had worked as a physical therapist for 10 years, with 3 years in clinical practice working with patients with musculoskeletal pain and in heart rehabilitation and 7 years in occupational health service. The therapists evaluated a few patients together before the study started. One therapist explained and demonstrated the Sock Test to each patient, and both therapists scored the patient's performance of the test independently on the same occasion. In this way, reliability was not dependent on the therapist's ability to give the instruction, which may increase the estimate of reliability by eliminating a source of error that would be present in practice. Measurement agreement was assessed by weighted kappa, with weights assigned as follows: 1.0, .6667, .3333, and 0.
Patient-perceived activity limitation and pain.
The patients were asked to answer the following questions on a yes or no basis: Did you have difficulty putting on socks and shoes? Did you change your way of performing the dressing activity because of musculoskeletal problems? Was the dressing activity painful? The percentages of patients who answered "yes" to the questions in relation to each of the Sock Test scores were calculated. The null hypothesis of no relationship between Sock Test scores and the patient data was examined by use of chis-quare tests.
In order to examine the "sensitivity" and "specificity" of Sock Test scores to reflect perceived activity limitation among patients, the patient data were condensed in the following way. An answer of "yes" to at least one question in the preceding paragraph was considered to reflect perceived activity limitation and was coded as 1, and answers of "no" to all questions were considered to reflect no perceived activity limitation and were coded as 0. "Sensitivity" and "specificity" were examined according to each possible cutoff value of Sock Test scores.
Sock Test scores were correlated with questionnaire-derived scores at the pretest examination. The association between the Sock Test and concurrent items of the DRI and the NPQ at the pretest examination was examined by Spearman correlations. Seventeen correlations between scores were calculated. A significance level of .01 was chosen to account for the multiple testing.
The associations of changes from the pretest examination to the posttest examination between the Sock Test and the DRI and between the Sock Test and the NPQ were examined by Spearman correlations. Seventeen correlations were calculated between changes in scores (ie, pretest scores minus posttest scores). A significance level of .01 was chosen to account for the multiple testing.
Restriction of the musculoskeletal system due to demographic factors.
Logistic regression analysis was used to examine the likelihood of scoring above 0 by the Sock Test by univariate as well as multivariate analysis of age, BMI, and sex.
Prediction of perceived activity limitation.
Logistic regression analysis was used to examine whether the Sock Test scores obtained during the pretest examination can be predictive of perceived difficulties in the dressing activity at the 1-year posttest examination.
| Results |
|---|
|
|
|---|
|
2 =44.66, df =3), changed performance (
2 =60.73, df =3), and pain (
2 =68.01, df =3) were all significant (P <.001). Sock Test scores with a cutoff value of 1 demonstrated a "sensitivity" value of 0.77 and a "specificity" value of 0.91. Sock Test scores with cutoff values of 2 and 3 demonstrated "sensitivity" values of 0.99 and 1.00, respectively, and "specificity" values of 0.31 and 0.25, respectively (Tab. 2). The pretest scores for the Sock Test showed low or moderate correlation values with those of the DRI. Correlations were highest for the overall DRI (r =.45) and for activities such as running (r =.41), climbing stairs (r =.34), and participating in sports (r =.31) (Tab. 3). Only carrying a bag and lifting heavy objects were not significantly related (Tab. 3). Low correlation values (r =.17 and .18) were found for the sensory, affective, and total scales, but not for the evaluative scale, of the NPQ (Tab. 3).
|
|
|
Restriction of the Musculoskeletal System Due to Demographic Factors
An increased likelihood (P <.05) of scoring 1 or higher on the Sock Test with increases in age and BMI was demonstrated when separate variables were examined (Tab. 4). The group of patients between 51 and 65 years of age were almost 3 times more likely to score above 0 on the Sock Test than the group of patients between 21 and 35 years of age. Patients with BMI values greater than 27.1, representing the upper quarter of BMI measured, were almost 10 times more likely to score above 0 on the Sock Test than patients in the lowest quarter. When all variables were included, only BMI demonstrated an increased likelihood of Sock Test scores above 0.
|
|
| Discussion |
|---|
|
|
|---|
Intertester Reliability
The study of intertester reliability shows that Sock Test scores are somewhat reliable. The kappa statistic is considered the best approach for judging intertester agreement between categorical assessments.31 It has a maximum value of 1.00 when agreement is perfect. A value of 0 indicates agreement equal to chance. The relative strength of agreement of kappa values of .61 to .80 is "substantial" according to Landis and Koch.32 There is no value of kappa, however, that can be regarded universally as indicating good agreement. Clinical judgment must decide whether agreement is sufficiently high.31 We used weighted kappa in our study because it takes into account the degree of disagreement. In a clinical situation, we believe some discrepancies between therapists in assessing function are expected. Differentiating performance when it is close to the intersection between 2 scores is difficult. In our opinion, our high weighted kappa (.79), along with the finding that the therapists agreed completely in 16 out of 21 cases and differed by only one level in the 5 cases in which they did not agree, provides evidence for clinical acceptance. The fact that the physical therapists had rather dissimilar backgrounds, were not clinical specialists, and had tested few patients together before the reliability study started suggests that intertester reliability of test scores may be improved. The major skill needed to conduct the test, however, appears to be the ability to abide by operational definitions, as there are no physical skills involved.
Patient-Perceived Activity Limitation and Pain
The relationships that were demonstrated between Sock Test scores and percentages of patients who reported limitation of the dressing activity ("yes" answers) showed that increasing Sock Test scores reflected higher percentages in all except one case. The difference between Sock Test scores of 0 and 1, as indicated in the Sock Test procedure (Fig. 1), may seem minor, but we believe it reflects a substantial difference in the percentage of patients reporting activity limitation. A closer relationship between Sock Test scores and perceived problems might well have been established if the patient data had been graded rather than expressed as "yes" or "no" answers. The fact that the patients were asked the questions related to the dressing activity on the same occasion that the Sock Test was performed may have caused them to report in a more realistic way than if they simply were to answer a questionnaire in another context. The relationship examined by chi-square tests between Sock Test scores and activity limitations reported by the patients ("yes" or "no" answers) demonstrated that there was a relationship between the clinician-derived and patient-derived measures of function.
"Sensitivity" and "specificity" in our study were not related to a "gold standard" of disease, only to activity limitation reported by the patients (yes/no). The cutoff value of 1 for Sock Test scores yielded the highest sum score of "sensitivity" and "specificity" and demonstrated the highest value of "specificity" (Tab. 2). The cutoff value of 2, however, seems to be preferable, with a high "sensitivity" value (0.99), showing that almost all patients who had Sock Test scores of 2 or 3 reported activity limitation. The results suggest that the Sock Test has some validity as a patient-oriented test, reflecting patient-perceived limitation of the dressing activity. Care should be taken, however, to use test scores as definite objective measures of patient-perceived problems.
Scores for the Sock Test were associated with most items of the DRI, but the correlation values varied between low and moderate (Tab. 3). The highest correlation value was obtained for the overall DRI. The results are comparable to reported correlation values (r =.20.60) between related methods of measuring health (McDowell and Newell, referred to in Salén et al23). The results suggest that Sock Test scores, to a moderate degree, reflect concurrent perceived problems in heterogeneous activities of daily life in patients with musculo-skeletal pain.
Low correlation values were found between NPQ and Sock Test scores. The patients had been asked to fill in the NPQ according to the overall pain experienced during the last 2 days. The pain description, therefore, was not specifically related to the activity of putting on socks and shoes or other activities. This fact may explain why correlation values between the Sock Test and the NPQ were lower than correlation values between the Sock Test and most items of the DRI.
The results related to responsiveness indicate that changes over time as measured by the Sock Test scores are somewhat related to changes over time in problems of various activities of daily living and pain, as measured by the DRI and the NPQ, respectively. The Sock Test, therefore, is somewhat responsive to clinically important change over time. The low correlation values, however, suggest that activity limitation should be measured by clinical tests in addition to questionnaires.
Restriction of Musculoskeletal Function Due to Demographic Factors
The results demonstrate that Sock Test scores reflect restriction of musculoskeletal function due to demographic factors. The finding of an increased likelihood of Sock Test scores above 0 (indicating activity limitation) with increases in age is in accordance with previous studies showing a decline in flexibility occurring with age.33,34 The increased likelihood of Sock Test scores above 0 with increases in BMI, especially in groups of patients defined as overweight,35 appears to support the results of another study.36 Increases in BMI were shown to be related to increased risk of disability in middle-aged men.36 The analysis of separate variables suggested that women performed the Sock Test with more ease and flexibility than men did, which conforms with findings showing that women tend to move more flexibly than men.33,34 In our study, however, this indication of a sex difference disappeared completely when all variables were analyzed.
The results seem to support the validity of the Sock Test as a method of evaluating differences in musculoskeletal restrictions. The findings imply, however, that demographic factors such as age and especially BMI can represent a bias in the assessment of activity limitation.
Prediction of Perceived Activity Limitation
The results demonstrated that the Sock Test can be used to predict perceived limitation in the dressing activity 1 year after the pretest examination.
Future Studies
The patients in this study were between 21 and 64 years of age, and the applicability of the Sock Test to older and younger subjects needs to be decided. Whether the Sock Test is equally adequate to evaluate activity limitation for all patients with musculoskeletal pain or whether it should be used primarily for defined subgroups remains to be determined. Another topic for future studies is the validity of the Sock Test in predicting return to work. Normative data of test scores should also be available, derived from subjects not having disabling musculoskeletal pain.
| Conclusion |
|---|
|
|
|---|
| Acknowledgments |
|---|
| Footnotes |
|---|
The study was approved by the Regional Ethics Committee, Health Region III, Norway, and was performed according to the Helsinki Declaration. The project was approved by the Norwegian Data Inspectorate.
Part of this work was presented as an abstract and a poster at the Second International Forum for Primary Care Research on Low Back Pain, The Hague, the Netherlands, May 3031, 1997.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. I. Strand, R. Moe-Nilssen, and A. E. Ljunggren Back Performance Scale for the Assessment of Mobility-Related Activities in People With Back Pain Physical Therapy, December 1, 2002; 82(12): 1213 - 1223. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |