|
|
||||||||
Research Reports |
CY Chou, OT, MS, is Lecturer, Department of Rehabilitation, Jen-Teh Junior College of Medicine, Nursing and Management, Miaoli, Taiwan, and a doctoral student, Institute of Allied Health Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan
CW Chien, OT, BS, is Research Assistant, School of Occupational Therapy, College of Medicine, National Taiwan University, Taipei, Taiwan
IP Hsueh, OT, MA, is Assistant Professor, School of Occupational Therapy, College of Medicine, National Taiwan University
CF Sheu, PhD, is Professor, Department of Psychology, National Chung Cheng University, Chiayi, Taiwan
CH Wang, PT, BS, is Associate Professor, School of Physical Therapy, College of Medical Technology, Chung Shan Medical University, and Department of Physical Therapy, Chung Shan Medical University Rehabilitation Hospital, Taichung, Taiwan
CL Hsieh, OT, PhD, is Professor and Chair, School of Occupational Therapy, College of Medicine, National Taiwan University
(chwang{at}csmu.edu.tw) Address all correspondence to Mr Wang at School of Physical Therapy, College of Medical Technology, Chung Shan Medical University, 110, 1 Sec, Chien-Kuo N Rd, Taichung 402, Taiwan
Submitted December 14, 2004;
Accepted July 18, 2005
| Abstract |
|---|
Subjects and Methods. A total of 226 subjects with stroke participated in this prospective study at 14 days after their stroke; 167 of these subjects also were examined at 90 days after their stroke. The BBS, Barthel Index, and Fugl-Meyer Motor Test were administered at these 2 time points. By reducing the number of tested items by more than half the number of items in the original BBS (ie, making 4-, 5-, 6-, and 7-item tests) and simplifying the scoring system of the original BBS (ie, collapsing the 5-level scale into a 3-level scale [BBS-3P]), we generated a total of 8 SFBBSs.
Results. The distributions of scores for all 8 SFBBSs were acceptable but featured notable floor effects. The 4-item BBS, 5-item BBS, 5-item BBS-3P, and 7-item BBS-3P demonstrated good reliability. The subjects scores on the 6-item BBS, 6-item BBS-3P, 7-item BBS, and 7-item BBS-3P showed excellent agreement with those on the original BBS. The 6-item BBS-3P and 7-item BBS-3P exhibited great responsiveness. Only the 7-item BBS-3P demonstrated both satisfactory and psychometric properties similar to those of the original BBS.
Discussion and Conclusion. The 7-item BBS-3P was found to be psychometrically similar to the original BBS. The 7-item BBS-3P, compared with the original BBS, is simpler and faster to complete in either a clinical or a research setting and is recommended. [Chou CY, Chien CW, Hsueh IP, et al. Developing a short form of the Berg Balance Scale for people with stroke.
Key Words: Balance Cerebrovascular disorders Item reduction
| Introduction |
|---|
|
|
|---|
coefficient has been found to be as high as .98)6 indicates, to some extent, item redundancy. These observations suggest that the BBS needs to be simplified in order to improve its utility. The simplification of a measure may include reducing the number of items or shortening the levels of scaling, or both.2,9–12 It has been revealed that certain measures simplified by one or both of these methods are psychometrically similar to the original measures.2,5,9,11 Therefore, the purpose of this study was to develop a short form of the BBS (SFBBS) that was psychometrically equivalent to the original BBS. We hypothesized that at least half of the items on the original BBS could be omitted and that the 5-level scaling could be reduced without sacrificing any psychometric properties. Thus, several SFBBSs are proposed here, and the psychometric properties of the SFBBSs were compared with those of the original BBS for a cohort of subjects who had had a stroke and who were evaluated from 14 days to 3 months after their stroke.
| Method |
|---|
|
|
|---|
Measures
The BBS has 14 items, including 1 sitting item and 13 standing items.4,6 These items are based on a 5-level scale (0–4). Its total score ranges from 0 to 56. The BBS was originally developed to screen elderly people who are at risk for falling. The psychometric properties of the scale have been found to be satisfactory for people with stroke.5–7
A simplified BBS with a 3-level scale (BBS-3P)5 was developed by collapsing the second, third, and fourth levels of the original scale into a single level. This collapsed level was scored when subjects met the criteria for the original second or higher level of the scale but not when subjects met the criteria for the highest level of the scale. The BBS-3P was found to feature psychometric properties similar to those of the original BBS. Thus, in the present study, both the BBS and the BBS-3P were used in the development of short forms with shortened scaling. For use of the BBS-3P in this study, the data retrieved for this study were recoded as 0-2-4 by collapsing the 3 middle levels of the original 5-level scale.
The Barthel Index (BI) was developed to measure the severity of disability.14 The BI evaluates 10 basic activities of daily living items: feeding, transferring, grooming, toileting, bathing, ambulation, stair climbing, dressing, bowel control, and bladder control.13 The total possible score of the BI ranges from 0 to 100. The BI was previously shown to yield scores with good interrater reliability (intraclass correlation coefficient [ICC]=.94) and high convergent validity (Spearman 
.92) for people with stroke.5,6,15,16 The BI was used to examine the convergent validity and predictive validity of data for the SFBBSs proposed in this study.
The Fugl-Meyer Motor Test (FM)17 has been used to measure motor impairment following stroke. The FM consists of 50 items of upper- and lower-extremity motor function. Each item is graded on a 3-level scale. Its total possible score ranges from 0 to 100 points, and it has been shown to yield data with good interrater reliability (ICC
.92) and high concurrent validity (r
.99) for people with stroke.5,18,19 The FM was used to test the convergent validity of data for the SFBBSs proposed in this study.
Procedure
Subjects consecutively enrolled in the Quality of Life After Stroke Study were examined at 14 days after the onset of stroke and reassessed at other specific time points (eg, 90 days) after stroke onset for up to 3 years after the stroke to characterize their recovery of neurologic function (eg, as measured by the FM), balance ability (eg, as measured by the BBS), functional abilities (eg, as measured by the BI), and health-related quality of life. The measures used in this study (ie, the BBS, the FM, and the BI) were administered by an occupational therapist who was not informed of the purpose of this study. The interrater reliabilities for the raters administering the BBS and the BI were satisfactory, with ICCs of .95 and .94, respectively.6,15,16
Development of SFBBSs
In this study, the method used to develop and validate the SFBBSs mainly followed that proposed by Hobart and Thompson.2 These authors selected items featuring the highest internal consistency (ie, minimizing measurement error) and the greatest responsiveness (ie, maximizing the ability to detect change). Thus, this method would appear to be especially useful for developing a measure for monitoring recovery after stroke and measuring outcome after treatment and was adopted in this study. The data retrieved for this study were randomly divided into 2 groups: a calibration group for developing the SFBBSs and a validation group for comparing the psychometric properties of the various SFBBSs with those of the original BBS.
To develop the SFBBSs, the best items were determined by selecting the items with the lowest values from an overall item index of each item.2 The overall item index of each item is the product of the 2 rank orders (ie, the rank order of the corrected item total correlation for an item and the rank order of the effect size for an item). The corrected item total correlation for an item is the correlation between the scores of an individual item and the sum of the scores of all of the items on the scale minus that item. The rank of the corrected item total correlation is useful in removing test items that have a lower correlation with the overall construct measured in the BBS. Furthermore, the effect size for an item is the mean change score (14–90 days after stroke) divided by the standard deviation of the scores at 14 days after stroke. The rank of the effect size is useful in removing test items that show little sensitivity to change. Finally, the corrected item total correlation for each item and the effect size for each item were respectively ranked, and then the product of these rank orders was computed, that is, the overall item index of each item. For example, if the item total correlation rank of a given item is 1 and its effect size rank is 4, then its overall item index is 1x4=4. Lower values for the overall item index indicated better items.
We hypothesized that the use of 4 to 7 best items would be adequate for the SFBBSs. Four sets of SFBBSs were generated (ie, 4-item BBS, 5-item BBS, 6-item BBS, and 7-item BBS). We also used a technique to collapse the 3 levels in the middle of the BBS into a single level. Thus, we developed an additional 4 sets of SFBBSs (ie, 4-item BBS-3P, 5-item BBS-3P, 6-item BBS-3P, and 7-item BBS-3P). Therefore, a total of 8 SFBBSs were generated.
Data Analysis
To compare the psychometric properties of the 8 SFBBSs and the original BBS, we linearly transformed the scores of the SFBBSs into the same score range as that for the original BBS (0–56). The psychometric properties tested in this study included acceptability, reliability, validity, and responsiveness.
Acceptability is a determination of whether the score distributions of a measure can match the distribution corresponding to the subjects intended to be measured.2 A measure exhibiting good acceptability should reveal observable scores spanning the entire range of the scale, with a mean score near the scale midpoint, and featuring small floor and ceiling effects, that is, less than 15% of the subjects achieving the lowest or the highest scores.2,20
Test reliability reflects the degree of precision of a measure; that is, high reliability requires a low rate of errors to be generated.21,22 To estimate test reliability, Hobart and Thompson2 recommended examination of the internal consistency of a specific test by use of Cronbach
coefficients to determine the intercorrelations among the items.2 It has been suggested that reliability estimations exceed .80 for group comparison studies and .95 for individual patient clinical decision making.2,21 Confidence intervals for the
coefficients were computed.2,23 Confidence intervals for individual scores for subjects with stroke were computed by calculating the standard error of measurement (SEM).21 The SEM indicates the spread of scores.24 The following 2 formulas were used: SEM=(standard deviation of sample scores)x
(1–reliability) and 95% confidence intervals for individual scores=±1.96xSEM.
Test validity indicates whether a measure actually determines what it has been constructed to determine.2,25 We examined the agreement between the results of the SFBBSs and the results of the original BBS at 14 days after stroke by using a random-effects model ICC and the method proposed by Bland and Altman,26 which involves plotting the scores of the difference between the original BBS and the SFBBSs against those of the average between the original BBS and the SFBBSs.26 Ideally, there should be no trend showing systematic bias in a Bland-Altman plot.26 These results are useful for determining whether the SFBBSs and the original BBS can be used interchangeably.
In addition, 3 validity indicators were examined for the comparisons of the 8 SFBBSs and the original BBS. First, the concurrent validity at 14 days after stroke was examined by computing the intercorrelations between the scores of the SFBBSs and those of the original BBS. Second, the convergent validity for the scores of the SFBBSs, the FM, and the BI at 14 days after stroke also was examined. Third, the predictive validity of scores for the SFBBSs was determined by examining the relationships between the scores of the SFBBSs at 14 days after stroke and those of the BI at 90 days after stroke.
Responsiveness reflects the effectiveness of a measure in detecting changes in the longitudinal follow-up of the participants.27,28 The extent of the responsiveness of the SFBBSs was investigated by calculating effect sizes.22,25,29 Effect sizes were determined by computing the mean of the total score difference between 14 days and 90 days after stroke for each subject, divided by the standard deviation of the total score at 14 days after stroke.16 Larger values suggest greater responsiveness. Finally, we cross-validated the main psychometric properties of the best SFBBS found by using 20 samples that were randomly and repeatedly drawn from the full sample.
| Results |
|---|
|
|
|---|
|
|
41.6% of the subjects) for the 8 SFBBSs (Tab. 3).
|
coefficients (
.95), but only the 4-item BBS, 5-item BBS, 5-item BBS-3P, and 7-item BBS-3P had lower-limit confidence intervals that met the criterion of .80 (Tab. 3). The SEM of the 8 SFBBSs ranged from 3.6 to 4.7, values that were lower than 5.6 (ie, 10% the highest possible score of 56, such a score indicating clinical importance).30
Validity
The ICCs for the original BBS and SFBBSs were high (
.96) (Tab. 3), indicating excellent agreement between the SFBBSs and the original BBS. The limits of agreement of the 6-item BBS, 6-item BBS-3P, 7-item BBS, and 7-item BBS-3P were about half those of the other SFBBSs, indicating that their scores for individual subjects were closer to the scores of the original BBS than to those of the other SFBBSs. Figures 1 and 2 show that only the 6-item BBS-3P and 7-item BBS-3P demonstrated no obvious systematic bias toward the BBS in the Bland-Altman plots (r2
.04).
|
|
.97). Moreover, scores for all of the SFBBSs exhibited equivalent and high convergent validity with scores for the BI (r=.84–.86) and with scores for the FM (r=.66–.68). The extent to which each of the 8 SFBBSs was able to predict the score of the BI at 90 days after stroke also was similar to that of the original BBS and satisfactory (r=.58–.60).
|
.8). We found that the 7-item BBS-3P was slightly superior to the 6-item BBS-3P in acceptability, reliability, and validity (Tabs. 3 and 4). Only the 7-item BBS-3P met all of the predefined psychometric criteria, with the exception of the floor effects. Furthermore, the findings of this study also supported the requirement that the 7-item BBS-3P demonstrate satisfactory internal consistency, concurrent validity, and responsiveness relative to the original BBS for the 20 randomly reselected samples (Tab. 5).
|
| Discussion |
|---|
|
|
|---|
Compared with the original BBS, the 7-item BBS-3P is improved in 3 significant aspects. First, the number of items is reduced by half. Second, the scoring levels are reduced from 5 to 3, thereby reducing the possibility of scoring inconsistency. Third, administration of the 7-item BBS-3P requires fewer assessment tools. For example, a stool was not necessary for the 7-item BBS-3P because of the removal of the item "placing alternate foot on stool." All of these improvements allowed the raters to complete the SFBBS within half the time required to complete the original BBS (less than 10 of the original 20 minutes). This advantage of the 7-item BBS-3P decreases the possibility of incomplete data collection and contributes to efficiency in examination.
The use of the 7-item BBS-3P in clinical and research settings can be an improvement over the use of the original BBS given that the 7-item BBS-3P has excellent agreement with the original BBS. The Bland-Altman plot revealed that there was no notable trend between the difference and the average scores of the 7-item BBS-3P and the original BBS. Thus, the 7-item BBS-3P may be used interchangeably with the original BBS. The 7-item BBS-3P is especially useful when the time available for examination is short, such as at follow-up or when the clients are too weak to endure long examinations.
From the perspective of psychometric properties, up to 7 items (eg, standing unsupported and transferring) in the original BBS were found in our study to be redundant because their application did not provide any additional psychometric information. In earlier research, similar findings of item redundancy also were obtained for measures of some other domains, such as activities of daily living or quality of life.1,2,9,10,12,16 Therefore, it is worthwhile to explore in future studies whether there is any possibility of simplifying any of the other domains of conventional measures to decrease item redundancy in the measures and to promote the utility of clinical measures. However, from the clinical point of view, some important aspects of the balance performance of individual patients (eg, standing unsupported and transferring) are not recorded after the deletion of the items. Therefore, the 7-item BBS-3P may not be able to entirely replace the original BBS in the clinical setting, especially when the specific balance functions measured by the items deleted from the original measure are deemed to be treatment goals.
In this study, we used the method described by Hobart and Thompson2 to develop the 7-item BBS-3P. In that study, 6 of the first 7 items selected from the original BBS had been ranked as 1 (best) according to their corrected item total correlations, indicating that the corrected item total correlations were somewhat limited in discriminating the psychometric properties of the items of the BBS. Fortunately, this limitation did not interfere with the development of the 7-item BBS-3P. Future studies may add interrater reliability or test-retest reliability2 as supplementary criteria when too many items are ranked the same in the results for the corrected item total correlations.
A rather notable floor effect that was revealed for the 7-item BBS-3P also was found for the original BBS to a lesser extent. This notable floor effect may have resulted from the removal of the easiest item (unsupported sitting) from the 14 items of the BBS. Removing this item from the original BBS could reduce the ability of the 7-item BBS-3P to detect changes in sitting balance. As a result, the floor effect could weaken the ability of the 7-item BBS-3P to differentiate small balance function differences between people with severe stroke. Moreover, the presence of just such a floor effect may potentially damage the relative responsiveness of such a measure. However, we found the responsiveness of the 7-item BBS-3P to be satisfactory and very similar to that of the original BBS. Thus, the floor effect of the BBS-3P may not necessarily restrict the use of the 7-item BBS-3P for detecting balance improvement. From another point of view, the 7-item BBS-3P would benefit people who are able to attain or maintain upright stance without support, because testing easy tasks (eg, unsupported sitting) appears to be irrelevant for these people.
The psychometric properties of the 7-item BBS-3P were internally validated by use of 20 randomly reselected samples. The results of such validation testing provided strong evidence suggesting that the 7-item BBS-3P was psychometrically similar (including internal consistency, concurrent validity, and responsiveness) to the original BBS for people with stroke. Such results suggested that we did not "over fit" the results of the 7-item BBS-3P to this single data set and that the findings of this study were well supported.
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
This study was approved by an institutional review board of National Taiwan University Hospital.
This study was supported by a research grant from the National Science Council (NSC-90-2815-C-002-022-B), and National Health Research Institutes (NHRI-EX94–9204PP).
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. Blum and N. Korner-Bitensky Usefulness of the Berg Balance Scale in Stroke Rehabilitation: A Systematic Review Physical Therapy, May 1, 2008; 88(5): 559 - 566. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-W. Chien, J.-H. Lin, C.-H. Wang, I-P. Hsueh, C.-F. Sheu, and C.-L. Hsieh Developing a Short Form of the Postural Assessment Scale for People With Stroke Neurorehabil Neural Repair, January 1, 2007; 21(1): 81 - 90. [Abstract] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |