|
|
||||||||
Research Reports |
J Flegel, PT, MS, is Physical Therapist, Curative Rehabilitation Services, Milwaukee, Wis. She was a student at the University of Illinois at Chicago at the time this research was completed in partial fulfillment of the requirements for her Master of Science degree in physical therapy.
THA Kolobe, PT, PhD, is Assistant Professor, Department of Physical Therapy, University of Illinois at Chicago, Chicago, Ill
Address all correspondence to Ms Flegel at 4069 N Prospect, Milwaukee, WI 53211 (USA) (Judyfpt{at}aol.com)
Submitted July 20, 2001;
Accepted February 6, 2002
| Abstract |
|---|
Key Words: Bruininks-Oseretsky Test of Motor Proficiency Children Infant Motor development Predictive validity
| Introduction |
|---|
|
|
|---|
Given the emotional and financial burden placed on families when there is intervention with infants who are at risk for developmental delays, clinicians must consider the information about sensitivity and specificity when selecting a diagnostic test. Sensitivity and specificity pertain to how well a test performs in the presence (or absence) of the condition of interest. A sensitive test is likely to miss few infants with atypical motor development (low false negative).2 Because it is unlikely that a test will be both highly sensitive and highly specific, the decision to use a particular test depends on the type of risk for misclassification that the test user is willing to accept.2 Fletcher and associates2 recommend using a test with high sensitivity when there is an important consequence for missing a disorder (eg, a serious, progressive condition that is treatable). For children with typical motor development, a specific test is less likely to misdiagnose them as having atypical motor development (low false positive). A highly specific test is useful in substantiating a diagnosis that may have been suggested by other information.2
Because the accuracy of diagnosis of infants born prematurely increases with age,3,4 we believe it is important that, in addition to sensitivity and specificity, diagnostic tools that are used during the neonatal period also demonstrate high predictive values. Predictive values are useful in interpreting the test results of an individual infant. A positive predictive value describes the probability that an infant with a positive test result will indeed have the disorder, whereas a negative predictive value describes the probability that an infant with a negative test result will have a normal outcome.2
Many instruments have been developed to predict developmental outcome in infants, but their accuracy has varied.5 Few of these instruments focus on motor performance. In studies that have examined the predictive validity of motor performance assessments, we believe 2 problems may limit the clinical application of the findings with preterm and very young infants. The first problem is the age at which the outcome measure is administered. Evidence exists that some of the neurological and medical abnormalities observed in premature infants during the first 2 years of life may be transient (transient dystonia).68 Given this instability of neuromotor behavior during the first 2 years of life, it would appear that obtaining a measurement after 2 years of age would circumvent this problem.
The second issue is the infant's age at the time the predictive test is administered. The tendency in numerous predictive studies has been to test infants at 4 months of age or older, even though the test under investigation is designed to be used for younger ages (04 months).9,10 Consequently, there is a dearth of information regarding motor tests that could be used in NICUs or nurseries to identify infants who are at risk for developmental delays. If the objective of prediction is early identification of infants at risk, such as when the infant is still in the NICU, it is essential that very young infants are included in studies of predictive validity.
The primary purpose of our study was to examine the predictive validity of the Test of Infant Motor Performance (TIMP). We examined whether TIMP scores collected between 32 weeks gestational age and 16 weeks postterm age accurately classified the children as typically developing or developmentally delayed based on fine and gross motor performance at early school age as measured by the Bruininks-Oseretsky Test of Motor Proficiency (BOTMP).11
The TIMP, developed by Campbell and associates,12 is a comprehensive motor test designed to assess functional motor performance of infants between 32 weeks postconceptional age and 16 weeks postterm age. Initial reports on the TIMP suggest that it yields reliable measurements that are valid for discriminating optimal motor performance from poor motor performance in infants born preterm and very young infants.1214 Test of Infant Motor Performance scores have been reported to be sensitive to changes in infants' motor performance due to maturation and medical complications.12 Infants with greater numbers of medical complications, as measured by the newborn form of the Problem-Oriented Perinatal Risk Assessment System (POPRAS),15 had lower TIMP scores than infants of the same age with fewer medical complications.
The TIMP consists of 59 items divided into 2 sections: Elicited and Observed. The Elicited section items assess the infants' motor responses to placement in various positions and to visual or auditory stimulation. The Observed section items are used to rate spontaneous movement exhibited by infants. Items for the TIMP were taken from neurologic and developmental tests, such as tests developed by Dubowitz et al16 and motor assessment procedures developed by Cioni and Prechtl.17 Additional items and scoring descriptors were developed by Campbell et al.12 The Rasch psychometric model guided the development of the items.18 The items on the TIMP have been shown to have ecological validity (ie, the Elicited section items are similar to demands placed on infants in everyday caregiving situations).19 Concurrent validity with the Alberta Infant Motor Scale was demonstrated for infants at 3 months of age.14 In our opinion, satisfactory interrater reliability of TIMP measurements among experienced examiners and satisfactory test-retest reliability (r=.89) also have been reported.12,13 Although the TIMP demonstrates attributes of a valid developmental test for very young infants, there are no reports concerning whether TIMP scores can predict motor outcome at a later age.
Because motor performance on the TIMP has been found to be correlated with scores on the POPRAS,12 the infants' medical complications cannot be overlooked when examining the predictive validity of the TIMP. Therefore, a secondary purpose of our study was to examine the relationship between perinatal risk, as measured by the POPRAS, and BOTMP scores at early school age. The POPRAS is a perinatal risk scale that was developed by Hobel and associates20 to assess prenatal and perinatal medical complications. All information needed to score the POPRAS is obtained from the medical record. Higher POPRAS scores are associated with greater medical complications, particularly during the neonatal period.20,21
| Method |
|---|
|
|
|---|
|
=.05, power for the F test=0.80, and R2=.27). Based on the discussion of guidelines for the use of correlation coefficients for health science studies in Portney and Watkins,22 we selected an R2 value of .27 because it reflected a moderate degree of association for our power analysis. The minimum number of subjects necessary was determined to be 34. Sixty-five of the 137 subjects from the Campbell et al12 study were located. Two children from each age group and risk category were then randomly selected for testing, except for 7 categories (either risk or age) in which only 1 child had been located. This stratification process yielded 12 children (34.3%) who were classified as being at low risk for developmental delays, 10 children (28.6%) who were classified as being at medium risk for developmental delays, and 13 children (37.1%) who were classified as being at high risk for developmental delays. There were 4 to 6 children in each of the 7 age categories. The parent or guardian of each child gave permission for the child's participation in the study. The subjects had an average gestational age of 32 weeks (SD=5, range=2441) and had received the TIMP at an average postnatal age of 59 days (SD=49, range=2164). The mean POPRAS score for the total sample, as infants, was 73.7 (SD=44.3, range=2166). At the time of testing on the BOTMP, the children ranged in age from 4 years 9 months to 7 years 3 months, with a mean age of 5 years 8 months. There were 19 male children and 16 female children. Sixteen (45.7%) of the children were white, 10 (28.6%) were Latino, and 9 (25.7%) were African American. Three children were diagnosed with cerebral palsy, and 1 child was diagnosed with Down syndrome.
Instrument
The complete battery of the BOTMP was used as a criterion measure of motor outcome. This test is used to assess the motor functioning of children from 4.5 to 14.5 years of age.11 The BOTMP consists of 8 subtests (running speed and agility, balance, bilateral coordination, strength, upper-limb coordination, response speed, visual-motor control, and upper-limb speed and dexterity), with a total of 46 separate items that provide a comprehensive index of motor proficiency as well as separate measures of both gross and fine motor skills. The BOTMP provides age-normed, composite scores, based on a T-score with a mean of 50 (SD=10).11
Reliability
Prior to data collection, interrater reliability for BOTMP scores was established using 6 children of varying abilities who were not part of the study but were within the age range of those in the study. The first author (JF) administered the BOTMP to 4 children while a therapist who used the BOTMP extensively watched and scored independently. The experienced BOTMP user administered the test to 2 children while the first author watched and scored independently. The authors have had extensive experience (724 years) in pediatrics, including administration and scoring of various motor tests. The intraclass correlation coefficient (2,1) for the total battery composite score was .97.
The interrater and test-retest scores stability on the TIMP over a 3-day period have been reported.12,13 The Facets computer program* for Rasch psychometric analysis23 was used to determine the scoring consistency for ratings by testers. A criterion of fewer than 5% misfitting ratings (ie, the number of unexpected ratings given the infant's level of ability and the item difficulty) was used for interrater reliability.24 The correlation between the scores over a 3-day period was .89.13
Procedure
The first author administered the BOTMP according to the standardized instructions in the test manual. The majority of children were tested in their homes. Because the subjects for this study were young children, a failure to cooperate was an issue in some cases. This issue was handled as recommended in the BOTMP manual,11 which provides the options of postponing the completion of the session until another day or administering items in a different order than that in the manual. Only 2 children were uncooperative at initial testing, and completion of the session was postponed for them. In both cases, the test was completed in 2 additional sessions that were scheduled within 8 days of the initial test, as recommended in the BOTMP manual. The test was scored according to the instructions in the manual. For purposes of scoring the BOTMP, each child's age was calculated from his or her date of birth. All children were tested at their homes, except for 3 children who were tested in a physical therapy classroom.
At the end of the session, the parent or guardian accompanying the child was asked to fill out a questionnaire with items for demographics and the child's medical and developmental history. This information was used to determine whether the infants had medical problems during the period between the TIMP and BOTMP testing that would put them at risk for developmental delays or disabilities. These medical problems would change the nature of the relationship between the results of the 2 tests, because the child might develop motor problems as a result of a new medical problem that did not exist at the time of TIMP testing.
The TIMP and POPRAS scores from the study by Campbell and associates12 were obtained after the administration and scoring of the BOTMP were completed. The first author, therefore, was unaware of the child's POPRAS and TIMP scores when administering the BOTMP.
Data Analysis
Sensitivity, specificity, and positive and negative predictive values were calculated to identify a TIMP cutoff score that might be used in a clinical setting to identify infants with current motor delays who were likely to have long-term motor delay or poor motor performance as assessed with the BOTMP. Poor motor performance was defined as a z score on the BOTMP of 1.5. This cutoff represents the middle range of scores that are considered "low performance" in the BOTMP manual.11 There is no recommended cutoff for the BOTMP composite score in the literature, except what is provided in the manual for the subtests.
Because the TIMP has not been normed, the z scores for the TIMP were first calculated using the means and standard deviations for each of the 7 age groups reported by Campbell et al.12 Because the TIMP z scores used for this analysis were not based on a random sample of the high-risk population, the results of this analysis must be considered preliminary. The receiver operating characteristic curves were used to explore various cutoffs on the TIMP for predicting BOTMP z scores of 1.5. The most correct classifications overall were obtained using a TIMP z score of 1.6. Given our small sample, confidence intervals (CIs) for the sensitivity, specificity, and positive and negative predictive values were calculated using the formula proposed by Fletcher et al.2 According to Fletcher and colleagues, point estimates of sensitivity and specificity values tend to be misleading when the sample is small. These authors recommended reporting the 95% CI for the range of values. Similarly, the diagnostic efficiency of the POPRAS was examined.
The relationship between TIMP scores (when adjusted for AGE) and BOTMP battery composite scores was examined using the Pearson product moment correlation coefficient (partial correlation). The TIMP total scores and BOTMP standard composite scores were used, with AGE controlled. It was necessary to control the effect of AGE because the TIMP yields a raw score, with no age norms, and because postconceptional age and TIMP scores are highly correlated (r=.83).12 A Pearson product moment correlation coefficient also was calculated to determine the relationship between the POPRAS and BOTMP battery composite scores.
A hierarchical multiple regression analysis was used to examine the amount of variance in the BOTMP score that was accounted for by the TIMP score (adjusted for AGE), beyond that accounted for by the POPRAS. Order of entry for the independent variables was: AGE, POPRAS score, and TIMP score. Although the POPRAS scores had been categorized in the original study (low risk=a score less than 60, medium risk=a score between 61 and 90, and high risk=a score greater than 91) based on medical complications, the scores were analyzed as continuous data in this analysis.
| Results |
|---|
|
|
|---|
Table 2 presents the sensitivity, specificity, and positive and negative predictive values using various cutoff scores for the TIMP and a z score on the BOTMP of 1.5 as the cutoff for differentiating normal versus abnormal motor outcomes. Using a cutoff z score of 1.6 for the TIMP, 4 true positive, 0 false positive, 27 true negative, and 4 false negative classifications were observed. These findings resulted in a sensitivity value of .50, a specificity value of 1.0, a positive predictive value of 1.0, and negative predictive value of .87 for the TIMP. Using the 95% CI, the sensitivity values ranged from .33 to .67, and the negative predictive values ranged from .76 to .98. Higher cutoff scoresfor example, a z score of .50increased the sensitivity values (.75) but at the expense of a lower specificity value (.63) and a lower positive predictive value (.38) (Tab. 2). Overall, 89% of the children were correctly classified by the TIMP cutoff z score of 1.6.
|
|
|
| Discussion |
|---|
|
|
|---|
A certain number of false negative classifications are expected with any test; however, the relatively high number of such classifications in this study is somewhat unusual because false positive classifications are the more typical problem with infant examinations.25,26 High false negative results are a concern to clinicians because they represent children with motor delays at school age who may not have received early intervention or periodic testing based on their early test scores. Therefore, to further explore the findings on sensitivity, each false negative case was inspected (Tab. 5). In particular, the age at which the TIMP was administered was examined to determine whether there was a particular age at which the TIMP score is likely to yield a false negative prediction. Second, we examined the questionnaires that the subjects' parents had completed to determine whether any of the children had notable injuries or illnesses after the administration of the TIMP that might have affected their later motor performance. Finally, the therapists' comments on the TIMP and on the BOTMP score forms were examined to determine whether any unusual occurrences were noted during testing. All 4 of the children with false negative results were born preterm (2732 weeks gestational age at birth) and were classified as being at moderate to high risk for developmental delays based on their POPRAS scores. Two children were diagnosed with cerebral palsy and the other 2 were diagnosed with developmental delays, based on their scores on the BOTMP.
|
Ferrari and associates27 examined the development of 29 infants who were at high risk for developmental delays from approximately 32 weeks postconceptional age to approximately 1 to 2 years of age. They reported that 9 of the infants whose previous neurological examination results were poor were considered developmentally normal around term age. The neurological examination results had been abnormal prior to term age, and were abnormal again several weeks later. The authors concluded that the transient normal neurological examination results were due to an interval of "normal" muscle tone (not defined by the authors) during the transition from hypotonia to hypertonia, which often occurs with abnormal motor development. Based on the observation by Ferrari and associates,27 future predictive studies using the TIMP may need to examine whether administration of the TIMP near term age may be more likely to yield a false negative classification.
Based on data from the parent questionnaires, child 1 also was reported to have had a condition that might have resulted in motor disability after the initial TIMP testing. This child had a history of epilepsy that, if not successfully controlled, might have contributed to a decline in motor performance after TIMP testing. Finally, based on comments made on the BOTMP, it is possible that the performance of child 2 on the BOTMP may have been affected by a poor attention span and hesitation in participating. The therapist noted that testing for this child had to be repeated in 3 sessions over a period of 3 days. There was no finding that we felt was consequential with child 3.
To interpret an individual child's test scores, information on the test's predictive values and the prevalence of the condition of interest is needed. The TIMP cutoff z score of 1.6 yielded excellent predictive values (ie, positive predictive values of .75.80 are considered good28). The positive predictive value and negative predictive value were 1.00 and .87, respectively. These values suggest that, given similar prevalence rates for poor motor outcome, an infant with TIMP z scores of less than 1.6 (less than average performance) is 100% likely to demonstrate less than optimal motor proficiency at early school age, as measured by the BOTMP, whereas an infant with a TIMP score above the cutoff point has an 87% chance of showing optimal motor performance. Based on these findings, it appears that delayed motor performance, as measured by the TIMP, at this very early age in an infant's life may persist to early school age. Although positive predictive values are affected by the prevalence of the condition and are highly inflated if the prevalence is high,2 the 23% prevalence of developmental delay in our sample is consistent with (or, in some cases, lower than) that reported in the literature.29
The cutoff score of 80 provided the best diagnostic values for the POPRAS. The sensitivity and specificity were .88 and .74, respectively. Positive and negative predictive values were .50 and .95, respectively. The sensitivity of .88 suggests that the cutoff score of 80 on the POPRAS may be more effective than the TIMP cutoff score in identifying infants who are likely to experience motor delays later in life; however, the specificity values suggest that the POPRAS yields more false positive classifications than the TIMP. The results of the predictive values also indicate that half of the infants who are identified by the POPRAS as having abnormal motor development will actually have a normal motor outcome, whereas most of those identified as having normal motor development (95%) may have a normal motor outcome. The POPRAS may be more useful if the risk of overidentification of infants as having abnormal motor development is acceptable, but having some certainty that infants with motor problems are not being overlooked is necessary.
The moderate-to-good association between POPRAS and BOTMP scores observed in this study suggests that a child's early medical complications continue to influence motor performance at a later age. The simple correlation between the POPRAS and BOTMP was strong: .55 (P=.001, R2=.30). We believe this finding is notable because the POPRAS was not designed for long-term prediction, but rather to predict medical risk for mortality in the perinatal and neonatal period.20 Scores on the POPRAS have been shown to be correlated with 1- and 5-minute Apgar scores30 as well as with primary cesarean section.31 The correlation between POPRAS score and later neuromotor outcome observed in our study is consistent with the findings of Campbell and Wilhelm.32 These authors reported that 60% (9 out of 15) of infants who were classified as being at high risk for developmental delay based on a combination of high POPRAS scores and other risk factors had major or minor neuromotor problems at 1 to 3 years of age.
Based on reports of their parents, 10 of the children in this study had received some type of motor therapy at some point during the interval between TIMP/POPRAS testing and BOTMP testing. The effect that this intervention might have had on the child's BOTMP score is unclear. The information on therapies received was based on parent recollection, and the reliability of the frequency, duration, and type of therapy that the child received is unknown.
Clinical Implications
A decision to use any one test for diagnostic purposes hinges on several factors: the psychometric properties of the test (including reliability, sensitivity, and specificity) and ease of administration of the test (including risk and cost factors). The TIMP appears to have good psychometric properties in terms of construct validity and reliability.12,13 Our findings suggest that the TIMP may not be useful if the clinician feels that it is very important to identify most of the children who have a poor neuromotor outcome, but is willing to accept overidentification of abnormal motor development in infants who will have normal motor outcomes. The TIMP, however, appears to be a useful tool if the clinician is interested in the reverse (ie, confidence that overidentification of infants as having abnormal motor development is minimized, but willing to accept the risk that some infants with problems will be overlooked).
According to Fletcher and associates,2 a test with high specificity is useful when false positive results can harm the patient physically, emotionally, or financially. Although unnecessary interventions are probably not physically harmful to the infant, they have an emotional and financial cost to families. Additionally, Fletcher et al recommend that sensitive tests be used for dangerous, treatable conditions. Although it would be ideal not to overlook an infant with motor problems during early infancy, most neuromotor problems that would be identified on the TIMP actually are not considered of immediate danger to the infant. However, because there is some evidence that early intervention may be effective, false negative results are always a concern. Infants may be monitored through early periodic screening and evaluation programs, which have become a standard practice in most hospitals with level-III nurseries.33 Furthermore, there is currently no other infant motor test that has reported predictive validity from preterm ages to school age.25,34,35
In our study, only the sensitivity value was higher for the POPRAS compared with the TIMP. The other diagnostic values were lower. The TIMP and POPRAS are very different from each other in terms of the items and administration of the test, which affects the ease of their administration. The TIMP is a clinical test of motor performance, whereas the POPRAS is a checklist of items from the prenatal, perinatal, and neonatal medical history in the infant's medical record. This information is usually available to clinicians working in an NICU, but would not normally be available to clinicians in a developmental follow-up clinic or early intervention (birth3 years of age) programs. The POPRAS does not involve examination of the infant.
Interpretation of individual scores in a specific setting hinges on positive and negative predictive values, which are determined by the sensitivity, specificity, and prevalence of the disorder in the population being tested.2 Clinicians should be aware that the prevalence reported in this study (23%) included all children with BOTMP z scores lower than 1.5, not only children with specific neuromotor related diagnoses. Using a cutoff z score of 1.6 on the TIMP resulted in very high predictive values: 1.00 for the positive predictive value and .87 for the negative predictive value. Although the negative predictive value is good, clinicians basing a decision on the results of this study must make recommendations knowing that some infants with negative test results will have a poor neuromotor outcome. Because the TIMP is administered at a very young age, clinicians may decide to readminister the TIMP, particularly for infants at higher risk for developmental delays, such as those with a POPRAS score above 80 or a low gestational age. To help interpret individual scores, the range of scores, based on the 95% CI, have been presented. Confidence intervals are particularly important when the sample size is small, as is the case in our study.
Limitations
Three limitations of our study need to be considered before using the TIMP and the cutoff z score of 1.6 to identify infants who will have long-term problems with motor performance. The first limitation was the small number of subjects, despite the fact that the sample was randomly selected from infants who participated in the original TIMP study. Because of the small sample size, 95% CIs for the predictive values have been reported. Second, TIMP z scores were not calculated based on data from a random sample of the population of infants who are at high risk for developmental delays. Use of standard TIMP scores based on normative data would provide more insight regarding the predictive value of the tool. Third, there was very little documentation regarding environmental variables that could have affected motor development during the interval between TIMP and BOTMP administration.
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
This study was approved by the University of Illinois at Chicago Institutional Review Board-Human Subjects Review Committee.
This work was supported, in part, by a grant from the Maternal and Child Health Bureau.
Subsequent to the submission and acceptance of this paper, Dr Kolobe became a partner in the Infant Motor Performance Scales, a provider of the Test of Infant Motor Performance.
This work was presented orally at the Combined Sections Meeting of the American Physical Therapy Association; February 26, 2000; New Orleans, La.
* MESA Press, 5835 Kimbark Ave, Chicago, IL 60637. ![]()
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. Cairney, S. Veldhuizen, P. Kurdyak, C. Missiuna, B. E Faught, and J. Hay Evaluating the CSAPPA subscales as potential screening instruments for developmental coordination disorder Arch. Dis. Child., November 1, 2007; 92(11): 987 - 991. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. Kolobe, M. Bulanda, and L. Susman Predicting Motor Outcome at Preschool Age for Infants Tested at 7, 30, 60, and 90 Days After Term Age Using the Test of Infant Motor Performance Physical Therapy, December 1, 2004; 84(12): 1144 - 1156. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |