Lung Function in Growth and Aging
History PDF Print E-mail

1971

Working with collated data is not entirely new. The first such effort to derive prediction equations for pulmonary function indices was probably that made by Polgar [1]. He and Promadhat compiled reference values (for the paediatric age range) from the literature since the 1920's, applying the following set of criteria (quotation):
  • The data had to cover a substantial range of childhood development.
  • The number of subjects had to be stated in the publications and had to be proportional to the volume of information generally available for the particular test.
  • The type of population used for the study had to be described with reasonable assurance that the subjects were indeed normals.
  • Sufficient information about the methodology had to be available to assure that only data obtained by similar techniques would be compared.
  • The data had to be presented relative to standard values of growth and development, or to some standard measurement of the lung, so that they could be compared with others on the basis of a minimal number and the most practical attributes.
  • Unless the "raw data" were published the method for evaluating the results had to conform with some accepted statistical techniques; and enough numerical information had to be disclosed about the actual measurements to ensure the discovery of possible errors or misinterpretation.

The authors relied on published (a) pulmonary function means, standard deviations, and number of cases as a function of sex, height, age, weight or surface area;  (b) equations relating pulmonary function to height and other variables; (c) combinations of these presentation forms. Basically, therefore, they faced the task of deriving a set of new equations for pulmonary function indices that was not based on raw data sets, but on summary statistics in the form of tables and equations. This was done by constructing a new data set by substituting values for discrete heights, and using these to derive a new prediction equation which was shown graphically to represent the mean of the original equations. The average percentage coefficient of variation was similarly estimated from published sets of measurements.

This approach has been shown to be useful, and as of 2008 predicted values for spirometric indices based on Polgar's publication are still widely used worldwide. This procedure was also adopted by the European Community for Coal and Steel (ECCS) in the 1980's [2]. It is of historic interest to disclose how this decision came about.

1983

A working party had been given the task to revise a previous recommendation of the ECCS on measurements of lung volumes and ventilatory flows, lung elasticity, airway resistance and gas transfer. Whereas the 1971 ECCS recommendation [3] provided predicted values for some of these indices, there were various reasons why the working party wanted to provide a new set of equations. One reason was that the equipment used and the methodology had evolved. Another one was that the previous set of equations was based on a highly selected population: male workers in coal and steel industry. Not only were predicted values generally regarded as being quite high, but the recommended prediction equations for females were not based on a single measurement in females, but derived as an educated guess from data on males. Permission was not obtained to do a field study to derive new equations as it would be too costly. Also the working party was instructed not to publish on women. The reason for this was that the ECCS was financed by levies on coal and steel for the benefit of workers in that industry, and there were no female workers. The working party decided to ignore the instruction on reference values for females and tried to locate files with raw data so as to collate these, but this was to no avail. This was the reason to adopt the approach that Polgar had pioneered. An account of the procedures is given in [2].

Deriving predicted values from published prediction equations has several drawbacks, such as:

  • It is impossible to decide retrospectively whether the model applied to derive the published equation is the most appropriate one, and whether distributional assumptions are not violated.
  • The ranges of age, height and weight were not always published.
  • The models applied differed between authors: some used logarithmic transformations, others did not. Inevitably a model for a new prediction equation could never do justice to such different models, so that there would be some loss of information.
  • It was not always entirely clear whether the published study fully excluded smokers and ex-smokers.
  • There was paucity of information about differences in the procedure of subject selection, quality control, derivation of summary indices, in the original publication.

1995

The above procedure was also used to derive predicted values for lung volumes for both adults and the paediatric age range [4].

It is obviously much more appropriate to derive prediction equations using 'raw data', obtained with acceptable measurement techniques and satisfactory quality control from a representative healthy reference population. The first such effort was published in 1995 [4]. It was based on the use of 6 data sets from 5 different countries in Europe and used to to derive a new set of regression equations, ascertaining that the new prediction equations were valid in all data sets, and assessing the difference between centres. This was published in Pediatric Pulmonology 1995; 19: 135-142. This gave rise to the following recommendation:

"The approach adopted in this study is equally applicable to other ethnic groups, to other indices of ventilatory function, such as residual volume, functional residual capacity, total lung capacity, and transfer factor for the lung for CO. Similarly, problems need to be resolved about normal ventilatory function in other age ranges, in particular in elderly persons. There is ever increasing awareness that in the monitoring of patients or groups of special interest the assessment of longitudinal data needs to be further developed. There is potentially much to be gained by starting an international data base to this end, to which researchers who have performed studies which comply with international standards could submit their cross-sectional and longitudinal data. It will often obviate the need for costly and time-consuming new studies as so much information is already available but not exploited. We suggest that such a data base should be available for research purposes. The data base will also be of historic interest as it will allow future generations to study cohort effects. Proper selection of submitted material and management of the data base requires that an international body take responsibility. This would be a worthy project under the auspices of the European Respiratory Society and the American Thoracic Society. It would also be a worthy tribute to the approach pioneered by George Polgar."

 In reference 4 one of the conclusions was:

There are as yet no validated prediction equations which span the whole age range from childhood to old age. This often leads to large discontinuities in predicted ventilatory function, as subjects move from one age group to the next. With the advent of modern statistical tools, such as splines, this deficiency might be remedied.

2008

The first publication to put the above into practice was by Stanojevic et al. [5]. It was based on 4 data sets in which the ages ranged between (1) 8-80 yr, N= 2273, (2) 4-19 yr, N = 761, (3) 5-18 yr, N = 316, and (4) 5-19 yr, N = 248.

Pooling different data sets should not be done uncritically. We will go into requirements that need to be met elsewhere. It is for example possible that, even though the same model applies to different data sets, they differ systematically in the level of the predicted value. In that case the newly derived model will still be valid, but the coefficient of variation will be inflated. Whereas cubic splines are a very powerful tool in modelling the relationship between dependent and predictor variables, there is the danger that one will also model the idiosyncrasies of the combined data sets. Take e.g.  FEV1/FVC, which is very dependent on age. If we have a number of data sets in which the FEV1/FVC ratios differ systematically, and the observations do not fully overlap in age and have widely different numbers of observations, a cubic spline with a sufficient number of degrees of freedom will follow a curved pattern to accommodate the different levels, where a straight one might be required.

References

  1. Polgar, G, Promadhat V. Pulmonary function testing in children: techniques and standards. Philadelphia,: WB Saunders C, 1971. 

  2. Quanjer, PH, editor. Standardized Lung Function Testing. Report Working Party "Standardization of Lung Function Tests", European Community for Coal and Steel. Bull. Europ Physiopathol Respir 1983; 19, suppl. 5

  3. Cara M, Hentz P (1971) Aide-mémoire of spirographic practice for examining ventilatory function, 2nd edn. (Industrial Health and Medicine series, vol 11) pp 1-130.

  4. Stocks J, Quanjer PH. Reference values for residual volume, functional residual capacity and total lung capacity. Eur Respir J, 1995, 8, 492–506.

  5. Stanojevic S, Wade A, Stocks J, Hankinson J, Coates AL, Pan H, Rosenthal M, Corey M, Lebecque P, Cole TJ. Reference ranges for spirometry across all ages. A new approach. Am J Respir Crit Care Med 2008; 177: 253–260.

Last Updated on Monday, 26 July 2010 11:51