OBJECTIVES: To provide estimates and confidence intervals for the performance (detection and false-positive rates) of screening for Down's syndrome using repeated measures of biochemical markers from first and second trimester maternal serum samples taken from the same woman. DESIGN: Stored serum on Down's syndrome cases and controls was used to provide independent test data for the assessment of screening performance of published risk algorithms and for the development and testing of new risk assessment algorithms. SETTING: 15 screening centres across the USA, and at the North York General Hospital, Toronto, Canada. PARTICIPANTS: 78 women with pregnancy affected by Down's syndrome and 390 matched unaffected controls, with maternal blood samples obtained at 11-13 and 15-18 weeks' gestation, and women who received integrated prenatal screening at North York General Hospital at two time intervals: between 1 December 1999 and 31 October 2003, and between 1 October 2006 and 23 November 2007. INTERVENTIONS: Repeated measurements (first and second trimester) of maternal serum levels of human chorionic gonadotrophin (hCG), unconjugated estriol (uE3) and pregnancy-associated plasma protein A (PAPP-A) together with alpha-fetoprotein (AFP) in the second trimester. MAIN OUTCOME MEASURES: Detection and false-positive rates for screening with a threshold risk of 1 in 200 at term, and the detection rate achieved for a false-positive rate of 2%. RESULTS: Published distributional models for Down's syndrome were inconsistent with the test data. When these test data were classified using these models, screening performance deteriorated substantially through the addition of repeated measures. This contradicts the very optimistic results obtained from predictive modelling of performance. Simplified distributional assumptions showed some evidence of benefit from the use of repeated measures of PAPP-A but not for repeated measures of uE3 or hCG. Each of the two test data sets was used to create new parameter estimates against which screening test performance was assessed using the other data set. The results were equivocal but there was evidence suggesting improvement in screening performance through the use of repeated measures of PAPP-A when the first trimester sample was collected before 13 weeks' gestation. A Bayesian analysis of the combined data from the two test data sets showed that adding a second trimester repeated measurement of PAPP-A to the base test increased detection rates and reduced false-positive rates. The benefit decreased with increasing gestational age at the time of the first sample. There was no evidence of any benefit from repeated measures of hCG or uE3. CONCLUSIONS: If realised, a reduction of 1% in false-positive rate with no loss in detection rate would give important benefits in terms of health service provision and the large number of invasive tests avoided. The Bayesian analysis, which shows evidence of benefit, is based on strong distributional assumptions and should not be regarded as confirmatory. The evidence of potential benefit suggests the need for a prospective study of repeated measurements of PAPP-A with samples from early in the first trimester. A formal clinical effectiveness and cost-effectiveness analysis should be undertaken. This study has shown that the established modelling methodology for assessing screening performance may be optimistically biased and should be interpreted with caution.



Publication Date


Publication Title

Health Technol Assess





First Page


Last Page


Organisational Unit

Peninsula Medical School


Algorithms, Bayes Theorem, Biomarkers, Case-Control Studies, Chorionic Gonadotropin, Confidence Intervals, Down Syndrome, Estriol, Female, Humans, Models, Statistical, Pregnancy, Pregnancy Trimester, First, Second, Pregnancy-Associated Plasma Protein-A, Prenatal Diagnosis, ROC Curve, Risk Assessment, alpha-Fetoproteins