2012-3 An empirical-process central limit theorem for complex sampling under bounds on the design effect

Uniform central limit theorems (`Donsker theorems’) have been widely useful in semiparametric statistics, both under iid sampling and for stationary sequences and random fields. Only limited results have been available under complex sampling, especially multistage sampling. In this note we derive a complex-sampling analogue of Ossiander’s bracketing-entropy conditions for a uniform central limit theorem, under the assumption that certain design effects are uniformly bounded. We discuss the plausibility of this assumption in realistic surveys.

Thomas Lumley

Download

 

2012-2: Two-phase subsampling designs for genomic resequencing studies

Targeted resequencing of DNA at specific genes or other genomic loci is now feasible for hundreds or thousands of samples, and costs for larger-scale resequencing are decreasing rapidly. For at least the next few years, resequencing will need to be confined to small subsets of the large samples on which genome-wide association studies have been recently been performed. This paper describes some strategies for subsampling an existing cohort for resequencing, and flexibly analysing the resulting data. We illustrate these strategies by describing the actual design and planned analyses for the example that motivated our research, the CHARGE-S resequencing study carried out by the CHARGE (Cohorts in Heart and Aging Research in Genomic Epidemiology) Consortium.

Thomas Lumley, Josee Dupuis, Kenneth M. Rice, Maja Barbalic, Joshua C. Bis, L. Adrienne Cupples, Bruce M. Psaty, Christopher J. O’Donnell, Eric Boerwinkle

Download

2012-1: Partial Likelihood Ratio Tests for the Cox model under Complex Sampling

We develop an analogue of the likelihood ratio test for Cox proportional hazards models fitted to sample survey data.  We look at methods for computing the asymptotic distribution and at ways of improving the small sample performance.  The methods are illustrated with examples using data from the National Health and Nutrition Examination Survey (NHANES) and from a stratified case-cohort study.

Thomas Lumley, Alastair Scott

Now published: read at Statistics in Medicine