The large sample size of UK Biobank, coupled with the large number of phenotypes only completed by a subset of participants and the number of rarer binary phenotypes with lower statistical power, provides and excellent opportunity to observe how LDSR results behave across GWAS sample sizes. We consider that relationship here, including providing estimated approximate power curves for detecting \(h^2_g\) in LD Score regression (more details in Methods).
In looking at these trends we largely focus on effective sample, defined for binary phenotypes as
\[N_{eff} = \frac{4}{\frac{1}{N_{cases}}+\frac{1}{N_{controls}}}\] and as the standard sample size for non-binary phenotypes. We expect this effective N to better capture statistical power for rare binary phenotypes. We compare how \(N\) and \(N_{eff}\) relate to the significance of \(h^2_g\) results below.
The relationship between the SNP heritability estimate and sample size is of interest given we previously observed evidence that \(h^2_g\) estimates may be downwardly biased at low sample sizes. We’ve revisited that question in our evaluation of confidence in the current results, and reproduce the relevant figures here.