popout
GPU-accelerated local ancestry inference at biobank scale — no reference panel required.
Feed it phased WGS from a large cohort and ancestry structure falls out of the joint distribution.
How it works
With 500K+ samples, the data is the reference panel. See docs/THEORY.md
for the full mathematical treatment. The pipeline:
- SEED — Randomized SVD on a SNP subset projects all haplotypes into PCA space. GMM assigns soft ancestry labels. Number of ance
This is a companion discussion topic for the original entry at github.com/broadinstitute/popout/filter-pgen