The Annals of Applied Statistics

Inferring rooted population trees using asymmetric neighbor joining

Yongliang Zhai and Alexandre Bouchard-Côté

Full-text: Open access


We introduce a new inference method to estimate evolutionary distances for any two populations to their most recent common ancestral population using single-nucleotide polymorphism allele frequencies. Our model takes fixation into consideration, making it nonreversible, and guarantees that the distribution of reconstructed ancestral frequencies is contained on the interval $[0,1]$. To scale this method to large numbers of populations, we introduce the asymmetric neighbor joining algorithm, an efficient method for reconstructing rooted bifurcating nonclock trees. Asymmetric neighbor joining provides a scalable rooting method applicable to any nonreversible evolutionary modeling setups. We explore the statistical properties of asymmetric neighbor joining, and demonstrate its accuracy on synthetic data. We validate our method by reconstructing rooted phylogenetic trees from the Human Genome Diversity Panel data. Our results are obtained without using an outgroup, and are consistent with the prevalent recent single-origin model.

Article information

Ann. Appl. Stat., Volume 10, Number 4 (2016), 2047-2074.

Received: November 2015
Revised: June 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Asymmetric neighbor-joining algorithm fixation and drift phylogenetics population histories rooted tree inference single-nucleotide polymorphism


Zhai, Yongliang; Bouchard-Côté, Alexandre. Inferring rooted population trees using asymmetric neighbor joining. Ann. Appl. Stat. 10 (2016), no. 4, 2047--2074. doi:10.1214/16-AOAS964.

Export citation


Supplemental materials

  • Supplement to: “Inferring rooted population trees using asymmetric neighbor joining”. We provide additional simulation studies and proofs on the properties of the algorithms in the supplementary material [Zhai and Bouchard-Côté (2016)].