THE SPACE OF SAMPLED ANCESTOR TREES
An important property that distinguishes a sampled ancestor tree from a classical phylogenetic tree is that the taxa are allowed to be internal nodes of the tree. For example the information that a particular fossil taxon is ancestral to a particular contemporary taxon can be expressed in a sampled ancestor tree but not in a classical tree.
Surprisingly little is known about sampled ancestor trees from computational and mathematical point of view. Indeed, there exists no standard coordinate system for this type of trees. This poor understanding leads us to major problems with computational tree inference: MCMC methods lack efficient navigation algorithms, comparison methods lack a sound metric, and statistical summaries lack consistency. The key obstacle for solving these problems is the dimensionality of the trees.
In this work, we suggest an approach to fill this gap by providing a novel coordinate system for phylogenetic sampled ancestor trees. This system scales naturally from continuous to discrete trees by hierarchically approximating continuous time by discrete time segments. Although elementary moves between trees are inherited from the NNI move, geometric and algorithmic properties of the moves are greatly different.
In this talk, I will introduce the coordinate system and motivate it by popular applications in computational phylogenetics. I will compare the system with classical phylogenetic trees and demonstrate its algorithmic and statistical potential. I will finish by outlining possible directions for future research.