EFFECTS OF RANDOMLY AND NON-RANDOMLY DISTRIBUTED MISSING DATA IN SUPPORT VALUES OF BAYESIAN AND PARSIMONY ANALYSIS (Invited Presentation)
Regarding Bayesian analyses, recent studies using both empirical and simulated data matrices have shown that missing data also affect the performance of this method, especially when the missing data is non-randomly distributed. Non-random distribution of missing data in paleontological data matrices is quite common as it is usually concentrated on highly incompletely scored taxa and highly incompletely scored characters. As in parsimony, the effects of the amount of missing data (and the different patterns of distribution) on posterior clade probability is poorly understood.
Here we present a study on the effect of randomly and non-randomly distributed missing entries have on a set of empirical data matrices of morphological characters in support values for both Bayesian and parsimony analyses. Different regimes of missing entries were artificially added to these datasets and the support/credibility values obtained for the modified datsets were compared with those of the original matrices (without missing data). The results of these analyses show that support/credibility values are highly sensitive to the presence of non-randomly distributed missing entries, in particular for the case of highly incompletely scored taxa. A major difference in the results of both methods is found in the frequency of high credibility values obtained for erroneous groups in the case of Bayesian analyses.