Short answer: only if you have a posterior sample of trees (from BEAST, MrBayes, BirdTree.org, etc.) and you want the tree-topology uncertainty to show up in your pooled standard errors and p-values.
| Situation | Use this article? |
|---|---|
| One tree (published phylogeny, time-calibrated tree) | No — use multi_impute() and the mixed-types vignette. |
| Posterior sample (2 or more trees) | Yes. |
Tree uncertainty enters the analysis in two places. pigauto handles step 1. Step 2 is your responsibility because the downstream model is your choice.
+--------------------------------------+
| Step 1 -- imputation |
| |
| multi_impute_trees(traits, trees) |
| -> T x m_per_tree completed |
| data.frames, each tagged with |
| the tree that produced it |
+------------------+-------------------+
|
v
+--------------------------------------+
| Step 2 -- analysis + pool |
| |
| for dataset i: |
| fit model with trees[[t_i]] |
| pool_mi(fits) |
| |
| The SAME tree that produced |
| dataset i is used to fit model i. |
+--------------------------------------+
With share_gnn = TRUE (the default), T = 50 posterior
trees is cheap. Use one imputation per tree (M = 50 total), fit the
downstream model 50 times (each with the matching tree), and pool with
Rubin’s rules.
library(pigauto)
data(avonet300, trees300)
df <- avonet300
rownames(df) <- df$Species_Key
df$Species_Key <- NULL
mi <- multi_impute_trees(df, trees = trees300, m_per_tree = 1L)
# share_gnn = TRUE, reference_tree = MCC via phangorn -- all default
fits <- with_imputations(mi, function(dat, tree) {
dat$species <- rownames(dat)
nlme::gls(
log(Mass) ~ log(Wing.Length),
correlation = ape::corBrownian(phy = tree, form = ~species),
data = dat, method = "ML"
)
})
pool_mi(fits) # pooled SEs include both imputation and tree uncertaintyThe code above is illustrative — full execution takes ~25 min because it fits pigauto on the MCC reference tree, then runs a GLS model for each of the 50 posterior trees. Running the chunk is left to the reader.
| T | m_per_tree | M | When |
|---|---|---|---|
| 50 | 1 | 50 | Default. Canonical N&dV 2019. |
| 20 | 2 | 40 | Smaller posterior, still stable. |
| 10 | 5 | 50 | Very small posterior; per-tree variance helps. |
| <10 | bump m_per_tree | >=25 | Runtime warning fires; Rubin’s rules unstable below M=25. |